"cbmpinr
        f ,:,,:,.-,* U., :.
                                                         research&development
                           Table of Contents
Section Title / Contents
Table of Contents
BOSC Computational Toxicology Subcommittee Members
Meeting Agenda
Charge Questions
Computational Toxicology Implementation Plan for FY2009-2012
BOSC Poster Abstracts
CTRP Scientist Biosketches
Training, Outreach and Leadership by the NCCT
      - Scientific Leadership Roles
      - Mentoring
      - Computational Toxicology Rotational Fellowship Program
      - Communities of Practice Presentations
      - Partnership Agreements (MTAs, MoUs, CRADAs, and lAGs)
CTRP Bibliographic Information
      -NCCT
      - New Starts
Prior BOSC Letter  Reports and ORD Responses
      - 2005 Report and Response
      - 2006/2007 Report and Response
      - 2008 Report and Response
           UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
   901*01
    "  *  n
  "
      111
    •  * (
iMi 11111 (
,^ • •
S!"      ."  •
                                                                 " 1
                                             COMPUTATIONAL
                                                 TOXICOLOGY"

-------
           B'O-S-C
           BOARD OF SCIENTIFIC COUNSELORS
                                                                     September 2, 2009

                   COMPUTATIONAL TOXICOLOGY SUBCOMMITTEE
                                        ROSTER

      CHAIR
      George Daston, Ph.D.
      Research Fellow
      Miami Valley Laboratories
      The Proctor & Gamble Company


      Members

      James Clark, Ph.D.
      Distinguished Scientific Associate
      Exxon Mobil Research & Engineering Co.

      Richard Di Giulio, Ph.D.
      Professor
      Nicholas School of the Environment and Earth Sciences
      Duke University

      AH Faqi, DVM, Ph.D. DABT
      Director, Developmental and Reproductive Toxicology
      MPI Research, Inc.

      Lawrence Hunter, Ph.D.
      Director, Center for Computational Pharmacology and Computational Bioscience Program
      University of Colorado

      M. Moiz Mumtaz, Ph.D.
      Science Advisor
      Division of Toxicology and Environmental Medicine
      Agency for Toxic Substances and Disease Registry

      Dennis Paustenbach, Ph.D., CIH, DABT
      President and Founder
      ChemRisk, Inc.

      John Quackenbush, Ph.D.
      Professor of Biostatistics and Computational Biology
      Department of Biostatistics and Computational Biology
      Dana-Farber Cancer Institute
A Federal Advisory Committee for the U. S. Environmental Protection Agency's Office of Research and Development
                         Previous  I    TOC

-------
                       Computational Toxicology Subcommittee Roster
Santiago Schnell, Ph.D.
Associate Professor of Molecular and Integrative Physiology
Brehm Investigator, Brehm Center for Type 1 Diabetes Research and Analysis
Research Associate Professor of Computational Medicine and Bioinformatics
University of Michigan Medical School

Cynthia Stokes, Ph.D.
Independent Consultant

Katrina Waters, Ph.D.
Computational Biology & Bioinformatics Group
Battelle
                        Previous
TOC

-------
                  BOARD  OF SCIENTIFIC COUNSELORS
                       COMPUTATIONAL TOXICOLOGY SUBCOMMITTEE
                                          DRAFT AGENDA
                                         September 29-30, 2009
                           Hilton Raleigh-Durham Airport at Research Triangle Park
                                          4810 Page Creek Lane
                                           Durham, NC 27703
                                        Telephone: (919) 941-6000
                                                                                          August 26, 2009
    Tuesday, September 29, 2009

    12:00 p.m. - 12:30 p.m.     Registration

    12:30 p.m. - 12:40 p.m.
    12:40 p.m.-12:45 p.m.


    12:45 p.m.-1:00 p.m.



    1:00 p.m.-1:45 p.m.



    1:45 p.m.-2:15 p.m.



    2:15 p.m. -4:15 p.m.

    4:15 p.m. -5:15 p.m.

    5:15 p.m. - 6:15 p.m.
    6:15 p.m.
Welcome and Introductions
- New Subcommittee Members
- Draft Charge
- Meeting Agenda

DFO Remarks
Dr. George Daston,
Subcommittee Chair
Ms. Lori Kowalski, Office of
Research and Development (ORD)
Computational Toxicology Research   Mr. Lek Kadeli, ORD Acting
Program (CTRP) - Critical Component Assistant Administrator (AA)
of EPA Science in the 21st Century
CTRP Overview



Introduction to Poster Session I:
Informatics, Exposure Science,
ORD, and External Partners

Poster Session I

Poster Session I: Discussion

Comments on the CTRP
Dr. Robert Kavlock, Director,
National Center for Computational
Toxicology (NCCT)

Dr. Ann Richard, NCCT
Subcommittee/ORD

Subcommittee/ORD

Dr. Peter Preuss, Director, ORD/National Center
for Environmental Assessment; Mr. Jim Jones,
Deputy AA, EPA/Office of Prevention,
Pesticides, and Toxic Substances; Dr. John
Bucher, Associate Director, National
Toxicology Program, National Institutes of
Health; Dr. Cal Baier-Anderson, Senior Health
Scientist, Environmental Defense Fund
Adjourn
A Federal Advisory Committee for the U. S. Environmental Protection Agency's Office of Research and Development
                                 Previous
                      TOG
   Next

-------
         BOSC COMPUTATIONAL TOXICOLOGY SUBCOMMITTEE SEPTEMBER 2009 MEETING AGENDA
Wednesday, September 30, 2009

8:00 a.m. - 8:30 a.m.
8:30 a.m. - 10:30 a.m.

10:30 a.m. - 11:30 a.m.
Introduction to Poster Session II:      Dr. Thomas Knudsen, NCCT
High Throughput Screening, Toxicity
Predictions, Virtual Tissues, and
Uncertainty Analysis

Poster Session II

Poster Session II: Discussion
11:30 a.m. - 12:00 noon    CTRP Future: Providing High
                        Throughput Decision Support
                        Tools for Screening and Assessing
                        Chemical Exposure, Hazard, and Risk

12:00 noon- 12:15 p.m.    Public Comment

12:15 p.m. - 1:15 p.m.     Working Lunch

1:15 p.m. - 3:30 p.m.      Subcommittee Working Time

3:30 p.m.                Adjourn
Subcommittee/ORD

Subcommittee/ORD

Dr. David Dix, Acting Deputy
Director, NCCT
                                  Subcommittee

                                  Subcommittee
                             Previous
                      TOC

-------
                BOARD OF  SCIENTIFIC COUNSELORS
                                                                               August 24, 2009

                                      DRAFT Charge
                          Computational Toxicology Subcommittee
                               September 29-30, 2009 Meeting

    Background

    The National Center for Computational Toxicology (NCCT) became operational on February 20,
    2005.  On April 25-26, 2005, the BOSC Computational Toxicology Subcommittee held its first
    public meeting at the Office of Research and Development's (ORD) Research Triangle Park (RTF),
    North  Carolina facility, where the majority of NCCT staff is located. This meeting was intended as
    the first of several consultative reviews of the Center's progress, and was prospective in nature, due
    to the newness of the Center. The Subcommittee developed a letter report from the April meeting
    which addressed six charge questions that concentrated on the NCCT's strategic goals; its
    collaborations, and connectedness to the rest of the Agency and to outside scientists; its staffing
    plan; and its thematic choices. The letter report was finalized by the BOSC Executive Committee
    and transmitted to ORD in July 2005.  A formal response of the NCCT to the review was provided
    to the BOSC at their September 2005 Executive Committee meeting.

    The Subcommittee met again in a June 2006 review to continue to provide the NCCT with advice
    on the progress the Center had made, since Spring 2005, in fulfilling its mission and  strategic goals.
    The subcommittee addressed nine charge questions, which touched on the new extramural
    bioinformatics centers, effectiveness of Center work in Agency research, use of computational tools,
    feedback on the first generation computational toxicology implementation plan, effectiveness of
    Center communication of research, outcomes, and responsiveness to stakeholder needs. A letter
    report was finalized by the BOSC Executive Committee and transmitted to ORD in December 2006.
    A formal response of the NCCT to the review was provided to  the BOSC at their January 2007
    Executive Committee meeting.

    In December 2007, the Subcommittee met to discuss progress the NCCT had made with respect to
    five NCCT activities: ToxCast; Information Management/Information Technology (IM/IT) -
    Informatics; Virtual Liver; Developmental Systems Biology; and Arsenic Biologically-Based Dose
    Response Model. A letter report was finalized by the BOSC Executive Committee and transmitted
    to ORD in September 2008. A formal response of the NCCT to the review was provided to the
    BOSC at their February 2009 Executive Committee meeting. That response included more details
    on progress with informatics (ACToR, DSSTox, ToxRefDB), chemical prioritization (ToxCast and
    ExpoCast), and systems modeling (v-Liver and v-Embryo); and an explanation of the disinvestment
    in arsenic research.

    September 2009 Review

    The purpose of the September 2009 review is to provide the NCCT and broader ORD
    Computational Toxicology Research Program (CTRP) with 1)  advice/recommendations on the
    progress the CTRP has made in the past four and a half years in fulfilling its mission and  strategic
A Federal Advisory Committee for the U.S. Environmental Protection Agency's Office of Research and Development
                               Previous  I     TOG    I     Next

-------
         BOSC COMPUTATIONAL TOXICOLOGY SUBCOMMITTEE SEPTEMBER 2009 DRAFT CHARGE
goals; and 2) advice/recommendations on whether the NCCT should continue as an established
organization beyond its original five-year charter.  In particular, the subcommittee will address the
following questions:

Charge Question 1: What is your evaluation of the progress the CTRP has made in achieving its
original goals and objectives, and whether it has efficiently utilized available resources?

Charge Question 2: To what extent and how effectively has the CTRP utilized internal and external
partnerships to foster its goals?

Charge Question 3: What evaluation can you provide relative to the contributions of the CTRP to
the advancement of transforming the field of toxicity testing?

Charge Question 4: To what extent do the ORD intramural projects, the extramural STAR centers,
and the five stated CTRP management priorities described in the FY09-12 implementation plan
combine to efficiently support the goal of providing high throughput decision support tools for
screening and assessing chemical exposure, hazard and risk to human health?

Charge Question 5: The NCCT was established as an organization with a five-year charter ending
in February 2010, which would continue dependent on:  1) meeting established goals; and 2) having
continuing mission-critical goals and objectives. What recommendation(s) can you provide the
Agency regarding continuation of the NCCT as an established organization, and the criticality of its
goals and objectives to EPA?
                            Previous  I    TOC

-------
                                                    00011001t Qi   00
                                                    i 0 00i 10 1110111
                                                  COMPUTATIONAL
                                                      TOXICOLOGY
    U.S. EPA OFFICE OF RESEARCH AND DEVELOPMENT
  COMPUTATIONAL TOXICOLOGY RESEARCH PROGRAM

   IMPLEMENTATION PLAN FOR FISCAL YEARS 2009-2012


  Providing High Throughput Decision Support Tools for Screening
         and Assessing Chemical Exposure, Hazard and Risk


                BOSC Review Draft- 24 August, 2009
                             DISCLAIMER:
This document has been reviewed by the U.S. EPA Office of Research and Development (ORD)
and approved for public release, but does not necessarily constitute official Agency policy. This
Plan follows the first generation FY2005-2008 Computational Toxicology Research Program
(CTRP) Implementation Plan, and provides a strategic overview of research for FY2009-2012.
This Plan was reviewed by ORD senior management and members of the Science Council, as
well as the Computational Toxicology Subcommittee of the ORD Board of Scientific Counselors
(BOSC) on September 29-30, 2009, in RTF, NC.
                    Previous
TOC

-------
  EPA CompTox Research Program FY2009-2012                BOSC Review Draft- 24 August, 2009
TABLE OF CONTENTS

LIST OF FIGURES	iv
LIST OF TABLES	iv
ACRONYMS	v
EXECUTIVE SUMMARY	vii

I.  History of the CTRP and the NCCT	1
   A. Defining the Mission of Computational Toxicology at EPA	1
   B. Timeline of CTRP Development	2
   C. Resources for the CTRP	4
      1. Funding and NCCT Personnel	4
   D. BOSC Reviews of the CTRP	6
   E. NRC Report and EPA's Strategic  Plan for 21 st Century Toxicology	7
      1. NRC Report on Toxicity Testing in the 21st Century	7
      2. EPA Strategic Plan for Evaluating the Toxicity of Chemicals	8
   F. Significant Accomplishments of the CTRP: FY2006-2008	9
      1. Accomplishments of the NCCT	9
      2. Accomplishments by the CTRP ORD Intramural Partners	12
      3. Accomplishments by STAR Grantees in the CTRP	16
   G. Summary on Retrospective of the CTRP and NCCT	17
II. Revision of the CTRP for FY2009-2012	18
   A. Maturation of the Program	18
   B. CTRP Integration across Other ORD Laboratories and Centers	21
   C. Regional and Program Interactions	26
   D. Priority Areas for CTRP Management	29
      1. Toxicity Predictions and Chemical Prioritizations Incorporating Exposure	29
      2. Strengthening Cross-ORD Collaborations	29
      3. Tox21: A Federal Partnership Transforming Toxicology	30
      4. Communicating Computational Toxicology	32
         a.  EPA Program Office Training  and Implementation of Computational Tools	32
         b.  Communities of Practice for Chemical Prioritization and Exposure Science	32
      5. Developing Clients for Virtual Tissues	33
III. CTRP Project Summaries for FY2009-2012	35
   A. Intramural Projects Coordinated by NCCT	35
      1. ACToR	35
      2. DSSTox	36
      3. ToxRefDB	37
      4. ChemModel	37
      5. ToxCast™	37
      6. ExpoCast™	37
      7. v-Embryo™	39
      8. v-Liver™	39
      9. Uncertainy	39
   B. Intramural Projects Coordinated by NERL and NHEERL	40
      1. NERL	40
                                        11
                       Previous I    TOC

-------
  EPA CompTox Research Program FY2009-2012                BOSC Review Draft- 24 August, 2009
       2.  NHEERL	41
   C.  Extramural STAR Grantee Projects	43
   D.  Summary Integration of the CTRP Projects for FY2009-2012	44
IV. Appendices	A-l
   A.  Intramural CTRP Projects	A-l
       1.  Project Plans	A-l .a
          a.  ACToR - Aggregated Computational Toxicology Resource	A-l .a. 1
          b.  DSSTox- Chemical Information Technologies in Support of Toxicology
             Modeling	A-1 .b. 1
          c.  ToxRefDB - Toxicity Reference Database	A-l.c.l
          d.  ChemModel- The Application of Molecular Modeling to Assessing Chemical
             Toxicity	A-1 .d. 1
          e.  ToxCast™- Screening and Prioritization of Environmental Chemicals Based on
             Bioactivity Profiling and Predictions of Toxicity	A-l .e.l
          f.  ExpoCast™- Exposure science for screening, prioritization, and toxicity
             testing	A-l .f. 1
          g.  v-Embryo™ - Virtual Embryo	A-l.g. 1
          h.  v-Liver M- Virtual Liver Project	A-l.h.l
          i.  Uncertainty Analysis in Toxicological Modeling	A-l.i. 1
       2.  Project Outcomes Table	A-2.1
   B.  Extramural STAR Centers	B-l
   C.  FY2004 "New Start" Award Bibliography	C-l
   D.  EPA Strategic  Plan for Evaluating the Toxicity of Chemicals	D-l
                                         in
                        Previous  I    TOC

-------
                                 LIST OF FIGURES

Figure 1 -ORD Computational Toxicology Research Program Development	8
Figure 2-CTRP Budget History	10
Figure 3 -NCCT Organizational Chart	11
Figure 4 -Computational Toxicology in ORD	27
Figure 5 -The Future State: Using Hazard and Exposure Predictions to Prioritize Testing and
Monitoring	28
Figure 6 -Applying Computational Toxicology Along the Source to Outcome Continuum	38
                                LIST OF TABLES

Table 1-Project Outcomes Table	A-1.2
                                        IV
                       Previous
TOC

-------
  EPA CompTox Research Program FY2009-2012                BOSC Review Draft- 24 August, 2009
ACRONYMS

ACToR Aggregated Computational Toxicology Resource
BOSC Board of Scientific Counselors
CPCP Chemical Prioritization Community of Practice
CoP Communities of Practice
CTISC Computational Toxicology Implementation and Steering Committee
CTRP Computational Toxicology Research Program
DNA Deoxyribonucleic acid
DSSTox Distributed Structure-Searchable Toxicity Database
EDC Endocrine Disrupting Compounds
EPA U.S. Environmental Protection Agency
ExpoCast™ Exposure Forecasting Project
ExpoCop Exposure Science Community of Practice
FTE Full Time  Equivalents
FTTW Future of Toxicity Testing Workgroup
FY Fiscal Year
HTS High Throughput Screening
IAG Interagency Agreements
IRIS Integrated Risk Information System
KEGG Kyoto Encyclopedia of Genes and Genomes
LTG Long Term Goal
MICA Mechanistic Indicators of Childhood Asthma
MO A Modes or Mechanisms of Action
MOU Memorandum of Understanding
MTA Material Transfer Agreements
MYP Multi Year Plan
NCCT National Center for Computational Toxicology
NCGC NIH Chemical Genomics Center
NCEA National Center for Environmental Assessment
NCER National Center for Environmental Research
NCGC National Institutes of Health Chemical Genomics Center
NGO Non-Governmental Organization
NERL National Exposure Research Laboratory
NHEERL National Health and Environmental Effects Research Laboratory
NHGRI National Institutes of Health Chemical Genome Research Institute
NIEHS National Institute of Environmental Health Sciences
NIH National Institutes of Health
NPD National program Directors
NRC National Research Council
NRMRL National Risk Management Research Laboratory
NTP National Toxicology Program
OECD Organization for Economic Cooperation and Development
OPP Office of Pesticide Programs
OPPT Office of Pesticides, Prevention, and Toxics
OPPTS Office of Pesticides, Prevention, and Toxic Substances
                       Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012
              BOSC Review Draft- 24 August, 2009
ORD Office of Research and Development
OSCP Office of Science Coordination and Policy
OW Office of Water
QSAR Quantitative Structure Activity Relationship
RFA Request for Applications
RNA Ribonucleic acid
SAB Science Advisory Board
SEE  Senior Environmental Enrollee
STAR Science to Achieve Results
ToxCast™ Toxicity Forecasting Project
ToxRefDB Toxicity Reference Database
v-Embryo™ Virtual Embryo Project
v-Liver™  Virtual Liver Project
v-Liver-KB v-Liver Knowledgebase
                                         VI
                        Previous
TOC

-------
  EPA CompTox Research Program FY2009-2012
              BOSC Review Draft- 24 August, 2009
EXECUTIVE SUMMARY
This document lays out the fiscal year 2009-2012  objectives of the U.S. Environmental
Protection Agency (EPA), Office of Research and Development (ORD) research program in
computational  toxicology.  Computational toxicology is the  application of mathematical and
computer  models  to  help assess  chemical hazards  and risks  to  human  health  and  the
environment.   Supported  by  advances  in  informatics,  high-throughput screening  (HTS)
technologies, and systems biology, EPA is developing robust and flexible computational tools
that can be applied to the thousands of chemicals in commerce, and contaminant mixtures found
in America's  air,  water,  and  hazardous-waste  sites.  The ORD  Computational Toxicology
Research Program (CTRP) is composed of three main elements. The largest component is the
National Center for Computational Toxicology  (NCCT),  which was  established in 2005  to
coordinate research on chemical screening and prioritization, informatics, and systems modeling.
The  second element consists of related activities in the National Health and Environmental
Effects  Research Laboratory (NHEERL) and  the  National  Exposure Research  Laboratory
(NERL). The third and final component consists of academic centers working on various aspects
of computational toxicology and funded by the EPA  Science to Achieve  Results (STAR)
program. Together these elements form the key components in the implementation of both the
initial strategy, A Framework for a Computational  Toxicology Research Program (US EPA,
2003), and the newly released The U.S. Environmental Protection Agency's Strategic Plan for
Evaluating the Toxicity of Chemicals (US EPA, 2009).  Key intramural projects of the CTRP
include  digitizing legacy toxicity testing information toxicity reference database (ToxRefDB),
predicting toxicity (ToxCast™) and  exposure  (ExpoCast™),  and creating  virtual liver (V
Liver™) and virtual embryo (v-Embryo™) systems models. EPA funded STAR centers are also
providing bioinformatics, computational toxicology data and models, and developmental toxicity
data and models. All CTRP projects participate in the Agency's formal quality assurance (QA)
program and regularly undergo peer review. The models and underlying data are being made
publicly available through  the Aggregated Computational Toxicology Resource (ACToR), the
Distributed Structure-Searchable Toxicity  (DSSTox)  Database  Network,   and other EPA
websites. Thus the CTRP is providing the foundation for advancing high-throughput toxicology
and risk assessments,  and thereby closing critical data  gaps for thousands of chemicals and
helping  EPA better assess and manage chemical risk.

The  CTRP is  evolving  beyond the  initial focus  on hazard identification  and chemical
prioritization,  as expressed in the new long term goal of providing high-throughput decision
support  tools for assessing  chemical exposure, hazard and risk. There is an increasing emphasis
on using high-throughput bioactivity profiling data in systems modeling to support quantitative
risk  assessments, and greater involvement  in developing complementary higher  throughput
exposure models.  Discussions are  well underway between NCCT,  NHEERL, NERL,  the
National Risk Management Research  Laboratory  (NRMRL),  the  National  Center  for
Environmental  Assessment (NCEA),  and the National Center for Environmental Research
(NCER, managers of the STAR program) on how the CTRP will play a major role in future
integrated ORD programs centered on looking at chemical hazards and risks  from a life cycle
viewpoint. This integrated approach will enable  analysis  of  life  stage susceptibility,  and
understanding of the exposures, pathways and key events by which chemicals exert their toxicity
in developing systems (e.g., endocrine related pathways). The CTRP will be a critical component
                        Previous
                                          vn
TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
in next generation risk assessments utilizing quantitative high-throughput data and providing a
much higher capacity for assessing chemical toxicity than is currently available.

This second generation CTRP  implementation  plan  is highly consistent with the Agency's
priority  for improving the management of chemical and contaminant risks. A January 2009 a
memo sent to all EPA employees from Administrator Jackson listed managing chemical risks as
one of five top priorities and stated:

   "More than 30 years after Congress enacted the Toxic Substances Control Act, it is clear that
   we are not doing an adequate job of assessing and managing risks of chemicals in consumer
   products, the workplace and the environment.  It is now time to revise and strengthen EPA's
   chemicals management and risk assessment programs."

With  contributions from across  ORD, the CTRP will provide EPA  program offices better
decision analysis tools for hazard and exposure screening and assessment, which can then  be
used to better manage the risks of chemicals. The CTRP is acquiring an international reputation
for leadership in the introduction of innovative high-throughput technologies and computational
approaches for identifying toxicity pathways  and characterizing  response  to  environmental
exposures. It is  through this  effort that problems will  be addressed,  and solutions to EPA's
chemicals management and risk assessment programs will be developed.
                                          Vlll
                        Previous  I    TOC

-------
                      Mission statement of Computational
                      Toxicology: To integrate modern
                      computing and information technologies
                      with molecular biology to provide the
                      Agency with decision support tools for
                      high-throughput risk assessment.
I.      HISTORY OF THE CTRP AND THE NCCT

A.     Defining the Mission of Computational Toxicology at EPA

Computational toxicology applies mathematical
and computer models and molecular biological
and chemical approaches to explore both
qualitative and quantitative relationships
between chemical exposure and adverse health
outcomes. Recent technological advances make
it possible to develop molecular profiles using
high-throughput and high content methods that
identify the impacts of environmental exposures on living organisms. With these tools, scientists
can produce a more-detailed understanding of the hazards and risks of a much larger number of
chemicals. The integration of modern computing with molecular biology and chemistry is
allowing  scientists to better understand a chemical's progression through the environment to the
target tissue within an organism, and ultimately to the key steps that trigger an adverse health
effect. Currently, risk estimates are most often based on gross outcomes of disease such as
occurrence of cancer, a neurological disorder, or a visible birth defect. The National Research
Council, in its 2007 report Toxicity Testing in the 21st Century: A  Vision and a Strategy called
for a concerted effort to move toxicology from  a primarily descriptive science to a more
predictive one by utilizing largely human based in vitro studies to  understand the biological
pathways by which chemically induced diseases occur. EPA's CTRP is working to aide this
transformation by evaluating the key molecular changes occurring in the function of critical
human toxicity pathways within cells, tissues, individuals and populations. The key will be
connecting these changes quantitatively and systemically to the types of adverse health effects
that have been the traditional basis of EPA risk assessments and to use this understanding to
reduce the current uncertainties in the extrapolation of effects across dose, species and
chemicals. The ORD CTRP is composed of three main elements. The largest component is the
NCCT, which was established in 2005 to coordinate research on chemical screening and
prioritization, informatics, and systems modeling. The second element consists of related
activities in the NHEERL and NERL laboratories of ORD. The third and final component
consists of academic centers working on various aspects of computational toxicology and funded
by the EPA STAR program.

The rapid and ongoing success of the CTRP is impacting hazard and exposure identification,
helping to close data gaps, identify toxicity pathways, suggest modes of action, and make for
more efficient utilization of precious resources  on the highest priority chemicals. Besides these
initial outcomes from the higher throughput approach of the CTRP, informatics and modeling
efforts will provide more in-depth and quantitative molecular understanding of how biological
systems respond to environmental chemicals. These knowledgebases and in silico tools will
reduce or quantify uncertainties relating to biological susceptibility,  species differences and dose
response  as part of a faster and more intelligent targeted testing paradigm in support of
quantitative risk assessments.
Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012
                                    BOSC Review Draft- 24 August, 2009
B.     Timeline of CTRP Development

In FY2002 Congress ordered a redirection of $4 million from available EPA funds,

       ".. .for the research, development and validation of non-animal alternative chemical
       screening and prioritization methods, such as rapid, non-animal screens and Quantitative
       Structure Activity Relationships (QSAR), for potential inclusion in EPA's current and
       future relevant chemical evaluation programs."

 To fulfill this directive, the EPA embarked on development of a research program that: (1) was
consistent with the Congressional mandate; (2) complemented and leveraged related on-going
Agency sponsored efforts to consider alternative test methods; (3) further advanced the research
to support the Agency's mission; and (4) would not duplicate the mission and programs in this
area conducted by other agencies (see Figure 1 for a timeline of CTRP development). Thus the
    Figure 1        ORD Computational Toxicology
                  Research Program  Development
 Congressional Redirect
 EDC Pilot Projects

    FY02
CTRP Framework
CTISC
7 CTRP Proof of Concepts
STAR Systems Biology RFA

  FY04
1st CTRP
 Implementation Plan
ToxCast Design
1st and 2nd STAR Centers
sBOSC II

  FY06
   NCCT Staffing Complete
   MAS 21st Century Vision
   EPA 21st Century Strategy
   Tox21 MOU
   ACToR Launch
   DSSTox v2
   3rd STAR Center
   sBOSC III

FY08
FY03
CTRP Design Team
SAB and BOSC reviews
RTP CTRP Workshop
STAR HTS RFA






FY05
NCCT Launch
DSSTox v1
ToxCast Concept
sBOSC I






FY07
1st Title 42s
Chemical Prioritization
CoP
ToxCast Launch
ToxRefDB Launch
v-Tissues Launch
Intl Science Forum




FY09
2nd CTRP
Implementation Plan
ACToR v1
ExpoCast Launch
ToxRefDB v1
ToxCast I Complete
1st ToxCast Summit
ToxCast II Launch
V-Tissues 09
4th STAR Center
sBOSC IV
CTRP was initiated to target these goals and, in the process, significantly advance toxicology and
risk assessment as currently practiced by the Agency and the broader environmental sciences
community. In FY2002-2003 pilot projects were funded to demonstrate computational
toxicology could be adapted to the study of endocrine disruptors. Early successes of these efforts
included refinement of estrogen receptor ligand binding data for development of quantitative
                        Previous
                      TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
structure-activity models, evaluation of EPA-developed cell lines for detecting estrogen and
androgen activities from various species and the development of an alternative test method for
evaluating effects on steroidogenesis in H295R cells (see EDSP Assay Status).

With increasing attention to and expectations for the CTRP in FY2003, ORD developed^
Framework for a Computational Toxicology Research Program, which was published in
FY2004 and provided strategic direction for the program. This document was the product of a
cross-ORD design team of scientists and was endorsed by the Science Advisory Board (SAB).
ORD hosted a workshop in Research Triangle Park in late FY2003 to introduce the CTRP
framework (Kavlock et. al., 2003), from which the three objectives for EPA computational
toxicology were translated into the three initial Long Term Goals (LTGs) for the program:

       • Risk assessors use improved methods and tools to better understand and describe the
       linkages of the source-to-outcome paradigm,
       • EPA Program Offices use advanced hazard characterization tools to prioritize and
       screen chemicals for toxicological evaluation and
       • EPA assessors and regulators use new and improved methods and models based on the
       latest science for enhanced dose-response assessment and quantitative risk assessment.

With issuance of the CTRP framework, ORD began the process of implementing a more
formalized program. A cross-Agency working group, the Computational Toxicology
Implementation Steering Committee (CTISC) was formed in FY2004 to oversee the selection
and funding of projects across ORD. Seven cross-ORD projects were initiated as result of CTISC
action, and these "new start" projects became a critical component of the first generation CTRP
implementation plan. Greater detail on the accomplishments of these seven projects is provided
below in Section I.F.2.

In October 2004, then EPA Science Advisor and Assistant Administrator for ORD, Dr. Paul
Oilman announced the formation of the National Center for Computational Toxicology (NCCT),
which began official functions in February 2005. The announcement states:

       "The Center will advance the science needed to more quickly and efficiently evaluate the
       potential risk of chemicals to human health and the environment. The Center will
       coordinate and implement EPA's research on computational toxicology to provide tools
       to conduct more rapid risk assessments and improve the identification of chemicals for
       testing that may be of greatest risk."

NCCT quickly became the hub for ORD CTRP research. NCCT formed key partnerships with
the other Laboratories and Centers within ORD, which formed the second  critical element.
Partnerships with NHEERL, NERL, NRMRL, and NCEA helped in the execution of not only the
seven cross-ORD "new start" projects awarded by the CTISC in 2004, but also in several
original NCCT led projects including Distributed Structure-Searchable Toxicity Database
(DSSTox), ToxCast™, ToxRefDB, and the virtual tissues projects looking at liver and embryo.
Greater detail on the accomplishments  and future plans for these and other NCCT projects is
provided below in Section I.F.I.
                        Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012
                                                      BOSC Review Draft- 24 August, 2009
The third critical component of the CTRP is extramural partners and research, much of which is
supported by NCER through the STAR program. In FY2003 and 2004 two separate STAR
Requests for Applications (RFA) that funded projects in HTS and systems biology were issued.
In FY2006 two STAR academic centers to support the advancement of bioinformatics in
environmental health were funded, with a third center for computational toxicology funded in
FY2008. An award for a fourth center, which will focus on pathways and models of
developmental toxicity, was made in late FY2009. Additional information on the STAR centers,
their accomplishments, and future plans is provided in Section IF.3.

In ORD 's Computational Toxicology Research Program Implementation Plan for FY2006-2008,
these various research efforts supporting the three LTGs identified in the CTRP framework were
grouped into five research tracks: (A) Development of Data for Advanced Biological Models;
(B) Information Technologies Development and Application; (C) Prioritization Method
Development and Application; (D) Providing Tools and System Models for Extrapolation across
Dose, Life Stage, and Species; and (E) Advanced  Computational Toxicology Approaches to
Improve Cumulative Risk Predictions. Within these five tracks, all of the NCCT, ORD
intramural and STAR funded extramural project plans and interconnections were defined in this
first generation implementation plan. The next generation of the CTRP implementation plan
carries forward many of these same research components into FY2009-2012.

As noted in the EPA Strategic Plan for Evaluating the Toxicity of Chemicals (see Section I.E.2),
many environmental statutes require that EPA consider both human and ecological health risks,
and while the initial emphasis of the CTRP has been on improvements in human health
assessments in order to provide the critical mass and resources necessary to be successful, the
program also over time must address ecological concerns as well. As we look to the future, we
see opportunities to leverage existing efforts such  as the Tox21 project (see Section II.D.3) and
research on alternative species (e.g., the zebrafish  projects underway at NHEERL and at the
newest STAR Center) to help transition the CTRP to a more balanced human and ecological
assessment program.
C.
       Resources for the CTRP
                                                   CTRP Budget History
1.     Funding and NCCT Personnel

Funding of the CTRP has been relatively stable over the past several years, and this has allowed
the program to develop consistent with the
strategic plan. In FY2009, the program
was funded at ~$ 15M and 32 FTEs.         Fl§ure 2
Approximately 50% of the resources are
allocated to NCCT, 25% to the STAR program,
and the remainder to NHEERL and NERL.
The majority of the FTEs (~22) are located in
the NCCT. Figure 2 displays the history of the
budget through to the President's FY2010
request, which would provide an increase in
tu-
35-
30
25-
on
15
10
5
n.





K
c




I
c:


|
_








I
~






=






1






1






DFTEs
• $ (in Ms)

                        Previous
                                        TOC

-------
   EPA CompTox Research Program FY2009-2012
                                    BOSC Review Draft- 24 August, 2009
funding, initially to support Phase II of the ToxCast™, but then broader aspects of the program
in the out years.

The NCCT is organized into three primary functional groups: Chemical Prioritization, Systems
Modeling, and Informatics, with a small group of administrative support personnel (Figure 3). In
addition to the permanent federal staff of 23, there are a number of postdoctoral, predoctoral,
student contractors, and Senior Environmental Enrollees (SEE) (A Grantee Organization) who
support various aspects of the program. In 2006 and 2007 NCCT successfully recruited three
senior-level Title 42 scientists in bioinformatics and systems biology. These hires have proven
critical for NCCT to establish core research projects in predictive and systems toxicology and the
necessary informatics and computational infrastructure. More recently, in 2008 several junior
research positions were filled from scientists coming through the NCCT postdoctoral program.
Full CVs of the staff are provided at the NCCT website. The NCCT is currently recruiting a
public affairs specialist to improve both internal and external communications.

Figure 3 - NCCT Organizational Chart

  ORD NCCT Organizational Chart

    EPA   Environmental Protection Agency

    ORD  Office of Research and Development
    NCCT
    National Center for
    Computational
    Toxicology
    Director
    RobertJCayjock
    Deputy Director (on detail)
    Jerry Hlancato
Administration
Management Analyst
KarenjDean
Program Analyst
Sandra Roberts
  DoroUffi Goodson
  SEE Grantee
Public Affairs Specialist
TBD
  Chemical Prioritization

   David Dt», ;/,;-Jrifl Dspu-j Director)

    Dai\is Roffoff - UNC Graduate Student

   Kefth Houi*

   Stephen Little
   James Rabinowitz
Systems Modeling

 Elaine Cohen JHuisal
 Toni_KnjJdsen
  N
-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
D.     BOSC Reviews of the CTRP

To help guide the CTRP, ORD established a standing subcommittee of the ORD BOSC to
provide review and advice to NCCT and the CTRP. To view all prior BOSC reviews and ORD
responses please visit the attached link (NCCT BOSC Reviews).

The BOSC first met in April 2005 to review the organization of NCCT, initial plans for
implementation, and progress of the early CTRP work. The panel commented very favorably on
the Center's early progress and the means outlined to achieve its goals. The composition of staff,
plans for future  hiring, establishment of working partnerships, and the Center's strategic plan
were especially  highlighted. Several recommendations were made for consideration. Two main
recommendations were to develop a formal implementation plan for the future, and secondly to
develop Communities of Practices (CoPs) within the EPA, which could serve as a networking
function for interested scientists. ORD's Computational Toxicology Research Program
Implementation Plan (FY 2006 - 2008) was completed in April of 2006, in time for the next
BOSC review. Two CoPs have been organized and are discussed later in this document. A  few
other minor suggestions were also made, which were addressed in the formal ORD response to
the review.

On June 19-20, 2006, a second BOSC site visit was held to assess and evaluate NCCT progress
in executing the first implementation plan and incorporating prior BOSC recommendations. In
this review it was noted, in the 16 months of its existence, "NCCT has made substantial progress
in (1) establishing goals and priorities; (2) making connections within and outside EPA to
leverage staff's  considerable modeling experience; (3) expanding its capabilities in informatics;
and (4) significant contributions to research and decision making throughout the Agency."  The
BOSC also noted that "many of the recommendations made by the BOSC during its first review
have been acted on by NCCT."  This review occurred just before NCCT hired the first two
scientists, under ORD's new Title 42 authority, which brought needed experience in informatics
and systems biology to address several of the key recommendations of the BOSC from this
second review.

The third review by the BOSC occurred on December  17-18, 2007. In this review, the BOSC
noted that NCCT continued to make substantial progress in setting priorities and goals, and
specifically acknowledged the "increased capabilities in bioinformatics through the funding of
two STAR centers and in informatics and systems biology through  staff hires; expansion of its
technical approaches to even more programs within the Agency; and formation of an extensive
collaboration with the NIEHS and the NHGRI for its ToxCast™project."  They went on to
recommend that: (1)  client offices participate in future reviews to ensure all parties understand
how NCCT's efforts can address the most relevant needs of the Agency; (2) interactions with
risk assessors in the Agency be enhanced, particularly related to how ToxCast™ might be
utilized by them; (3) complementary efforts in exposure prioritization be undertaken; (4) the
directions and milestones be detailed and applications of the virtual tissues to risk assessment be
clarified; and  (5) finally, the committee encouraged a more precise  definition of the database  for
compilation and rigorous quantitative analysis of the ToxCast™ data. Overall, the review
considered the goals of the Center to be well described, very ambitious and innovative, as well as
important for the future of research at EPA.
                        Previous   I     TOC

-------
  EPA CompTox Research Program FY2009-2012                  BOSC Review Draft- 24 August, 2009
The next BOSC review is scheduled for September 29-30, 2009, and will focus on the products
from the first CTRP implementation plan and the future directions outlined in this second
generation plan.

E.     NRC Report and EPA's Strategic Plan for 21st Century Toxicology

Two significant activities have increased the visibility and importance of EPA's computational
toxicology research efforts over the past two years. The first is a report, commissioned by the
EPA, and from the NRC presented a vision of how toxicological evaluations should be
conducted in the future. The second activity, motivated by the NRC report, was development of
an EPA strategic plan to transform how the Agency addresses chemical toxicity.

1.     NRC Report on Toxicity Testing in the 21st Century

The NRC released a report titled Toxicity Testing in the Twenty First Century: A  Vision and
Strategy, in 2007, which outlines a long-term vision for developing novel approaches to
chemical toxicity characterization and prediction. This vision addresses several concerns about
the current "gold standard" methods for toxicity characterization which rely heavily on  extensive
animal testing. These concerns are the desire to reduce the number of animals used in testing, to
reduce the overall cost and time required to characterize each chemical, and to increase the level
of mechanistic understanding of chemical toxicity.

The NRC report outlines an approach for toxicity determination in which each chemical would
be first characterized for a number of properties related to environmental distribution, exposure
risk, physico-chemical properties, and metabolism. These activities fall under the heading
"chemical characterization," this (with the exception of metabolism) excludes bioactivity in the
target organism. The second stage is "toxicity pathway characterization," in which a series of
cell-based and non-cell-based (in vitro) tests would be used to indicate which (if any) "toxicity
pathways" are activated by the test chemical. A major challenge posed by this approach is that
few such toxicity pathways are currently understood,  and assays to probe these pathways are
therefore generally lacking. Next, for a subset of chemicals, "targeted testing" would be carried
out to refine our understanding of the effects of triggering specific toxicity pathways. Targeted
testing might involve additional in vitro assays or a limited amount of in vivo animal testing.  A
final phase is "dose response and extrapolation modeling," that would use new and existing data
and models to perform low dose extrapolation, toxicokinetics, and exposure estimation. All
phases would have significant computer modeling components. The end result of these  studies
would be a determination of the potential toxic effects (including mode and mechanism of
action) of a compound, as well as estimates of dose response behavior.

There is a relatively simple argument as to why an in  v/Yro-based approach should be able to
predict whole animal or human toxicity. The effect of a chemical is ultimately due to direct or
indirect molecular interactions of the chemical with one or more cellular components. These
interactions can be receptor or enzyme binding, disruption of a lipid membrane, localized
production of free radicals, or non-specific dephosphorylation. Nonetheless, if two chemicals
have the  same biological interactions and have the same distribution and kinetics within an
organism, then the two chemicals should present the same bioactivity profiles and potential toxic
                         Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                  BOSC Review Draft- 24 August, 2009
effects. This concept highlights a major benefit of the in vitro, mechanism-based approach; it
provides a way to extrapolate from one chemical to the next based on a set of relatively
inexpensive and quick biochemical or cellular assays. Achieving this vision however, will take
many years due to a number of circumstances that are described in the NRC report. It is
noteworthy that the NRC report did not downplay how difficult the task of developing this new
approach to toxicity testing would be, proposing a timeline of 20 years with annual investment
on the order of $100M to fully achieve this vision. However, for initial chemical screening and
prioritization a generalized set of assays for key biological targets and pathways can be
successfully implemented in a much shorter and less costly fashion.

2.     EPA Strategic Plan for Evaluating the Toxicity of Chemicals

In response to the release of the NRC report, EPA established an intra-agency workgroup, the
Future of Toxicity Testing Workgroup (FTTW), under the auspices of the Science Policy
Council. The FTTW includes representatives from across the Agency, including the Regions and
all major Program Offices. It produced The U.S. Environmental Protection Agency's Strategic
Plan for Evaluating the Toxicity of Chemicals, which serves as a blueprint to ensure a leadership
role for EPA in pursuing the directions and recommendations presented  in the 2007 NRC report.
The strategy presents the Agency's vision of how to incorporate a new scientific paradigm and
new tools into toxicity testing and risk assessment practices with ever-decreasing reliance on
traditional approaches. The overall goal of the strategy is to provide the tools and approaches
necessary to move from a near exclusive use of animal tests for predicting human health effects
to a process that relies more heavily on in vitro assays, especially those using human cell lines.

The program envisioned in this strategy builds upon the traditionally major components of the
risk assessment process (i.e., hazard identification, dose response, exposure assessment, and risk
characterization) by overlaying a  toxicity pathways approach on the source—>• fate &
transport—>exposure—>-outcome continuum. Specific components include three integrative and
interactive focal areas:

    •   Toxicity Pathway-Based Chemical Screening and Prioritization. This research will
       focus  on (1) identifying toxicity pathways and deploying in vitro assays to characterize
       the ability of chemicals to perturb those pathways, and (2) further development and
       implementation of the ToxCast™ concept to establish the predictive relationship of the
       new assays for identifying adverse outcomes in humans or ecological populations.

    •   Toxicity Pathway-Based Risk Assessment. This element will focus on reducing key
       risk assessment uncertainties currently associated with the extrapolation of data from
       animal studies to humans, from high doses to relevant human exposures, and to different
       population susceptibilities (e.g., children). The program will achieve these ends by (1)
       developing knowledgebases of toxicity pathways, toxicological responses, and
       information on biological  networks; (2) constructing dynamic computational models of
       tissue biology that link diverse data together to understand the progression of events from
       exposure to effect;  and (3) demonstrating that this new vision of toxicity testing
       adequately predicts human risk using case studies.
                         Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
   •   Institutional Transition. Implementing major changes in toxicity testing of
       environmental chemicals and incorporating new types of toxicity data into risk
       assessment will require significant institutional changes in terms of (1) how EPA
       transitions to the use of new types of data and models; (2) how EPA deploys resources
       necessary to implement the new toxicity testing paradigm, such as hiring of scientists
       with particular scientific expertise and training of existing scientific  staff; and (3) how
       EPA educates stakeholders and the public.

While the workgroup identified a range of partners in this effort and some planning on the
relative role of these partners has been developed, the specific areas of work to be conducted and
funded by EPA versus these other partners needs to be further assessed. Decisions on these
relative roles will have a significant impact on EPA resources required to implement the vision.
Regardless, the CTRP will play a central role in all three goals of strategy - from identifying and
conducting high throughput screening on chemical libraries of interest, to developing systems
levels model of biology for application in risk assessment, to training program offices in the
understanding and use of the new technologies.

F.     Significant Accomplishments of the CTRP: FY2006-2008

1.     Accomplishments of the NCCT

With the CTRP being in existence for nearly six years, and the NCCT for more than four of
those six, a number of projects contained within the CTRP are now delivering important
accomplishments to regulatory offices within the Agency, toxicologists in the international
community, and the regulated chemical industry. Details of these accomplishments are excerpted
below, with greater details available in the descriptions of the ten NCCT-led projects provided in
Appendix IV.A., with additional information on the annual milestones and projected impacts of
the projects provided in a summary format in Appendix IV.C.

A number of projects on informatics, chemical prioritization and systems level models of biology
are beginning to provide the foundation for high throughput, decision making tools that EPA
program offices can apply to chemical hazard screening and risk assessment. These include:

          •  ToxRefDB:  A relational, electronic toxicity reference database (ToxRefDB)
             developed in partnership with the Office  of Pesticide Programs that contains
             results of over 30 years and $2B worth of rat and mouse chronic, rat
             multigenerational, and rat and rabbit developmental studies for over 400
             chemicals (Martin et al 2009a ; Martin etal 2009b: Knudsen etal 2009}. This
             relational database is allowing the Agency, for the first time, to readily discern
             patterns of toxicity in the assays and to assess the value of the design of the assays
             in assessing toxicity. It will also be invaluable in the interpretation of the HTS
             data derived in the ToxCast™ Program. This database will be expanded to
             include Developmental Neurotoxicity Assays, and potentially results from the
             Endocrine Disrupter Screening Program, thus affording a one stop shop for
             animal bioassay data.
                         Previous  I    TOC

-------
EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
        •   ACToR: The Aggregated Computational Toxicology Resource (ACToR)
           provides an Internet based portal of information on chemical structure, bioassay
           and toxicology data for environmental chemicals from 200 sources of public data
           for over 500,000 chemicals, and provides a central integrated public resource for
           all DSSTox, ToxCast™ and ToxRefDB data (Judson. etal2009\ ACToR was
           released to the public in the second quarter of FY2009, and will undergo periodic
           updates over the next several years as we add data and functionalities, and
           respond to user feedback.

        •   DSSTox: The Distributed Structure-Searchable Toxicity Database Network
           (DSSTox) was updated with several high-interest chemical inventories, including
           ToxCast1  Phase I chemicals, EPA High Production Volume Chemicals, and the
           2 major public genomics inventories (GEO  and ArrayExpress), an on-line
           structure-browser, and linkages and coordination with internal and outside
           resources such as ACToR and PubChem (Williams, et al 2009; Williams, etal
           2009}. DSSTox, in coordination with ACToR, is also responsible for all chemical
           information registration and review for ToxCast™ and Tox21 (see the next page
           for a more detailed explanation of Tox21) projects, and is the source of high
           quality chemical structures for the OECD QSAR Toolbox.

        •   ToxCast™: ToxCast™ is a major effort to evaluate the comprehensive use of
           HTS assays to provide biological fingerprints of activity that can be used to
           predict adverse outcomes in rodents and in humans (Dixet al 2007). With the
           initiation of 9 contracts in FY2007, the CTRP began to generate a myriad of
           molecular in-vitro based data for the ToxCast™ program. Data collection for over
           300 chemicals from over 500 assays in Phase I of this chemical prioritization
           program was completed in quarter 2 of FY2009. An internal EPA workshop
           regarding ToxCast™ was held in March 2008, and a public ToxCast Summit
           workshop was held in May 2009 with over 30 national and international Material
           Transfer Agreements (MTA) analysis partners participating. Analyses are
           focusing along three dimensions: (1) comparing bioactivity profiles across
           chemical classes; (2) correlating specific assays or pathways with toxic
           phenotypes; and (3) correlating phenotypic  syndromes with bioactivity profiles. It
           is expected that analysis of the data will continue over several years, both by EPA
           and external groups as the compendium of data represents a truly unique and
           innovative resource. A significant number of additional partners were brought
           into the program in FY2008 and FY2009 via MTAs. Additional contracts awards
           are anticipated to augment the breadth of biological pathways contained with
           Phase II, which will launch in late FY2009. A critical feature of Phase II is the
           plan to include drugs that have failed in human clinical trials, thus providing
           human toxicity data upon which to benchmark ToxCast™ bioactivity profiles.
           One major pharmaceutical company, Pfizer, has already agreed to collaborate in
           this regard, and Health and Environmental Sciences Institute has adopted the
           concept as their Emerging Issues Proposal for 2009, promoting the opportunity
           for a much wider group  of pharmaceutical companies to contribute failed drugs
           and clinical data. Thus Phase II of ToxCast™ will be generating HTS data on 700
                                        10
                      Previous  I    TOC

-------
EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
           additional chemicals, some with animal toxicity information, or clinical and
           additional data on human disease, susceptibility, and variability that will
           contribute to the goal expanding, verifying and translating in vitro bioactivity into
           predictions of potential toxicity.

        •   Tox21:  To garner complementary expertise across the federal government to
           transform the field of toxicology to a more predictive science, EPA signed a
           Memorandum of Understanding (MOU) with the National Toxicology Program
           (NTP)/National Institute of Environmental Health Sciences (NIEHS) and the
           National Institutes of Health Chemical Genomics Center (NCGC)/ National
           Institutes of Health Chemical Genome Research Institute (NHGRI) in February
           2008 (Collins et al, 2008}. This "Tox21" consortium now has four active working
           groups identifying chemicals, assays, informatic analyses and targeted testing as
           plans proceed to have nearly 10,000 chemicals under study at the NCGC by the
           fourth quarter of FY2009 (Kavlock et al,  2009}. Supported by Interagency
           Agreements (lAGs) between the NTP and EPA, this effort will conduct 50 or
           more HTS assays on this enlarged chemical library every year for the next several
           years.

        •   Cumulative Risk: Tools of computational toxicology have been used in
           integrating data for the cumulative risk assessments of cholinesterase inhibiting
           pesticides (organophosphates and carbamates) and in developing a model for the
           effects of iodomethane, a fumigant, on development. These methods have been
           used to set safe exposure limits for these  chemicals in the field and home.

        •   Communities of Practice (CoP): Though not projects or tools in the typical
           sense, NCCT has formed several CoPs promoting the utilization of CTRP
           research in specific areas of computational toxicology. Each CoP has a charter
           and an open membership policy, and is co-chaired by a member of the NCCT.
           The CoPs operate via activities such as teleconferences, face to face meetings,
           team rooms,  and workshops. By bringing together members from different parts
           of EPA, ORD and the outside  scientific, regulatory and regulated community,
           CoPs help to promote the adoption of common practices and ontologies, guide
           development of common databases and software usage, aid in construction of
           training materials, provide recommendations on efficiencies of relevant
           operations, and act as a public outreach mechanism for ORD activities. To date,
           two CoPs have been established on Chemical Prioritization and Exposure
           Science. Both of these are large, active groups that meet monthly and have
           brought a wealth of ideas, interest and  international collaboration to various
           CTRP projects.

        •   International Science Forum on Computational Toxicology. In 2007, the
           CTRP hosted EPA's annual Science Forum.  This was the first EPA Science
           Forum held outside of Washington, B.C. and featured an international overview
           of the state of the science on computational toxicology (Kavlock et al 2008}.
           More than 300 people attended this Forum on advances in computer sciences,
                                        11
                      Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
             molecular biology and chemistry, and systems models, that can be used to
             increase the efficiency and the effectiveness by which the hazards and risks of
             environmental chemicals are determined. The Forum surveyed the state of the art
             in many areas of computational toxicology and identified key areas important to
             move the field forward: proof-of-concept studies demonstrating the additional
             predictive power gained; more researchers comfortable generating and working
             with high throughput data and using it in computational modeling; and regulatory
             authorities willing to embrace new approaches as they gain scientific acceptance.
             Ideas from this Forum and efforts bridging from it were helpful in EPA's
             development of The U.S. Environmental Protection Agency's Strategic Plan for
             Evaluating the Toxicity of Chemicals.

2.     Accomplishments by CTRP ORD Intramural Partners

The seven cross-ORD "new start" research projects initiated by the CTISC in FY2004 advanced
the field of computational toxicology and were important components in the first generation
CTRP implementation plan and establishment of the NCCT. Details on the organization, goals,
and cross-ORD participants for these projects were provided in the FY2006-2008 CTRP
Implementation Plan, and a comprehensive listing of publications from these projects is included
in Appendix C.  Descriptions of major accomplishments are provided below:

       Linkage of Exposure and Effects Using Genomics, Proteomics, and Metabolomics in
       Small Fish Models: This project used a combination of whole organism endpoints,
       genomic, proteomic, and metabonomic approaches, and computational modeling to (a)
       identify new molecular biomarkers of exposure to endocrine disrupting compounds
       (EDCs) representing several modes/mechanisms of action (MOA) and (b) link those
       biomarkers to effects that are relevant for both diagnostic and predictive risk assessments
       using small fish models. Data from the project have provided the basis for several
       important predictive modeling efforts. For example, the development of a graphical
       systems model focused on defining the HPG axis of small fish, which enables
       consideration of the interactive nature of a perturbed system at multiple levels of
       biological organization, ranging from changes in gene, protein and metabolite expression
       profiles to effects in cells/tissues that directly influence reproductive  success. A second
       modeling effort involves development of a steady-state model for ovarian tissue to
       predict synthesis and release of testosterone and estradiol. Results from the model were
       successfully compared to data generated from the fathead minnow. Model-predicted
       concentrations of the two steroids over time corresponded well with both baseline
       (control) data, and information from experiments in which estradiol synthesis was
       blocked by fadrozole. Modeling also has focused on the consequences of perturbations in
       the HPG axis relative to effects in individuals and populations. Here, a population model
       which employs a Leslie matrix in conjunction with the logistic equation  (to account for
       density dependence) was used to translate laboratory toxicity information into prediction
       of population trajectories. In these analyses, changes in steroid or vitellogenin
       concentrations in female fish first were related to fecundity, and  then using this
       relationship to population status in fish exposed to EDCs which inhibit production of
       vitellogenin , most notably compounds that depress steroid synthesis (e.g., fadrozole,
                                           12
                         Previous  I    TOC

-------
EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
    prochloraz, trenbolone). That analysis is unique in that it focuses on biochemical
    endpoints, female steroids and vitellogenin, which reflect both toxic MOA of EDCs and
    have a functional relationship to reproductive success (formation of eggs). As such,
    within the overall systems framework for the project, this computational model can serve
    as the basis via which genomic information can be quantitatively linked to responses in
    populations.

    A Systems Approach to Characterizing and Predicting Thyroid Toxicity Using an
    Amphibian Model:  The main objective of this work was to develop a hypothalamic-
    pituitary-thyroid (HPT) model which is capable of integrating data from different levels
    of biological organization into a coherent system. A simulation model has been
    developed to describe the thyroid axis of X. laevis tadpoles. Information pertaining to
    normal baseline HPT-axis development was collected to compare to the perturbed
    system. Thyroid and pituitary gland culture systems were developed to investigate the
    response of these components in isolation from the HPT-axis feedback mechanisms
    within the animal. Genes responsive to TSH were also measured in the thyroid gland and
    pituitary in vivo and in vitro in response to chemical exposure and TSH stimulation.
    Genes involved in the T4 synthesis pathway that robustly responded to TSH, such as NIS
    and thyroid peroxidase (TPO), were not changed when challenged with chemical alone.
    Other thyroid-specific genes such as thyroid transcription factor and thyroglobulin were
    not TSH-responsive. Efforts were made to obtain TPO activity from^ laevis, but the
    small amount of tissue contributed to our inability to develop an assay. Therefore,
    porcine thyroid glands were used to isolate and measure TPO activity. Twenty-four
    chemicals have been tested in the in vitro assay for their capacity to inhibit TPO activity.
    Although the majority of the tested chemicals were negative, several were identified as
    TPO inhibitors. These positives were tested in the thyroid gland explant culture assays
    and the TH released into the culture media was measured by RIA. One of the test
    chemicals, a mercaptobenzothiazol, inhibited T4 release from the  thyroid glands at
    concentrations that were not overtly toxic to the gland. Chemicals that test positive in the
    TPO inhibition assay and the ex vivo thyroid gland TH release assay will be tested in the
    abbreviated amphibian metamorphosis assay to determine activity in the HPT-axis in
    vivo. This will begin to provide information on the predictive ability of the in vitro and ex
    vivo assays for identifying thyroid-disruptive chemicals.

    Risk Assessment of the Inflammogenic and Mutagenic Effects of Diesel Exhaust
    Particulates: A  Systems Biology Approach:  This project utilized a systems approach
    to developing and applying predictive computational models that quantitatively describe
    relationships between the composition of DEP and its genotoxic and inflammogenic
    potencies. A significant accomplishment is derived from the development of a prototype
    ESP sampler, which has resulted in a significant improvement in DEP collection yield
    relative to conventional filter methods for particle capture that were in use previously.
    These analyses have produced an unprecedented physicochemical characterization of the
    DEP, including XRF analysis of metal content and GC/MS analysis of organic species, as
    well as determinations of OC/EC, particle size and aerodynamic characteristics,  etc. By
    design, this Phase generated inflammogenicand mutagenic DEPs of varying composition
    to address the central hypothesis regarding the inflammogenic and mutagenic potency of
                                        13
                      Previous  I     TOC

-------
EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009


    DEP. An extensive database has been generated and is currently being analyzed for
    publication and use in model development.

    The Mechanistic Indicators of Childhood Asthma (MICA) study: A Systems Biology
    Approach to Improve the Predictive Value of Biomarkers for Assessing Exposure,
    Effects, and Susceptibility in the Detroit Children's Health Study:  This
    computational toxicology effort integrated rodent and human research across the source-
    to-outcome continuum to link gene expression with clinical outcomes and biomarkers of
    exposure, early effect, and susceptibility for two broad classes of chemicals: polycyclic
    aromatic hydrocarbons (PAHs) and metals. In collaboration with Michigan State
    investigators, exposures of rodents to concentrated air particulates was completed using
    state of the art mobile air exposure chambers. Genomic analysis of rodent blood and lung
    tissues highlighted tissue-specific patterns of gene expression related to airborne
    exposures. For the children's study (whose protocols were approved by three separate
    Internal Review Boards), a 20-page respiratory health questionnaire was mailed to the
    parents of 6,883 children aged 7 to  12 years recruited through the Henry Ford Health
    System. An indoor/outdoor MICA-Air component was added using an innovative
    participant-based air sampling approach that included measurements of nitrogen dioxide
    and selected volatile organic compounds. A subset (205) of these children volunteered for
    clinical examinations (including measurements of pulmonary function and exhaled nitric
    oxide) collection of blood (for both genetic and gene expression analysis),  and nail and
    urine samples. Our analysis strategy represents a true "systems" approach in that each
    type of data is examined in the context of the broader MICA data set. This systems
    approach affords more robust conclusions, because the predictive value of biomarkers
    from particular data slices can be assessed for biological and statistical validity against
    the diverse set of supporting data.

    An Approach to Using Toxicogenomic Data in Risk Assessment:  Dibutyl Phthalate
    (DBF) Case Study: To address how genomic data may be used most effectively in risk
    assessment, this project was initiated with the goals of: 1) developing an approach to
    using toxicogenomic data in risk assessment and 2) testing this approach in a case study.
    Recognizing that genomic data type (e.g., species, organ, design, method) varies, the
    approach included formulating questions that the toxicogenomic data may  inform. Since
    microarray data is often informative of the mechanism or mode of action (MOA) of a
    chemical, the approach included an assessment of the toxicogenomic dataset in
    conjunction with the toxicity dataset in order to relate the affected endpoints (identified in
    the toxicity dataset evaluation) to the pathways (identified in the toxicogenomic dataset
    evaluation) as a method for informing the mechanism of action. Dibutyl phthalate (DBP)
    was selected for the case study, focusing on the male reproductive outcomes, since it has
    a relatively large and consistent genomic dataset, phenotypic anchoring of certain gene
    expression data for these outcomes, and an ongoing Integrated Risk Information System
    (IRIS) assessment. The case study team concluded that the "genomic dataset" should
    include all gene expression data (single gene, global gene expression, protein, RNA) in
    the evaluation as these data taken together provide a stronger basis for reproducibility of
    the global gene expression study findings. This evaluation found that the gene level
    findings  from the DBP genomic studies (i.e., microarray, RT-PCR, and protein
                                        14
                      Previous  I    TOC

-------
EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
    expression) were highly consistent in both the identification of differentially expressed
    genes (DEGs) and their direction of effect. This project also identified research needs for
    toxicity and toxicogenomic studies for use in risk assessment. These include: 1) Parallel
    study design characteristics with toxicogenomic studies (i.e., dose, timing of exposure,
    organ/tissue evaluated) to obtain comparable toxicity and toxicogenomic studies to aid
    our understanding of the linkage between gene expression changes and phenotypic
    outcomes; 2) Exposure time-course microarray data to develop a regulatory network
    model; 3) Generate TK data in relevant study (time, dose, tissue), and obtain relevant
    internal dose measure to derive best internal dose metric; 4) Multiple doses in microarray
    studies in parallel with phenotypic anchoring.

    Development of Microbial Metagenomic Markers for Environmental Monitoring
    and Risk Assessment: This project focused on the development of nonculture-based
    genomic methods for environmental monitoring and risk assessment. The research
    focused on the use of a microbial community genome (metagenome) approach to identify
    novel nucleic acid sequence markers for fecal contamination and source identification.
    The basic experimental design consisted of challenging genomic microbial community
    DNA from different fecal samples in genome subtraction studies to enrich for host
    specific microbial genes. Sequencing complete fecal metagenomes generated redundant
    information, making difficult the selection of host-specific genes. To address this
    limitation, a novel approach called genome fragment enrichment was developed to select
    for DNA fragments present in a specific fecal microbial community and absent in other
    fecal communities. A patent disclosure was filed on this specific method. In each case,
    hundreds of enriched DNA fragments were sequenced and assigned a putative functional
    role. While several assays were developed, these have to be further evaluated,
    particularly to determine  the geographic stability of these methods, not only in the U.S.,
    but also in other parts of the world. To this end, we are working with researchers from
    different regions in the U.S. and from countries such as Canada, Brazil, Austria,
    Singapore, and Spain. Results of this research will overcome the current limitation of
    assessing the microbial water quality by measuring bacterial densities, which are both
    time consuming and do not provide information about the sources impacting a watershed.
    Such information is necessary to implement adequate pollution control and remediation
    practices.

    Simulating Metabolism of Xenobiotic Chemicals as a Predictor  of Toxicity: The
    MetaPath research project is a collaboration with OPP scientists developing a capability
    for forecasting the metabolism of xenobiotic chemicals of EPA interest, to predict the
    most likely formed chemical metabolites, and to interface that information with toxic
    effect models allowing prediction of parent chemical toxic potential and the identity of
    chemical metabolites of equal or greater toxicity than the parent chemical. Key
    milestones include developing and expanding the in vivo and in vitro liver metabolism
    database especially for chemicals  and transformation reactions underrepresented in the
    current database; finalizing development of the searchable metabolism database and
    continuing to populate existing databases  with additional metabolism data; enhancing the
    performance of the existing metabolic simulator by incorporating reliable metabolism
    data and expansion of relevant transformation reactions; and conducting in vitro
                                         15
                       Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
       experimentation to verify maps and metabolites forecasted by the metabolic simulator
       and evidence for enhanced estrogenicity.

3.     Accomplishments by STAR Grantees in the CTRP

The STAR grants and centers have been a critical component of the CTRP from its inception. A
brief summary of the accomplishments of the three existing STAR centers for this program are
provided below. More information and fuller descriptions of these centers are available in
Appendix B.

NCER funded two STAR environmental bioinformatics centers as part of the CTRP in FY2006.
The Research Center for Environmental Bioinformatics and Computational Toxicology at the
University of Medicine  & Dentistry of New Jersey (UMDNJ), Piscataway, NJ, and The Carolina
Environmental Bioinformatics Research Center at the University of North Carolina (UNC),
Chapel Hill, are operating as cooperative agreements and helping to facilitate the application of
bioinformatics tools and approaches to environmental health issues supported by the CTRP.

To date, the UMDNJ center has made progress expanding the framework of the FDA
Array Track to ebTrack, an integrated bioinformatics system for environmental research and
analysis enabling the integration, curation, management, first-level analysis and interpretation of
environmental and toxicological data from diverse sources. Other major accomplishments
include the enhancement of Shape Signatures QSAR technology for chemical hazard
identification; metabolic engineering tools for identifying important pathways within the  overall
hepatocyte metabolism; and computational procedures for quantifying the structure of molecular
bionetworks via the S-space Network Identification Protocol (SNIP) and the  Closed-Loop
Identification Protocol (CLIP).

Major accomplishments of The Carolina Center for Environmental Bioinformatics include the
development and refinement of a mouse model of variation in genetic susceptibility relevant to
human populations, pathway modeling in genomic analysis, and new methods in quantitative
structure activity (QSAR) modeling relevant to toxicity. This work complements other work in
the CTRP, utilizing the unique strengths of the STAR Center in genetics, toxicology, and
statistical modeling. An early outcome of this work included dissection of the genetic  regulation
of liver gene expression. In addition, the Carolina Environmental Bioinformatics Center has
refined expression quantitative trait locus (eQTL) analysis procedures. These methods serve the
larger goal of elucidating the underlying mechanisms of toxicity. The Center has also  developed
high-quality methods for testing biological pathway involvement in toxicogenomics studies, and
a novel hierarchical two-step approach to model chemical structure for in vitro/in vivo toxicity
data.

In FY2008 NCER funded a third center, through a cooperative agreement, The Carolina Center
for Computational Toxicology at the UNC at Chapel Hill. This Center is applying high-
performance computing techniques and resources to in silico multi-scale modeling applications
at the cellular, organ, and system-wide level. In its first year, the center has begun to implement
and design advanced mathematical approaches modeling biological systems, and biological-
chemical interactions represented in the ToxCast™, as well as other datasets.
                                          16
                        Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009


Another high-priority for EPA is to understand the molecular and cellular processes that, when
perturbed, result in developmental toxicity. With a project start date of November 2009 NCER
responded to this need by funding the Texas-Indiana Virtual STAR Center; Data-Generating in
vitro and in silico Models of Developmental Toxicity in Embryonic Stem Cells and Zebrafish at
the University of Houston, Texas A&M Institute for Genomic Medicine, and Indiana University.
This center that will bridge the interface of in vitro data generation and in silico model
development to answer critical biological questions related to toxicity pathways important to
human development. This research should result in an improved predictive capacity for
estimating outcomes or risk associated with developmental exposure to environmental toxicants.

G.    Summary on Retrospective of the CTRP and NCCT

The first five years of the CTRP and NCCT have seen a great deal of progress and
accomplishment. These accomplishments have come from across ORD, from the STAR
grantees, and increasingly from the NCCT as the Center has become fully staffed and its projects
mature. As the plan from the first implementation comes to an end, the CTRP is poised to carry
out the vision of the NRC report  on Toxicity Testing in the Twenty First Century: A Vision and
Strategy, and The U.S. Environmental Protection Agency's Strategic Plan for Evaluating the
Toxicity of Chemicals. This CTRP implementation plan for FY2009-2012 will go on to explain
revisions to the program and specific projects that will be management priorities into the future.
                                          17
                        Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009


II.     REVISION OF THE CTRP FOR FY2009-2012

A.     Maturation of the Program

The accomplishments of the CTRP noted above have enabled the program to evolve beyond the
early focus on hazard identification and chemical prioritization into broader areas of risk
assessment. In  doing so, a number of research activities are becoming increasingly intertwined.
For example, results of the ToxCast™ program are now providing  data for models being
developed in the virtual tissue programs, and in turn, the virtual tissue programs are beginning to
guide targeted testing needs by other parts of the program. As a result, there has become a
blending of activities across the three long term goals (LTG)  contained in the first
implementation plan.  While the LTGs served us well in the initial  years of the program, we have
reduced them in the current plan to a single goal, namely—Providing High Throughput Decision
Support Tools for Screening and Assessing Chemical Exposure, Hazard and Risk.  Following
The U.S. Environmental Protection Agency's Strategic Plan for Evaluating the Toxicity of
Chemicals, we will see increased emphasis on the use of new data types being generated in the
CTRP in quantitative risk assessment, and greater involvement in aspects of high throughput
exposure models and in analysis of results coming out of high throughput bioactivity profiling.
As noted in other places, discussions are already well underway between NCCT and NHEERL,
NERL and NCEA on the shape of such an expansion. This expansion will incorporate progress
from ToxCast™ and ExpoCast™, as well as the v-Liver™ and v-Embryo™ projects, into
informatics tools and databases that can be used for both research and regulatory applications.
For example, in FY2011 we expect ToxCast™ will be winding down Phase II of its three part
development program, providing in vitro and in vivo toxicity  data on a total of 1,000 compounds
across hundreds of molecular targets and biological pathways. Over this same time period, the
Tox21 consortium will be building a chemical screening library of up to 10,000 chemicals, and
conducting approximately one assay per week on this library  and providing additional in vitro
bioactivity data. Within the ACToR database and website, and in the analysis application
ToxMiner, all of this bioactivity data will be merged with chemical information and the exposure
data being curated from the ExpoCast™ project. At this stage in FY2011, steps will be taken to
provide EPA's program offices decision analysis tools that incorporate  hazard and exposure data
for prioritization of further chemical testing. Prior to that time the CTRP will be hosting periodic
training courses on the new science and tools for program office and regional personnel. We
envision this decision analysis tool being part of the ToxMiner application. In addition to
statistically based predictive toxicology tools, ToxMiner will be  able to reference the mapping of
ToxCast™ assays to biological pathways curated from sources such as  Kyoto Encyclopedia of
Genes and Genomes (KEGG) into a computable  format for use in ToxMiner.

Depending on the precise pace of progress in ToxCast™ and  ExpoCast™, and the development
of screening and prioritization tools, additional resources could be freed up to support other
components of the CTRP in FY2011 and FY2012. However,  chemical prioritization will remain
a priority area until it reaches the stated objectives. A natural  extension of the chemical
prioritization projects is using this data for systems modeling in the v-Liver™ and v-Embryo ™
projects. It is these systems models that will help fully utilize the toxicity and exposure pathway
information coming out of ToxCast™ and ExpoCast™  into next generation, higher throughput
risk assessments. In conjunction with OPPTS, we have also begun  an examination of the
                                          18
                        Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009


feasibility of using HTS tools to evaluate the potential hazards of nanomaterials. This effort will
start with carbon nanotubes and nanomaterials being used in the OECD Working Party on
Manufactured Nanomaterials and include materials submitted to the EPA via the
Premanufacturing Notification program of the Toxics Substances and Control Act. We expect
the application of bioactivity profiling of nanomaterials to become an increasingly important
component of the CTRP as we gain experience with their handling and screening results.
Another major consideration in the evolution of these programs, screening high priority
chemicals in toxicity pathways for testing prioritization, is to appropriately tailor  them for
reducing to practice the research program for regulatory use. At current funding levels, the
burden of putting into practice the screening and prioritization of 5,000-10,000 chemicals using a
comprehensive battery of HTS assays could not be borne by the ORD CTRP alone.

Relative to the STAR program, the number of funded Centers will reach five (two on
environmental bioinformatics, one on computational toxicology, and one or two on virtual
tissues) by FY2010 and consideration will need to be given  to the second generation Centers as
the first ones to be awarded begin to reach conclusion of their funding cycle.

For historical purposes, the following is provided as transition information relative to the original
three LTG structure, with notation of the degree of emphasis the research activities that will be
carried over into the new implementation plan. Much of the increased activity is predicated on
continual and increasing support of the CTRP program, such as is being witnessed in the FY2010
budget process. The following section describes research areas of increased and decreased effort
relative to the FY2006-FY2008 three LTGs.

       Long-Term Goal 1 - EPA risk assessors use improved methods and tools to better
       understand and describe linkages across the source  to outcome paradigm. Work was
       directed towards computational models and modeling systems that represented
       comprehensive descriptions of the underlying biology of adverse impacts  caused by
       exposure to environmental agents. The whole-systems biology modeling approach was to
       develop a range of models, from those describing pharmacodynamic connections
       between exposure and effects to those describing complex endogenous pathways and the
       perturbations in such pathways resulting from environmental exposures.  Also, ways to
       incorporate and use "omics" information in these models was explored.  Finally,
       attempts were made at formulating models of common, but complex, disease processes
       which are then exacerbated by exposures to exogenous substances and stressors through
       the development of virtual organ models, the first being the virtual liver and second the
       virtual embryo.

          •  Increase - efforts to develop virtual models of tissues (liver and embryo) that link
             across levels of biological organization from molecular to cellular to tissue level
             responses

          •  Increase - Coordination of efforts across NCCT, NERL, NHEERL, and NCEA to
             ensure models of aspects of the source to outcome paradigm can be integrated and
             scaled to meet the increasing needs of chemical evaluations.
                                          19
                        Previous  I    TOC

-------
EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
        •  Decrease - Research efforts related to the validation and acceptance of PBPK
           models

        •  Decrease after FY2009 - Linkage of exposure and effects using genomics,
           proteomics, metabonomics in small fish models

     Long-Term Goal 2 - EPA Program Offices use advanced hazard characterization tools
    to prioritize and screen chemicals for toxicological evaluation.  Molecular biological
    tools were employed to develop fingerprints of biological activity of chemicals of
    concern to the EPA.  Computational models were  applied to the fingerprints to derive
    associations with classical measures of toxicity derived from animal studies so that
    predictive models were developed leading to more efficient testing paradigms and
    reduction in uncertainties in inter-species extrapolation. Proof-of-concept demonstration
    of ToxCast™, the forecasting tool, is scheduled to provide a number of EPA program
    offices with an extremely useful tool to improve the efficiency and effectiveness of
    hazard identification and risk assessment methodologies. There were also new and
    innovative ways developed to assimilate, evaluate, and use the myriad of data assorted
    with molecular and chemical information. Increasingly integrated chemical-biological
    effects databases are intended to spur new capabilities for data mining and chemical
    categorization in conjunction with HTS data. Development of advanced computational
    chemistry method also provided in silico means to predict complex interactions of
    environmental chemicals with biochemical receptors which can then lead to adverse
    effects.

        •  Increase - Support for Phase II of ToxCast™. The additional chemicals, which
           could total up to  1,000, will include a greater number of pesticides, pesticidal
           inerts, antimicrobials, industrial chemicals, water contaminants and failed drug
           candidates. Phase II should be winding down sometime in FY2011, depending on
           the exact pace of progress in FY 2009 and FY2010.
           Increase — Develop methods to analyze data and relate the results from the
           ToxCa;
           levels.
ToxCast™ studies to potential for hazard and risk from realistic human exposure
        •  Increase - Interactions between NCCT and NHEERL on identification of
           additional critical toxicity pathways to include in the chemical prioritization
           program.

        •  Increase - In conjunction with NERL develop ExpoCast™, the exposure
           component of a chemical prioritization process.

        •  Increase - Cross  governmental collaborations to employ quantitative high
           throughput screening assays to predict human toxicity by engagement with
           relevant components of the NTP/NIEHS, NCGC/NHGRI and FDA in the Tox21
           program.
                                        20
                      Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009


       Long-Term Goal 3 - EPA risk assessors and regulators use new models based on the
       latest science to reduce uncertainties in dose-response assessment, cross-species
       extrapolation, and quantitative risk assessment.  The intent of this goal was to develop
       additional key modules for computational models of biological processes relevant to the
       induction of toxicity for high priority environmental chemicals. These modules would
       help assess the interaction of exposure to environmental chemicals with other processes
       such as underlying disease and concomitant intake of pharmacological agents. As a
       result, EPA will be less reliant on default assumptions for risk assessment and better able
       to accurately characterize the true uncertainty associated with risk predictions for various
       chemical  classes (e.g., EDCs) under conditions more relevant to actual exposures and
       lifestyles

          •  Decrease - Chemical specific efforts vis a vis tools that have greater generic
             applicability.

          •  Increase - Analysis of the resulting HTS data to (1) provide mode of action
             information to specific risk  assessments being conducted by EPA; (2) provide
             rationale for grouping of cumulative risk assessments based on toxicity pathways;
             (3) design of a higher throughput risk assessment approach for chemicals based
             upon exposure potential and perturbation of toxicity pathways;  and (4) develop
             methods for analyzing and quantifying the uncertainty in dose-response model
             predictions.

These modifications to levels of effort and emphases are consonant with distillation of the three
LTGs from the FY2006-2008 implementation plan into a single goal--Providing High
Throughput Decision Support Tools for Screening and Assessing Chemical Exposure, Hazard
and Risk. This will  more efficiently  support the use of new data types being generated in the
CTRP in quantitative risk assessments, as well as providing high throughput hazard and exposure
data and models for screening. Since the majority of the HTS data is being generated for human
molecular targets and pathways,  this will support a transition from the current  dependence on
animal-based toxicology. In combination with clinical data from pharmaceutical partners,
expanding efforts bringing in human data on exposure, and multi-scale systems models, it should
be possible to improve both the pace and quality of risk assessments.

B. CTRP Integration across ORD Laboratories and Centers

Due to the relatively small size of NCCT and the ambitious nature of the CTRP mission, a key
part of the process of advancing  the  science involves developing partnerships that are both within
and external to ORD, so as to best leverage resources committed to the effort. Within ORD, the
majority of research in health, ecological, and risk assessment,  and are contained within a variety
of Multi-Year Plans (MYPs) that incorporate efforts from multiple Laboratories and Centers and
are  coordinated by National Program Directors (NPDs). In the case of the CTRP, the Director of
the NCCT also leads the ORD Computational Toxicology Program. At present, there is an
ongoing active dialogue between ORD Laboratories and Centers, and relevant NPDs, regarding
the  future directions of the CTRP and other programs. To move beyond just dialogue, the NCCT
                                          21
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012
                                         BOSC Review Draft- 24 August, 2009
has hosted or is hosting, rotational fellows from across ORD (NERL, NHEERL and NCEA to
date) that spend 4 months or more working with NCCT scientists on CTRP projects.

In response to the Administrator's priorities, a pilot effort is being considered by ORD that
addresses problems of broad national significance through highly integrated multi-disciplinary
research efforts. In the last several months, NCCT, NERL, NHEERL, NRMRL and NCEA have
been exploring opportunities for greater synergy in execution of the CTRP, and each Laboratory
and Center is actively developing implementation plans for such an effort. This integration will
include aspects of the Human Health Research Program MYP and the Safe Pesticides Safe
Products MYP. The challenge will be to ensure these plans are adequately integrated and the
Figure 4
        Computational Toxicology in ORD
      Virtual
     Systems
    Quantitative
      Tools
       Data
    Repositories
                    Populations
                                                          MetaPath
                                                                        QSAR
                                                                     TTDB
                                                               ToxRefDB
                  Ambient
                  Monitoring
                 Personal
                Monitoring
                                                      rTK
Biomonitoring
        HTS
      Assay Development
      In vitro
    Rodent models
 Human studies
                             Environmental
                             Concentration
Environmen
  Release
     dividual
    Exposure
Internal
 Dose
Biological
outputs are suitably ambitious in nature. As EPA has just released its Strategic Plan for the
Evaluation of the Toxicity of Chemicals (U.S. EPA 2009), there is a window of opportunity to
continue this momentum within the CTRP, while also expanding efforts to include critical
aspects of exposure assessment (NERL), toxicity pathway coverage and targeted testing
(NHEERL), life cycle analysis of chemical use (NRMRL), and quantitative risk assessment
(NCEA). If successful, this integration of efforts across multiple laboratories, centers, and
research programs could meld the numerous ongoing ORD research efforts in computational
toxicology into a more functional integrated multidisciplinary research program.
                                          22
                        Previous
                           TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
Figure 4 shows the stages in the source-to-outcome continuum on the horizontal axis, and
existing components of ORD research and development infrastructure needed to support risk
assessment on the vertical axis, and products/tools for decision support on the far right. At the
present, many of these efforts are not integrated, and in some cases not fully compatible and a
major emphasis in the near future will to bring more coordination and integration to these
currently diverse efforts. To overcome these challenges, integrated cross-disciplinary teams,
involving exposure scientists, effects researchers, risk assessors, risk managers and
computational modelers will need to be further developed.

Understanding of complex, interrelated environmental stressors and potential impacts on human
health has grown tremendously in recent years. Basic and clinical sciences, however, have
significantly outpaced risk assessment science. Insights into health and disease exist, but it is
unclear how to incorporate this information into risk assessment with current methodologies.
Risk assessment will have to address fundamental paradigm shift from a reliance on  animal
toxicology data derived primarily from rodent bioassays. The need for this shift is made more
immediate by the challenges of applying new types of data stemming from advances in
computational toxicology and the huge volume of data that will be generated from the European
Union's Registration, Evaluation, and Authorization of Chemicals (REACH) Program. High
throughput data will need to be translated into knowledge to support science-based decisions in
risk assessment. The areas where this knowledge is expected to have an impact is in defining
toxicity pathways and informing risk assessors about interpretation of multiple modes of action
for toxicity, and providing insight into human variability in key pathways and human
susceptibility.  The impacts of this information will be qualitative and quantitative. For this type
of information to be incorporated quantitatively numerous challenges exist not the least of which
is extrapolation from in vitro test systems to in vivo human health outcomes. However, the
challenges are not insurmountable if ORD, all of EPA and key extramural partners work together
in  a coordinated, multidisciplinary fashion. The initial impacts of this new paradigm will
probably be seen in cases of chemicals lacking significant data sets but for which toxicity
predictions or rankings can be developed from HTS data. These results may be used to derive
estimates of relevant potency to chemicals that have much larger data bases and affect the same
toxicity pathways. In addition, challenges in extrapolating from effective concentration in vitro
will require additional considerations in the development of environmental exposure estimates.
Examples of promising CTRP efforts with antimicrobials and pesticidal inerts, EDSP
compounds, and industrial chemicals such as phthalates and perfluorinated chemicals; as well as
hepatocarcinogens and teratogens are all underway. Results from these studies have been
presented, and are being published in the peer reviewed literature, EPA databases, and decision
support software tools.

The CTRP will continue to provide  critical research components, work on integrating these
efforts across ORD, and facilitate the institutional transition necessary to see these tools
integrated by EPA programs and regions. This includes quantitative and mechanistic
experimental data (e.g., ToxCast™) that is useful for chemical prioritization, but also supports
systems models that could be useful for quantitative risk assessments.  Data generated from
experimental systems can be used to define toxicity pathways and link them to adverse events
via the mode-of-action framework. This definition of adverse, versus As toxicity pathways and
key events are defined, development of high throughput assays for measuring impact on these
                                           23
                         Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                  BOSC Review Draft- 24 August, 2009
pathways, and bioindicators of downstream key events will provide the tools for HTS of
compounds and evaluation of outcome predictions in target populations. Future efforts will also
need to focus on the translation of the computational and high throughput methods into
information that can be used in risk assessment. This is most apparent with respect to actual use
in quantitative risk assessment (i.e. being able to use in vitro or computational methods to
develop a point of departure for an IRIS assessment or other type of risk assessment), but is also
relevant to prioritization (i.e. how does the relative potency information derived from the in vitro
assays or models compare to in vivo potency) if information from screening assays is going to be
used to drive further testing decisions. Key issues in developing the next generation of risk
assessments include:

   •   Defining Adversity- Before these methods can be used in  risk assessment, it will be vital
       to understand how 'perturbations of toxicity pathways' relate to adverse effects that are of
       concern for human or ecological health.
   •   Development of decision methods and criteria for using high-throughput data in risk
       assessment.

The large interdisciplinary projects required to meet the goals of this program are dependent on
well integrated data repositories such as ACToR. These databases not only provide inputs for
empirical models for prioritization/hazard identification, but also  access to well-structured data
required for data mining needs to define and evaluate toxicity pathways, dose-response modeling
needs. As hazard identification and risk characterization tools are developed, a gradual shift from
reliance on empirical models supported by limited mechanistic information to incorporation of
detailed mode-of-action for predicting hazard can proceed. The first step in this process will be
the development of prioritization tools out of ToxCast™ which incorporate toxicity pathway
information to identify chemicals of most concern and to direct more detailed testing and
modeling toward chemicals and toxicity pathways of highest importance. As computational
models are developed that relate the perturbation of a toxicity pathway quantitatively to an
adverse outcome (aided by development of virtual tissues), it should become possible to not only
prioritize but also screen for chemicals where the evidence is convincing  of either safety or
toxicity.  The toxicity pathways and modes of action for those chemicals where evidence is
inconclusive would then become  the subject of further experimental study. Eventually,
quantitative prediction of risk from HTS data could be derived based on evaluation of toxicity
pathway predictions from systems models of tissues and organs (e.g., v-Liver™ and
v-Embryo™), and perhaps even simulated target populations.
                                           24
                         Previous  I     TOC

-------
   EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
Flgure 5               The Future State: Using Hazard and Exposure
                      Predictions to Prioritize Testing and Monitoring
                        High exposure potential^     im_
                                                       Low exposure potential

                               HE   ~~                           H
                                   HE ToxCast targets
                               HE
               ToxCast Low
                 Hazard
                Prediction  /          \                                Low Priority for
                        E  E           \ToxCast Hazard Prediction       Bioactivity Profiling
                        /
HE
                 Lower Priority for
              Testing and Monitoring
                                                      Intelligent, Targeted Testing
                                                     t
                                                      Human Biomonitoring

A major step in higher throughput modeling and prediction of exposure will come from the
CTRP ExpoCast1  project, a collaboration of NERL and the NCCT focused on providing
Biologically-Relevant Exposure Science for 21st Century Toxicity Testing. The ExpoCast™
project is described in more detail in the project plans attached in the appendices.  It should be
noted that pre-existing and other NERL research, databases and models will be critical to the
success of ExpoCast™ and the broader goals of ORD's CTRP. As ExpoCast™, ToxCast™ and
other modeling efforts succeed, it would be possible to prioritize testing and monitoring of large
numbers of chemicals and other environmental contaminants. These prioritizations would be
feasible because they would be based on hazard (H) and exposure (E) predictions of calculable
and reasonable uncertainty that were derived solely from in silico and in vitro data (Figure 5).
Upon validation of these predictions and development of decision support tools and software for
translation into a form useful for chemical prioritization by EPA program offices.

As suggested by the NRC report "Toxicity Testing in the 21st Century: A Vision and a Strategy"
and The U.S. Environmental Protection Agency's Strategic Plan for Evaluating the Toxicity of
Chemicals, the majority of current CTRP efforts are centered on advancing toxicity testing for
assessing human health effects of environmental agents. However, under environmental
legislative mandates (e.g., the Toxic Substances Control Act; the Federal Insecticide, Fungicide,
and Rodenticide Act; and the Clean Water Act), most EPA programs regulate compounds to
ensure both environmental and human health risks are properly managed. Statutory language and
resulting policy typically require decisions for chemicals that encompass environmental and
human health risks such that the CTRP will also need eventually to develop higher throughput
and computational approaches for ecotoxicology and risk assessment. Notable progress has been
made in the previous CTRP implementation plan, and within other ORD research programs, on
the development and use of toxicity pathway models, toxicology knowledgebases  (e.g.,


                                           25
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
ECOTOX), and systems biology models (e.g., small fish model) in the field of environmental
science. In follow up to the current CTRP implementation plan for FY2009-2012, opportunities
will exist to bring together relevant disciplines, data, and models across both human health and
environmental risk assessment applications. As noted previously, we will be using existing
resources and projects to leverage an increased attention to ecologically relevant areas, however,
given the challenges of developing a system for relevance to human health assessment, the near
term goals of the CTRP will have to remain largely focused on that area. It is expected that
future versions of the CTRP and related implementation plans will accommodate further
progress in ecotoxicology that can be incorporated and merged with the efforts relating to human
health in this current CTRP plan.

C.     Regional and Program Interactions

The CTRP is working closely with Program Offices, including the Office of Pesticides,
Prevention, and Toxic Substances (OPPTS), and the Office of Water (OW) to create informatics
tools and databases of use to the goals of both research and regulation. Within the past year, full
day sessions have been  spent briefing senior management in these and other Program and
Regional Offices on the activities of the CTRP. This has resulted  in shared participation, input
and authorship between Program and CTRP staff in planning, research projects, scientific
publications, and promulgation of regulatory decisions. Many examples of collaborations with
Program Offices exist, including DSSTox, ACToR, ToxRefDB and ToxCast™. Regional input
has been more limited: briefings on the CTRP have been given to  particular Regions (e.g., 6 and
7), and to the regional risk assessors group. However, additional dialogue is needed as products
begin to emerge from the program that could be of utility to the Regions, especially in relation to
the Toxics Release Inventory (TRI) and Superfund programs.

In moving forward over the next several years,  the key points of engagement with the Programs
will be their use of the DSSTox, ToxRefDB and ACToR databases as sources of chemical,
toxicity and exposure information, and predictions and prioritizations based on results from
ToxCast™, ExpoCast™ and the Virtual Tissues projects. More detailed descriptions of what
these engagements will be are provided in the individual project descriptions in the appendices.
Some specific examples include the continued co-development of ToxRefDB to include
additional guideline test results, as well as to be positioned to record the output  of the Tier 1
Endocrine Disrupter Screening batteries.

OPPTS-OPP: Driven by its needs to assess the effects of pesticidal inerts and  anti-microbial
agents, both of which suffer from limited data availability, the Office of Pesticide Programs has
been a strong proponent of this new approach to toxicity testing. OPP has defined a Strategic
Direction for New Pesticide Testing and Assessment Approaches  focused on developing and
evaluating new technologies in molecular, cellular, and computational sciences  to supplement or
replace more traditional methods of toxicity testing and risk assessment. This integrated
approach to testing and assessment is moving toward a new paradigm where in  vivo (animal)
testing is targeted to the most likely hazards of concern. As defined by OPP, the path forward
will include close collaborations with the CTRP, in order to predict chemical toxicity and
exposure through application of efficient and effective  screening tools including new in vitro
assays that rapidly provide biological profiles of the toxicological  potential of chemicals (i.e.,
ToxCast™). Exposure and biomonitoring data will also be critical to interpreting toxicity data
                                           26
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                  BOSC Review Draft- 24 August, 2009
and evaluating the effectiveness of the new testing and assessment paradigm. Over the next five
years, OPP plans to enhance its integrated approach to testing and assessment to better determine
what toxicity data are needed to further refine risk assessments for chemicals that do not have
extensive toxicity information (e.g., inert ingredients, certain antimicrobial and biochemical
pesticides, and metabolites and degradates of pesticide active ingredients). Over the next 10-15
years, as experience is gained and as understanding of toxicity pathways increases, an enhanced
integrated testing and assessment approach will be implemented for all pesticides including
conventional agricultural pesticides. This approach will fully integrate hazard and exposure data
along with advanced systems modeling based on new in vitro data and an understanding of
toxicity pathways to better predict risks and to determine what additional data are necessary (i.e.,
virtual tissues). The key goals of integrated approaches to testing and assessment are to:
          •   improve our ability to set priorities for what data to require,
          •   ensure that the data requirements are focused on the right issues, and
          •   efficiently reach the end result of effective risk assessment.
This approach would provide the ability to focus testing on pesticide chemicals and the effects
that could most likely result in harm. As a result, testing would
          •   use fewer animals,
          •   take less time,
          •   be less expensive in data generation and review, and
          •   explore a broader range of potential adverse effects.
These goals and approach are wholly consistent with the goals and approach of the CTRP, and
represent the close working relationship between OPP and ORD in developing these tools for
integrated approaches to testing and assessment.

CTRP researchers are working with OPP and other parts of OPPTS, and with the Organization for
Economic Cooperation and Development (OECD) to explore and evaluate regulatory application
of molecular screening assays in relation to chemical testing guidelines. The goals of this effort
are to provide tools for;  1) improving the understanding of mechanisms of toxicity; 2) identifying
biomarkers of toxicity and exposure; 3) reducing uncertainty in grouping of chemicals for
assessments, inter-species extrapolation, effects on susceptible populations, etc.; and 4) providing
alternative methods for chemical screening, hazard identification and characterization. Also, the
CTRP and OPP are working with the OECD, on reducing and refining current animal testing
guidelines using the ToxRefDB database and other tools (e.g., Extended One-Generation
Reproductive Toxicity Test Guideline).

OPPTS-OPPT: For the Office of Pollution Prevention and Toxics there will be several key
points of intersection with the CTRP over the next several years.  These include using HTS
technologies and bioactivity profiling to lend biologically based support to strengthen and
potential revise of chemical categories currently in the new and existing chemicals programs; to
evaluate the relative hazard of chemicals being evaluated by the Design for the Environment
program; for providing toxicity pathway data on specific groups of chemicals of high concern
(e.g., the perflourinates); and for evaluation of the feasibility of HTS approaches to
characterization of the bioactivity profiles of manufactured nanoparticles. On the exposure side,
the CTRP expects to be engaged with exposure efforts within OPPT such as the IUR (Inventory


                                           27
                         Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
Update Rule) as procedures are developed for broad scale predictions of exposure potential
across the life cycles of chemicals. CTRP researchers are working with OPPT and with the
OECD to access and incorporate CTRP tools (e.g., DSSTox, ToxRefDB, ER QSAR models) into
the OECD QSAR Application Toolbox. The Toolbox is an international effort to provide tools
for grouping chemicals, based on the understanding of mechanisms of toxicity in order to
extrapolate properties and effects from tested chemicals to untested chemicals.

OPPTS-OSCP:  For more than a decade the Office of Science Coordination and Policy has led
EPA's work to fulfill mandates contained within the Food Quality Protection Act of 1996 to
identify chemicals that can interfere with the function of natural hormones (e.g., endocrine
disrupters). This is a mode of action that may result in significant adverse consequences in
developing organisms (e.g., embryo, fetus, neonates, and children) should they be exposed to
levels sufficient to cause perturbations of their endocrine systems. Animal models have shown
that the levels that impact developing organisms can be much lower than those impacting adults
and that exposures during development can lead to adverse effects not seen until adulthood.  The
Agency has adopted a Tier 1  screening battery that is designed to detect whether a chemical
substance interacts with the estrogen, androgen or thyroid hormonal systems. Combinations of in
vitro and in vivo assays are used to provide complementary measurements that detect the
endocrine disrupting potential of a chemical. Given the rapid advances in computational and
molecular sciences, discussions are underway with OPPTS on the next generation of tools that
could be more efficiently applied to the large number of chemicals of potential concern for
ability to disrupt the function of the endocrine system. Included in these discussions are the
integrative potential of contributions from CTRP in HTS of chemicals, and providing informatics
and analysis solutions and tools; the Human Health Research Program in conducting targeted
follow up testing and exposure analysis; of the Endocrine Disrupters Research Program in
understanding mechanisms of action, developing methods for assessing cumulative risks, and
improving the ability to extrapolate results across species; of the Human Health Risk Assessment
Program in assessing the contributions of multiple exposures (e.g., chemicals with
common/different modes of endocrine action) across critical life stages; and the Risk
Management Research Program in providing information on chemical life cycles that pose the
greatest potential exposure and risk, and the development of tools for greener chemicals or
processes and other mitigation strategies.

The CTRP has identified a large collection of up to  10,000 chemicals that are of high priority for
EPA Program Offices. Using available funding, a subset of 700 will be used in Phase II of
ToxCast  (cost per chemical of ~$20k). Going beyond the screening of these chemicals in the
full suite of ToxCast™ Assays (>500), ORD has the capacity under existing contracts to acquire
the entire collection of chemicals and to screen them in a subset of ToxCast M assays that cover a
broad spectrum of endocrine related activities. The cost of such HTS assays would be  on the
order of $ 1 -4k per chemical depending on precisely which subset of ToxCast™ assays were
included. The use of HTS assays for receptor binding and transcriptional activation and (Q)SARs
for these endpoints to obtain empirical or predicted information on chemicals for which no data
were available could provide an ordered list of which chemicals have the highest potential to
interact with the estrogen, androgen and thyroid receptors. Along with exposure information, this
prioritized list could then be used to guide other effects research, exposure analyses, and
                                           28
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
assessment and management activities in the various parts of ORD. The results could be used to
select chemicals for entry into subsequent phases of EPA's EDSP.

OW- For the Office of Water, the CTRP expects to be engaged with the assessment of the
bioactivity of chemicals on the Candidate Contaminant List (CCL) and the derivation of
subsequent CCL lists. In the course of identifying and assessing CCL compounds, the OW has to
collect and analyze hazard and exposure data on thousands of the same compounds addressed in
ACToR, DSSTox and other CTRP databases.  In future iterations of this process, the CTRP will
be able to  assist the OW with data curation and analysis, and provide a wealth of new hazard and
exposure data that is computable, as well as prioritization tools designed to use this data.

D.     Priority Areas for CTRP Management

1.     Toxicity Predictions and Chemical Prioritizations Incorporating Exposure

With the publication of the first predictive bioactivity signatures from Phase I, and the initial
proof of concept of the ToxCast™ program, we will have hazard predictions that can be
incorporated into prioritizing chemicals for further screening and testing. Phase II, scheduled for
launch in later 2009, will explore greater diversity of chemical structures and classes in order to
evaluate the robustness of the signatures identified in Phase I. As indicated in Figure 5, having
viable exposure predictions or estimations will also be critical to the envisioned prioritization
scheme. ExpoCast™ will provide an overarching framework for the science required to
characterize biologically-relevant exposure, and can thus inform chemical prioritization by
linking information on potential toxicity of environmental chemicals to real-world health
outcomes. NCCT management will support continued collaborations with NERL and other
critical partners, within EPA and externally, to improve accessibility to EPA human exposure
data and create a consolidated EPA exposure database focused on measured concentrations in
biological  media. As NCCT pushes ahead into Phase II of ToxCast™, expanding the compounds
in the screening program to include nanomaterials and chemicals with demonstrated human
toxicities (e.g., failed Pharmaceuticals), it will coordinate these efforts with ExpoCast™ in order
to maximize utility of datasets to develop predictive models and decision support software for
chemical prioritizations.

2.     Strengthening Cross-OKD  Collaborations

Given the  broad nature of the challenges facing computational toxicology, the CTRP must
engage collaborative partners across ORD in order to be successful. Cross-ORD collaborations
have been a part of the CTRP from its inception, and will continue to be a dominant feature of
the program. Numerous collaborations from previous years will carry forward, including
linkages at the management level, such as the MOU with NHEERL and NERL to provide NCCT
with administrative support functions for funds control, extramural management, quality
assurance  and information management. Key research partnerships have developed between
NCCT and the rest of ORD. In addition, NCCT continues to advise NCER on the formulation of
ideas for new computational toxicology RFAs, providing suggestions for scientific peer
reviewers, serving on relevancy reviews as appropriate, and collaborating with the cooperative
research partners of the STAR grants program. The ORD's multi-year planning process provides
                                          29
                        Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                  BOSC Review Draft- 24 August, 2009
another opportunity for linkage between the CTRP and related research efforts. Each of the
MYPs is led by a National Program Director with support from staff of the Laboratories and
Centers. The NCCT participates actively in four of the MYP teams. Three contain similar
research activities for screening and prioritizing chemicals, i.e. Endocrine Disrupting Chemicals
(EDCs, LTG III), Safe Pesticides/Safe Products (SP2, LTG I), and Drinking Water Research
Program (DW, LTG II) whereas the fourth has a major focus on the incorporation of biologically
based mode-of-action information into quantitative risk assessment (the Human Health Research
Strategy and the Human Health MYP, LTG II). Consideration is also being given to  how these
data and analyses can be incorporated into next generation risk assessments, in association with
ORD's Human Health Risk Assessment program and NCEA. NCCT management meets at least
quarterly with the National Program Directors for these MYPs, in part to  continue dialogue on
sharing and coordination of resources between programs and to ORD Laboratories and Centers
beyond NCCT. Besides financial resources, NCCT is cultivating cross-ORD collaborations
through shared post-doctoral, other students and through NCCT rotational fellowships for ORD
scientists.  To date, scientists from NHEERL, NERL and NCEA have participated in details of 4
months or longer in NCCT to work on collaborative research projects.

ORD is in the midst of a transformation to a system of integrated, multidisciplinary (IMD)
research projects. The CTRP is actively engaged in planning and discussions on one such IMD
proposal entitled "Decision Support Tools for Preventing, Reducing and Managing
Chemical Risks." This IMD project will address the tens of thousands of chemicals  and millions
of products that current regulatory decision tools don't have the ability to assess,  in terms of
impact on life-stage vulnerability, genetic  susceptibility, disproportionate exposures  and
cumulative risk. The project will incorporate some of the predictive, high-throughput tools for
exposure and hazard being developed as part of the CTRP, scale them up and with attention to
critical life stage impacts create prioritization algorithms and next generation risk assessments
that highlight viable management options for prevention, mitigation and risk reduction.

3.     Tox21: A Federal Partnership Transforming Toxicology

The NRC report on Toxicity Testing in the 21st Century has significant implications  for human
health risk assessment, and in order to accelerate progress in this area, two NIH institutes and
EPA have entered into a formal collaboration known as Tox21 to identify mechanisms of
chemically induced biological activity, prioritize chemicals for more extensive toxicological
evaluation, and develop more predictive models of in vivo biological response. Consistent with
the  vision  outlined by Krewski et al. in the NRC report, success in  achieving these goals is
expected to result in methods for toxicity testing that are more scientific and cost effective as
well as models for risk assessment that are more mechanistically based. As  a consequence, a
reduction or replacement of animals in regulatory testing is anticipated to occur in parallel with
an increased ability to evaluate the large numbers  of chemicals that currently lack adequate
toxicological evaluation. Ultimately, Tox21 is expected to deliver biological activity profiles that
are  predictive of in vivo toxicities for the thousands of understudied substances of concern to
regulatory authorities in the United States, as well as in many other countries.

The Tox21 collaboration is being coordinated through  a five-year MOU, which leverages the
strengths of each organization. The MOU builds on the experimental toxicology expertise at the
                                           30
                         Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
NTP, headquartered at the NIH/NIEHS; HTS technology of the NIH/NCGC, managed by the
NHGRI; and the computational toxicology capabilities of the EPA's NCCT. Each party brings
complementary expertise to bear on the application of novel methodologies to evaluate large
numbers of chemicals for their potential to interact with the myriad of biological processes
relevant to toxicity. A central aspect of Tox21 is the unique capabilities of the NCGCs high-
speed, automated screening robots to simultaneously test thousands of potentially toxic
compounds in biochemical and cell-based HTS assays, and an ability to target this resource
toward environmental health issues. As mentioned by Krewski et. al., EPA's ToxCast™ Program
is an integral and critical component for achieving the Tox21 goals laid out in the MOU.

To support the goals of Tox21,  four focus groups—Chemical Selection, Biological
Pathways/Assays, Informatics, and Targeted Testing—have been established; these focus groups
represent the different components of the NRC vision described by Krewski et. al. The Chemical
Selection group is coordinating the selection of chemicals for the Tox21 compound library to test
at the NCGC. A chemical library of nearly 2,400 chemicals selected by NTP and the EPA is
already under study at the NCGC and results from several dozen HTS assays are already
available. In the near term, this  library will be expanded to approximately 8,400 compounds,
with an additional 1,400 compounds selected by the NTP; 2,800 compounds selected by the EPA
and provided by the CTRP; and 2,800 clinically approved drugs selected by the NCGC.
Compound selection is currently based largely on the compound having a defined chemical
structure and known purity, on the extent of its solubility and stability in dimethyl sulfoxide
(DMSO), (the preferred solvent for HTS assays conducted at the NCGC),  and on the compound
having low volatility. Implementing quality control procedures for ensuring identity, purity, and
stability of all compounds in the library is an important responsibility of this group. A subset of
the Tox21 chemical library will be included in Phase II of the ToxCast™ program, which will
examine a broader suite of assays in order to evaluate the predictive power of bioactivity
signatures derived in Phase I.

The Biological Pathways and Assays group is identifying critical cellular toxicity pathways for
interrogation using biochemical- and cell-based high-throughput screens and prioritizing HTS
assays for use at the NCGC. Assays already performed at the NCGC include those to assess (1)
cytotoxicity and activation of caspases in a number of human and rodent cell types, (2)  up-
regulation of tumor suppressor p53, (3) agonist/antagonist activity for a number of nuclear
receptors, and (4) differential cytotoxicity in several cell lines associated with an inability to
repair various classes of DNA damage. Other assays under consideration include those  for a
variety of physiologically important molecular pathways (e.g., cellular stress responses) as well
as methods for integrating human and rodent hepatic metabolic activation into reporter  gene
assays. Based on the results obtained, this group will construct test batteries useful for
identifying hazard for humans and for prioritizing chemicals for further, more in-depth
evaluation. As Tox21 progresses, it will offer an excellent opportunity to incorporate assays
specifically relevant to the assessment of chemical hazard to wildlife. For example, as assays
become available for the reactivity of nuclear hormone receptors (e.g., estrogen and androgen)
from multiple species, we will be have ability to directly compare their responsiveness and test
whether a particular species might be more sensitive to perturbations than others.
                                           31
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
The Informatics Group is developing databases to store all Tox21-related data and evaluating the
results obtained from testing conducted at the NCGC and via ToxCast™ for predictive toxicity
patterns. To encourage independent evaluations and/or analyses of the Tox21 test results, all data
as well as the comparative animal and human data, where available, will be made publicly
accessible via various databases, including EPA's Aggregated Computational Toxicology
Resource (ACToR), NIEHS' Chemical Effects in Biological Systems (CEBS), and the National
Center for Biotechnology Information's PubChem.

As HTS data on compounds with inadequate testing for toxicity becomes available via Tox21,
there will be a need to test selected compounds in more comprehensive assays. The Targeted
Testing group is developing strategies and capabilities for this purpose using assays that involve
higher order testing systems (e.g., roundworms (Caenorhabditis elegans), zebrafish embryos,
rodents).

In addition to the testing activities, the MOU promotes coordination and sponsorship of
workshops, symposia, and seminars to educate the various stakeholder groups, including
regulatory scientists and the public, with regard to Tox21-related activities.

4.    Communicating Computational Toxicology

As the CTRP has matured, communication of progress has developed beyond the publication of
peer-reviewed papers, to include implementation of software and databases, websites and other
applications. Given the importance of communicating and disseminating the products of the
CTRP, recruitment of a public affairs/communications specialist to the NCCT is currently
underway, to provide even greater efforts in this area.

      a.     EPA Program Office Training and Implementation of Computational Tools

      The NCCT and CTRP partners from across ORD have given numerous seminars, held
      multiple 1-2 day workshops, and provided specific training and installation of
      computational tools for EPA program offices. These ad hoc approaches are now being
      formalized into an online menu of lectures and tutorials on a broad range of
      computational toxicology topics.  Senior scientists from NCCT, NHEERL and NERL are
      contributing to this resource, and a FY2009 NCCT recruitment of a communication
      specialist will accelerate development of this effort. In FY2010, computational
      toxicology training will initially focus on the tools ready for program office use,
      including DSSTox, ACToR,  ToxRefDB and the ToxMiner tool for analyzing ToxCast™
      data.

      b.     Communities of Practice for Chemical Prioritization and Exposure Science

      On the scientific level, the NCCT has initiated two Communities of Practice (CoP) in the
      areas of Chemical Prioritization and Exposure Science that are intended to unite
      practitioners in the designated fields. The concept of the CoPs was suggested by the
      BOSC in April 2005, and has since been adopted as a primary means of communication
      and integration of activities across ORD, the EPA, and outside entities and stakeholders.
                                          32
                        Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
       These efforts will serve to enhance communication and coordination, develop common
       standards, promote consistency, evaluate and provide guidance on best practices,
       recommend research priorities, and provide training to interested parties.

5.     Developing Clients for Virtual Tissues

The virtual tissue projects are developing systems models of the liver and embryo that will make
the data from CTRP databases and chemical prioritization projects more useful in quantitative
risk assessments. Over the course of design and implementation of these systems models, interim
milestones and deliverables will need to be developed and communicated to Program Offices.
The process of communicating these products, and taking feedback from the Programs, will
serve to identify and establish longer-term clients for these ambitious projects that are utilizing
cutting-edge science that is very different from current regulatory practice.

The Virtual Liver (v-Liver™) project is actively engaging Program Office personnel to address
challenges in mode of action (MOA) elucidation and quantitative dose-response prediction for
chronic liver injury. In FY2009, chemicals that work through nuclear receptor pathways (e.g.,
CAR, PXR, PPARs) were chosen for a proof-of-concept. Information  on these chemicals is
being used to populate the v-Liver™ Knowledgebase (v-Liver-KB) and develop a liver simulator
model. Program Office staff from OPPTS and NCEA were consulted on the selection of these
chemicals from key classes of pesticides and industrial chemicals. The first deliverable for risk
assessors will be v-Liver-KB, which formally organizes information on normal hepatic functions
and their perturbation by chemical stressors into pathophysiologic states. The v-Liver-KB will be
deployed as an interactive web-based and desktop tool to intuitively browse and query
physiologic knowledge on chemicals. This can then be used to hypothesize and test putative
MOA(s), and to link assay results from ToxCast™ and ToxRefDB with other evidence curated
from the literature. This  system will provide computable information on key events that
transparently indicate the uncertainties and data gaps, and that make inferences on MOA from
experimental data. In addition, we will work closely with risk assessors to customize the system
for specific requirements. Beta versions of the liver simulator will be applied to Program Office
issues relating to key chemical classes- this will be an intensively collaborative process between
CTRP scientists and OPPTS and NCEA staff. CTRP scientists working on the v-Liver™ project
will also work with OPP staff on retrospective analyses of chronic and cancer in vivo test data,
further introducing OPP  to the v-Liver-KB and the simulator as appropriate.

Motivation for the Virtual Embryo (v-Embyo™^ is the scientific need to understand mechanisms
of toxicity and predict developmental defects  from complex datasets. The research goal is to
simulate embryonic tissues reacting to perturbation across chemical class, system, stage, genetic
makeup, dose and time. Data input is detailed knowledge of molecular embryology, as well as
high-throughput data from in vitro models on signaling pathways and cellular phenotypes. Initial
efforts have focused on retrospective analyses of regulatory data from reproductive and
developmental toxicity tests in ToxRefDB.  This work has been a collaborative effort with OPP
that has fostered working relationships between the CTRP scientists and Program Office staff.
Next steps are to identify appropriate developmental and reproductive toxicities that will support
development of systems  model, and are also of regulatory interest to Program Offices. This will
include the incorporation of high throughput data from morphogenetically-competent in vitro
                                           33
                         Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012
               BOSC Review Draft- 24 August, 2009
assays such as the Embryonic Stem Cell test and Zebrafish Embryo Test through collaborations
with NHEERL and outside partners. Virtual Embryo's first goals are to create a knowledgebase
and simulation engine that enable in silico reconstruction of key developmental landmarks,
which develop by regulating conserved signaling pathways and cellular processes and that are
sensitive to environmental chemicals of scientific and regulatory interest to EPA Program
Offices.
                         Previous
                                           34
TOC

-------
  EPA CompTox Research Program FY2009-2012
                                BOSC Review Draft- 24 August, 2009
III.   CTRP PROJECT SUMMARIES FOR FY2009-2012

A.    Intramural Projects Coordinated by NCCT

There are nine CTRP projects coordinated by investigators in NCCT which collectively comprise
the core of the CTRP moving forward into FY2009-2012. Individual project plans for each of
these append this document (Section IV). These projects span the source to outcome continuum
of toxicology research (Figure 6), providing critical components for next generation risk
assessments.
     Figure 6
Applying  Computational Toxicology Along
     the Source to Outcome Continuum
   Source/Stressor Formation
                \
               Environmental Cone.
                           \
ExpoCast

                        External Dose
1.     ACToR - Aggregated Computational Toxicology Resource
      Lead/Principal Investigator: Richard Judson

ACToR is a web-based informatics platform, organized at the top level by chemical and
chemical structure, which is indexing, collecting, and organizing many types of data on
environmental chemicals (Judson et al 2008}. Environmental chemicals are defined as those
likely to be in the environment, including all chemicals regulated or tracked by the EPA, as well
as related chemicals, such as pharmaceuticals, that find their way into water sources.  ACToR is
indexing and linking to data from hundreds of sources, including the EPA, FDA, CDC, NIH,
academic groups, other governmental agencies (state and national) and international
organizations, such as the WHO. Information being indexed and gathered includes in  vivo
toxicity, in vitro bioassay data, use levels, exposure information, chemical structure, regulatory
information and other descriptive data. Planning for the project began in mid-FY2007; beta
versions were available inside the EPA since early FY2008; and a public version became
available in  December 2008. ACToR consists of a back-end database and a front-end  web
                       Previous
                                        35
                  TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
interface built on low-cost, publicly accessible applications and tools. Over the next 3 years,
ACToR will expand to include more publicly available resources and data, including more
information extracted from text reports and tabularized, and more information on chemical use
and exposure. The latter effort will be coordinated with the efforts of ExpoCast™ and NERL to
identify, index and extract data from exposure-related resources of highest interest and
importance to EPA programs. In planned upgrades to ACToR, the ability of users to perform
flexible searches across different layers of data will be enhanced, and customized data
downloads will be implemented. ACToR will serve as the primary vehicle to aggregate and
publicly disseminate all published data associated with the ToxCast™, ToxRef, and Tox21
research projects. Additionally, the ToxMiner and NCCT Chemical Repository systems are
being developed as part of ACToR. These are data repositories and data analysis engines for the
ToxCast™/Tox21 projects.

2.     DSSTox- Chemical Information Technologies in Support of Toxicology Modeling
       Lead/Principal Investigator: Ann Richard

The DSSTox project has implemented high quality data review procedures for standardized
chemical structure annotation, created linkages connecting diverse toxicity resources within and
outside of EPA, and published high quality EPA chemical inventories and toxicity data files
spanning over  10,000 substances. The broad utility of DSSTox data files for cheminformatics
and modeling applications provides significant opportunities to influence the course of predictive
modeling strategies and to encourage wider engagement of toxicologists in toxicity data
representation. The DSSTox project will continue efforts to expand chemical data file offerings
into less well-represented areas of toxicology (immunotoxicology, toxicogenomics, etc.), and
provide varied representations of summary toxicity endpoints. In addition, this research will
explore new representations of chemical structure in relation to the biology (e.g., analog
measures, chemical features, chemical classes), and new representations of biological endpoints
in relation to modeling (e.g., quantitative endpoints in terms of potency, summarized or grouped
effects, qualitative active and inactive classes). These efforts will be designed to complement and
augment projects in NCCT (ToxCast™, ACToR, and Tox21) that are working to improve
capabilities to access, mine, and integrate chemical-biological activity information from existing
and new data, both within and outside EPA, in support of toxicity prediction efforts. In  close
coordination with ACToR, which is the primary informatics resource for ToxCast™ and Tox21
chemical and biological data, DSSTox will provide the initial chemical registration IDs, structure
annotation, and quality review of ToxCast™ and ToxRef inventories, as well as the expanded
Tox21 chemical testing library, helping to ensure quality and consistency of chemical
information across the various NCCT programs.
                                           36
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
3.     ToxRefDB - Toxicity Reference Database
       Lead/Principal Investigator: Matthew Martin

Thirty years of registration toxicity data and open literature studies have been historically stored
as hardcopy and scanned documents by the EPA and others. A significant portion of these data
have now been processed into standardized and structured toxicity data within the EPA's
Toxicity Reference Database (ToxRefDB), including chronic, cancer, developmental, and
reproductive studies from laboratory  animals (Martin et al 2009a ; Martin et al 2009b\ Knudsen
et al 2009). ToxRefDB is a collaborative project between NCCT and OPP. These data are now
accessible and mineable within ToxRefDB and are serving as a primary source of validation for
U.S. EPA's ToxCast™ research program in predictive toxicology. In addition to providing
reference toxicity information to research efforts, ToxRefDB will be mined for information on
the role and impact of previous and current study guidelines on the regulation of environmental
chemicals. The initial collection of studies in March of 2006 focused primarily on the reviews of
registrant submitted toxicity studies on pesticide active ingredients. ToxRefDB design,
development and implementation were completed in mid-FY2006 with ongoing updates to the
standardized vocabulary and data entry tool interface. The entry of over 2,000 studies spanning
the majority of the ToxCast™ Phase  I chemical set was completed late FY2008.  The status of
these initial datasets are either published,  in press, or submitted and are being made publicly
available through the ToxRefDB website. A web-based query tool for the entire contents of
ToxRefDB will become available to the public in 2009, in conjunction with a quarterly update of
the EPA ACToR program.  ToxRefDB will continue to enter available data from chronic, cancer,
developmental, and reproductive studies with a focus on potential ToxCast™ Phase II chemicals.
The availability and entry of toxicity  data into ToxRefDB will also guide the selection of
ToxCast™ Phase II chemicals. Over the next year, ToxRefDB will also be expanded to capture
developmental neurotoxicity (DNT) study data and possibly in vivo data submitted to the EDSP.
In addition, ToxRefDB is being used for retrospective analyses by various EPA,  OECD and
other workgroups working  on revisions to animal test guidelines and other projects.

4.     ChemModel- Application of Molecular Modeling to Assessing Chemical Toxicity
       Lead/Principal Investigator: James Rabinowitz

This project is using modern molecular modeling methods developed for the discovery of novel
pharmaceutical agents to computationally predict toxicant-biomolecular target interactions. A
library of computational models of relevant biomolecular targets is being developed. Molecular
modeling approaches may then be used to interrogate this library for the capacity of specific
environmental molecules to interact with  each target. The endocrine system provides a test for
the utility of this approach because many  of the pathways for toxicity and the macromolecular
targets in those pathways have been identified. Appropriate experimental crystal structures of
many of the receptor protein targets are available to create the computational library of targets.
The ultimate objective of this research is to develop a library of biomolecular targets for
chemical toxicity, and the methods appropriate for their application to predicting the capacity of
a chemical to interact with these targets. This library of targets may then be used in conjunction
with other approaches  as part of a chemical prescreen.
                                          37
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009


5.     ToxCast™- Screening and Prioritization of Environmental Chemicals Based on
       Bioactivity Profiling and Predictions of Toxicity
       Lead/Principal Investigator: Keith Houck

The objective of the ToxCast™ research program is to develop a cost-effective and rapid
approach for screening and prioritizing a large number of chemicals for toxicological testing.
Using data from HTS bioassays developed in the drug discovery field, ToxCast™ is generating
data, constructing databases, and building computational models to forecast the potential human
toxicity of chemicals. HTS bioassays for ToxCast™ are also being provided by NHEERL
partners. These hazard predictions should provide EPA regulatory programs, including OPP,
with science-based information helpful in prioritizing chemicals for more detailed toxicological
evaluations, ultimately leading to reduced animal testing. Furthermore, the toxicity pathways
identified from this dataset and project will be critical to transforming the practice of risk
assessment for environmental chemicals and contaminants (in collaboration with NCEA and
EPA Program Offices). ToxCast™ is a multi-year effort that is divided into three distinct phases:
       •   Phase I: 300 chemicals assayed in over 600 different HTS bioassays, to create
          predictive bioactivity signatures based on the known toxicity of the chemicals;
       •   Phase II: focused on confirmation and expansion of ToxCast™ predictive signatures,
          generating HTS data on 700 additional chemicals;
       •   Phase III: ToxCast™ expanded to thousands of environmental chemicals for which
          little toxicological information is  available.
Once ToxCast™ has gone through successful initiation of Phase III, the data and toxicity
predictions will be ready for deployment throughout numerous EPA program offices. NCCT will
work to link these hazard predictions with exposure predictions, and create integrated database
analysis tools facilitating customized chemical prioritizations appropriate to specific programs.
Beyond the initial application of ToxCast™ data and tools to prioritizing chemicals for further
screening, testing and monitoring, secondary applications will include the Virtual Tissues
systems modeling projects, and next generation risk assessments with NCEA, NHEERL and
EPA program offices.

6.  ExpoCast™- Exposure Science for Screening, Prioritization, and Toxicity Testing.
       Lead/Principal Investigator: Elaine Cohen Hubal

ExpoCast™ will provide an overarching framework for the science required to characterize
biologically-relevant exposure as a critical part of the  CTRP (Cohen Hubal, 2009). The
ExpoCast™ program will foster novel exposure science research to (1) inform chemical
prioritization, (2) understand system response to chemical perturbations and implications at the
individual and population levels, and (3) link information on potential toxicity of environmental
contaminants to real-world health outcomes.  An important early component of ExpoCast™ will
be to consider how best to consolidate and link human exposure data for chemical prioritization
and toxicity testing. ExpoCast™ represents a strong collaboration between NCCT and NERL,
with both parties providing leadership and critical scientific contributions towards this
transformation of exposure science. Initial research will focus on identifying and evaluating
novel approaches for characterizing exposure to prioritize chemicals and developing modeling
approaches for considering exposure potential based on chemical properties, sources (e.g.,
consumer products), uses, lifecycle, and individual/population vulnerability. Beyond the initial
                                           38
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009


application of ExpoCast™ data and tools to prioritizing chemicals for further screening, testing
and monitoring, secondary applications will include next generation risk assessments with
NCEA, NERL and EPA program offices.

7.    v-Embryo™ - Virtual Embryo
      Lead/Principal Investigator:  Thomas Knudsen

Motivation of the Virtual Embryo (v-Embyro™) is scientific needs to understand mechanisms of
toxicity and predict developmental defects from complex datasets. EPA must evaluate
environmental chemicals for potential effects on development. Part of this challenge is to
understand mechanisms by which chemicals disrupt prenatal development. Unfortunately, the
mechanisms of prenatal developmental toxicity are not understood in sufficient depth or detail
for risk assessment purposes. Because embryonic tissues are regulated simultaneously by
pathways that control genetic patterning, molecular clocks, morphogenetic tissue
rearrangements, and cellular differentiation there is a need for computational (in silicd] models to
address this complexity. EPA's v-Embyro™ will comprise a framework to merge data and
knowledge about developmental  processes, leading to cell-based computational models that can
be used to analyze mechanisms in developmental toxicity. This is a collaborative project between
NCCT, NHEERL, and NCEA. Data input is detailed knowledge of molecular embryology, high-
throughput data from in vitro models, signaling pathways, and cellular phenotypes. Output
models aim for the modular reconstruction of a developing embryo from cell-based models of
morphogenesis and differentiation.

8.    v-Liver™ - The Virtual  Liver Project
      Lead/Principal Investigator:  Imran Shah

The Virtual Liver (v-Liver™) computational paradigm represents tissues as cellular systems in
which discrete individual cell level responses give rise to complex physiologic outcomes. In this
model cell level responses are governed by a self-regulating network of normal molecular
processes, and adverse histopathologic effects arise due to chronic stimulation by environmental
chemicals. The v-Liver™ proof of concept (PoC) is being developed by: (i) focusing on
environmental chemicals responsible for hepatocarcinogenesis in rodent studies, (ii)  organizing
mode-of-action (MOA) knowledge on the relevant molecular and cellular processes perturbed by
these chemicals; (iii) developing  a tissue simulation platform to investigate the uncertainties in
MOA and neoplastic lesion formation; and (iv) evaluating in vitro assays to predict lesion
development across chemicals and doses. Virtual Liver is a collaborative project between NCCT
and NHEERL.

9.    Uncertainty Analysis in  Toxicological Modeling
      Lead/Principal Investigator:  R. Woodrow Setzer

The goals of this project are to develop standardized and more efficient computational
approaches for parameter estimation and model selection; standardize approaches for model
evaluation for PBPK and other dynamic models; and develop methods for constructing priors
(probabilistic summaries of current knowledge) for model parameters, based on existing
computational  methods and data  sets. The initial motivation for this work was the need to
                                          39
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                  BOSC Review Draft- 24 August, 2009
standardize and make more sophisticated parameter estimation and model evaluation for PBPK
models being used by OPP, and that emphasis will continue in the early phase of this project.
However, all models relevant to toxicological risk assessment have similar requirements, and this
project will coordinate closely with the Virtual Tissues and ToxCast™ projects. In particular, the
project will collaborate with ToxCast™ in developing approaches for quantifying uncertainty in
ToxCast™ predictions and prioritizations.

B.     Intramural Projects Coordinated by NERL and NHEERL

Several key components and research projects of the CTRP are coordinated by NERL, NHEERL
or both in conjunction with NCCT and other collaborators. In addition,

1. National Exposure Research Laboratory

NERL is conducting human exposure research for screening, prioritization, and toxicity testing
in collaboration with and complementary to the ExpoCast™-project. Exposure science is crucial
for addressing many of our important and complex environmental health  issues and is essential
in order for toxicity testing to be valuable in public health protection. There is a clear need for a
collaborative effort across  the exposure and risk assessment community to ensure the required
exposure science, data and tools are ready to address immediate needs resulting from application
of high-throughput in vitro technologies for toxicity testing. A coherent program is required to
formulate significant exposure questions posed by these novel in vitro toxicity data, develop
creative approaches  for applying existing exposure information and tools to address these
questions, and finally identify key exposure research needs to interpret the toxicity data for risk
assessment. The authors  of the National Academies report (NRC, 2007) emphasize that
population-based data and  human exposure information are required at each step of their vision
for toxicity testing and risk assessment. The collaborating NERL  and NCCT scientists have
identified and will be conducting research in the following priority exposure research areas to
support chemical screening, prioritization, and toxicity testing (Sheldon and Cohen Hubal,
2009}: (1) accessible and linkable exposure databases; (2) exposure screening tools for
accelerated chemical prioritization; (3) computational tools for dose reconstruction and source-
to-outcome analyses; (4) tools for understanding the  fundamental processes and factors
influencing human exposures; and (5) efficient monitoring methods to measure and interpret
biologically-relevant exposure metrics. Susceptibility, vulnerability, and life-stage aspects are
integral to each of these.

The NERL directed  aspects of the ExpoCast™ program will include research required to
understand fundamental processes and factors influencing human exposures as well as
development of the tools required to facilitate efficient exposure assessment. This research will
be implemented in the following broad areas:

       Prioritization and screening. Two related research activities will be implemented, the
first developing high quality, high quantity exposure databases aligned with the NCCT
databases, and the second developing and evaluating  screening models for risk assessment.
NERL will inventory, compile and organize available environmental and exposure data into
readily accessible databases.  These databases will be  efficiently organized so  that they are
                                           40
                         Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
aligned and integrated with other NCCT databases, allowing the users to link source,
environmental, exposure, and effects data for chemicals and degradates/metabolites. Research
will be conducted to develop the next generation of predictive environmental fate and transport,
exposure and dose screening models that can be linked with the corresponding NCCT,
NHEERL, and other Agency toxicity screening-level models. Available screening models will be
inventoried and assessed. A workshop of international experts will be scheduled for evaluating
the available models currently being used by various international organizations. New/refined
models will be developed, based on the evaluation results, and linked with appropriate Agency
toxicity models for future risk assessments.

       Linked exposure-dose models. Research will be conducted to efficiently link NERL's
environmental, exposure and dose models and databases for supporting Program Office specific
exposure and risk assessments. Emphasis will be on developing tools and approaches to facilitate
rapid assessment.

       Biomarkers.  Collaborative research with scientists from ORD, CDC, and academia will
be implemented to design studies for developing and evaluating tools to interpret the results of
exposure biomarker studies  and link these results to indicators of first biological response. The
Metabolic Simulator and Metapath research tools will be upgraded and their utility for risk
assessments will be evaluated using industry provided data. Collaborative observational studies
will be conducted with CDC, NHEERL, and others to develop and refine models for relating
measured exposure biomarker results with environmental exposures (both forward and reverse
dosimetry). Research will be conducted with NCCT and NHEERL to understand how 'omics-
based exposure biomarker data, combined with chemical biomarker results, can be used to link
exposure with indicators of effects.

       Observational studies. Collaborative research with NCS and/or STAR awardees will be
conducted to identify and characterize the key factors influencing children's exposures to
pesticides and other chemicals. Research activities will also be implemented to characterize real-
world exposures to multiple chemicals (mixtures) in targeted communities and/or vulnerable
populations (including children). Tools will be developed for predicting high exposures,
understanding the factors contributing to these exposures, providing input for the development
and evaluation of risk reduction strategies.

2. National Health and Environmental Effects Research Laboratory

Research within NHEERL is both parallel to, and integrated with the CTRP. While some of this
research has been ongoing for a number of years, a new effort to expand this program is
currently underway. The overall goal is the prediction of chemical toxicity to humans  and
wildlife based on understanding fundamental biology and its perturbation by toxicants. The
approach is based on the elucidation of key events that link initiating events to adverse outcomes,
and leverages expertise in human studies, whole animal toxicology in a wide range  of species,
and cellular and molecular biology to identify toxicity pathways associated with adverse health
and ecological outcomes. The four focus areas for NHEERL research and the integration  of these
efforts into the CTRP is described below.
                                           41
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
       Linkage of environmental exposure to perturbation of toxicity pathways. For the
appropriate application of high throughput assays based on toxicity pathways to chemical
screening (hazard identification) and for risk assessment, the environmental exposure levels and
their relation to the exposure at the cellular level must be known. Pharmacokinetic studies and
modes will be used to determine the relationship between external  exposures and tissue dose in
vivo, while cellular and molecular biology studies of in vivo effects and subsequent development
of toxicity pathway assays will broaden the scope of screening assays.

       Linkage of toxicity pathway perturbations to adverse outcomes. Research here is focused
on identifying toxicity pathways, key events, and modes of action (MOA) as they relate to
adverse outcomes and disease. Mode of Action (MOA) is defined as a sequence of key events
and processes, starting with interaction of an agent with a cell, proceeding through operational
and anatomical changes, and resulting in an adverse health effect. The use of global biology
measures ("omics") will be used to discover new toxicity pathways that will then be translated
into medium and high thoughput in vitro and non-mammalian screening assays. Results from
these new technologies will be compared with predictions from in vivo experiments. In cases
where molecular targets  are established for chemical classes, quantitative structure-activity
relationship  ([QJSAR) models and read-across methods will be developed to predict the
toxicological potential of untested chemicals.  These efforts will be coordinated with activities in
OPPTS, OECD and the European Union to ensure the efforts meet Agency and International
needs for regulatory purposes.

       Development of toxicity pathway assays.  NHEERL is developing assays for
neurodevelopmental,  immunotoxicity and cellular stress responses. As these  assays are
established,  these will be used to expand the breadth of assays in the ToxCast testing program.
To date, the  ToxCast™ Phase I chemicals have been tested in several neurodevelopmental and
cellular stress assays. Through participation in the Tox21 MOU, NHEERL has contributed
additional toxicity pathway assays related to the assessment of stress responses in cellular
systems. In addition, NHEERL will be performing secondary screening and targeted testing to
explore insights and hypotheses generated from the ToxCast and Tox21 HTS efforts. These
follow up studies are  designed to evaluate findings from the screening and provide quantitative
in vivo relationships.

       Quantitative models for risk assessment. As quantitative relationships are established
between assay conditions and environmental exposures for humans and wildlife, a transition to
toxicity pathway-based risk assessment becomes technically feasible. Data to define these
quantitative  relationships will be generated in collaboration with NCEA and other ORD partners
to ensure the suitability for use in quantitative modeling. Research integrated with NERL
modeling efforts will make PK models compatible with exposure models. Modeling of MOA is
being conducted in collaboration with the NCCT through modeling of Virtual
Tissues/Systems/Organisms, as well as less detailed BBDR modeling projects. Through the
integrated nature of this research, it is anticipated that in vivo models will be generated for
comparison with in vitro toxicity pathway screening results. In addition, models will be
developed which can take in vitro results directly as input and make quantitative in vivo
predictions for target  organisms. The v-Liver™ and v-Embryo™ projects are two NCCT-
NHEERL collaborations currently underway in this area. An additional virtual cardiopulmonary
                                           42
                         Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
project is currently being developed based on extensive NHEERL research in this area and the
CTRP "new start" project looking at the mechanistic indicators of childhood asthma (MICA).

       Special considerations for ecology research.  The actions presented thus far are
applicable to both health and ecological problems. Ecology has some specific issues, however,
which must be addressed in parallel with the efforts described above. First, while human health
risk assessment can be conducted on the basis of a susceptible subpopulation's risk of an adverse
outcome, most ecological concerns relate to the effect on the viability of the population. There
has been demonstrated success within NHEERL of linking MOA models to population models
which will be extended to toxicity pathway-based models to be used for ecological risk
assessment as relevant. Second, the NRC report highlights the many problems when
extrapolating among species and recommends that species extrapolation be avoided by focusing
on human cells for toxicity pathway assays. This is not possible for ecological risk assessment as
the number of relevant species is much too large for direct testing in each one. Therefore, the
identification of appropriate sentinel species and development of toxicity pathway assays in these
species will be coupled with the development of methods for species extrapolation.

C. Extramural STAR Grantee Projects

Implicit in many of the research projects contained with the CTRP, bioinformatics is one area of
research that is needed. This rapidly emerging technology is crucial to the computational
toxicology program and there remains a large gap in ORD relative to the ability to analyze the
high volumes of molecular data and to predict potential toxicity, modes of action, and ultimately
risk. To help bridge this gap, NCER has supported the establishment of two STAR
Environmental Bioinformatics Research Centers (EBRC). The Research Center for
Environmental Bioinformatics and Computational Toxicology at the University of Medicine &
Dentistry of New Jersey (UMDNJ), Piscataway, NJ,  and The Carolina Environmental
Bioinformatics Research Center at the University of North Carolina, Chapel Hill,  are operating
as cooperative agreements and helping to facilitate the application of bioinformatics tools and
approaches to environmental health issues supported by the CTRP.

In the next year, UMDNJ researchers will begin the design of new ebTrack interfaces to open
source databases and to various "external" and Center-developed modeling tools for facilitating
wider-deployment and applicability of the ebTrack/ArrayTrack system for integrative analyses of
various types of genomic, proteomic, and metabonomic data. Additionally, plans are underway
to refine the environmental bioinformatics Knowledge Base (ebKB) and to make a public beta
version of ebKB available; implement a modular "Virtual Liver" with alternative levels of detail
in describing physical structure of the liver with respect to toxicokinetic and toxicodynamic
processes with case studies focusing on environmentally relevant chemicals; and refine the
framework for DORIAN (Dose-Response Information Analysis) modules representing  different
scales of biological complexity ranging from molecule-molecule  interactions to biochemical
networks to virtual organs and systems.

For the Carolina Bioinformatics Center, goals for the next year include; (i) continuing progress
in dose-response pathway modeling and analysis of ToxCast™ Phase I data; (ii) the continuation
of QSAR modeling of multiple animal toxicity endpoints; and (iii) the development of QSAR
and other statistical models to use in vitro biological data to predict in vivo toxicity endpoints.
                                           43
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
Other plans include the development of specific data-mining algorithms for genomic databases,
and extending computational work on fast approaches for genome-wide expression QTL analysis
to human haplotypes.

The third STAR Center is the Carolina Center for Computational Toxicology at the University of
North Carolina, Chapel Hill. This Center is developing fine-scale predictive simulations of the
protein-protein/-chemical interactions in nuclear receptor networks; mapping chemical-perturbed
networks and devising modeling tools that can predict the pathobiology of compounds based on
a limited set of biological data; building tools that will enable toxicologists to understand the role
of genetic diversity between individuals in responses to toxicants; and creating unbiased
discovery-driven prediction of adverse chronic in vivo outcomes based on statistical modeling of
chemical structures and high-throughput screening.

Another high-priority for EPA is to understand the molecular and cellular processes that, when
perturbed, result in developmental toxicity. In response to this need, NCER has funded the
Texas-Indiana Virtual STAR Center; Data-Generating in vitro and in silico Models of
Developmental Toxicity  in Embryonic Stem Cells  and Zebrafish at the University of Houston,
Texas A&M Institute for Genomic Medicine, and Indiana University. As chemical production
increases worldwide, there is a concordant increased need for determining the hazard and risk to
human health at realistic  exposure levels. The main objective of the proposed multidisciplinary
Texas Indiana Virtual STAR (TIVS) Center is to contribute to a more reliable chemical risk
assessment through the development of high throughput in vitro and in silico screening models
of developmental toxicity. Specifically, the TIVS Center aims to generate in vitro models of
murine embryonic stem cells and zebrafish for developmental toxicity. The data produced from
these models will be further exploited to produce predictive in silico models for developmental
toxicity on processes that are relevant also for human embryonic development.

D. Summary Integration of the CTRP Projects for FY2009-2012

The CTRP spans several  ORD Laboratories and Centers, as well as the extramural STAR grants
program. Collectively these various components of the CTRP are developing new methods and
tools that will enhance our ability to predict adverse effects and understand the mechanisms
through which chemicals induce harm. Advances from the CTRP will give EPA the ability to
screen and assess a larger number of chemicals than traditional methods allow. In addition, EPA
is collaborating with other governmental and private organizations to leverage resources and
access complementary expertise in order to accelerate progress in high-priority research areas.

Throughout the various components of the CTRP,  focus is maintained on addressing a number of
key science questions:
       •   What  are the key linkages in the  continuum between the source of a chemical in the
          environment and its adverse outcomes?
       •   How can we develop predictive models for screening and testing?
       •   How can we improve quantitative risk assessment and reduce uncertainty by using
          advanced computational techniques?
                                          44
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012
               BOSC Review Draft- 24 August, 2009
       •   How can we enhance dose-response modeling, especially in low-dose ranges, to
          include knowledge of molecular events?

In order to address these questions ORD's CTRP will continue to provide informatics, chemical
prioritization and systems modeling solutions for EPA. ACToR, DSSTox, ToxRefDB;
ToxCast™ and ExpoCast™; and v-Liver™ and v-Embryo™, are all excellent examples of
extensive, multidisciplinary and integrated projects that bring together the talents of ORD, along
with extramural scientists from EPA funded STAR centers to provide the high throughput
decision support tools for screening and assessing chemical exposure, hazard and risk.
                        Previous
                                          45
TOC

-------
IV. APPENDICES

A.  Intramural CTRP Projects

1. Project Plans

 a. ACToR - Aggregated Computational Toxicology Resource

Lead/Principal Investigator: Richard Judson

Version: March 4, 2009

Research Issue/Relevance: The EPA faces a significant issue in that there are many chemicals
in wide-spread use, and which the agency regulates, for which there is little or no toxicology
information. The EPA ToxCast™ program is one effort that is addressing this problem by
screening many of these chemicals using high-throughput techniques and helping prioritize
which ones are candidates for more detailed testing. To be effective, ToxCast™ needs to know
what chemicals are in need of screening, and needs to know what is already known (or not) about
these chemicals. ACToR is providing this information. In addition, the ToxCast™, ToxRefDB
and Tox21 projects each have need for unified, sophisticated informatics support and
management of the large amount of chemical and biological information that are central aspects
of these projects. Additional capabilities tied to this informatics resource will be  needed to
address the data analysis and toxicity prediction challenges of the ToxCast™/Tox21 projects.

A related issue is that other EPA programs as well as external stakeholders need  easy access to
information on environmental chemicals both within and beyond the set of interest to ToxCast™.
Currently, toxicity and exposure data associated with environmental chemicals does not adhere
to standardized representations, and is widely dispersed across many databases and Internet
resources, many of which are difficult to access or search. The ACToR project is addressing this
challenge by creating a central, standardized, publicly accessible chemical-informatics platform
to enable searching and cross referencing of chemical-associated toxicity information to aid
prioritization and hazard identification of environmental chemicals.

Purpose/Objective/Impact: ACToR aims to provide a unified, centralized resource of data on
environmental chemicals including toxicology, in vitro assay data, and chemical  structure
information. By gathering information on the type and location of toxicity or exposure data
associated with environmental chemicals into a single, searchable, publicly accessible web-site,
ACToR is providing the basis for chemical selection and screening within NCCT projects, such
as ToxCast™ and Tox21. ACToR is also coordinating with the DSSTox project to incorporate
quality chemical review and structure-annotation for the chemical data sets of highest interest to
the various NCCT projects. In addition to its use in supporting various NCCT and EPA projects,
ACToR is a publicly available EPA resource that enables other government agencies, industry,
and academic researchers to quickly search and collate toxicity-related information on chemicals
of interest. As such, it will promote and encourage other entities to  adopt standards for chemical
representation and broadly survey chemical information pertaining  to toxicology resources on
the Internet.
                                          A-l.a.l
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
Synopsis: ACToR is a web-based informatics platform, organized at the top level by chemical
and chemical structure that is indexing, collecting, and organizing many types of data on
environmental chemicals. Environmental chemicals are defined as those likely to be in the
environment, including all chemicals regulated or tracked by the EPA, as well as related
chemicals, such as Pharmaceuticals that find their way into water sources. ACToR is indexing
and linking to data from hundreds of sources, including the EPA, FDA, CDC, NIH, academic
groups, other governmental agencies (state and national) and international organizations, such as
the WHO. Information being indexed and gathered includes in vivo toxicity, in vitro bioassay
data, use levels, exposure information, chemical structure, regulatory information and other
descriptive data. Planning for the project began in mid-FY07; beta versions were available inside
the EPA since early FY08,  and a public version became available in December 2008. ACToR
consists of a back-end database and a front-end web interface built on low-cost,  publicly
accessible applications and tools. Over the next 3 years, ACToR will expand to include more
publicly available resources and data, including more information extracted from text reports and
tabularized, and more information on chemical use and exposure. The latter effort will be
coordinated with the efforts of ExpoCast™ and NERL to identify, index and extract data from
exposure-related resources  of highest interest and importance to EPA programs. In planned
upgrades to ACToR, the  ability of users to perform flexible searches across different layers of
data will be enhanced, and  customized data downloads will be implemented. ACToR will serve
as the primary vehicle to aggregate and publicly disseminate  all published data associated with
the ToxCast1 , ToxRef, and Tox21 research projects.  Additionally, the ToxMiner and NCCT
Chemical Repository systems are being developed as part of ACToR. These are data repositories
and data analysis engines for the ToxCast/Tox21 projects.

Partnerships/Collaborations (Internal & External):
    1. EPA ToxCast™ program - provide data for use in selecting chemicals and providing
      toxicology data for validation; provide route for publication of data
   2. Tox21 partnership - provide data for use in selecting  chemicals and providing toxicology
      data for validation; provide route for publication of data
   3. DSSTox coordination - align methods for registering high-interest chemical inventories
      (ToxCast™, ToxRef, Tox21,  DSSTox published data files), utilizing DSSTox chemical
      information quality review and structure-annotation within ACToR
   4.  EPA Centers and Offices (OPPT/OPP/NCEA/OW) - provide data on chemicals of
      interest

Milestones/Products:
FY09

    1. Initial public deployment.
   2. Significant version 2, including refined chemical structure information.
   3. Develop workflow for tabularization of data buried in text reports.
   4. Integrate all ToxCast™ and ToxRefDB data.
   5. Quarterly releases with new data.
                                      A - 1 . a. 2
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009


FY10

    1.   Quarterly releases with new data.
    2.   Implementation of a process to gather tabular data on priority chemicals from text
       reports.
    3.   Survey sources of chemical use and exposure data and import any remaining sources.
    4.   Develop flexible query interface and data download process.
    5.   Develop process to extract data from open literature.

FY11

    1.   Quarterly releases with new data


FY12

    1.   Quarterly releases with new data

Keywords:
Computational Toxicology; ACToR; ToxCast™; DSSTox; Database
QA Project Plan: Category II. ACToR development is guided by a series of Standard
Operating Procedures. These govern all aspects of the project, including data acquisition,
formatting, quality assurance, database filling and maintenance, and system administration. All
QA plans are archived in the EPA internal QA system.
                                          A-l.a.3
                        Previous   I     TOC

-------
b. DSSTox- Chemical Information Technologies in Support of Toxicology Modeling

Lead/Principal Investigator:  Ann M. Richard

Research Issue/Relevance:  A central regulatory mandate of the EP is to assess the potential
health and environmental risks of large numbers of chemicals released into the environment,
often in the absence of relevant test data. Significant advances in toxicity prediction capabilities
are predicated on the ability to store, mine, and analyze information on many levels in relation to
chemicals and their effects on biological systems. Standardized, high quality chemical structure
annotation and searchability of toxicity-related information across the Internet and within EPA
programs is a crucial requirement for creating effective data linkages and gathering relevant data.
Equally important is the need to incorporate meaningful chemical structure and property
representations, based on principles of organic chemistry and biologically informed measures of
chemical similarity, into toxicity modeling efforts. Finally, successful modeling efforts will
depend upon suitable representations of biological activity, both HTS and in vivo, in relation to
chemical structure. NCCT's ToxCast™ and Tox21 projects  are employing high-throughput
screening tests to probe biochemical target interactions, chemical pathways, and cellular
responses potentially relevant to toxicity for thousands of chemicals  of high potential exposure
and environmental interest. The goal is to use these data, in conjunction with legacy data and
chemical structure considerations, to infer meaningful patterns and to develop models to predict
a range of in vivo bioassay responses. Biologically informed  toxicity prediction models that
incorporate chemical structure-activity considerations are likely to provide the best means for
prioritizing large lists of chemicals for potential hazard - a pressing need for many EPA
programs - and charting a path forward for a more efficient and cost-effective screening and
testing paradigm.

Purpose/Objective/Impact: The current research will use the DSSTox project framework to
incorporate strict quality standards for chemical information  across NCCT projects (ToxCast™,
ACToR, Tox21), and expand comparability and linkages of summary toxicity data in the context
of a standardized cheminformatics environments. This project will also use these data
foundations to promote and explore new ways to associate chemical  structure with biological
activity, extending traditional structure-activity relationships  (SAR) towards new paradigms for
biologically informed structure-based toxicity prediction.  These efforts have the potential to
impact a wide variety of EPA program offices that heavily rely on chemical information
resources and have a need for structure-based data exploration, analog searching, and improved
toxicity prediction models when limited test data are available. These include programs within
OPPTS  [e.g., Green Chemistry, PreManufacture-Notification Program (PMN), Office of
Pesticide Programs (OPP), HPV Testing Program], as well as EPA's IRIS Program, Office of
Water, and Office of Environmental Information. New information technologies that incorporate
strict quality standards and more flexible and diverse means for assessing biological and
chemical similarity will also  improve the identification of lexicologically relevant analogs by
enhancing the ability to explore data and quantify associations across diverse chemical and
biological data domains.

Synopsis: The DSSTox project has implemented high quality data review procedures for
standardized chemical structure annotation, created linkages  connecting diverse toxicity
                                           A-l.b.l
                         Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
resources within and outside of EPA, and published high quality EPA chemical inventories and
toxicity data files spanning over 10,000 substances. The broad utility of DSSTox data files for
cheminformatics and modeling applications provides significant opportunities to influence the
course of predictive modeling strategies and to encourage wider engagement of toxicologists in
toxicity data representation. The DSSTox project will continue efforts to expand chemical data
file offerings into less well-represented areas of toxicology (immunotoxicology, toxicogenomics,
etc.), and provide varied representations of summary toxicity endpoints. In addition, this research
will explore new representations of chemical structure in relation to the biology  (e.g., analog
measures, chemical features, chemical classes), and new representations of biological endpoints
in relation to modeling (e.g., quantitative endpoints in terms of potency, summarized or grouped
effects, qualitative active and inactive classes). These efforts will be designed to complement and
augment projects in NCCT (ToxCast™, ACToR, and Tox21) that are working to improve
capabilities to access, mine, and integrate chemical-biological activity information from existing
and new data, both within and outside EPA, in support of toxicity prediction efforts. In close
coordination with ACToR, which is the primary informatics resource for ToxCast™ and Tox21
chemical and biological data, DSSTox will provide the initial chemical registration IDs, structure
annotation,  and quality review of ToxCast™ and ToxRef inventories, as well as the expanded
Tox21 chemical testing library, helping to ensure quality and consistency of chemical
information across the various NCCT programs.

Partnerships/Collaborations (Internal & External):  The DSSTox project is being
coordinated and linked with a number of public efforts (ILSI, ToxML, LHASA UK, PubChem,
ChemSpider), and government research laboratories (NIEHS, NTP, FDA) that are promoting
controlled toxicity vocabularies, adopting data standards,  and migrating diverse  toxicity data into
the public domain. DSSTox is also aligned with major NCCT projects (ToxCast™, ToxRefDB,
ACToR, Tox21), providing key quality review procedures and cheminformatics support,
expanding DSSTox data file publications of toxicological data in support  of predictive modeling,
and enhancing linkages to public resources such as PubChem for disseminating bioassay results
to the broader modeling community. In coordination with ACToR, partnerships  and
collaborations with scientists across EPA (NHEERL, OPPT, OPP, NERL) are being forged to
improve cheminformatic capabilities  across the Agency from a unified chemical structure
perspective, most recently extending into exposure data arenas (ExpoCast™). Research
collaborations are on-going with SAR modelers at UNC (A. Tropsha, H. Zhu) and in the data
mining and SAR community (C. Yang, R. Benigni, E. Benfenati) to improve methods to
incorporate biological considerations into SAR models. OECD is using DSSTox as a source of
high quality, quality controlled chemical structures and activities in their QSAR Toolbox.
Finally, we  are pursuing closer collaborations with toxicogenomics resources such as the
National Center for Biotechnology Information (GEO), the European Bioinformatics Institute
(ChEBI and ArrayExpress), and the NIEHS CEBS toxicogenomics resource.

Milestones/Products:
FY09
    1.  Publish paper on ToxCast™ 320 chemical inventory from SAR modeling perspective.
    2.  Publish papers and coordinate efforts with NCBI (GEO) and EBI (ArrayExpress) to
       structure-annotate and provide chemical linkages to microarray data for toxicogenomics.
    3.  Restart Chemoinformatics Communities of Practice using EPA's Science Portal.
                                          A-l.b.2
                        Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009


   4. Publish DSSTox files for ToxRef and ToxCast™ inventories and selected summary
      endpoints, and facilitate publication and linkage to the NLM PubChem project. Compile
      and publish public genetic toxicity data and SAR predictions for ToxCast 320.
   5. Continue expansion of DSSTox public toxicity database inventory for use in modeling.
   6. Perform primary chemical review and structure annotation of the ToxCast™/Tox21
      chemical testing libraries, coordinating with ACToR and within a central chemical
      registry.
       Publish DSSTox files for Tox21 inventory and selected summary endpoints, and facilitate
       publication and linkage to the NLM PubChem project.
   2.  Publish DSSTox files for NTP study areas (Immunotox, Genetox, etc) to facilitate
       incorporation into ACToR and encourage broader SAR examination;Explore new
       approaches to SAR modeling based on feature categories within existing DSSTox files
       and ToxCast™ data.
   3.  Explore new approaches to SAR modeling based on classifiers and feature categories.
   4.  Expand CEBS collaboration to incorporate DSSTox GEO and ArrayExpress files, create
       chemical linkage to ILSI Developmental Toxicity database and facilitate structure-
       searching.
   5.  Advise and assist efforts within ExpoCast™ to identify and chemically annotate
       important exposure-related public data resources.

       In collaboration with ACToR, establish procedures and protocols for automating
       chemical annotation of new experimental data generated by NCCT Programs
       (ToxCast™, Tox21) and in collaboration with CEBS or NHEERL.
   2.  Document and employ PubChem analysis tools in relation to published DSSTox and
       ToxCast™ data inventory in PubChem.
   3.  Collaborate with SAR modeling efforts to predict ToxCast™ endpoints using in vitro
       data.
   4.  Continue expansion of DSSTox public toxicity database inventory for use in modeling
       with co-publication and linkage to ACToR and PubChem.
FY12
   1.  Redesign DSSTox website to provide hosting of donated chemical descriptors, properties
       and predictions for high interest inventories.
   2.  Publish master tables of DSSTox IDs and high quality structures to serve as public data
       registry for toxicology, particularly for EPA,  FDA, and NTP datasets.
   3.  Promote use of chemical registry system from ToxCast™/Tox21, linked to DSSTox
       content and integrated into ACToR, more broadly within EPA.
   4.  Collaborate with SAR modeling efforts to expand modeling to address Tox21 chemicals
       and endpoints.
   5.  Continue expansion of DSSTox public toxicity database inventory into toxicity and
       exposure areas not effectively linked to current databases.

Keywords: DSSTox; prediction; cheminformatics; structure-activity; SAR
                                          A-l.b.3
                        Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
QA Project Plan:  Category II. The DSSTox project, involving the compilation,
standardization, and review of chemical data largely extracted from secondary public data
sources has a large intrinsic QA component.  All QA plans are archived in the EPA internal QA
system. Documentation and procedures contributing to overall QA objectives include:

   >  Maintenance of a DSSTox web page describing QA procedures for obtaining and
       reviewing chemical information prior to inclusion in the DSSTox Master file:
          http://www.epa.gov/ncct/dsstox/ChemicalInfQAProcedures.html
   >  File versioning, error tracking within data files, publication of Log files with version
       history, and full documentation of published DSSTox data files
   >  Maintenance of the DSSTox Master File tables in ACCESS, cross referencing, and use of
       automated scripts to perform field content error checks for new file generation;
   >  Coordinated publication in ACToR and PubChem checks for structure consistencies;
   >  Review of all newly published DSSTox data files by Source collaborators;
   >  Error reporting system associated with DSSTox published files and structure browser.
                                          A-l.b.4
                         Previous  I    TOC

-------
c. ToxRefDB - Toxicity Reference Database

Lead/Principal Investigator: Matthew Martin

Version: April 02, 2009

Research Issue/Relevance: As the EPA moves toward a new chemical toxicity testing
paradigm, the vast library of laboratory animal toxicity study information that is publicly
available, and for which the agency has received and continues to receive, will provide context
for many of the technologies applied recently to toxicity testing and screening. However, the
animal toxicity study information has not been made electronically accessible, searchable, or
computable. ToxRefDB is the relational database designed and developed to electronically
capture all of the relevant information spanning thirty years of health effect data from the agency
and beyond. The EPA ToxCast™ program is one effort using high-throughput techniques to
prioritize chemicals for further testing. To be effective, ToxCast™ needs reference toxicity
information to provide the interpretive context for the large amounts of screening data.
ToxRefDB is providing the reference in vivo toxicity data for programs such as ToxCast™ in a
searchable and computable format.

Additionally, the utility of ToxRefDB is broader than use as a validation dataset for ToxCast™.
Regulatory scientists have begun to assess the role and impact of previous and current guideline
studies and components of those studies in the regulation and assessment of chemicals.
ToxRefDB will be the primary data source for numerous retrospective analyses and may have a
large impact on the future revisions to existing guideline studies.

Purpose/Objective/Impact: The ToxRefDB project has provided access to a wealth of in vivo
toxicity data in a structured and searchable format. These data are being released through a series
of manuscripts which are currently submitted for publication or in preparation. In addition,
ToxRefDB data will be publicly available through the ToxRefDB website. This will fill a major
gap in the environmental toxicology community as very limited resources in this area exist in the
public domain. Such information should have high utility in building and interpretation of
predictive toxicology models. In addition, researchers and regulatory scientists can access the
toxicity information to address numerous questions specific to hazard identification and
characterization of environmental chemicals along with retrospective analyses that will direct
and evaluate possible changes to guideline toxicity studies.

Synopsis: Thirty years of registration toxicity data and open literature studies have been
historically stored as hardcopy and scanned documents by the EPA and others. A significant
portion of these data, including chronic, cancer, developmental, and reproductive studies from
laboratory animals, have now been processed into standardized and structured toxicity data,
within the EPA's Toxicity Reference Database (ToxRefDB). These data are now accessible and
mineable within ToxRefDB and are serving as a primary source of validation for EPA's
ToxCast™ research program in predictive toxicology. In addition to providing reference toxicity
information to research efforts, ToxRefDB will be mined for information  on the role and impact
of previous and current study guidelines on the regulation of environmental chemicals. The
initial collection of studies in March of 2006 focused primarily on the reviews of registrant
                                           A-l.c.l
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
submitted toxicity studies on pesticide active ingredients. ToxRefDB design, development and
implementation were completed in mid-FY2006 with ongoing updates to the standardized
vocabulary and data entry tool interface. The entry of over 2,000 studies spanning the majority of
the ToxCast™ Phase I chemical set was completed late FY2008. The statuses of these initial
datasets are either published, in press, or submitted and are being made publicly available
through the ToxRefDB website. Over the next year, a web-based query tool for the entire
contents of ToxRefDB will become available to the public and this will be performed in
conjunction with a quarterly update of the EPA ACToR program. ToxRefDB will continue to
enter available data from chronic, cancer, developmental, and reproductive studies with a focus
on potential ToxCast™ Phase II chemicals. The availability and entry of toxicity data into
ToxRefDB will also guide the selection of ToxCast™ Phase II chemicals. Over the next year,
ToxRefDB will also be expanded to capture developmental neurotoxicity (DNT) study data and
possibly in vivo data submitted through the agency as part of the endocrine disrupter screening
program (EDSP). The completion of retrospective analyses on reproductive toxicity studies will
be completed by  late FY2009 and the use of ToxRefDB for additional analyses including rat and
mouse chronic/cancer study and rat and rabbit development study assessments.

Partnerships/Collaborations (Internal & External):
   1.  EPA ToxCast™ program - provide the reference toxicity information for interpreting the
       screening data with respect to animal toxicity information
   2.  Tox21 partnership - provide the reference toxicity information for interpreting the
       screening data with respect to animal toxicity information
   3.  EPA ACToR program - provide ACToR with ToxRefDB data for chemical indexing and
       searchability
   4.  EPA Centers, Offices, and Labs (OPPT/QPP/OSCP/NHEERL) - offices provide legacy
       toxicity data; provide searchable and computable toxicity data to offices; entry of
       additional study types and from various sources
   5.  OECD  (including BfR, RIVM, PMRA) - evaluation of current testing guidelines and
       assessment of proposed new guidelines, e.g., extended one-generation reproduction
       toxicity study

Milestones/Products:
FY09
   1.  Publication on ToxRefDB
   2.  Release of stand-along ToxRefDB data entry tool
   3.  ToxRefDB webpage online
   4.  Initial public release of selected chronic/cancer endpoints
   5.  Public release of selected reproductive toxicity endpoints
   6.  Public release of selected developmental toxicity endpoints
   7.  Collection of ToxCast™ Phase II chemical  toxicity data
   8.  Public release of ToxRefDB web-based query tool
                                                                                  Tl\/f
   9.  Complete entry of targeted set of chemicals and study types for Phase II of ToxCast
   10. Complete reproductive toxicity study retrospective analysis
                                          A-l.c.2
                        Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
       Quarterly releases with new data in conjunction with ACToR
       Implementation of a process to gather and enter open literature studies
       Expansion of ToxRefDB to capture DNT studies and EDSP data
       Complete retrospective analyses on other major study types
       Release of ToxRefDB live data entry tool
    1.   Quarterly releases with new data in conjunction with ACToR

FY12
    1.   Quarterly releases with new data in conjunction with ACToR

Keywords (three to five):  Computational Toxicology; ToxRefDB; ToxCast™; ACToR;
Database

QA Project Plan: Category II. ToxRefDB development is guided by a series of Standard
Operating Procedures. These govern all aspects of the project, including data acquisition,
formatting,  quality control and assurance, data entry and maintenance, and system
administration. All QA plans are archived in the EPA internal QA system.
                                          A-l.c.3
                        Previous  I    TOC

-------
d. ChemModel- The Application of Molecular Modeling to Assessing Chemical Toxicity

Lead/Principal Investigator:  James Rabinowitz

Research Issue/Relevance:  Insufficient experimental information exists for the evaluation of
the potential of a large number of environmental chemicals to cause toxicity and other
environmental effects. Where data does exist often it is not ideal for this task. The Agency often
must make decisions about specific chemicals when lacking an ideal data set. Molecular
modeling approaches provide an approach for estimating relevant missing information. One
approach to this problem is to estimate the relevant missing information by extrapolation from
existing information on the chemical of interest and other similar chemicals making use of
molecular modeling approaches. Knowledge of the mechanisms of toxicity provides a rational
basis for application of these computational tools. The results of these models may be used in
conjunction with experimental information to inform decisions about the relevant chemical and
to prioritize the requirements for obtaining missing experimental data.

Purpose/Objective/Impact: The overall objective of this research is to develop an approach
(including the necessary tools) for the application of molecular modeling methods to Agency
problems, particularly problems resulting from the requirement to make preliminary decisions
about chemicals in a data poor environment. This includes the preliminary evaluation of
chemical toxicity and the prioritization of chemicals testing needs. Knowledge of the potential
mechanisms of toxicity provides a rational basis for extrapolation from existing  information to
derive information about chemicals for which little data exists. The differential step in many
mechanisms of toxicity may be generalized as the interaction between a small molecule (a
potential toxicant) and one or more macromolecular targets. (The small molecule may be the
chemical itself or one of its descendants). Using modern molecular modeling methods developed
for the discovery of novel pharmaceutical agents, it is possible to computationally predict these
toxicant-biomolecular target interactions  using a combination of direct computer modeling of
atomic interactions between the toxicant-target pair and correction factors derived from
experimentally-derived interactions with  similar targets. To employ this approach, a library of
computational models of relevant biomolecular targets is being developed. Molecular modeling
approaches may then be used to interrogate this library for the capacity of specific environmental
molecules to interact with each target. These approaches were developed for the discovery of
new pharmaceuticals where the objective is to discover molecules that interact most potently
with the target. However, the Agency's need is to discover if chemicals of environmental interest
interact with the target, even if their interaction  is much weaker than seen with potential
pharmaceutical agents. An objective of this research is to evaluate the relevant molecular
modeling methods in relationship this Agency requirement.

The endocrine system provides a test for  the utility of this approach because many of the
pathways for toxicity and the macromolecular targets in those pathways have been identified.
Appropriate experimental crystal structures of many of the receptor protein targets are available
to create the computational library of targets. Additionally, experimental data of the capacity of a
library of chemicals to displace the natural ligand from the rat estrogen receptor is available from
a single source. While most of the chemicals in  this library have been show not to interact with
the receptor in this laboratory assay, the few that displace estrogen are orders of magnitude less
                                          A-l.d.l
                         Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
potent than the natural ligand. The data from this chemical library provides an opportunity to test
the toxicant-target approach and its capacity to separate less potent chemicals from a large
number of similar inactive chemicals. In addition this exploration addresses the specific Agency
need to  evaluate the potential of chemicals to disrupt the endocrine system. A similar approach
may be  used to investigate other health effects and fate and metabolism.

The ultimate objective of this research is to develop a library of biomolecular targets for
chemical toxicity, and the methods appropriate for their application to predicting the capacity of
a chemical to interact with these targets. This library of targets may then be used in conjunction
with other approaches as part of a chemical prescreen.

Synopsis: The differential step in many mechanisms of chemical toxicity may be generalized as
the interaction between a small molecule (a potential toxicant) and one or more macromolecular
targets.  The small molecule may be the chemical itself or one of its descendents.  Describing the
potential of a molecule to participate in interactions of this type is a source of insight chemical
toxicity. In this project a series of molecular models (148) for critical toxicity targets is being
developed and methods to evaluate the capacity of a small molecule to interact with these targets
assessed.  These methods are adopted from those used in the design of novel  pharmaceutical. A
study of a library of 280 environmental chemicals interacting with the estrogen receptor target is
in the final stages of completion. In this library 14 of the chemicals are weakly active (3  -5
orders of magnitude less active than estrogen) and the others are inactive. Modeling the potential
interaction of these chemicals with the rat estrogen receptor provides  an ordered list of
molecules. The best results are achieved using a pharmacophore filter.  With that approach all
14 active chemicals are identified in the first 22 chemicals.  In addition to the importance of
these results relative to potential binding to the environmentally important estrogen receptor,
they indicate that this approach may be used to find chemicals that interact weakly with the
target. All 150  of the targets have been interrogated with the ToxCast™ chemicals.  Based on
the results from the estrogen receptor study, pharmacophores  for as many of the targets as
possible will be developed. The analysis of this data will proceed by  comparison with specific
ToxCast™ endpoint data and in concert with short term data to evaluate more complex
biological endpoints. The logical extension is to consider the androgen receptor, where relevant
data for comparison and developing a pharmacophore are available.  When sufficient data on
other biological macromolecules that are relevant to the Agency requirements become available
the current library of targets will be expanded. This library of targets will be used to study
chemicals and families of chemicals of importance to the Agency.  The Toxicant-Target
approach described above models molecular identification processes.  In collaboration with other
EPA scientist similar approaches are being applied to the steps that follow identification.  A
study on the differential metabolism of pyrethroids and the  effect of stereo-structure on
biological clearance is underway as is a study of perfluorinated chemicals.

Partnerships/Collaborations (Internal & External):  Scientists from NHEERL/RTD have
provided the database of the interactions of molecules with the estrogen receptor. We continue to
interact with them relative to this data and the biological details of the computational modeling
effort. Scientists from NERL/HEASD are collaborators on the study  of the metabolism and fate
pyrethroids. Collaboration with CDC scientists relative to using the target-toxicant approach to
investigate the interaction of environmental chemicals with nervous system enzymes and
                                           A-l.d.2
                         Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                  BOSC Review Draft- 24 August, 2009
receptors is being developed. As described above these studies apply methods developed for
pharmaceutical discovery to model the capacity of an environmental chemical to interact with a
macromolecular target.

Milestones/Products:
FY09
    1.  Report on the capability of the target-toxicant paradigm to identify chemicals that bind
       weakly to the estrogen receptor, including a description of the method.
    2.  Description of the library of 148 biological macromolecule targets.
    3.  Report on molecular modeling studies of the potential biological effects of the perfluoro
       compounds.

FY10
    1.  Report on the metabolism of pyrethroids and the effects of three dimensional chemical
       structures.
    2.  Description of additional targets added to the target library.
    3.  Report on the interaction of ToxCast™ chemicals with nuclear receptor targets and the
       importance of pharmacophore filters.

FY11
    1.  Report on the integration of results from the target library and available experimental
       parameters.
    2.  Report on the comparison of results with and without pharmacophore filter for
       ToxCast™ chemicals.
FY12

    1.  Comparison  of results using the target library with experimental determined activities
       particularly when observed at the molecular level.
    2.  Report of the potential use of molecular modeling and the target toxicant paradigm for
       regulatory purposes, including a discussion of the OECD principles either as they
       currently exist or relative to molecular modeling specific principles.

Keywords:
Molecular modeling; Protein binding; toxicity prescreening; weak interactions

QA Project Plan: Category IV.  The quality objectives  for molecular modeling are to achieve
the best balance between reasonable computational speed and model performance. Development
is guided by a series of Standard Operating Procedures. These govern all aspects of the project,
including data acquisition, formatting, quality assurance, database filling and maintenance, and
system administration. All QA plans are archived in the EPA internal QA system.
                                           A-l.d.3
                         Previous  I     TOC

-------
e. ToxCast™- Screening and Prioritization of Environmental Chemicals Based on
Bioactivity Profiling and Predictions of Toxicity

Lead/Principal Investigator: Keith Houck

Research Issue/Relevance:  The objective of the ToxCast™ research program developed by
The NCCT of the EPA's ORD is to develop cost-effective innovative approaches to efficiently
screen and prioritize a large number of chemicals for toxicological testing. Using data from state-
of-the-art high-throughput screening (HTS) bioassays developed by the pharmaceutical industry,
ToxCast™ is building computational models to predict the potential human toxicity of
chemicals. These hazard predictions should provide the Agency's regulatory programs with
science-based information that will be helpful in setting priorities for more targeted toxicological
evaluations that will help the Agency focus on those chemicals and endpoints with the greatest
potential for causing adverse effects in humans. The ultimate goal of ToxCast™ is to deliver an
affordable, efficient, science-based system for categorizing chemicals  according to their
predicted toxicities.

An essential component of the ToxCast™ research program is the development of a
standardized, reference database containing animal toxicity studies called ToxRefDB.
ToxRefDB is being populated with the results of guideline animal toxicity studies on pesticidal
active chemicals that are submitted to the Agency by manufacturers as a requirement of licensing
a pesticide product. ToxRefDB is, for the first time, providing a searchable, mineable historical
database for accessing a wealth of reference in vivo study data. Most importantly, ToxRefDB
will provide the essential interpretive context to anchor ToxCast™ in vitro data (i.e.,HTS and
genomic data) to animal toxicity endpoints with selected ToxRef in vivo outcomes serving as the
basis for developing predictive in vitro bioactivity profiles and signatures. Equally essential to
the overall success of this project will be the development of a suitable informatics and analysis
infrastructure for storing, relating and extracting patterns from all data associated with the
ToxCast™ project, including chemical, HTS, and in vivo data elements.

ToxCast™ databases and predictive models for the potential toxicity of environmental chemicals
will be useful to EPA program offices for chemical prioritization. For  example, the Office of
Pesticide Programs (OPP) and the Office of Pollution Prevention and Toxics (OPPT) anticipates
taking advantage of ToxCast™ models and datasets to prioritize in vivo animal testing of
products that have limited toxicity data available such as:
       •  antimicrobial pesticides
       •  inert ingredients  in pesticide products
       •  manufacturing process impurities
       •  metabolites and environmental degradates of concern
       •  new and existing industrial chemicals
After internal clearance and external peer review, the information in ToxRefDB and HTS data
generated on the chemicals screened in ToxCast™ will be publicly available at
www.epa.gov/ncct/toxrefdb and www.epa.gov/ncct/toxcast.
                                          A-l.e.l
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009


Purpose/Objective/Impact: The objective of the ToxCast™ research program is to develop a
cost-effective and rapid approach for screening and prioritizing a large number of chemicals for
toxicological testing (Dix etal., 2007). Using data from HTS bioassays developed in the drug
discovery community, ToxCast™ is generating data, constructing databases, and building
computational models to forecast the potential human toxicity of chemicals. These hazard
predictions should provide EPA regulatory programs, including OPP, with science-based
information helpful in prioritizing chemicals for more detailed toxicological evaluations,
ultimately leading to reduced animal testing. ToxCast™ is currently in the proof-of concept
phase, wherein over 300 chemicals have been assayed in over 600 different HTS bioassays,
creating rich bioactivity profiles for these chemicals. The Phase I chemicals are primarily
conventional pesticide actives that have been extensively evaluated using traditional mammalian
toxicity testing, and hence have a number of well characterized toxicity outcomes (e.g.,
carcinogenicity; and developmental, reproductive and neural toxicity). These in vivo data, in
turn, have been extracted from the evaluations conducted by OPP scientists and were used to
construct and populate the ToxRefDB. Comparable toxicity data from other toxicity sources
(e.g., National Toxicology Program) are also being captured in ToxRefDB. A broader and more
diverse set of complementary data on thousands of chemicals is being identified and collated in
EPA's Aggregated Computational Toxicology Resource (ACToR).  ACToR (and the analysis
component, ToxMiner) is providing the essential informatics infrastructure for housing,
integrating and analyzing all chemical and assay data associated the ToxCast™ project, also in
the context of a much larger world of web-accessible chemical-toxicological information.
DSSTox, in turn, is providing high quality, standardized chemical structure indexing for ACToR
and ToxCast™, including other high-interest Agency chemical-data inventories.

ToxRefDB is critical to developing predictive signatures, because it links ToxCast™ HTS in
vitro data to in vivo toxicity endpoints associated with the same chemicals. The toxicity data in
ToxRefDB and the HTS data generated in ToxCast™ will be made publicly available through
EPA websites and databases. The first manuscript on ToxRefDB was recently published (Martin
et al, 2008), presenting toxicity profiles from two-year rodent bioassays on 310 chemicals. A
similar analysis is nearing completion on multigeneration reproduction and prenatal
developmental test data for the  ToxCast™ chemicals in ToxRefDB, profiling the toxicity
potential of this chemical set across generation, life-stage, and different classes of endpoints.

ToxCast™ is a multi-year effort that is divided into three distinct phases:
7. Phase I: 300 chemicals assayed in over 600 different HTS bioassays, to create predictive
   bioactivity signatures based on the known toxicity of the chemicals;
8. Phase II: focused on confirmation and expansion of ToxCast™ predictive signatures,
   generating HTS data at least 300  additional chemicals;
                     TA/f
9. Phase III: ToxCast  expanded to thousands of environmental chemicals for which little
   toxicological information is available.

Once ToxCast™ has gone through successful initiation  of Phase III, the data and toxicity
predictions will be ready for deployment throughout numerous EPA program offices. NCCT will
work to link these hazard predictions with exposure predictions, and create integrated database
analysis tools facilitating customized chemical prioritizations appropriate to specific programs.
                                           A-l.e.2
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009


Synopsis: The primary goals for ToxCast™ are the completion of Phase I data collection and,
concurrently, development of the informatics infrastructure to store and analyze these data,
derivation of predictive signatures from Phase I data, validation of these signatures with Phase II
data, and the application of these predictions to the prioritization of chemicals in various
chemical and nanomaterial testing programs. Success in meeting these four principal goals will
lead to secondary applications in developing human toxicity pathway analyses, and in high
throughput risk assessments.

Partnerships/Collaborations (Internal & External):
Tox21 with NTP and NCGC
OECD Molecular Screening Project
OECD Enhanced One Generation Reproductive Test Guideline
EPA/ORD/NHEERL DNT Team
EPA/ORD/NCEA Phthalate Team
MOUs and MTAs with over 20 external organizations collaborating on ToxCast™ assays,
chemicals and data analysis.

Milestones/Products:
FY09
    1. Completion of ToxCast™ Phase I data collection.
    2. Provide annotated Phase I data sets for public access.
    3. Derivation of predictive signatures from ToxCast™ Phase I data.
                                    Tl\/f
    4. Multiple publications on ToxCast    data sets.
    5. Publication on signature generation (SCOUT item, April 2009).
    6. Publication on analysis of NR pathways and toxicity.
    7. Convene first ToxCast™ Data Summit for identifying promising prediction models from
      intramural and extramural sources.
    8. Finalize selection of chemicals for next 3-5 years of ToxCast™ and Tox21 projects.
    9. Prioritize and order 4000-6000 chemicals in collaboration with other Tox21 partners.

FY10
    1. Publications describing approaches to combining exposure, PK and in vitro assays to do
      risk prioritization.
    2. Evaluate compatibility of nanomaterials of diverse classes with ToxCast™ assays.
    3. Prioritize and select assays to be run  for ToxCast™ Phase II.
    4. Completion of ToxCast1  Phase II data collection.

FY11
    1. Confirmation of ToxCast™ predictive signatures with Phase II data.
    2. Publications on signature confirmations and applications.

FY12
    1. Application of ToxCast™ predictions to the prioritization of chemicals in various EPA
      chemical and nanomaterial testing programs.
                                          A-l.e.3
                        Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012
               BOSC Review Draft- 24 August, 2009
Keywords (three to five):  ToxCast™; high-throughput screening; hazard; predictive
toxicology; chemical prioritization

QA Project Plan: Category I. ToxCast™Quality Management Plan, includes:  Information
Management QA Project Plan,  NCGC QA Project Plan (for IA), and 9 separate Contractor QA
Plans and Records.  All QA plans are archived in the EPA internal QA system.
                        Previous
                                          A-l.e.4
TOC

-------
f. ExpoCast™- Exposure science for screening, prioritization, and toxicity testing.

Lead/Principal Investigator:  Elaine Cohen Hubal

Research Issue/Relevance:  High visibility efforts in toxicity testing and computational
toxicology raise important research questions and opportunities for exposure scientists. There is
a clear need for a collaborative effort across the exposure and risk assessment community to
ensure that the required exposure science, data and tools are ready to address immediate needs
resulting from application of high-throughput in vitro technologies for toxicity testing. A
coherent program is required to formulate significant exposure questions posed by these novel in
vitro toxicity data, develop creative approaches for applying existing exposure information and
tools to address these questions, and finally identify key exposure research needs to interpret the
toxicity data for risk assessment. The authors of the National Academies report (NRC, 2007)
emphasize that population-based data and human exposure information are required at each step
of their vision  for toxicity testing, and that these exposure data will continue to play a critical
role in both guiding the development and use of the toxicity information. Exposure research
questions posed in this report include how to: (1) use information on host susceptibility and real-
world exposures to interpret and extrapolate in vitro test results; (2) use human exposure data to
select doses for toxicity testing so information on biological effects pertains to environmentally-
relevant exposures; and (3) relate human exposure data from biomonitoring surveys to
concentrations that perturb toxicity pathways to identify biologically-relevant exposures. The
NCCT has identified the need to include exposure information for chemical prioritization,
modeling system response to chemical exposures across multiple levels of biological
organization (through  to the population level), and linking information on potential toxicity of
environmental contaminants to real-world health outcomes (Cohen Hubal et al., 2008).
Together, scientists from NCCT and NERL's Human Exposure Research Program have
identified and will be conducting research in the following priority exposure research areas to
support chemical screening, prioritization, and toxicity testing (Sheldon and Cohen Hubal,
2009): (1) accessible and linkable exposure databases; (2) exposure screening tools for
accelerated chemical prioritization; (3) computational tools for dose reconstruction and source-
to-outcome analyses; (4)  tools for understanding the fundamental processes and factors
influencing human exposures; and (5) efficient monitoring methods to measure and interpret
biologically-relevant exposure metrics.  Susceptibility, vulnerability, and life-stage aspects are
integral to each of these.

Purpose/Objective/Impact: The ExpoCast™ program is being initiated in FY09 to ensure the
required  exposure science and computational tools are ready to address global needs for rapid
characterization of exposure potential arising from the manufacture and use of tens of thousands
of chemicals and to meet challenges posed by new toxicity testing approaches. The overall goal
of this project is to develop novel approaches and tools for screening, evaluating and classifying
chemicals, based on the potential for biologically-relevant human exposure, to inform
prioritization and toxicity testing. An emphasis will be placed on conducting research to mine
and translate scientific advances and tools in a broad range of fields to provide information that
can be used  to support enhanced exposure assessments for decision making  and improved
environmental health.  Advanced exposure databases, computational tools, and analysis
approaches are required to prioritize chemicals, to design effective in vitro screening protocols,
                                           A-l.f.l
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                  BOSC Review Draft- 24 August, 2009
and to interpret the results of these screening tests for human health risk assessment. Approaches
for integrating information on genetic susceptibility, life-stage, and population-level
vulnerabilities with in vitro toxicity data are also required to improve public-health decision
making. This initiative will advance Agency tools for efficiently characterizing and classifying
chemicals based on potential for biologically-relevant exposures. The improved exposure science
and knowledge will subsequently inform the characterization of environmentally-relevant
toxicity.

Synopsis of NCCT Directed Research: ExpoCast™ will provide an overarching framework
for the science required to characterize biologically-relevant exposure in support of the Agency
computational toxicology program. Broadly and long-term, the ExpoCast™ program will foster
novel exposure science research to; (1) inform chemical prioritization, (2) understanding the
systems response to chemical perturbations resulting from environmentally relevant exposures
and how these translate to relevant biological changes at the individual and population levels, (3)
link information on potential toxicity of environmental  contaminants to real-world health
outcomes. The NCCT directed aspects of the ExpoCast™ program will have a strong focus on
research required to interpret and translate in vitro hazard data in the context of real-world
exposures for risk assessment. Research will be conducted jointly with NERL to leverage
expertise and resources required to meet objectives of this multidisciplinary project.

 An important early component of ExpoCast™ will be to consider how best to consolidate and
link human exposure and exposure factor data for chemical prioritization and toxicity testing.
Under the ExpoCast™ program NCCT and NERL scientists will collaboratively:
   •   Evaluate and recommend approaches for improving accessibility to EPA human exposure
       and exposure  factor data and for facilitating links between exposure and toxicity data
       (e.g., through DSSTox and ACToR systems);
   •   Advocate for the creation of a consolidated EPA exposure database focused on measured
       and predicted concentrations in exposure and biological media;
   •   Propose standards for human exposure data representation.
Early research activities will focus on identifying and evaluating novel approaches for
characterizing exposure to prioritize chemicals and developing modeling approaches for
considering exposure potential based on chemical properties, sources (e.g., consumer products),
uses, lifecycle, and individual/population vulnerability. Specific tasks pertinent to these goals
include:
   •   Analysis of extant exposure data to identify the  critical metrics and develop simple
       indices for representing biologically-relevant personal exposure over time, place,
       lifestage, and lifestyle or behavior.
   •   Development of novel approaches  for characterizing biologically-relevant exposure to
       prioritize chemicals, some examples:
          o   Application of residential models for prioritizing SVOCs
          o   Development of dermal uptake model suite.
          o   Application of biomonitoring equivalent (BE) approach to interpret ToxCast™
              data.
   •   Development of human exposure knowledgebase.
   •   Application of genomic tools and other biomarkers of exposure and susceptibility to
       consider population-level vulnerabilities for toxicity testing and risk assessment.
                                           A-l.f.2
                         Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009


NCCT Partnerships/Collaborations (Internal & External): The ExpoCast™ project will be
conducted in close collaboration with the ToxCast™ program and through extensive
collaboration with NERL principle investigators in the Human Exposure Research Program.
Integration of research directed by the both NCCT and NERL will be enhanced by activities such
as the Computational Toxicology Rotational Fellow Program. To jumpstart joint research
activities, a NERL investigator will join the rotational fellows program in the summer of 2009 to
focus on improving access to human exposure data. Partnerships with the other labs and centers
will also be developed as the ExpoCast M program advances. For example, collaboration with
NHEERL on the Mechanistic Indicators of Childhood Asthma (MICA) study is providing the
opportunity to pilot advanced computational approaches for evaluating multi-factorial biomarker
data (including genomic data) across the exposure-outcome continuum to investigate the
interplay of environmental and genetic factors on complex disease.
       External collaborations include research with Dr. John Little of Virginia Tech, to develop
improved tools for rapidly predicting exposure associated with SVOCs used or emitted in the
residential environment. Dr. Sean Hays and coworkers of Summit Toxicology will be exploring
application of the Biomonitoring Equivalent (BE) approach for interpretation of ToxCast™ data.
In a collaboration established through Bio-chem Redirect  Program and implemented through the
ISTC (International Science and Technology Center), Dr.  Petr Nikitin of the Natural Science
Center (NSC) of A.M. Prokhorov General Physics Institute, Russian Academy of Sciences  is
leading research to develop a multi-channel immunosensor for detection of pyrethroids. The
primary goal of this project is development of a biochip technology that avoids sophisticated
labeling steps. This is an example of the type of research required to address needs for advanced
exposure monitoring tools. In collaboration with the ICCA-LRI, we are participating in planning
of a workshop focused on developing innovative tools to characterize biologically-relevant
environmental exposures and implication of these for health risks. Finally, in collaboration  with
the Environmental Bioinformatics STAR center at UMDNJ, we will be presenting a symposium
at the ISES 2009 annual meeting.
       Input to and feedback on ExpoCast™ will be solicited through the ExpoCoP (Exposure
Science Community of Practice) to facilitate integration and collaboration across the broader
scientific community. ExpoCoP includes representatives from the ORD labs and centers, Agency
program offices, other federal government agencies, academia, industry, and environmental
advocacy groups. International representatives also participate in the ExpoCoP.

Milestones/Products:
FY09
• EHP paper with NERL, "Exposure as part of a systems approach for assessing risk"
• Tox Sci Forum paper "Biologically-Relevant Exposure for Toxicity Testing"
• ExpoCoP monthly teleconference, ESC resource, face-to-face meeting at ISES 2009
• ICCA-LRI workshop, "Connecting Innovations in Biological, Exposure and Risk Sciences:
Better Information for Better Decisions"
• SVOC workshop, "Semi-Volatile Organic Compounds (SVOCs) in the Residential
Environment"
• ISES 2009 Annual Conference, "Transforming Exposure Science for the 21st Century"
• Symposium at ISES 2009,  "Integrative Exposure Biology and Computational Toxicology for
Risk Assessment"
                                          A-l.f.3
                        Previous   I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
• Survey and identify high priority exposure data resources for initial chemical indexing in
collaboration with ACToR and DSSTox
• ExpoCast™ conceptual framework and research plan

FY10
• Workshop to review/evaluate current exposure prioritization tools
• Position paper recommending standards for exposure data representation.
• White paper defining exposure space and plan for assessing exposure data landscape.
• White paper exploring development and application of human exposure knowledgebase.
• Begin implementation of standards across exposure databases of highest interest and utility for
NCCT projects

FY11
• Manuscript describing extant data analyses to identify critical determinants for exposure
classification and chemical prioritization based on potential for exposure.
• Guide incorporation and further development of simple exposure estimation tools within the
ACToR system for use in prioritization.

FY12
• Apply exposure index for prioritization to subset of ToxCast compounds to evaluate concept.

Keywords  (three to five): Exposure Science, Chemical Prioritization and Toxicity Testing,
Vulnerable Populations, Susceptibility

QA Project Plan: Category III. Quality Assurance (QA) of modeling projects serves at least
two overlapping goals: 1) Verification - Reproducibility of results is essential for the scientific
method, and 2) Continuity - Proper documentation of results allows future researchers (or the
same researcher after a long period of time) to return to a project without excessive amounts of
time spent understanding what was done before. All QA plans are archived in the EPA internal
QA system.
                                           A-l.f.4
                         Previous  I    TOC

-------
g. Virtual Embryo

Lead/Principal Investigator:  Thomas B. Knudsen, PhD

Research Issue/Relevance:  Research issue: The Virtual Embryo project (www.epa.gov/ncct/v-
Embryo/) is motivated by scientific and regulatory needs to understand mechanisms of
developmental toxicity. A key research issue is to model how the embryo reacts to
environmental chemicals as a 'complex system.' Navigating this complexity requires detailed
knowledge of molecular embryology, data on cellular systems using high-throughput (HTP)
screening approaches, and computational models of network dynamics and multi-cellular
function.

Morphogenesis entails a dynamic tissue flow that is driven by conserved cell signaling pathways
and cellular reaction networks that follow these pathways during stimulus, mutation or injury [1].
Our strategy is to modularize the embryo as a collection of tractable models that represent the
cell as a computational unit [2]. In this strategy, the cell is treated as an autonomous agent that
processes local signals and selects from a repertoire of core behaviors that include growth,
differentiation, mitosis, apoptosis, migration, adhesion, and cell-shape changes. Specific rules for
signal-response are programmed for molecular pathways, cellular dynamics, and unique biology
per morphogenetic system [3].  Sophisticated imaging techniques can reveal complex dynamics
of cell-cell relationships and a 'morphogenetic blueprint' of early development [4].

Although much is known about molecular signaling networks  that drive morphogenesis,
considerably less is known about the nature of 'higher-order' processes that control collective
cellular behavior [3]. In complex systems, molecular networks can invoke higher-level processes
through signal-input and response-output relationships as determined by the timing and function
of signal strength and dynamic range [5]. We take this hypothesis in the context of
environmental stressors to embryonic development. Understanding network state relationships
will be required to predict non-linear dose-response relationships and when a breakdown of
higher-order control systems may occur [6]. Cell-based computational models have been used to
predict emergent properties that arise from cooperative transactions of cell groups behaving as a
self-regulating system [7]. Homeostasis, adaptation, and repair are a few examples of emergence
in a perturbed system [8].

Relevance: A new strategy for developmental toxicity testing involves screening multiple
chemicals through cell-based in vitro assays [1]. The goal is to build robust signatures of toxicity
that translate into in vivo predictions [9]. Because tissues are more than a cumulative sum of
individual cell behaviors, computational models can accelerate this effort [7]. A 'virtual tissue'
(VT) representation reconstructs a broad range of biological responses following cell-based rules
and systems-level controls. VT models can rapidly  sweep parameter space following chemical or
genetic perturbation to predict aggregate cellular behaviors and higher-order responses [10].
                                           A-l.g.l
                         Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
Purpose/Objective/Impact: Purpose: EPA's Virtual Embryo (v-Embryo™) will comprise a
knowledgebase (VT-KB) of relevant information and simulation engine (VT-SE) on the front
end of the modeling software. Proof-of-principle is underway to explore the range of potential
applications where comprehensive simulation is reliable and to find the limit in scale or stages
where studying the real embryo is not cost-effective. Computational models have been built by
others for in silico reconstruction of chondrogenesis [11], gastrulation [12], angiogenesis [13],
and somitogenesis [14]. These models were implemented as hybrid cellular automata using
CC3D open-source tissue simulation environment [www.CompuCell3D.org] which is being
evaluated for the Virtual Embryo. Initial models will focus on specific morphogenetic systems
that replay important concepts in experimental embryology and that are targets in developmental
toxicity. The end goal is a library of computer-driven simulations that can be manipulated in
silico and correlated with in vitro responses or in vivo phenotypes to predict developmental
toxicity. Applications for developmental toxicity align with EPA's strategy on the future of
toxicity testing [15] and can leverage unique pathway-based data for numerous chemicals tested
in mouse embryonic stem cells (mESC), free-living zebrafish (ZF) embryos, and ToxCast™
assays [16] to:

   •   Simulate key signaling pathways, interlocking genetic networks and cellular dynamics in
       developing tissues;

   •   Model how embryonic cells react as agents to chemical exposure individually, and
       collectively as a complex system;

   •   Analyze emergent behaviors and canalizing influences following stimulus / injury /
       perturbation; and

   •   Understand how this complexity contributes to the differential susceptibility of
       embryonic tissues across chemical, dose, stage, genetic makeup and time.

Objective: The main objective is to build data-rich models that can be used to analyze causal
relationships during environmental and genetic perturbation. One initial prototype will focus on
the powerful sine oculis network that controls early eye development - conserved morphogenetic
pathways [17], patterns of malformation and sensitivity to chemical perturbation [18]. There are
many advantages of focusing on eye malformations and specifically the molecular events related
to Pax6 leading to this endpoint: strong knowledgebase, relevant human phenotypes, and
underlying genetic susceptibility and environmental sensitivity. The underlying molecular
pathways and signaling networks scale to tissue-level developmental effects in many different
systems. A second prototype will focus on the patterning systems controlling early limb-bud
development. The specific objectives are as follows.

1. Build knowledgebase (VT-KB) and front end simulation engine (VT-SE) for developmental
   processes and toxicities.

   Rationale: VT-KB is required to initially store gene-gene and gene-phenotype  associations
   that will be used to provide the rules for cell-based modeling in the VT-SE. The framework
                                          A-l.e.2
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009


   will rely on information from the literature, data from national repositories (GO, EMAGE,
   GXD, MPO, ZFIN, OMIM) and pathway analysis software. VT-KB design, development and
   implementation are supported through ITS-ESE contract No.: 68-W-04-005, Task Order No.
   058: Technical Support for Development of Developmental Systems Toxicity Network
   (DevToxNet). Perl-scripts written for information extraction and assisted curation of relevant
   facts from scientific literature returned an output matrix loaded to MySQL. Initial application
   wll build gene-gene and gene-phenotype associations during eye and limb development for
   rat, mouse, zebrafish, or human species. VT-SE comprises a front end to the modeling
   software (DDLab, CC3D, C++, Python, Blender3D, GanttPV). An interactive tool is being
   developed with support from the Environmental Modeling and Visualization Laboratory
   under ITS-ESE contract No.: 68-W-04-005, Task Order No. 02: Virtual Embryo: Simulation
   and Visualization Project Management Plan. Initial application of VT-SE is to construct a
   cell-based, network-driven model for lens-retina induction that reconstructs the cellular
   dynamics and morphogenetic blueprint of ocular dysmorphogenesis. This model can be
   quantitative in terms of the degree of severity of chemical-induced defects (graded dose
   response) and relative risk for incidence of responding embryos (quantal dose response).

2.  Construct cell-based computational models for prototype morphogenetic processes and
   embryonic modules.

   Rationale: The second Specific Aim use the VT-SE to model modular embryonic systems,
   specific morphogenetic events and their perturbation. The strategy will apply agent-based
   models (ABMs). Initial prototype (optic cup, limb-bud) have well-characterized signaling
   networks and differential  susceptibility to chemical disruption; other systems will be added
   over time. Both small prototypes are  organized by complex self-regulating networks of signal
   molecules commuted from cell signaling centers and described mathematically as Turing
   gradients. Prevailing models  entail reciprocal induction in which heterotypic interactions
   between presumptive lens epithelium and prospective neural retina lead to formation of the
   optic cup, and interactions between apical ectodermal ridge and underlying mesenchyme
   drive polarized outgrowth of the paddle-shaped limb-bud. Both processes are organized by a
   self-regulating network of genes and signaling gradients (FGFs, BMPs, SHHs). Whereas the
   dual-reciprocating models set the stage for emergence of the optical  neuraxis and
   appendicular skeleton, respectively, they do not explain higher-order processes that  control
   geometry and size of these rudiments, nor do they account for differential susceptibility to
   teratogens [18]. For this purpose, we propose the extended cellular large-Q Pott's model
   (CPM) implemented in CC3D [19] and managed with Python software as part of the VT-SE.

3.  Specify rules for component interactions of developmental pathways at the cellar and
   molecular scales.

   Rationale: Network structures for regulatory pathways and cellular systems will be portrayed
   using a Boolean (on-off) formalism. A Boolean Network (BN) qualitatively captures system
   behavior probabilistically (PBN) or deterministically (DBN): the former is more biologically
   plausible whereas the latter is a modeling tool of the whole process,  which enables us to
   simulate, analyze, and manipulate different parts of the system. Both models incorporate
   rule-based dependencies for gene-gene and cell-cell interactions that can be built with
                                          A-l.e.3
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
   information from the VT-KB. States of the network as determined by wiring and rules will be
   characterized using DDLab software, which identifies stable attractors in complex systems
   [20]. The attractor concept implies that a finite number of stable cell states exist in a complex
   system as pathways of differentiation or canalization. Order and timing has great importance
   for predictive modeling genetic errors or cellular disruptions in the embryo caused by phase-
   specific chemical effects. As such, the ability to introduce pre-defined or stochastic lesions
   will enhance the functionality of VT-SE. Virtual Embryo is building a prototype interface
   that manages the order and timing of gene expression and signaling events for Python-based
   components in the VT-SE. This tool is being developed under ITS-ESE contract No.: 68-W-
   04-005, Task Order No. 02: Virtual Embryo - Simulation and Visualization.

4. Analyze abnormal developmental trajectories predicted from cellular modes of action that
   follow chemical perturbations.

   Rationale: A high-fidelity computer program that links cellular processes with network-level
   function can be evaluated for its capacity to  evolve features not explicitly coded in the cell-
   based model. As noted above, this emergence is important for manifesting the response to
   genetic errors and cellular disruptions that are introduced to the model as targets of
   developmental toxicity, based on simulated and experimental data. Such a computational
   model can reveal an interaction of mechanisms at the cellular and molecular scales to
   produce  emergent phenomena that manifest  as abnormal developmental phenotypes [10].
   Rules will derive from simulated data and semi-arbitrary parameters in ocular or related
   systems  since it will take a major experimental effort to model parameters kinetically for all
   relevant  pathways, reactants and interactions. Eventually, such information would be helpful
   to build a quantitative model. CC3D software advancements will be needed to implement
   molecular motors for core cell behaviors and to parallelize this implementation.

Impact: This research aims to improve mechanistic understanding and predictive modeling of
developmental toxicity. Biological models that are simple enough so as to be computationally
feasible (tractable) and yet complex enough to compute integrated cellular behaviors (rational)
can reveal key events in multi-cellular organization,  classify abnormal developmental trajectories
from genetic network inference, and predict chemical dysmorphogenesis from pathway-level
data. The initial focus on existing data,  with use of the modeling effort to identify data gaps and
to help guide the design of experiments for generating additional data  as needed, can produce
results that provide significant new information  on likely dose-response and time-course
behaviors of developmental toxicants. Most developmental modeling has been qualitative,
showing the link between fundamental  processes and morphogenesis.  Virtual Embryo is moving
to a different, quantitative level through knowledge of molecular embryology and pathway-level
data from high-throughout screening efforts. That resource can have an impact on HTP
hypothesis testing (parameter sweeps) to inform experimental design,  or to dry-run intractable
experiments complicated by time, scale, and cost (monetary, animal). Over the short-term, we
anticipate the work will draw greater attention to integrative thinking,  the application of
computational models to understand mechanisms, and approaches for  uncertainty analysis and
understanding how large uncertainties about parameter values will affect quantitative prediction.
Expanding the prototype models to broader representation of stages, tissues and species will be
an intermediate step towards the more visionary reconstruction of a 'Virtual Embryo'.
                                           A-l.g.4
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
Synopsis: Motivation of the Virtual Embryo is scientific needs to understand mechanisms of
toxicity and predict developmental defects from complex datasets. The research goal is to
simulate embryonic tissues reacting to perturbation across chemical class, system, stage, genetic
makeup, dose and time. Data input is detailed knowledge of molecular embryology, high-
throughput data from in vitro models, signaling pathways, and cellular phenotypes. Output
models aim for the modular reconstruction of a developing embryo from cell-based models of
morphogenesis and differentiation.

Milestones/Products:
FY09-10
    1.  project plan: Category III QAPP
    2.  recruit: postdoctoral fellow
    3.  manuscript: application of VT-KB to analyze ToxRefDB developmental toxicity studies
    4.  model: VT-KB based qualitative (structural) model of self-regulating ocular gene
       network
    5.  model: VT-SE based cell-based computational model of lens-retina induction
    6.  manuscript: ocular morphogenesis, gene network inference, analysis and modeling

FY10-11
    1.  project plan: extend lens-retina model to other stages and species
    2.  model: incorporate pathway data from ToxCast™, mESC  and ZF embryos
    3.  manuscript: sensitivity analysis for developmental trajectories and phenotypes
    4.  project plan: integrate with other morphogenetic models (ES cells, Zfish)

FY11-12
    1.  manuscript: test model against predictions for pathway-based dose-response relationship
    2.  manuscript: uncertainty analysis of models for complex systems
    3.  model: computer program of early eye development using rules-based architecture, cell-
       based simulators and systems-wiring diagrams

Keywords (three to five):  embryo development; systems biology; computational modeling

QA Project Plan: Category III. The proposed designation for Virtual Embryo is Quality
Assurance Category III. This designation recognizes its origin as a basic research project
(Category IV) that is moving into proof of concept phase (Category III). A Virtual Embryo
Quality Management Plan (QMP) will be constructed as the project moves into the proof of
concept phase.

Phase-I (development): first-generation ABMs based on the small prototype systems of lens
induction and polarized limb outgrowth (2009-10).

Phase-II (evaluation): sensitivity analysis using data for ToxCast™ chemicals that disrupt  eye
and/or limb development or Tox21 assays of relevant signaling pathways (2010-11).
                                          A-l.e.5
                        Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
Phase-Ill (expansion): uncertainty analysis of quantitative models that simulate reaction to
perturbation across chemical, system, stage, genetic makeup, dose and time (2011-12).

The anticipated Date of Elevation to Category-II is 2012. This is based on the premise that
research enabled by these models will reduce uncertainty in risk assessment for prenatal
developmental toxicity, through understanding of complex mechanisms of environmental
chemicals and their impact on complex developing systems. We also anticipate a successful
Virtual Embryo in the long-term can reduce the reliance on animal testing for prenatal
developmental toxicity. Many of the 10,000 chemicals EPA is concerned with do not have such
information available.

 References

1.  National Research Council, Committee on Toxicity and Assessment of Environmental Agents
   (2007). Toxicity Testing in the Twenty-first Century: A Vision and a Strategy. Washington,
   DC: National Academies Press Qittp://www.nap.edu/catalog/11970.html).

2.  Thorne BC, Bailey AM,  DeSimone DW and Peirce SM (2008) Agent-based modeling of
   multicell morphogenetic processes during development. Birth Defects Res (part C) 81: 344-
   353

3.  Lewis J (2008) From signals to patterns: space, time, and mathematics in developmental
   biology. Science 322: 399-403

4.  Keller PJ, Schmidt AD, Wittbrodt J and Stelzer EHK (2008) Reconstruction of zebrafish early
   embryonic development by scanned light sheet microscopy. Science 322:  1065-1069

5.  Janes KA, Reinhardy HC and Yaffe MB (2008) Cytokine-induced signaling networks
   prioritize dynamic range over signal strength. Cell 135: 343-354

6.  Andersen ME, Yang RSH, French CT, Chubb LS and Dennison JE (2002) Molecular circuits,
   biological switches, and nonlinear dose-response relationships. Env Hlth Persp 110: 971-978

7.  Knudsen TB and Kavlock RJ (2008) Comparative bioinformatics and computational
   toxicology. In: Developmental Toxicology Volume 3, Target Organ Toxicology Series. (B
   Abbott and D Hansen, editors) New York: Taylor and Francis, Chapter 12, pp 311-360

8.  Basanta D, Miodownik M and Baum B (2008) The evolution of robust development and
   homeostasis in artificial organisms. PLOS Computat Biol 4(3): e!000030

9.  Martin MT, Houck KA, McLaurin K, Richard A and Dix DJ (2007) Linking regulatory
   toxicological information on environmental chemicals with high-throughput screening (HTS)
   and genomic data. The Toxicologist CD - J. Soc Toxicol 96: 219-220

10. Andersen T, Newman R and Otter T (2009). Shape homeostasis in virtual embryos.  Artificial
   Lifel5(2):161-183.
                                          A-l.g.6
                        Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                BOSC Review Draft- 24 August, 2009
11. Chaturvedi R, Huang C, Kazmierczak B, Schneider T, Izaguirre JA, Glimm T, Hentschel HE,
   Glazier JA, Newman SA and Alber MS (2005) On Multiscale approaches to three-dimensional
   modeling of morphogenesis. JR Soc Interface 2: 237-253

12. Cui C, Yang X, Chuai M, Glazier JA and Weijer CJ (2005) Analysis of tissue flow patterns
   during primitive streak formation in the chick embryo. Developmental Biol 284: 37-47

13. Mahoney AW, Smith BG, Flann NS and Podgorski GJ (2008). Discovering novel cancer
   therapies: a computational modeling and search approach. In: IEEE Symposium on
   Computational Intelligence in Bioinformatics and Bioengineering. Sun Valley, ID: CIBCB.

14. Glazier JA, Zhang Y, Swat  M, Zaitlen B and Schnell S (2008) Coordinated action of N-CAM,
   N-cadherin, EphA4, and ephrin B2 translates genetic prepatterns into structure during
   somitogenesis in chick. Curr Topics Devel Biol 81:205-247

15. Future of Toxicity Testing Workgroup, US EPA (2009) The U.S. Environmental Protection
   Agency's Strategic Plan for Evaluating the Toxicity of Chemicals
   http://www.epa.gov/osa/spc/toxicitytesting/

16. Chapin R, Augustine-Rauch K, Beyer B, Daston G, Finnell R, Flynn T, Sunter S, Mirkes P,
   O'Shea KS, Piersma A, Sandier D, Vanparys P and van Maele-Fabry G (2008) State of the art
   in developmental toxicity screening methods and a way forward: a meeting report addressing
   embryonic stem cells, whole embryo culture, and zebrafish. Birth Defects Res (Part B) 83:
   446-456

17. Chow RL and Lang RA (2001) Early eye development in vertebrates. Annu Rev Cell Dev Biol
   17: 255-296

18. Green ML, Singh AV, Zhang Y, Nemeth KA, Sulik KK and Knudsen TB (2007)
   Reprogramming of genetic  networks during initiation of the fetal  alcohol syndrome. Devel
   Dynam 236: 613-631

19. Cickovski TM, Huang C, Chaturvedi R, Glimm T, Hentschel HGE, Alber MS, Glazier JA,
   Newman SA and Izaguirre JA (2005) A framework for three-dimensional simulation of
   morphogenesis. IEEE/ACM Trans Comput Biol Bioinfor 2: 1-15

20. Wuensche A (1996) Discrete dynamics Lab (DDLab) Available from [http://www.ddlab.com]
                                         A-l.e.7
                        Previous  I    TOC

-------
h. The Virtual Liver Project: v-Liver™

Lead/Principal Investigator:  Imran Shah

Research Issue: The Virtual Liver project (http://www.epa.gov/ncct/virtual liver) is aimed at
providing decision support tools for evaluating chemical-induced adverse liver outcomes across
chemicals, doses and species using in vitro data. Considering nuclear receptor (NR) mediated
liver cancer as an archetypal chronic adverse outcome, we focus on the research issues: Which
molecular circuits and cellular states altered by chemicals lead to cell damage, death and
proliferation? How are these cellular perturbations propagated across tissues as lesions? How can
we organize this complexity computationally to develop Virtual Tissues?

Two important perturbed cellular phenotypes, or states, in carcinogensis are: (i) initiation, in
which chemical mutagens cause DNA damage rendering a cell resistant to apoptosis, inhibition
of cell proliferation; and (ii) promotion, in which mitogenic signals persistently stimulate the
initiated cell creating focal proliferation. Increasing evidence suggests that the nuclear receptor
(NR) superfamily mediates rodent hepatocarcinogenesis for a number of environmental
chemicals (Butler 1996). For example, di(2-ethylhexyl)-phthalate (DEHP) and perfluorooctanoic
acid (PFOA) are PPAR-a activators (Maloney and Waxman 1999); while pesticides like
conazoles and pyrethroidsactivate either PXR or both PXR and CAR (Kretschmer and Baldwin
2005). The role of NR mediated activity in molecular circuits is being actively explored through
genomic profiling,  and the dose-dependence of specific molecular switches is being assayed
across hundreds of environmental chemicals in ToxCast™.

Propagating cellular alterations spatially requires information flow between cells, which
normally occurs on a backdrop of microanatomic spatial zones with heterogenous levels of
nutrients and distinct spatial distribution of intracellular states (Pette and Wimmer 1979;
Oinonen et. al. 1998). The microanatomic distribution of xenobiotics causes zonal alterations in
cell states (Kato et. al. 2001) that can progress to cell injury and even death. Hepatocyte (HC)
death stimulates neighbouring cells to replicate (regenerative proliferation). Necrotic death can
also lead to Kupffer cell (KC) activation, migration and release of inflammatory cytokines,
which can locally accelerate cell injury. There is evidence for such HC-KC interactions in
PPAR-a mediated hepatotoxicity and cancer (Rusyn 1998; Roberts 2007). Mitogens have also
been shown to disrupt gap junction communication between cells (Krutovskikh et. al. 1995),
which can reduce their homeostatic capacity. Advanced imaging, histomorphometry (HMP) and
molecular assays are making it feasible to extract local information on cells in a microanatomic
context.

To computationally model this level of biological complexity requires some simplifying
assumptions about  the modular organization of physiologic events across scales.  We hypothesize
a cell-oriented abstraction for developing "Virtual Tissues" with the following assumptions: (a)
tissues can be represented as a complex cellular system;  (b) cells are the unit of function and  can
be modeled as autonomous agents that use molecular circuits to make decisions;  and (c) injury is
a collective response of the multi-agent system to persistent stress.
                                           A-l.h.l
                         Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                  BOSC Review Draft- 24 August, 2009
Relevance:  Current approaches for assessing the risk of adverse effects in humans are based on
animal testing, which is time consuming, resource intensive and fraught with uncertainty. Novel
strategies are necessary to efficiently and effectively evaluate the risk of thousands of
environmental chemicals. Integrative computational systems, in vitro models and assays offer an
avenue for more cost-effective and humane alternatives for the future of toxicity testing. Liver
toxicity is a frequent outcome in rodent testing and it is difficult to evaluate its relevance in
humans.

Purpose: The v-Liver™ will provide in silico decision support tools to: (a) analyze the mode-
of-action in light of available data and prior knowledge, and (b) quantitatively simulate an MOA
at environmentally relevant tissue doses. This will inform/evolve biologically-based dose-
response models to include more relevant physiologic details necessary for predicting the human
risk of injury at low doses.

Objective: The primary objective is to develop an integrated in silico/in vitro framework that:
(a) aids intelligent hypothesis generation about the plausible sequence of molecular, cellular and
tissue events perturbed by a test chemical, and (b) quantitatively simulates the risk of these
events in humans at environmentally relevant tissue doses using in vitro data. The v-Liver™
proof-of-concept (PoC) will focus on a subset of 20 environmental chemicals with known rodent
toxicity in ToxRefDB and in vitro data in ToxCast™. There are two specific goals of the PoC:-

/.  The v-Liver™ Knowledgebase (KB).
   Knowledgebased, or semantic, approaches (Karp 2001) are important for computationally
   modeling incomplete and evolving insight on complex processes. They enable integration of
   disparate biological information from literature, -omic data, or pathway databases at different
   scales into coherent computable representation that is flexible, extensible and transparent. A
   more important advantage of semantic approaches is their support for automated reasoning,
   which is important  for inferring plausible sequences of events perturbed by new chemicals.
   Large-scale knowledgebased  approaches have become feasible due to semantic web
   technology and available ontologies for different levels of biological organization. The v-
   Liver™ Knowledgebase (v-Liver-KB) will represent normal hepatic functions and their
   perturbation by chemical stressors into pathophysiologic states using description logic
   expressed in OWL  and stored in the Sesame semantic repository. To facilitate the
   construction of the  v-Liver-KB, a Cytoscape plugin is being developed to synthesize
   information from different biological databases into OWL using a custom ontology.  This
   system also supports SPARQL-based queries, interactive visualization and information
   export in RDF. Information about the 20 PoC chemicals will be represented in the KB at
   multiple biological levels describing events and causal relationships between these events
   based on evidence from experiments or the literature. The main outputs of the KB will be:
       1. A computable logical  description of the molecular circuits and cellular states involved
          in normal hepotocyte and Kupffer cell function based on literature
       2. A computable logical  description of perturbations in molecular switches and cell
          states due to by PoC chemicals.
       3. Interactive web-based tools to browse and interactively query/explore the v-Liver-KB
          to analyze alternative MOA in light of HTS, omic or other cell based assays.
                                           A-l.h.2
                         Previous   I     TOC

-------
  EPA CompTox Research Program FY2009-2012                  BOSC Review Draft- 24 August, 2009
       4.  Intelligent inference tools to explore alternative pathways perturbed by chemicals
          based on existing information on partial orders in the KB.

//.  The v-Liver™ Simulator (Sim).
    The spatial model of the hepatic lobule will be developed using a multi-agent system (MAS)
    (Axelrod, 1997; Epstein and Axtell, 1996; Athale et. al.. 2005) in which hepatocytes and
    Kupffer cells will be modeled as autonomous agents. Information on molecular circuits in
    the KB will be used to describe chemical-induced molecular perturbations of nuclear
    receptors (NRs) namely, CAR, PXR and PPAR-a. We are developing a variation of
    Probabilistic Boolean Networks (Kauffman 1993; Shmulevich et. al. 2002) to describe the
    dynamics of individual agent decisions regarding state of stress, injury, cell cycle
    progression, apoptosis, necrosis or migration (KC).  These will be augmented and calibrated
    using available literature and/or ToxCast™ data on  the PoC chemicals. We believe this work
    will advance the two-stage clonal growth models of cancer (Conolly and Andersen 1997) by
    including relevant information on molecular pathways and cell-communication. The agent
    population will be initially situated in 2-dimensional regular spatial grid to model in vitro
    conditions and a simplified cross-section through the hepatic lobule. Portal to centrilobular
    blood flow will be initially represented as a gradient of nutrients and xenobiotics (estimated
    from organ dose), which can be extended to model more complex flows if necessary.
    Simulating the MAS will generate a spatial distribution of cellular alterations that can be
    interpreted as tissue lesions. Hence, the v-Liver™ Simulator (v-Liver-Sim) will dynamically
    simulate the key molecular and cellular perturbations leading to adverse effects in hepatic
    tissues. The predictions will be evaluated for the PoC chemicals using ToxCast™ data.  The
    main outputs of the v-Liver™ Simulator are:

          •  A large-scale tissue simulation engine to enable quantitative exploration of
              alternative physiologic processes and their histopathologic outcomes.
          •  A computational interface to integrate the tissue simulator with a PBPK modeling
              system to investigate individual exposure / population variability.
          •  Interactive tools to communicate with the simulation engine and to visualize
             results of simulations across chemicals, MOAs, doses and species.

Impact:  The v-Liver™ will impact the future of toxicity testing by providing computational
tools to explore the mode of action for new environmental chemicals using background
knowledge, chemical structure, and/or in vitro assays, and to provide an initial assessment of
hepatic lesion formation. Focusing on 20 NR activating environmental chemicals and their
hepatic lesions through a subset of molecular pathways  will demonstrate the v-Liver ™proof-of-
concept (PoC). The project is also expected to  contribute to on-going assessments of pesticides
and persistent toxics by providing useful information about the human relevance of any liver
effects and their putative dose-response. If successful, the Virtual Tissues will be able to leverage
available screening data  from ToxCast™, fill any data gaps with targeted studies and reduce the
time, the cost and the requirement for as many animal studies.

Synopsis: The v-Liver™ computational paradigm represents tissues as cellular systems in
which discrete individual cell level responses give rise to complex physiologic outcomes. In this
model cell level responses are governed by a self-regulating network of normal molecular
                                           A-l.h.3
                         Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009


processes and adverse histopathologic effects arise due to chronic stimulation by environmental
chemicals. The v-Liver™ PoC is being developed by: (i) focusing on environmental chemicals
responsible for hepatocarcinogenesis in rodent studies; (ii) organizing mode-of-action (MOA)
knowledge on  the relevant molecular and cellular processes perturbed by these chemicals; (iii)
developing a tissue simulation platform to investigate the uncertainties in MOA and neoplastic
lesion formation; and (iv) evaluating in vitro assays to predict lesion development across
chemicals and  doses.

Partnerships/Collaborations (Internal & External): EPA Collaborations: NCCT  ToxCast™
in vitro assays  and their linkage with physiologic outcomes; NCCT v-Embryo™ and cell level
modeling of tissue responses; NCCT/NHEERL/PBPK modeling to infer internal dose; NHEERL
Genomics Core on MoA for PPAR-a activators; NHEERL PK Branch on hepatic xenobiotic and
T3/T4 metabolism. External Collaborators: UNC Carolina Center for Computational Toxicology;
UMDNJ PBPK Modeling.

Milestones/Products:
 FY09
    1. Prioritize proof of concept (PoC) environmental chemicals with clients
    2. KB: Information about PoC chemicals using ToxCast assays
    3. KB: Cytoscape KB visualization and analysis tool
    4. Cell response: Initial molecular circuit describing hepatic cell functions
    5. Tissue Simulator: Develop / use MAS framework

FY10
    1. Tissue Simulator: Test liver lesions formation
    2. Integrate molecular circuits for MOA chemicals in Tissues
    3. Evaluate simulator using PoC chemicals and ToxCast™ data to predict outcomes

FY11
    1. KB inference tool for analyzing MOA for new chemicals/mixtures
    2. Extend lobule simulator to liver and integrate with PBPK model

FY12
    1. Evaluate impact of genomic variation on cellular responses and lesion formation
    2. Evaluate v-Liver™ for simulating human pathology outcomes using  clinical data

Keywords (three to five): Virtual tissues; knowledgebases; mode-of-action modeling; dose-
response modeling; nuclear receptors mediated hepatocarcinogenesis.

QA Project Plan: Category III. QA of modeling projects serves at least two overlapping goals:
1) Verification - Reproducibility of results is essential for the scientific method, and 2)
Continuity - Proper documentation of results allows future researchers (or the same researcher
after a long period of time) to return to a project without excessive amounts of time spent
understanding  what was done before. Since the v-Liver™ project requires multiple researchers
working over several years, ensuring both continuity of modeling efforts and reproducibility of
modeling results is vital. All QA plans are archived in the EPA internal QA system.
                                          A-l.h.4
                         Previous  I    TOC

-------
i. Uncertainty Analysis in Toxicological Modeling

Lead/Principal Investigator:  R. Woodrow Setzer

Research Issue/Relevance:  The analysis of uncertainty in toxicological modeling is critical to
the EPA because the Agency is increasing its use of toxicological models in regulatory decisions,
and any use of model predictions in a rational decision process must consider the uncertainty of
those predictions. The recent National Academy report on risk assessment activities in the U.S.
EPA emphasizes the importance of incorporating a quantification of uncertainty in risk
assessments.

In the context of models, the easiest form of uncertainty to address is that about parameter
values, assuming we have the correct model. This is the sort of uncertainty that statistical
methodologies were designed to estimate, and is typically quantified through confidence
intervals or probability distributions. However, we are rarely completely confident about the
models we use, and often the uncertainty about the underlying processes taking place or the best
way to characterize those processes in a model can be quite substantial. This form of uncertainty
is prevalent throughout dose-response analysis, from simple empirical modeling used in
benchmark dose analysis, to pathway modeling used in virtual tissues, and is called model
uncertainty.

Both forms of uncertainty are quantified through comparisons of models with data, by
quantifying the degree to which different parameter values for a given model, and the best-fitting
parameter values among different models, yield model predictions that are consistent with the
data. "Consistency" is quantified through the mediation of an additional, statistical, model that
relates the biological model to the data by describing the variability in the data as it is affected by
the biological model.

In principle this process  is straightforward, and there are standard statistical  procedures for
carrying the process out. However, toxicological models comprise a wide array of modeling
techniques, for example, simple algebraic expressions, systems of ordinary or partial differential
or difference equations, and may involve agent-based or stochastic process models. Such models
can be quite complex, with many parameters whose values are known with varying degrees of
uncertainty. Statistical models usually need to accommodate multiple hierarchical levels of
variability: variation among studies and among individuals in addition to the usual measurement
error, which should be allowed to differ among  studies and endpoints. Information for estimating
parameters and evaluating models themselves may come from well-characterized experimental
data as well as from tabulated (for example, physiological parameters like organ weights) or
computed (for example,  computed partition coefficients for a physiologically-based
pharmacokinetic, or PBPK, model).

Thus, despite the existence of sound statistical theory, the application of good statistical practice
for these models can be difficult, requiring thoughtful application of both statistical and
computational expertise  and quite a bit of 'art' to get good results. The key challenges in
uncertainty analysis for toxicological models in risk assessment are to develop computationally
                                           A-l.i.l
                         Previous   I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
efficient, transparent, and statistically valid approaches that may be implemented without the
development of extensive case-specific programming.

Purpose/Objective/Impact: The objective of this project is to develop tools and best practices
to facilitate the quantification of uncertainty in toxicological models. Early efforts will focus on
PBPK models, specifically models being developed as part of a joint project between the
Agency's Office of Research and Development and Office of Pesticide Programs to develop a
cumulative risk assessment for the pyrethroid pesticides. However, similar methods are
applicable to other forms of toxicological models, and examples will be pursued in the virtual
tissue projects in NCCT, and in evaluating uncertainty in the ToxCast predictors.

In recent years,  it has become clear that Bayesian statistical methods are ideal for estimating
parameters for PBPK models, because of the ease with which partial information about
parameters is included, through informative priors; the ease with which hierarchical models are
constructed; and the fact that, as long as informative priors are used, lack of identifiability of
model parameters, which is not generally possible to diagnose a priori, is not an impediment to
completing a valid analysis (failure of identifiability of parameters may be diagnosed from
analyses of the posterior parameter distribution). The same logic should apply to other
toxicological models. Thus, this project will center on tools to apply Bayesian methods to such
models.

More specific objectives of this project are:
       Develop the computational tools to carry out Bayesian analyses of large dynamic models
       as efficiently as possible. The standard approach to evaluating the posterior in a Bayesian
       analysis of complex models is to generate random samples from the distribution by
       Markov  Chain Monte Carlo (MCMC) methods. For models that are expensive to
       compute, such as those (like PBPK models) expressed as solutions to  systems of
       differential equations, an implementation that takes advantage of cluster computing may
       take advantage of some parallel structures in the problem.  This objective will develop a
       standard computational approach to parallelizing such problems, and will explore
       alternative approaches to implementing MCMC methods with an eye  towards
       computational efficiency. This objective also includes the  development of modeling tools
       to facilitate dynamic models in the statistical language R.
        Describe a language specifically for expressing PBPK models. The language should be
       extensible, be capable of incorporating systems models expressed in SBML, and should
       use semantics specific to PBPK models to facilitate model checking. The language would
       be ideal  as a way to archive PBPK models and as the definitive way to communicate
       PBPK models in the literature.
       Adapt statistical model evaluation approaches to complex  toxicological models, and
       develop  examples  to demonstrate the behavior of different model evaluation
       methodologies in the face of various model failures.
       Develop a general approach to developing priors for chemical-specific PBPK model
       parameters. There are already methods for computing chemical specific parameters either
       from physical chemical properties or in vitro assays (depending upon  the parameter). For
       this objective, data sets of measured chemical-specific parameters will be compared to
       values predicted from computational or in vitro methods. Statistical methods such as
                                           A-l.i.2
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
       regression will be used to adjust the predictions, and the variance about the resulting
       regression lines used to characterize the prior uncertainty about such predictions.
       Develop approaches for quantifying uncertainty of ToxCast-like predictors involving
       HTS data as inputs, explicitly evaluating the importance of variability and design of HTS
       assays on prediction and prioritization uncertainty.
       Examples for parameter estimation, model evaluation and overall uncertainty analysis
       drawn from PBPK models for pyrethroid pesticides and the encompassing cumulative
       risk analysis being conducted in collaboration with NHEERL, NERL, and OPP; selecting
       among molecular and cellular models used in developing a virtual liver model, and others
       developed in collaboration with the virtual tissues projects in NCCT.

Synopsis: This plan will target three areas: standardizing and making more efficient
computational approaches for parameter estimation and model selection; standardizing
approaches for model evaluation for PBPK and other dynamic models; and development of
methods for constructing priors (probabilistic summaries of current knowledge) for model
parameters, based on existing computational methods and data sets. The initial motivation for
this work was the need to standardize and make more sophisticated parameter estimation and
model evaluation for PBPK models, and that emphasis will continue in the early phase of this
project. However, all models relevant  to toxicological risk assessment have similar
requirements, and this project will coordinate closely with the virtual tissue and dose-response
projects.

Partnerships/Collaborations (Internal & External):  Internal: Jimena Davis (post doc);
Richard Judson, John Wambaugh, Imran Shah, Thomas Knudsen.
External: ORD/NHEERL: Mike Hughes, Kevin Crofton, Tim Shafer, Ginger Moser, Rory
Conolly; ORD/NERL: Rogelio Tornero, Valerie Zartarian,  Xianping Xue; OPPTS/OPP: Anna
Lowit,  David Miller, Ed Scollon; NIEHS/NTP: Mike DeVito.

Milestones/Products:
FY09
    1.  Contribution to 2009 SOT CED course on "Uncertainty and Variability in PBPK
       Models".
   2.  Submission of ms(s) on improvement of computational efficiency in Bayesian  analyses
       of PBPK models using MCMC, and assessment of convergence (Wambaugh, Davis,
       Garcia, Setzer).
   3.  Submission of ms on assessing  fit of PBPK models  (and, through example, other
       complex mechanistic models), to data, whether the models are 'fit' to the data using
       Bayesian or other methods, or are parameterized a priori from in vitro data (Wambaugh,
       Davis, Garcia, Setzer).
   4.  Submission of ms reviewing approaches for constructing priors for PBPK model
       parameters (that is, establishing a priori estimates for model parameters in advance of
       using in vivo PK data, with a characterization of uncertainty) (Davis, Setzer, Tornero,
       DeVito).
   5.  Draft ms on the estimation of model parameters and comparison of alternative  model
       forms for a PBPK model for permethrin (in preparation of SAP review in FY10) (Davis,
       Setzer, Tornero, DeVito)
                                          A-l.i.3
                        Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
   6.  Draft ms on global sensitivity analysis for exposure-dose model for permethrin, in
       preparation for SAP review in FY10. (Davis, Setzer, Tornero, Zartarian, Chiu)
   7.  Estimation of PBPK parameters for deltamethrin and two other pyrethroids
       complete.(Davis, Setzer, Tornero)
       MS on permethrin PBPK parameter estimation and model comparison submitted
       MS on global sensitivity analysis for permethrin exposure-dose model submitted
       SAP review of permethrin PBPK exposure - dose model.
       Submission of ms on parameter estimation for deltamethrin and two other pyrethroids
       Completion of exposure-dose-effect model for  'mini-cumulative' risk assessment and
       draft document for SAP review in early FY11,  including global 'exposure to effect'
       sensitivity analysis (Davis, Setzer, Tornero, DeVito, Crofton, Shafer, Lowit, Scollon,
       Miller, ...)
   6.  Substantial completion of modeling and uncertainty analysis for vLiver (Setzer, Shah,
       Wambaugh)
   7.  R package "RDynamic", for simplifying dynamic modeling in the statistical language R,
       submitted to CRAN.
   8.  Description of ontologically-aware PBPK language, working name 'SemanticPK'
       drafted, and translator to R completed (using code developed for RDynamic).
   9.  Discussion of uncertainty about ToxCast phase I predictions at 2010 SOT.
   10. Problem identification and early stages of evaluating alternative pathway formulations in
       vLiver model (Setzer, Shah, Wambaugh).
   11. Problem identification, timelines, and early stages of evaluating alternative model
       formulation for BBDR and virtual embryo models completed (Setzer, Conolly, Knudsen).
FY11
    1.  SAP on 'mini-cumulative' risk assessment for pyrethroids
    2.  Preparation for SAP for cumulative risk assessment for pyrethroids.
    3.  publication of virtual tissue and BBDR modeling results
    4.  Initial steps in constructing test datasets for analysis "competition" recommended by
       UVPKM.
    5.  Submission of ms on "RDynamic" to the Journal of Statistical Software.

FY12
    1.  Submission of ms on 'SemanticPK'.
    2.  R package "RDynamic", for simplifying dynamic modeling in the statistical language R,
       submitted to CRAN.

Keywords (three to five): uncertainty analysis; physiologically based pharmacokinetic models;
Statistics; Bayes methods; prior; Markov Chain Monte Carlo

QA Project Plan: Category III. This project is a Category III QA category due to the
significance of the pyrethroid cumulative risk assessment, to which this plan contributes.
Several quality objectives apply to this project: 1) Computer code developed for the project must
faithfully execute the intent of the code, whether that intent is described mathematically, or in
                                          A-l.i.4
                        Previous  I     TOC

-------
  EPA CompTox Research Program FY2009-2012
               BOSC Review Draft- 24 August, 2009
terms of other software (for example, implementations of the same model in different
programming languages must give identical results for identical inputs); 2) Distribution versions
of software packages must be installable and useable by a reasonably sophisticated person; 3)
Analyses must be transparent: it must be possible to replicate all analyses from the archived files
and information; and 4) Data sources must be transparent; in particular, data from the literature
must be annotated to be adequate for purpose, and extent of literature searches must be
documented.  All QA plans are archived in the EPA internal QA system.
                         Previous
                                           A-l.i.5
TOC

-------
                           EPA CompTox Research Program FY2009-2012
IV. APPENDICES cont. -2. Project Outcomes Table
                   BOSC Review Draft- 24 August, 2009
1. Providing High Throughput Computational Tools for the Identification of Chemical Exposure, Hazard and Risk
Project Title







ACToR - Aggregated
Computational Toxicology
Resource




















DSSTox- Distributed
Structure- Searchable
Toxicity Database Network









Outputs/Outcomes
FY09
• Initial public
deployment.
• Significant version 2,
including refined
chemical structure
information.
• Develop workflow for
tabularization of data
buried in text reports.
• Integrate all ToxCast
and ToxRefDB data.
• Quarterly releases with
new data.



• MS: SAR Perspective
of ToxCast 320 chemical
inventory.
• MSs: NCBI (GEO) and
FBI (Array Express)
structure-annotations
and linkages to
microarray data.
• Restart
Chemoinformatics CoP.
• Publish files for
ToxRefDB and ToxCast
inventories and selected
summary endpoints, and
facilitate publication and
linkage to PubChem.
Publish public genetic
toxicity data and SAR
predictions for ToxCast
320.
• Continue expansion of
DSSTox public toxicity
database inventory.
• Primary chemical
review and structure
annotation of
ToxCast/Tox21 libraries
within a central registry.
Outputs/Outcomes
FY10
• Quarterly releases with
new data.
• Implementation of a
process to gather tabular
data on priority chemicals
from text reports.
• Survey sources of
chemical use and
exposure data and import
any remaining sources.
• Develop flexible query
interface and data
download process.
• Develop process to
extract data from open
literature.
• Publish files for Tox21
inventory and selected
summary endpoints and
facilitate linkage to
PubChem.
• Publish files for NTP
study areas.
• Explore new approaches
to SAR based on feature
categories.
• Expand CEBS
collaboration to
incorporate DSSTox GEO
and ArrayExpress files
and create chemical
linkage to ILSI
Developmental Toxicity
database.
• Assist efforts within
ExpoCast regarding
chemically annotatation.







Outputs/Outcomes
FY11
• Quarterly releases with
new data.














• Establish procedures
and protocols for
automating chemical
annotation of new
experimental data
generated by NCCT and
in collaboration with
CEBSorNHEERL.
• Document and employ
PubChem analysis tools
in relation to published
DSSTox and ToxCast
data.
• Collaborate with SAR
modeling efforts to
predict ToxCast
endpoints.
• Continue expansion of
DSSTox public toxicity
database inventory for
use in modeling with co-
publication and linkage
to ACToR and
PubChem.




Outputs/Outcomes
FY12
• Quarterly releases with
new data.














• Redesign DSSTox
website to provide
hosting of donated
chemical descriptors,
properties and
predictions.
• Publish master tables
of DSSTox IDs and high
quality structures.
• Promote use of
chemical registry system
from ToxCast/Tox21
more broadly within EPA.
• Collaborate with SAR
modeling efforts to
expand modeling to
address Tox21
chemicals and
endpoints.
• Continue expansion of
DSSTox public toxicity
database inventory into
new toxicity and
exposure areas.



Expected Impacts


Enables access to cross-chemical data by EPA program offices,
NCCT, and other ORD organizations and external stakeholders.
Improves EPA data transparency.












Adoption of DSSTox chemical standards more broadly within
EPA and across other government Agencies to improve quality
and read-across capabilities.
Promote public data dissemination and encourage greater public
participation of industry and commercial sources in public toxicity
database and modeling efforts.
Facilitate improved toxicity prediction models and data mining
capabilities across wider span of endpoints and chemicals
impacting Hazard ID and risk assessment.















                                                              A-2.1
                                         Previous
TOC
Next

-------
                           EPA CompTox Research Program FY2009-2012
IV. APPENDICES cont. -2. Project Outcomes Table
BOSC Review Draft- 24 August, 2009
Project Title












ToxRefDB








Chem Model -The
Application of Molecular
Modeling to Assessing
Chemical Toxicity


Outputs/O utcom es
FY09
• MSs: Chronic/cancer,
multigeneration and
developmental modules
• Release of stand-along
data entry tool.
• ToxRefDB webpage
online.
• Collection of ToxCast
Phase II chemical toxicity
data.
• Public release of
ToxRefDB web-based
query tool.
• Complete entry of
targeted set of chemicals
and study types for
Phase II of ToxCast.
• Complete reproductive
toxicity study
retrospective analysis.
• MS: Capability of the
target-toxicant paradigm
to identify chemicals that
bind weakly to the
estrogen receptor,
including a description of
the method.
• Description of the
library of 1 48 biological
macromolecule targets.
• MS: Molecular
modeling studies of the
potential biological
effects of the perfluoro
compounds.
Outputs/Outcomes
FY10
• Quarterly releases with
new data in conjunction
with ACToR.
• Implementation of a
process to gather and
enter open literature
studies.
• Expansion of ToxRefDB
to capture DNT studies
and EDSP data.
• Complete retrospective
analyses on other major
study types.
• Release of ToxRefDB
live data entry tool.





• MS: The metabolism of
pyrethroids and the effects
of three dimensional
chemical structures.
• Description of additional
targets added to the target
library.
• MS: The interaction of
Toxcast chemicals with
nuclear receptor targets
and the importance of
pharmacophore filters.



Outputs/Outcomes
FY11
• Quarterly releases with
new data in conjunction
with ACToR.

















• MS:The integration of
results from the target
library and available
experimental
parameters.
• MS: The comparisons
of results with and
without pharmacophore
filter for Toxcast
chemicals.





Outputs/O utcom es
FY12
• Quarterly releases with
new data In conjunction
with ACToR.

















• Comparison of results
using the target library
with experimental
determined activities.
• MS: Thel use of
molecular modeling and
the target toxicant
paradigm for regulatory
purposes, including a
discussion of the OECD
principles relative to
molecular modeling
specific principles.


Expected Impacts

Enables access of traditional toxicological data in a structured
and computable format extending the utility of the data beyond
chemical risk assessment and into broad research applications,
including guiding novel toxicity characterization methods and
transparent data-driven retrospective analyses leading to refined
animal use, a more predictive toxicology paradigm, and more
efficient chemical safety assessments.











An approach will be provided to prioritize chemicals for their
ability to influence the endocrine system by competing with
natural ligands for the binding sites of receptors. The application
of a method used to find strong acting drug like chemicals will be
evaluated for its capability of discovering weakly active
chemicals. The capability of these methods will be used for
finding weakly active chemicals. Parameters based on the
capacity of a chemical to interact with macromolecular targets for
toxicity will be available for applications of computational
methods to screen for or eventually predict chemical toxicity.



                                                               A-2.2
                                           Previous

-------
                           EPA CompTox Research Program FY2009-2012
IV. APPENDICES cont. -2. Project Outcomes Table
BOSC Review Draft- 24 August, 2009
Project Title








T ox Cast™



















ExpoCast™: Exposure
science for screening,
prioritization , andtoxicity
testing





Out puts/O utcom es
FYO9
• Completion of ToxCast
Phase 1.
• Provide Phase 1 data
sets to public.
• Derivation of predictive
signatures from ToxCast
Phase 1 data.
• MSs: ToxCast Phase 1
data sets.
• MS: Signature
generation
• MS: NR pathways and
toxicity.
- Convene first ToxCast
Data Summit for
identifying prediction
models.
• Finalize selection of
chemicals for Phase II of
ToxCast and Tox21 .
- ExpoCoP monthly
teleconference, ESC
resource, face-to-face
meeting at !SES 2009.
• MSs: EHP and ToxSci
on role of exposure in
the transforming
toxicology.
- SVOC workshop,
"Semi-Volatile Organic
Compounds (SVOCs) in
the Residential
Environment".
• Survey and identify
high priority exposure
data resources for initial
chemical indexing in
collaboration with
ACToR and DSSTox.
• ExpoCast conceptual
framework and research
plan.
Outputs/Outcomes
FY1O
• MSs: Describing
approaches to combining
exposure, PK, and in vitro
assays to do risk
prioritization .
• Evaluate compatibility of
nanomaterials of diverse
classes with ToxCast
assays.
• Prioritize and select
assays to be run for
ToxCast Phase II.
• Completion of ToxCast
Phase II data collection.






- Position paper
recommending standards
for exposure data
representation.
• White paper defining
exposure space and plan
for assessing exposure
data landscape.
• White paper exploring
development and
application of human
exposure knowledge base.
• Begin implementation of
standards across
exposure databases of
highest interest and utility
for NCCT projects.
• Award contracts.




Outputs/Outcomes
FY11
• Confirmation of
ToxCast predictive
signatures with Phase II
data.
• MSs: Signature
confirmations and
applications.













• MS: describing extant
data analyses to identify
critical determinants for
exposure classification
and chemical
prioritization based on
potential for exposure.
• Guide incorporation
and further development
of simple exposure
estimation tools within
the ACToR system for
use in prioritization.






Outputs/Outcomes
FY12
• MSs: Profiling of large
chemical sets for
potential for hazard.

















• Apply exposure index
for prioritization to subset
of ToxCast compounds
to evaluate concept.















Expected Impacts
Program Offices will have tools to prioritize chemicals for targeted
toxicity testing and as well as insight on potential mechanisms of
toxicity.















Advance Agency tools for efficiently characterizing and
classifying chemicals based on potential for biologically-relevant
exposure; inform characterization of environmentally-relevant
toxicity.













                                                               A-2.3
                                           Previous

-------
                           EPA CompTox Research Program FY2009-2012
IV. APPENDICES cont. -2. Project Outcomes Table
BOSC Review Draft- 24 August, 2009
Project Title






Virtual Embryo
(v-Embryo™ ) - The
Virtual Embryo Project













Virtual Liver -
(v-Liver1M)) - The Virtual
Liver Project

















Uncertainty- Uncertainty
Analysis in Toxicological
Modeling












Outputs/O utcom es
FY09
• MS: application of VT-
KB to analyze ToxRefDB
developmental toxicity
studies. • VT-KB based
qualitative (structural)
model of self-regulating
ocular gene network.
• VT-SE based cell-
based computational
model of lens-retina
induction.
• MS: ocular
morphogenesis, gene
network inference,
analysis and modeling.

• Prioritize proof of
concept (PoC)
environmental chemicals
with clients. • Knowledge
Base (KB): Information
about PoC chemicals
using ToxCast assays.
• KB: Cytoscape KB
visualization and
analysis tool. • Cell
response: Initial
molecular circuit
describing hepatic cell
functions. • Tissue
Simulator: Develop /use
MAS framework.
• MS: Improvement of
computational efficiency
in Bayesian analyses of
PBPK models using
MCMC. • MS: Assessing
fit of PBPK models to
data using Bayesian or
other methods. • MS:
Reviewing approaches
for constructing priors for
PBPK model
parameters. • MS:
Estimation of model
parameters and
comparison of alternative
model forms for
permethin PBPK model.
• MS: Global sensitivity
analysis for exposure-
dose model for
permethrin. • Estimation
of PBPK parameters for
deltamethrin and two
other pyrethroids
complete.
Outputs/Outcomes
FY10
• Extend lens-retina model
to other stages and
species.
• Incorporate pathway data
from ToxCast, mESC and
ZF embryos.
• MS: Sensitivity analysis
for key biological
pathways.
• MS: Developmental
trajectories and
phenotypes in
computational models.
• Integrate with other
morphogenetic models
(ES cells, Zfish).
• Tissue Simulator: Test
liver lesions formation.
• Integrate molecular
circuits for MOA chemicals
in tissues.
• Evaluate simulator using
PoC chemicals and
ToxCast data to predict
outcomes.







• MS: permethrin PBPK •
MS on global sensitivity
analysis for permethrin
exposure-dose model
submitted. • MS:
Parameter estimation for
deltamethrin and two other
pyrethroids. • Completion
of exposure-dose-effect
model for 'mini-cumulative'
risk assessment.
• Substantial completion of
uncertainty analysis for
vLiver. • Description of
ontologically-aware PBPK
language and translation
to R. • Problem
identification for evaluating
alternative pathway
formulations in vLiver
model. • Problem
identification evaluating
alternative model
formulation for BBDR and
virtual embryo models.
Outputs/Outcomes
FY11
• MS: Test of model
against predictions for
pathway-based dose-
response relationship.
• MS: Uncertainty
analysis of models for
complex systems model.
• Model: Computer
program of early eye
development using
rules-based architecture,
cell-based simulators
and systems-wiring
diagrams.


• KB inference tod for
analyzing MOA for new
chemicals/mixtures.
• Extend lobule simulator
to liver and integrate
with PBPK model.










• SAP on 'mini-
cumulative' risk
assessment for
pyrethroids.
• Preparation for SAP for
cumulative risk
assessment for
pyrethroids.
• Publication of virtual
tissue and BBDR
modeling results
• Initial steps in
constructing test
datasets for analysis
"competition"
recommended by
UVPKM.
• Submission of ms on
"RDynamic"tothe
Journal of Statistical
Software.




Outputs/Outcomes
FY12
• MS: Integrated eye
morphogenesis.
• Integration of
computational models of
different systems.
• Evaluation of
integrative model with
data from ES cells, Zfish.








• Evaluate impact of
genomic variation on
cellular responses and
lesion formation.
• Evaluate v- Liver for
simulating human
pathology outcomes
using clinical data.








• Submission of ms on
'SemanticPK'.
• R package "R Dynamic",
for simplifying dynamic
modeling in the statistical
language R, submitted to
CRAN.


















Expected Impacts


Framework for in silico reconstruction of the embryo to facilitate
navigation of complex relationships and predict systems-level
behavior (outcome) from data on biochemical, molecular and
cellular changes.












Proof-of-Concept decision support tools to enable tiered-testing:-
a) Reduce uncertainty in evaluating the effect of chemicals on
normal hepatic pathways.
b) Estimate dose-dependent adverse hepatic effects in
individuals and variability in populations.











Improved cumulative risk assessment for pyrethroid pesticides,
based on realistic quantitative assessment of uncertainties.





















                                                               A-2.4
                                           Previous

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
IV. APPENDICES cont.

B. Extramural STAR Centers Projects

1.  The Research Center for Environmental Bioinformatics and Computational Toxicology at the
University of Medicine & Dentistry of New Jersey (UMDNJ), Piscataway, brings together a
team of computational scientists with diverse backgrounds in bioinformatics, chemistry and
environmental science, from UMDNJ, Rutgers, and Princeton Universities, and the US Food and
Drug Administration's Center for Toxicoinformatics. The team is addressing multiple elements
of the source-to-outcome sequence for toxic pollutants as well as developing tools for toxicant
characterization. The computational tools developed through this effort will be extensively
evaluated and refined through collaboration between STAR  Center scientists as well as with
colleagues from the three universities and the EPA. Particular emphasis is on methods that
enhance current risk assessment practices and reduce uncertainties. Researchers are also
developing a web accessible Environmental Bioinformatics Knowledge Base that will provide a
user-oriented interface to an extensive set of information and modeling resources.

2.  The Carolina Environmental Bioinformatics Research Center at the University of North
Carolina, Chapel Hill, is developing new analytic and computational methods, creating efficient
user-friendly tools to disseminate the methods to the wider community, and applying the
computational methods to molecular toxicology and other studies. The Center brings together
multiple investigators and disciplines, combining expertise in biostatistics, computational
biology, chemistry, and computer science to  advance the field of Computational Toxicology.
Researchers focus on providing biostatistician support to the Center by performing analyses  and
developing new methods in Computational Biology. The Center is also creating a framework for
merging data from various technologies in a  systems-biology approach.

3.  Carolina Center for Computational Toxicology at the University of North Carolina, Chapel
Hill, University of North, will advance the field of computational toxicology through the
development of new methods and tools, as well as through collaborative efforts.  The Center  is
utilizing a bottom-up approach to predictive computational modeling of adverse effects of toxic
agents. The emphasis spans from the fine-scale predictive simulations of the protein-protein/-
chemical interactions in nuclear receptor networks, to mapping chemical-perturbed networks and
devising modeling tools that can predict the pathobiology of the test compounds based on a
limited set of biological data, to building tools that will enable toxicologists to understand the
role of genetic diversity between individuals in responses to  toxicants, to unbiased discovery-
driven prediction of adverse chronic in vivo outcomes based on statistical modeling of chemical
structures, high-throughput screening and the genetic makeup of the organism. In each project,
new computer-based models will be developed and published that represent the state-of-the-art.
The tools produced within  each project will be widely disseminated, and the emphasis will be
placed on their usability by the risk assessment community and the investigative toxicologists
alike.  The synthesis of data from a variety of sources will move the field of computational
toxicology from a hypothesis-driven science toward a predictive science.

4.  Texas-Indiana Virtual STAR Center; Data-Generating in  vitro and in silico Models of
Developmental Toxicity in Embryonic Stem Cells and Zebrafish at the University of Houston,
Texas A&M Institute for Genomic Medicine, and Indiana University.  Project Period:
Project start: November 1,  2009
Project end: October 31, 2012

                                          B-l
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
Description
Objectives/Hypothesis:
As chemical production increases worldwide, there is increasing evidence as to their hazardous
effects on human health at today's exposure levels, which further implies that current chemical
regulation is insufficient. Thus, a restructuring of the risk assessment procedure will be required
to protect future generations. Given the very large number of man-made chemicals and the likely
complexity of their various and synergistic modes of action, emerging technologies will be
required for the restructuring. The main objective of the proposed multidisciplinary Texas
Indiana Virtual STAR (TIVS) Center is to contribute to a more reliable chemical risk assessment
through the development of high throughput in vitro and in  silico screening models of
developmental toxicity. Specifically, the TIVS Center aims  to generate in vitro models of murine
embryonic stem cells and zebrafish for developmental toxicity. The data produced from these
models will be further exploited to produce predictive in silico models for developmental
toxicity on processes that are relevant also for human embryonic development.

Approach:
The project is divided into three Investigational Areas; zebrafish models, murine embryonic stem
cells models and in silico simulations.  The approaches  are to:
   1.  Generate developmental models suitable for high throughput screening. Zebrafish
       developmental models (transgenic GFP/EGFP/RFP  models of crucial steps in
       development) and embryonic stem cell (ESC) differentiation models (transgenic beta-geo
       models of crucial steps in differentiation) will be generated. Important morphology
       features and signaling pathways during development will be documented. The impact of
       environmental pollutants on development and differentiation will be assessed  in the
       models. Finally, the models will be refined for high  throughput screening and
       automation.
   2.  Generate a computational model that faithfully recreates the major morphological
       features of normal wild-type zebrafish development (ie- segmentation into somites,
       proper patterning of vascular and neural systems)  and the differentiation to three
       primitive layers (endoderm, mesoderm and ectoderm) in mouse embryonic stem cells.
       The data for simulations are produced from developed high information content zebrafish
       and ESC models. Once a working model of normal development has been generated, we
       will carry out a directed series of parameter sweeps  to try to create developmental defects
       in silico. We will compare the results of computationally created defects with
       experimentally-generated defects in zebrafish and embryonic stem cells. Best  matches
       between the two datasets will suggest hypotheses  about possible mechanisms by which
       defects occur.
   3.  Perform proof-of-concept  experiments of the in vitro and in silico test platforms with a
       blind test of chemicals.

Techniques will be molecular biology techniques on zebrafish and ESC models, such as cloning,
imaging,  in vitro differentiation and in vitro exposure studies, and in silico mathematical
simulations.

Expected Results (Outputs/Outcomes):
                                          B-2
                         Previous  I    TOC

-------
  EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
               BOSC Review Draft- 24 August, 2009
In collaboration with other initiatives taken in the field of chemical safety, our generated results
and models will contribute to large screening effort to prioritize chemicals for further risk
assessment. We will specifically contribute with:
   •   9 transgenic fish lines validated for toxicity screening
   •   16 embryonic stem cell models validated for toxicity screening
   •   High information content models on development and differentiation to produce data for
       in silico simulations, within the project and elsewhere
   •   Computational models for developmental toxicology of normal development and of
       mechanisms by which chemical perturbations cause experimentally-observed
       developmental defects
   •   Information on developmental toxicity on 39 compounds.
                         Previous
                                          B-3
TOC

-------
        EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009


IV. APPENDICES cont.

C. FY2004 "New Start" Award Bibliography

Project Title:  Linkage of Exposure and Effects Using Genomics, Proteomics, and Metabolomics in
Small Fish Models

Peer Reviewed Publications:
Ankley, G.T., K.M. Jensen, E.J. Durban, E.A. Makynen, B.C. Butterworth, M.D. Kahl, D.L. Villeneuve, A.
Linnum, L.E. Gray, M. Cardon, V.S. Wilson.  2005.  Effects of two fungicides with multiple modes of action
on reproductive endocrine function in the fathead minnow (Pimephales promelas).  Toxicol. Sci. 86, 300-
308.

Ankley, G.T., K.M. Jensen, M.D. Kahl, E.A. Makynen, L.S. Blake, K.J. Greene, R.D. Johnson and D.L.
Villeneuve. 2007. Ketoconazole in the fathead minnow (Pimephlaes promelas): reproductive toxicity and
biological compensation. Environ. Toxicol. Chem. 26, 1214-1223.

Ankley, G.T., D.H. Miller, K.M. Jensen, D.L. Villeneuve and D. Martinovic. 2008.
Relationship of plasma sex steroid concentrations in female fathead minnows to reproductive success and
population status. Aquat. Toxicol. 88, 69-74.

Ankley, G.T., D. Bencic, M. Breen, T.W. Collette, R. Connolly, N.D. Denslow,
S. Edwards, D.R. Ekman, K.M. Jensen, J. Lazorchak, D. Martinovic, D.H. Miller, E.J.
Perkins, E.F. Orlando, N. Garcia-Reyero, D.L. Villeneuve, R.-L.Wang , and K.
Watanabe. 2009. Endocrine disrupting chemicals in fish: Developing exposure
indicators and predictive models of effects based on mechanisms of action. Aquat.
Toxicol. 92, 168-178.

Breen, M.S., D.L. Villeneuve, M. Breen, G.T. Ankley and R.B. Conolly. 2007.
Mechanistic computational model of ovarian steroidogenesis to predict biochemical responses to endocrine
active compounds. Ann. Biomed. Engin. 35, 970-981.

Ekman, D.R., Q. Teng, K.M. Jensen, D. Martinovic, D.L. Villeneuve, G.T. Ankley and T. W. Collette.
2007. NMR analysis of fathead minnow urinary metabolites:  a potential approach for studying impacts of
chemical exposures.  Aquat. Toxicol. 85,  104-112.

Ekman, D.R., Q. Teng, D.L. Villeneuve, M.D. Kahl, K.M. Jensen, E.J. Durban, G.T. Ankley and T.W.
Collette. 2008. Investigating compensation and recovery of fathead minnow (Pimephales promelas)
exposed to 17a-ethynylestradiol with metabolite profiling. Environ. Sci. Tecnhol. 42, 4188-4195.

Ekman, D.R., Q. Teng, D.L. Villeneuve, M.D. Kahl, K.M. Jensen, E.J. Durban, G.T. Ankley and T.W.
Collette. 2009. Profiling lipid metabolites yields unique information on gender- and time-dependent
responses of fathead minnows (Pimephales promelas) exposed to  17a-ethynylestradiol. Metabolomics 5, 22-
32.
                                              C-l
                             Previous  I    TOC

-------
        EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
IV. APPENDICES cont.

Garcia-Reyero, N., D.L. Villeneuve, K.J. Kroll, L. Liu, E.F. Orlando, K.H. Watanabe, M.S. Sepulveda, G.T.
Ankley and N.D. Denslow.  2009.  Expression signatures for a model androgen and antiandrogen in the
fathead minnow ovary. Environ. Sci. Technol. 43, 2614-2619.

Garcia-Reyero, N., K.J. Kroll, L. Liu, E.F. Orlando, K.H. Watanabe, M.S. Sepulveda, D.L. Villeneuve, E.J.
Perkins, G.T. Ankley and N.D. Denslow. 2009. Gene expression responses in male fathead minnows
exposed to binary mixtures of an estrogen and antiestrogen. BMC Genomics, In Press.

Johns, S.M., M.D. Kane, N.D. Denslow, K.H. Watanabe, E.F. Orlando, D.L. Villeneuve,
G.T. Ankley and M.S. Sepulveda.  2009.  Characterization of ontogenetic changes in gene expression in the
fathead minnow (Pimephalespromelets). Environ. Toxicol. Chem. 28, 873-880.

Martyniuk, C.J., S. Alvarez, S. McClung, D.L. Villeneuve, G.T. Ankley and N.D. Denslow.2009.
Quantitative proteomic profiles of androgen receptor signaling in the liver of fathead minnows (Pimephales
promelas) J. Proteome Res. In Press.

Martinovic, D., L.S. Blake, E.J. Durhan, K.J. Greene, M.D. Kahl, K.M. Jensen, E.A. Makynen, D.L.
Villeneuve and G.T. Ankley. 2008. Characterization of reproductive toxicity of vinclozolin in the fathead
minnow and co-treatment with an androgen to confirm an anti-androgenic mode of action. Environ. Toxicol.
Chem. 27, 478-488.

Miller, D.H., K.M. Jensen, D.L. Villeneuve, M.D. Kahl, E.A. Makynen, E.J. Durhan and G.T. Ankley.
2007.  Linkage of biochemical responses to population-level effects: a case study with vitellogenin in the
fathead minnow (Pimephlaes promelas). Environ. Toxicol. Chem. 26, 521-527.

Perkins, E.J., N. Garcia-Reyero, D.L. Villeneuve, D. Martinovic, S.M. Brasfield, L.S.
Blake, J.D. Brodin, N.D. Denslow and G.T. Ankley. 2008. Perturbation of gene
expression and steroidogenesis with in vitro exposure of fathead minnow ovaries to
ketoconazole. Mar. Environ. Res. 66, 113-115.

Villeneuve, D.L.,  P. Larkin, I. Knoebl, A.L. Miracle, M.D. Kahl, K.M. Jensen, E.A.  Makynen, E.J. Durhan,
B.J. Carter, N.D. Denslow and G.T. Ankley.  2007. A graphical systems model to facilitate hypothesis-
driven ecotoxicogenomics research on the brain-pituitary-gonadal axis. Environ. Sci. Technol. 40, 321-330.

Villeneuve, D., L. Blake, J. Brodin, K. Greene, I. Knoebl, A. Miracle, D. Martinovic and G.T. Ankley.
2007.  Transcription of key genes regulating gonadal steroidogenesis in control and ketoconazole- or
vinclozolin-exposed fathead minnows.  Toxicol. Sci. 98, 395-407.

Villeneuve, D.L.,  L.S. Blake, J.D. Brodin, J.E. Cavallin, E.J. Durhan, K.M. Jensen, M.D. Kahl, E.A.
Makynen, D. Martinovic, N.D. Mueller and G.T. Ankley. 2008.  Effects of a 3p-hydroxysteroid
dehydrogenase inhibitor, trilostane, on the fathead minnow reproductive axis. Toxicol. Sci. 104, 113-123.

Villeneuve, D.L.,  N.D. Mueller, D. Martinovic, E.A. Makynen, M.D. Kahl, K.M. Jensen, E.J. Durhan, J.E.
Cavallin, D. Bencic and G.T. Ankley.  2009. Direct effects, compensation and recovery in female fathead
minnows exposed to a model aromatase inhibitor. Environ. Health Perspect. 117, 624-631.

Villeneuve, D.L.,  R.-L. Wang, D.C. Bencic, A.D. Biales, D. Martinovic, J.M.

                                               C-2
                              Previous  I     TOC

-------
        EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
Lazorchak, G. Toth and G.T. Ankley. 2009. Altered gene expression in the brain and ovaries of zebrafish
exposed to the aromatase inhibitor fadrozole: microarray analysis and hypothesis generation.  Environ.
Toxicol. Chem. In Press.

Wang, R.-L., A. Biales, D. Bencic, D. Lattier, M. Kostich, D. Villeneuve, G.T. Ankley, J. Lazorchak and G.
Toth. 2008a.  DNA microarray application in ecotoxicology: experimental design, microarray scanning, and
factors impacting transcriptional profiles in a small fish species.  Environ. Toxicol. Chem. 27, 652-663.

Wang, R.-L., D. Bencic, A. Biales, D. Lattier, M. Kostich, D. Villeneuve, G.T. Ankley, J. Lazorchak and G.
Toth. 2008b.  DNA microarray-based ecotoxicological discovery in a small fish specbs. Environ. Toxicol.
Chem. 27, 664-675.

Watanabe, K.H., K.M. Jensen, E.F. Orlando and G.T. Ankley. 2007. What is
 normal? A characterization of the values and variability in reproductive endpoints of the fathead minnow,
Pimephales promelas.  Comp. Biochem. Physiol. 146, 348-356.

Watanabe, K.H., Z. Li, K. Kroll, D.L. Villeneuve, N.J. Szabo, E.F. Orlando, M.S.
Sepulveda, T.W. Collette, D.R. Ekman, G.T. Ankley and N.D. Denslow. 2009. A physiologically-based
model of endocrine-mediated responses of male fathead minnows to!7a-ethinylestradiol. Toxicol. Sci. In
Press.

Project Title: Simulating Metabolism of Xenobiotic Chemicals as a Predictor of Toxicity

Peer Reviewed Publications:
Mazur, C.  S.; Kenneke, J. F. 2008. Cross-species comparison of conazole fungicide metabolites using rat and
rainbow trout (Onchorhynchus mykiss) hepatic microsomes and purified human CYP 3A4. Environmental
Science and Technology, 42:947-954.

Mazur, C.  S.; Kenneke, J.F.; Tebes-Stevens, C. Okino, M. S.; Lipscomb, J. C. 2007. In vitro  metabolism of
the fungicide and environmental contaminant trans-bromuconazole and implications for risk assessment.
Journal of Toxicology and Environmental Health, Part A, 70:1241-1250.

Kenneke, J. F. 2006. Environmental fate and ecological risk assessment for the reregistration of antimycin A
(PC Code 006314), Appendix D: In vitro mammalian metabolism and Appendix G: Summary of antimycin
hydrolysis research, U.S. EPA, Office of Pesticide Programs, Reregistration Eligibility Decision (RED) on
Antimycin A

Project Title: Risk Assessment of the Inflammogenic and Mutagenic Effects of Diesel Exhaust
Particulates: A Systems Biology Approach

Peer Reviewed Publications:
Cao, D., Bromberg, PA and Samet, JM (2007). Diesel-induced Cox-2 expression involves chromatin
modification via degradation of HDAC1 and recruitment of p300. Am. J. Respir. Cell. Mol. Biol.37:232-239.
Cao, D., Tal, T., Graves, L., Gilmour, I., Linak, W., Reed, W., Bromberg, P., and Samet, J., Diesel Exhaust
Particulate (DEP)-Induced Activation of Stat3 Requires Activities of EGFR and SRC in Airway Epithelial
Cells, American Journal of Physiology: Lung, Cell, & Molecular Physiology, 292, L422-L429 (2007).

                                               C-3
                              Previous  I     TOC

-------
        EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
Cho, S.-H., Yoo, J.-L, Turley, A.T., Miller, C.A., Linak, W.P., Wendt, J.O.L., Muggins, F.E., and Gilmour,
M.I., Relationships between Composition and Pulmonary Toxicity of Prototype Particles from Coal
Combustion and Pyrolysis, Proceedings of the Combustion Institute, 32, in press (2008).

Ciencewicki, J., Gowdy, K., Krantz, Q.T., Linak, W.P., Brighton, L., Gilmour, M.I., and Jaspers, I., Diesel
Exhaust Enhanced Susceptibility to Influenza Infection is Associated with Decreased Surfactant Protein
Expression, Inhalation Toxicology, 19, 1121-1133 (2007).

DeMarini, D.M., Brooks, L.R., Warren, S.H., Kobayashi, T., Gilmour, M.I., and Singh P., Bioassay-Directed
Fractionation an dSalmonella Mutagenicity of Automobile and Forklift Diesel Exhaust Particles,
Environmental Health Perspectives, 112, 814-819 (2004).

Gottipolu, R.R., Wallenborn, J.G.,  Karoly, E.D., Schladweiler, M.C., Ledbetter, A.D., Krantz,  Q.T., Linak,
W.P., Nyska, A., Johnson, J.A., Thomas, R.,  Richards, J.E., Jaskot, R.H., and Kodavanti, U.P.  (2009)., One-
month Diesel Exhaust Inhalation Produces Hypertensive Gene Expression Pattern in Healthy Rats,
Environmental Health Perspectives. 117:38-46.

Gowdy, K., Krantz, Q.T., Daniels,  M., Linak, W.P., Jaspers, I., and Gilmour, M.I., Modulation of Pulmonary
Inflammatory Responses and Anti-microbial  Defenses in Mice Exposed to Diesel  Exhaust, Toxicology &
Applied Pharmacology, 229, 310-319 (2008).

Linak, W.P., Yoo, J.I., Wasson, S.J., Zhu, W., Wendt, J.O.L., Huggins, F.E., Chen, Y., Shah, N., Huffman,
G.P., and Gilmour, M.I., Ultrafine Ash Aerosols from Coal Combustion: Characterization and  Health
Effects, Proceedings of the Combustion Institute, 31, 1929-1937 (2007).

Reed, W., Gilmour, I., DeMarini, D., Linak, W. and Samet, J. (2008). Gene Expression Profiles of Human
Airway Epithelial Cells Exposed to Diesel Exhaust Particles of Varying Composition. In Preparation.

Saxena, RK, Williams, W & Gilmour, MI. (2007) Suppression of basal and cytokine induced expression of
MHC, ICAM 1  and B7 markers on mouse lung epithelial cells exposed to diesel exhaust particles. Am J
Biochem Biotech. 3(4). 187-192.

Saxena, RK., Gilmour, ML, & MD Hayes. Uptake of diesel exhaust particles by lung epithelial cells and
alveolar macrophages. Biotechnology, 2007, 3 (4). 187-192

Singh, P., DeMarini, D.M., Dick, C.A.J., Tabor,  D., Ryan, J., Linak,  W.P., Kobayashi, T., and  Gilmour, M.I.,
Bioassay-Directed Fractionation, Physiochemical Characterization, and Pulmonary Toxicity of Automobile
and Forklift Diesel Exhaust Particles in Mice, Environmental Health Perspectives, 112(8)  820-825 (2004).
Stevens, T., Krantz, Q.T., Linak, W.P., Hester, S.,  and Gilmour, M.I., Increased Transcription  of Immune
and Metabolic Pathways in Naive and Allergic Mice Exposed to Diesel Exhaust, Toxicological Sciences,
102(2), 359-370 (2008).

Stevens, T., Linak, WP., Gilmour, MI. Differential potentiation of allergic lung disease in mice exposed to
chemically distinct diesel samples.  Tox Sci.  107(2), 522-534.
                                               C-4
                              Previous  I    TOC

-------
        EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
IV. APPENDICES cont.

Stevens,!, Hester, S, & Gilmour MI. Differential transcriptional changes in mice exposed to chemically
distinct diesel samples.  Submitted.

Tal., T., Bromberg, P.A., Kim, Y. and Samet, J.M. (2008). Tyrosine phosphatase inhibition induces
epidermal growth factor receptor activation in human airway epithelial cells exposed to diesel exhaust
Toxicol. Appl. Pharmacol. 233:382-388.

Project Title: Development of Microbial Metagenomic Markers for Environmental Monitoring and
Risk Assessment

Peer Reviewed Publications:
Lamendella R, Santo Domingo JW, Yannarell AC, Ghosh S, Di Giovanni G, Mackie RI, Oerther DB.
Evaluation of swine-specific PCR assays used for fecal source tracking and analysis of molecular diversity of
Bacteriodales-swine specific populations. Appl Environ Microbiol. 2009 Jul 24. [Epub ahead of print]

Lu J, Santo Domingo JW, Hill S, Edge TA. Microbial Diversity and Host-specificSequences of Canadian
Goose Feces. Appl Environ Microbiol. 2009 Jul 24.

Santo Domingo, J.W. and T.A. Edge. 2009. Identification of primary sources of faecal pollution. In Safe
Management of Shellfish and Harvest Waters. G. Rees., K. Pond, D. Kay and J. Santo Domingo. IWA
Publishing, London, UK.

Lee, Y.-J., M. Molina, and J.W. Santo Domingo, J.D. Willis, M. Cyterski, D.M. Endale, and O.C. Shanks.
2008. A temporal assessment of cattle fecal pollution in two watersheds using 16S rRNA gene-based and
metagenome-based assays. Appl. Environ. Microbiol. 74:6839-6847.

Lu, J. and J.W. Santo Domingo. 2008. Turkey fecal microbial community structure and functional gene
diversity revealed by 16S rRNA gene and metagenomic sequences. J. Microbiol. 46:469-477.

Lu, J., J.W. Santo Domingo, R. Lamendella, T.Edge, and S.Hill. 2008. Phylogenetic diversity and molecular
detection of gull feces. Appl. Environ. Microbiol. 74: 3969-3976.

Lamendella, R., Santo Domingo J.W., Kelty C, and Oerther DB. 2008. Occurrence of bifidobacteria in feces
and environmental waters. Appl. Environ. Microbiol. 74:575-584.

Santo Domingo, J.W., D.G. Bambic, T.A. Edge, and S. Wuertz. 2007. Quo vadis source tracking? Towards a
strategic framework for environmental monitoring of fecal pollution. Water Res. 41:3539-3552.

Lu, J., J.W. Santo Domingo, and O.C. Shanks. 2007. Identification of chicken-specific fecal microbial
sequences using a metagenomic approach. Water Res. 41:3561-3574.

Shanks, O., J.W. Santo Domingo, J. Lu, C.A. Kelty, and J. Graham. 2007. PCR Assays for the identification
of human fecal pollution in water. Appl. Environ. Microbiol. 73: 2416-2422.

Vogel, J.R., D.M. Stoeckel, R. Lamendella, R.B. Zelt, J.W. Santo Domingo, S.R. Walker, and D.B. Oerther.
2007. Identifying fecal sources in a selected catchment reach using multiple source-tracking Tools. J.
Environ. Qual. 36:718-729.

                                              C-5
                             Previous  I    TOC

-------
        EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
Lamendella, R., J. W. Santo Domingo, D. Oerther, J. Vogel, and, D. Stoeckel. 2007. Assessment of fecal
pollution sources in a small northern-plains watershed using PCR and phylogenetic analyses of Bacteroidetes
16S rDNA. FEMS Microbiol. Ecol. 59:651-660.

Santo Domingo, J.W., Lu, J., Shanks, O., Lamendella, R., Kelty, C. A., and Oerther, D.B. "Development of
host-specific markers for source tracking using a novel metagenomic approach," Water Environment
Federation, Proceedings of Disinfection 2007, Pittsburg, PA, February 4-7, 2007.

Shanks, O., J. W. Santo Domingo, R. Lamendella, C.A. Kelty, and J. Graham. 2006. Competitive
metagenomic DNA hybridization identifies host-specific genetic markers in cattle fecal samples. Appl.
Environ. Microbiol. 72:4054-4060.

Shanks, O., J. W. Santo Domingo, and J. Graham. 2006. Use of competitive DNA hybridization to identify
differences in the genomes of two closely related fecal indicator bacteria. J. Microbiol. Methods. 66:321-330.
Project Title:  A Systems Approach to Characterizing and Predicting Thyroid Toxicity Using an
Amphibian Model

Peer Reviewed Publications:
Sternberg, Thoemke, Hornung, Tietge, and Degitz. Regulation of thyroid-stimulating hormone release from
pituitary by t4 during metamorphosis in Xenopus laevis  (In review)

Serrano, Higgins Witthuhn, Korte, Hornung, Tietge, and Degitz In vivo assessment and potential diagnosis
of xenobiotics that perturb the thyroid pathway: Part I. Differential protein profiling of Xenopus Laevis brain
tissues by two-dimensional polyacrylamide gel electrophoresis and peptide-labeling with isobaric tags for
relative and absolute quantification (iTRAQ) following exposure to model T4 inhibitors.  (In review)

Conners, K. , Jorte, J.J., Anderson G., Degitz SJ. Charaterization of thyroid hormone transporting protein
expression during tissue-specific metamorphosis in Xenopus tropicalis (In review)

Hornung, M.W. Degitz, S.J., Korte, L.M., Olson, J., Kosian, P.A., Linnum, A.L., Tietge,  J.E. Inhibition of
thyroid hormone release from cultured amphibian thyroid glands by methimazole, 6-propylthiouracil, and
perchlorate. (Completed, NHEERL In-House review. To be submitted with the following Hornung et al.
paper).

Degitz, S.J., Hornung, M.W., Korte, J.J, Holcombe, G.W, Kosian, P.A., Thoemke, K.R.,  Helbing, C., Tietge,
J.E. In vivo and in vitro regulation of genes in the thyroid gland following exposure to the model T4
synthesis inhibitors methimazole, 6-propylthiouracil, and perchlorate. (In preparation. To be submitted with
above paper).

Hornung, M., Burgess, E.  Tandem in vitro and ex vivo thyroid gland assays to screen xenobiotic chemicals
for thyroid hormone synthesis inhibition. (In preparation).

Nichols systems model paper (In preparation).


                                               C-6
                              Previous  I    TOC

-------
       EPA CompTox Research Program FY2009-2012                BOSC Review Draft- 24 August, 2009
IV. APPENDICES cont.

Tietge Butterworth Kosian Hammermeister Hornung Haselman Degitz Analysis of Thyroid Hormone and
Related lodo-Compounds in Complex Samples by Inductively Coupled Plasma Emission/Mass
Spectrometry. (In preparation)

Tietge, Butterworth, Haselman, Holcombe, Korte, Kosian, Wolfe, Degitz. Early Temporal Effects of Three
Thyroid Hormone Synthesis Inhibitors in Xenopus laevis. (In preparation)

Project Title: Mechanistic Indicators Of Childhood Asthma (Mica): A Systems Biology Approach For
The Integration Of Multifactorial Environmental Health Data

Peer Reviewed Publications:
Kim SJ, Dix DJ, Thompson KE, Murrell RN, Schmid JE, Gallagher JE, Rockett JC.
Effects of storage, RNA extraction, genechip type, and donor sex on gene expression profiling of human
whole blood. Clin Chem. Jun;53(6): 1038-45. (2007)

Vesper, S.,McKinstry C., Haugland., R., Neas, L., Hudgens, E.,  Heidenfelder, B., and Gallagher J.
Environmental Relative Moldiness Index (ERMIsm) as a Tool to Identify Mold Related Risk Factors for
Childhood Asthma Sci  Total Environ. May 1;394(1): 192-6 (2008)

Johnson M, Hudgens E, Williams R, Andrews G, Neas L, Gallagher J, Ozkaynak H. "A Participant-Based
Approach to Indoor/Outdoor Air Monitoring in Community Health Studies" Journal of Exposure Science
and Environmental Epidemiology. (2008), 1-10  (2008).

Cohen Hubal E, Richards A., Shah I, Edwards S, Gallagher  J, Kavlock R, Blancato, J  Exposure Science and
the US EPA National Center for Computational Toxicology  J Expo Sci Environ Epidemiol. November
(2008).

Heidenfelder B,. ReifD, Harkema,  JR, Cohen Hubal E,  Hudgens,E. Bramble L G. Wagner G, Harkema
JR, Morishita M, Keeler G, Edwards,SW and Gallagher J.  Comparative Microrarray  Analysis and
Pulmonary Changes in  Brown Norway Rats Exposed to Ovalbumin and concentrated Air Particulates Tox
Sci. volume 108 2009 March 2 (2009)

Heidenfelder B, Johnson M,  Hudgens E, Inmon J, Hamilton R, Neas L, and Gallagher J, Increased plasma
reactive oxidant levels and their relationship to blood cells, total IgE, and allergen-specific IgE in asthmatic
children Journal of Asthma accepted (2009) Williams AH, Gallagher JE, Hudgens E, Johnson MM,
Mukerjee S, Ozkaynak H, Neas LMN. EPA Observational studies of children's respiratory health in Detroit
and Dearborn, Michigan.  Proceedings of AWMA 102nJune  16-19; Detroit, Michigan.(2009)

J. E Gallagher, E A Cohen Hubal, S.W. Edwards Invited book Chapter "Biomarkers of Environmental
Exposure" "Biomarkers of toxicity: A New Era in Medicine  Editors Vishal S. Vaidya and Joseph V.
Bonventre Publisher:John Wiley and Sons, Inc. October 1, (2009)

Markey M. Johnson, Ron Williams, Zhihua Fan, Lin, Edward Hudgens, Jane Gallagher, Alan Vette, Lucas
Neas, Haluk Ozkaynak Indoor and outdoor concentrations of nitrogen dioxide, volatile  organic compounds,
and polycyclic aromatic hydrocarbons among MICA-Air households in Detroit, Michigan submitted AWMA
(2009)


                                              C-7
                             Previous  I    TOC

-------
       EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
              BOSC Review Draft- 24 August, 2009
Gallagher, J Reif, D; Heidenfelder, B Neas, L; Hudgens, E Williams, A Inmon, J; Rhoney, S,  Andrews G.,
Johnson, M Ozkaynak, H; Edwards, S, Cohen-Hubal, E Mechanistic Indicators of Childhood asthma (
MICA); A systems biology approach for the integration of multifactorial  environmental health data
submitted: Journal of Exposure Science and Environmental Epidemiology (2009).
                             Previous
                                              C-8
TOC

-------
       EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.


D. EPA Strategic Plan for Evaluating the Toxicity of Chemicals
              BOSC Review Draft- 24 August, 2009
                                                 EPA/100/K-09/001 I March 2009
                                                             www.epa.gov/osa
           United States
           Environmental Protection
           Agency
                           The U.S.  Environmental  Protection
                           Agency's  Strategic Plan for
                           Evaluating the Toxicity of Chemicals
                                    Chemicals
                                                         Receptors / Enzymes / etc.
                                                         Direct Molecular Interaction
                                                         Pathway Regulation /
                                                         Genomics
                                                     Cellular Processes
                                                Tissue / Organ / Organism Tox Endpoint
           Office of the Science Advisor
           Science Policy Council
                            Previous
                                           D-l
TOC

-------
        EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
                BOSC Review Draft- 24 August, 2009
                                                                    PA 100/K-09/001
                                                                        March 2009
                   The U.S. Environmental Protection Agency's
                            Strategic Plan for Evaluating
                               the Toxicity of Chemicals
                                   Office of the Science Advisor
                                     Science Policy Council
                                U.S. Environmental Protection Agency
                                     Washington, DC 20460
                                       Recyeted/Recyclatole
                                       Pnnted with vegetab^-basKJ ink on paper thai
                                       contains a m^imum of 50% pa$£-c
                                       ftb@f and is pr
-------
         EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
                 BOSC Review Draft- 24 August, 2009
                                            DISCLAIMER
            Mention of trade names or commercial products does not constitute endorsement or
            recommendation for use. Notwithstanding any use of mandatory language such as "must" and
            "require" in this document with regard to or to reflect scientific practices, this document does not
            and should not be construed to create any legal rights or requirements.
                                  Previous
TOC

-------
         EPA CompTox Research Program FY2009-2012                     BOSC Review Draft- 24 August, 2009
IV. APPENDICES cont.
                                     AUTHORS AND CONTRIBUTORS
                               Future of ToxicMy Testing Workgroup Co-Chairs
             Michael Firestone, Office of Children's Health Protection and Environmental Education, U.S.
                   EPA
             Robert Kavlock. Office of" Research and Development. U.S. EPA
             Hal Zenick, Office of Research and Development, U.S. EPA

                                         Science Policy Council Staff
             Melissa Kramer, Office of the Science Advisor, U.S. EPA

                             Future of Toxieity Testing Workgroup Representatives
             Marcia Bailey, Region 10, U.S.  EPA
             Arden Calvert, Office of the Chief Financial Officer. U.S. EPA
             Laurel Celeste. Office of General Counsel, U.S. EPA
             Vicki Dellarco, Office of Prevention. Pesticides, and Toxic Substances. U.S. EPA
             Scott Jenkins, Office of Air and Radiation, U.S. EPA
             Gregory Miller, Office of Policy, Economics, and Innovation, U.S. EPA
             Nicole Paquette. Office of Environmental Information. U.S. EPA
             Santhini Ramasamy, Office of Water. U.S, EPA
             William Sette. Office of Solid Waste and Emergency Response. U.S. EPA

                                            Other Contributors
             Kathcrinc Anitole. Office of Prevention. Pesticides, and Toxic Substances. U.S. EPA
             Hugh B;ii1on, Office of Research and Development, U.S. EPA
             Norman Birchfield. Office of the Science Advisor, U.S. EPA
             Michael Brody, Office of the Chief Financial Officer, I I.S. EPA
             Rory Conolly. Office of Research and Development. U.S. EPA
             David Dix. Office of Research and Development. U.S. EPA
             Stephen Edwards, Office of Research and Development, U.S.  EPA
             Andrew Geller. Office of Research and Development, II.S. EPA
             Karen Ilarncmik. Office of Prevention. Pesticides, and Toxic Substances, U.S. EPA
             Jean Holmes, Office of Prevention, Pesticides, and Toxic Substances, I ",S. EPA
             Richard Judson. Office of Research and Development. U.S. EPA
             Thomas Knudsen, Office of Research and Development, U.S.  EPA
             Julian Preston, Office of Research and Development, U.S. EPA
             Kathleen RalTaele, Office of the Science Advisor, U.S. EPA
             Ram Ramabhadran, Office of Research and Development, U.S. EPA
             James Samet, Office of Research and Development, U.S. EPA
             Patricia Schmieder, Office of Research and Development, t LS. EPA
             Banalata Sen, Office of Prevention, Pesticides, and Toxic Substances, U.S. EPA
             Imran Shah. Office of Research and Development, U.S. EPA
             Linda Sheldon. Office of Research and Development. U.S. EPA
                                     Previous  I      TOC

-------
         EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
                  BOSC Review Draft- 24 August, 2009
            John Vandenberg, Office of Research and Development. U.S. EPA
            Maurice 7,eeman, Office of Prevention, Pesticides, and Toxic Substances, U.S. KPA

                                          External Peer Reviewers
            John R, Bucher, Ph.D., Associate Director, National Toxicology Program, National Institute of
                  Environmental Health Sciences
            George Daston, Ph.D., Research Fellow, P&G
            Daniel Krewski, Ph.D.. MHA, Professor and Director, Mclaughlin Centre for Population Health
                  Risk Assessment, I Iniversity of Ottawa
            Martin Stephens, Ph.D., Vice President for Animal Research Issues, "Hie Humane Society of the
                  United States
                                    Previous
TOC

-------
          EPA CompTox Research Program FY2009-2012                     BOSC Review Draft- 24 August, 2009
IV. APPENDICES cont.
                                           TABLE OF CONTENTS
             LIST OF FIGURES	vi
             LIST OF TABLES	vi
             ACRONYMS	vii
             1. Introduction	,.	,.	..,..,.,..	,.,...	1
             2. Regulatory Applications and Impacts	5
               2.1 Chemical Screening and Prioritization	5
               2.2 Toxicity Pathway-Based Risk Assessment	5
               2.3 Institutional Transition	7
             3. Toxicity Pathway Idt*ntitication and Chemical Screening and Prioritization	8
               3.1 Strategic Goal 1: Toxicity Pathway Identification and Assay Development	 10
               3.2 Strategic Goal 2: Chemical Priorili/ation	11
             4. Toxicity Pathway-Based Risk Assessment	 12
               4.1 Strategic Goal 3: Toxicity Pathway Know ledgebascs	13
               4.2 Strategic Goal 4: Virtual Tissues, Organs, and Systems: Linking Exposure. Dosimctry, and
               Response	14
               4.3 Strategic Goal 5: Human Evaluation and Quantitative Risk Assessment	16
             5. Institutional Transition	18
               5.1 Strategic Goal 6: Operational Transition	IX
               5.2 Strategic Goal 7: Organizational Transition	20
               5.3 Strategic Goal 8: Outreach	20
             6. Future Steps	23
             Appendix: Other Related Activities	24
             References	 27
                                     Previous  I      TOC

-------
         EPA CompTox Research Program FY2009-2012
IV. APPENDICES cont.
                 BOSC Review Draft- 24 August, 2009
                                           LIST OF FIGURES

            Figure 1. Toxicity Pathways	2
            Figure 2. Toxicity Pathways Target Multiple Levels of Biological Organization...,,,..	8
            Figure 3. "I'oxCast™	11
            Figure 4. Toxicity Pathways to Dose-Response	,	 12
            Figure 5. Knowledgehase Development	 14
            Figure 6. Relative (°o) Kmphasis of the Three Main Components of this Strategic Plan over its
                    Expected 20-year Duration	23
                                           LIST OF TABLES

            Table 1. Strategic Plan: Applications and Impacts	
                                   Previous
TOC

-------
   EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
                 BOSC Review Draft- 24 August, 2009
                                             ACRONYMS

            ACToR      Aggregated Computational Toxicology Resource
            FIFRA      Federal Insecticide, Fungicide, and Rodenticide Act
            F'lTW       Future ofToxieily Testing Workgroup
            HIS        High ITiroughpul Screening
            IRIS        Integrated Risk Information System
            NRC        National Research Council of the National Academies
            OPPTS      Office of Prevention, Pesticides, and Toxic Substances
            ORD        Office of Research and Development
            QSAR       Quantitative Structure-Activity Relationship
            SAR        Structure-Activity Relationships
                             Previous
TOC

-------
   EPA CompTox Research Program FY2009-2012                        BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
                                                  1.  INTRODUCTION

              KPA bases its regulatory' decisions on a wide range of tools and information that represent the best
              available science. In some situations, where very limited or no animal toxieity data exist. EPA may
              use tools such as structure-activity relationships (SAR) and quantitative structure-activity
              relationship (QSAR) modeling, together with information on exposure to make decisions about
              priority setting and the need for further evaluation (e.g., for new chemicals in the toxics program.
              high production volume chemicals, and pesticide inerts). To establish regulatory standards, KPA
              relies heavily on toxieity testing to evaluate clinical or pathological effects in experimental animal
              models. As such, toxieity testing and related research is currently a multi-billion dollar activity that
              engages thousands of research scientists, risk assessors, and risk managers throughout the world.
              To that end, the historical path taken in toxieity testing of environmental agents has generally been
              either to make incremental modifications to existing tests or to add additional tests to cover
              endpoints not previously considered (e.g., developmental neuroloxieity). This approach has led
              over time to a continual increase  in the number of tests, cost of testing, use of laboratory animals,
              and time to develop and review the resulting data. Moreover, the application of current toxieity
              testing and risk assessment approaches to meet existing, and evolving,  regulatory needs has
              encountered challenges in obtaining data on the lens of thousands of chemicals to which people are
              potentially exposed and in accommodating increasingly complex issues (e.g.. lifestage
              susceptibility, mixtures, varying exposure scenarios, cumulative risk, understanding mechanisms
              of toxieity and their implications in assessing dose-response, and eharacteri/alion of uncertainty) .

              While the challenges of such information gaps are great, the explosion of new scientific tools in
              computational, informational, and molecular sciences offers great promise to  address these
              challenges and greatly strengthen toxieity testing and risk assessment approaches. Proven benefits
              have been demonstrated in allied fields such as medicine and phannaccuticals. Although untapped,
              the potential application to toxieity testing and risk assessment has also been reeogni/ed by KPA as
              witnessed by the issuance of a series of papers that provided guidance on the use of genotnic data/
              To better anticipate the potential  contribution ofnew technologies and  scientific advances to issues
              associated with toxieity testing and risk assessment,  KPA commissioned the National Research
              Council (NRC) in 2004 to review existing strategies (NRC, 2006) and develop a long range vision
              for toxieity testing and risk assessment (NRC. 2007). In the subsequent release of Toxicily Testing
              in the 21st Century: a  1'ision and a Strategy; a landmark transformation in toxieity testing and risk
              assessment is envisioned that  focuses on 'toxieity pathways."" This approach is based on the
              rapidly evolving scientific understanding of how  genes, proteins, and small molecules interact to
              form molecular pathway's that maintain cell function. The goal is to determine how exposure to
              environmental agents can perturb these pathways causing a cascade of subsequent key events
              ' These limitations have been described more fully in,-! Review of the Reference Dose and Reference Concentration
              Processes: hup "www epa gov 'ncea;insi'RFD_FINAI,[ I J.pdf
              1 Interim Policy on Genomics (20112): http/ www epa.gov/osa-spcigefiomics.htm, Uenomies White Paper (2004):
              http:.'.w\vw.epa gov osa:pdfs.HFA-Genom]cs-\Vhite-Papcr,p(.1f; Interim Guidance for Microarray-Bascd Assays
              COO 7): hUp:-.%ww.epa.govi'i>sa.i'spc.;pJf&'epa_inleriiTi_guidance_for_rnicroaiTay-ha.sed_assays-extemal-
              review draft pdf
              3 Toxieity pathways are cellular response pathways thai, when sufficiently perturbed, are expected to result in
              adverse health effects.
                                 Previous  I      TOC

-------
   EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
                                                             BOSC Review Draft- 24 August, 2009
              leading to adverse health effects. This sequence of events is illustrated in Figure 1 wherein the
              introduction of an environmental stressor may trigger such a cascade. Successful application of
              these new scientific tools and approaches will inform and produce more credible decision making
              with ail increased efficiency in design and costs and a reduction in animal usage.

                                                                  Other agencies have also recognised
                                                                  the need for this transformative shift,
                                                                  including the National Toxicology
                                                                  Program in their Roadmap for the
                                                                  Future and the hood and Drug
                                                                  Administration in their Critieal Path
                                                                  Program. In anticipating the
                                                                  emergence, and potential, of this new
                                                                  scientific paradigm, EPA's  Office of
                                                                  Research and Development (ORD) and
                                                                  some of the Agency's regulatory
                                                                  programs have also begun to redirect
                                                                  resources in intramural and extramural
                                                                  research programs to "jump stall" the
                                                                  process of transformation. For
                                                                  example. ORD created the National
                                                                  Center for Computational Toxicology
                                                                  in 2006. Likewise, ORD National
                                                                  Laboratories and Centers have also
Figure 1. Tttxicity Pathways. Toxicity pathways describe the
processes by which perturbations of normal biological processes
due to exposure to a slrcssor (e.g.. chemical) produce changes
sufficient to lead to cell injury and subsequent events (modified
from NRC, 2007).
              begun to incorporate these new scientific tools to better support the research being conducted
              tinder several of its mulliyear research plans. Several ongoing projects address the use of in vitro
              assays in risk assessment and loxicity testing (e.g.. Ciuyton, el a/., 2008), and assessments under
              the Integrated Risk Information System (IRIS) program arc describing and evaluating published
              gcnomic data. KP.Ys Office of Prevention. Pesticides, and Toxic Substances (OPPTS) is also
              actively involved in the development and transition of computational toxicology' tools into
              regulatory practice. OPPTS has developed a multi-year strategic plan to advance computational
              toxicology tools in its risk assessment and management paradigm. Current activities include
              assisting ORD by providing the necessary databases to support the development of models for
              efficiently and credibly predicting toxic potency and levels of exposure, beta testing the new
              computer models,  training staff, and initiating plans for successful international coordination and
              stakeholder involvement. Furthermore, recognizing the need to partner to achieve the vision and
              goals laid out by the NRC, EPA recently signed a Memorandum of Understanding for research
              cooperation with the National Toxicology Program and the National Institutes of Health
              Chemical Gcnomics Center as a substantive step forward in building collaborations across sister
              federal agencies.6 EPA is also working actively at the international level with programs such as
              the Organization for Economic Cooperation and Development (OECD) through the Molecular
              4 Computational toxicology is the application of mathematical and computer models and molecular biological
              approaches to improve the Agency's priorilizalion of data requirements and risk assessments (from .-I Framework
             for a Computational Toxicology Research Program. HPA 600.14-03/065),
              5 http://cfpub.cpa.gov nccairis/indc.x.cfhi
              6 htlp://www.epa.gov/comptox/arlicles/com plox_mou.html
                                 Previous
                                         TOC

-------
   EPA CompTox Research Program FY2009-2012                       BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
             Screening Initiative, the Integrated Approaches for Testing and Assessment Workgroup. Test
             Guideline Committees, and the QSAR Expert Group to ensure global harmoni/.ation of any new
             approach that originates from the research program. A more complete listing of these
             collaborations may be found in the appendix.

             In response to the release of the NRC reports, KPA has established an intragency workgroup, the
             Future of Toxicity Testing Workgroup (FTTW), under the auspices of the Science Policy
             Council. The FTTW includes representatives from across the .Agency, including the Regions and
             all major Program Offices, It has produced this current document, which will serve as a blueprint
             lor ensuring a leadership role for KPA in pursuing the directions and recommendations presented
             in the 2007 NRC report. This document presents a strategy that is consistent with the NEC's
             directions and recommendations. It presents the Agency's vision of how to incorporate a new
             scientific paradigm and new tools into toxicily testing and risk assessment practices with ever-
             decreasing reliance on traditional apical approaches. The overall goal of this strategy is to
             provide the tools and approaches to move from a near exclusive use of animal tests for predicting
             human health effects to a process that relies more heavily on in vitro assays, especially those
             using human cell lines. The topics to be covered include (1) the applications and impacts benefits
             for various types of regulatory activities (Section 2), (2) the research to be conducted to facilitate
             the screening and prioritization of environmental agents (Section 3), (3) the implementation of a
             toxicity pathway-based approach to risk assessment (Section 4), and (4) the critical companion
             component, namely, the institutional transition that must occur before the changes can be fully
             implemented (Section 5).

             As described in Section 6, the workgroup ro-cogniy.es thai the full implementation ofthe vision
             set out in this strategy will require a significant investment of resources  over a long period of
             time. 'Hie workgroup has identified a range of partners in this  effort, and some planning on the
             relative role of these partners has begun, although the specific areas of work to be
             conducted funded by EPA versus other partners needs further assessment. Decisions on the
             relative roles will have a significant impact on KPA resources required to implement the vision.

             Since the NRC charge and report centered on advancing toxicity testing for assessing human
             health effects of environmental agents, this strategic plan is presented primarily within that
             context. However, under environmental legislative mandates (e.g., the Toxic Substances Control
             Act; the federal Insecticide, Fungicide, and Rodenticidc Act; and the Clean Water Act), most
             KPA programs must regulate compounds to ensure both environmental and human health risks
             are properly managed. Since statutory language and-'or resulting policy typically require single
             regulatory decisions for a chemical(s) that encompass environmental and human health risks at
             the same time, accelerated and cost effective approaches for both areas are critical to reali/e
             programmatic benefits. As in the human health arena, development and  application of
             approaches described in this strategy apply to ecoloxicology and risk assessment as well. Notable
             progress is being made within EPA Laboratories and Centers on the development and use of
             toxicity pathway models and the creation of prioritization schemes, toxicology knowledgebases.
             and systems biology models in  the field of environmental science. The bringing together of
             relevant disciplines to share data and integrate models is critical to fully achieve increased
             efficiency in toxicity testing and a reduction in animal usage for both human health and
             environmental risk assessment. Consequently, the Agency will be implementing this strategy in a
             manner that addresses both human health and ecological risk assessment. Future versions ofthe
                                Previous   I      TOC

-------
   EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
                BOSC Review Draft- 24 August, 2009
           strategy will summarize progress made in advancing integrated testing and assessment capability
           and revisit remaining challenges.
                           Previous
TOC

-------
   EPA CompTox Research Program FY2009-2012                      BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
                                2.  REGULATORY APPLICATIONS AND IMPACTS

             The research arising from implementation of this strategy will change the nature of the methods,
             models, and data that will inform the major components of the risk assessment process (i.e..
             ha/ard identification, dose response, exposure assessment, and risk characterization). Without
             attempting to he all-inclusive. Table 1 presents some of the major cross-office applications and
             impacts of these new scientific approaches, with more in-depth discussion ofthe planned work
             described in Sections 3-5. Hie three components of this strategic plan, namely, chemical
             screening and prioritization. toxieity pathway-based risk assessment, and institutional transition,
             are not independent elements but rather highly interactive and integrative efforts that will
             maximize the value and application ofthe research generated.

             2.1.   Chemical Screening and Prioriti/ation

             An ongoing need of several regulatory offices is to have tools to assist in chemical screening and
             prioriti/ation, e.g., high production volume chemicals, air toxics, the drinking water Contaminant
             Candidate Lists, and Superfund chemicals. These programs consider anticipated exposure and
             hazard to select chemicals to evaluate in longer-term, whole-animal laboratory studies. An early
             use for data developed under the new paradigm will be as an efficient and cost effective screen
             for several types of chemical toxieity. Thus, risk assessors could use in silico (computer-based)
             technologies and structure  molecular bioactivity profiling from diagnostic high-throughput .in
             vitro assays, along with predicted exposure/dose information, to predict chemicals most likely to
             cause hazards of concern for humans.  This approach will also enable risk assessors to determine
             the specific effects, in vivo data, and exposures that would be most useful to assess, quantify, and
             manage. As the technology develops, EPA will be able to screen previously untested chemicals
             using libraries of chemical, molecular, biological, and toxicological data and models to identify
             the types of adverse effects that they arc most likely to produce in standard animal bioassays.
             More importantly.  EPA will be able to gain better insight into whether such effects would likely
             be manifest in humans under various exposure scenarios. As noted earlier, these needs arc
             common to a number of federal agencies; discussions are underway to develop more common
             paradigms among federal agencies to facilitate data sharing.

             2.2.   Toxicity Pathway-Based Risk Assessment

             'lite current approach to risk assessment includes uncertainties associated with (1) the human
             relevance of laboratory animal studies (species extrapolation). (2) the use of high doses in
             animals to estimate risk associated with lower environmental/ambient exposures (dose
             extrapolation), and (3) predicting the risk to susceptible populations.  In recent years, the
             consideration of such issues has been better informed by the incorporation of information on
             potential modes of action through which toxicity may be expressed, '[lie approach outlined
             earlier in Figure 1 focuses on perturbations  in baseline biological processes that may lead down
             toxicity pathways to adverse health outcome(s). Combining this information with distributional
             data on population characteristics of exposure and dose (magnitude, frequency, and duration)
             provides a scientifically based approach for reducing the uncertainties associated with current
             risk assessments. By relying on a quantitative understanding of perturbations in toxicity
             pathways that lead to adverse health effects, the new approach to toxicity testing and risk
             assessment envisioned in this  document will greatly increase  EPA's capacity to assess individual
                                Previous   I      TOC

-------
   EPA CompTox Research Program FY2009-2012
IV. APPENDICES cont.
                   BOSC Review Draft- 24 August, 2009
             chemicals and their mixtures. "The new approach will also increase EPA's confidence that the
             Agency's assessments adequately protect human health. Rcali/ation and acceptance of this new
             approach will likely encounter numerous challenges, but the effort is expected to ultimately lead
             to better protection of human health.
Table 1, Strategic Plan: Applications and Impacts

o
1
1
»w Approach
1
Toxicilj Pathway
Idtiitifiaifion and
Chemical Screening &
PHoritoiioii
Need to screen 10,000'*
of chemicals for wide
range of endpoints in a
manner thai considers
toxieity pathways and the
potential for human
exposure.
Need to limit cos! and
animal usage, improve
timeliness, and decrease
uncertainty in testing
decisions,
Identification oftoxicity
jxHhways for key
toxieologicrd endpoints.
Cornhsnc in siiico and
taoprofiles from UTS"
ulon» with QSAR
approaches linked to
animal study data
Offices would bo better
able to direct etibrts and
resources to chemicals
with greatest ixucntinl
nsk. Significant increase
in efficiency with marked
reduction in cosf for
toxieity testing.
Toxicily P;iHiwa>-B;*secI Risk Assessmenl
For many clieniicals, the current sapproaeh
relies on expensive animal EdsUifif tlutt Itike^
time to conduct and rc^ew. Untitattom in the
design of in vivo studies often jwcvent
complete evahiationoi'all endpomts ajid
hiizai'djisk s-ceiiunos of concern,
Limited imderAtanding of biotostcal
niechaj'ii&nis most often leads to lutcertain^' MS
assessing cumulative nsk or estrapolttlnig m
vitro to in vivo or across doses, lifesttsges,
species, or genetic diversi^
New scientific saidersEandingcaiKl fools in
molecular, computational, and information
science* consistent with applications in allied
aiea,s such as medicine aiidplianiiiiceuticals
represent a path forward
Reliance on increased understanding of how
perturbations of biologicLtl processes a!
environmentally relevant conccntratioas
tnt^er events (i.e., toxicilv [sjthway(s)) Htti3
may lead to adverse health ou teenies.
Develop linked cxposnre/dose models to-
inform dosing levels for toxieity testing and
inform risks.
More scientifically relevant data on which to
base EPA's regulatory decisions and/or
impact analyses dial rely ott Uwsc risk
assessments.
Institutional Transition
implementing the new approach will
reqtrira significant instil utiomtl investmeni
in operational and orgiani/ational transition
and in public outreach.
EPA lacks itppropriale ex|x:rl!se and
sufficient ftsndtng to fislly and most
elllcieatly utilise the new to>acit\' testing
ttfclmoiojzies when iiiakm.u regulakxr\'
decisions.
Fully adopting tlic new fmradigm slKndd
be supported by nice ham stkally based
proof-of- concept and venficafion studies
i-'urthcr, such atlo-pttoti will reijusre
adtijlioual training ol" existing staff and
hiring new staff tonversiinc in state-of-tht:-
scicnce knowletlge tn fields such as
toxicology, hiocbemtstry, hioinfonnatics,
etc.
A well informed public will have greater
confidence as EPA greatly expands the
nyniber of chemicals assessed for possible
nsks and unpro%res existing strategies for
ha/ard and risk assoAsnient!
             7 High-Throughput Screening (UTS) refers to robotic technologies developed by the pharmaceutical industry for
             drug development thai enable the ability to evaluate the effects of hundreds lo thousands of chemicals per day on
             molecular, biochemical or cellular processes
                               Previous
TOC

-------
   EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
                   BOSC Review Draft- 24 August, 2009
             23.    Institutional Transition

             Implementing major changes in toxicity testing of environmental contaminants and incorporating
             new types of toxicity data into risk assessment will require significant institutional change
             involving:

                •   Operational transition - how EPA will transition to the use of new types of data and
                    models for toxicity testing and risk assessment;
                •   Organizational transition  how KPA will deploy resources necessary to implement the
                    new toxicily testing paradigm such as hiring of scientists with particular scientific
                    expertise and training of existing scientific staff and risk managers;
                •   Outreach - efforts by EPA to share information with the public and improve risk
                    communication.

             The process of moving from research to regulatory acceptance for  implementing new science
             related to toxicity testing will be an iterative and long-term effort (likely encompassing more
             than a decade). Essential to this iterative process will be the demonstration that the predictive
             nature of these new approaches is superior to that of our current practices for toxicity testing and
             risk assessment. It will be critical to begin activities geared toward regulatory acceptance early in
             the process of implementing this strategic plan.
                               Previous
TOC

-------
   EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
                                                             BOSC Review Draft- 24 August, 2009
      3.  TOXICITY PATHWAY IDENTIFICATION AND CHEMICAL SCREENING
                                   AND PRIORI TI/AT ION

The advancements in biotechnology brought about by the sequencing of the human genome and
the investment in high throughput screening tools to mine large chemical libraries for potential
drugs have for the first time allowed a broad scale, unbiased examination of the molecular and
cellular targets of chemicals. At this time, the examination of the relationships between the
molecular and cellular targets ol chemicals and the traditional endpoints of toxicity is at an early
stage of development. Even upon characterization of these types of relationships, significant
phenotypic data will be required to critically establish the role oftoxicity pathways in evaluating
hazards and risks. The great potential is that identification of a toxicity pathway and
development of an in vitro bioassay for studying its chemical interactions will enable evaluation
ol the effects ol thousands of chemicals in that pathway. Broadening this approach to the many
toxicity pathways present in living systems allows a new avenue  for identifying those chemicals
that pose the greatest potential hazard. Knowledge of the toxicity pathways triggered by any one
chemical will also allow targeting of specific  in vivo tests to more fully characterize the potential
hazard and risk. The identification oftoxicity pathways for key target tissues, organs, and
lifcstagcs. and their linkage across levels of biological organization and exposure pathways and
intensities are core elements of this strategy.

As indicated in Figure 2, chemicals may interact with a single pathway (the blue chemical) or
multiple pathways  (the yellow chemical).  Also, multiple pathways can lead to the same
expression of toxicity in the target organ as signaling pathways converge on common elements.
It is important to note that multiple
mechanisms of action  for am
particular adverse response likely      1
exist, and thai many environmental
pollutants arc likely to  have multiple
mechanisms of action. Two critical
components of the  toxicity pathway
concept are (1) extending knowledge
of molecular perturbations and cell
signaling pathways to understand
linkages between levels of biological  j^g 2. Toxteity pathway* Target Multiple Levels of
organization and (2) extending        Biological Organization.
knowledge of in vitro and m vivo
markers relevant to adaptive changes and/or adverse outcomes (see Section 5). As the research
moves forward, it will be important to capture quantitative relationships between the molecular
events and the higher order changes. Demonstration of plausible connectivity along the
mechanism of action from initiating event to adverse outcome will serve as the rationale for
using data from subcellular or cell-based in vitro assays  for not only chemical  prioritixation but
also predictive risk assessment. As loxicity pathways are identified, relevant in vitro  assays can
                                                        Chemicals  —,
                                                                                Receptors / Enzymos / etc.
                                                                                Direcl Molecular Interaction
                                                                                Pathway Regulation /
                                                                                Genomics
                                                                            Cellular Processes
                                                                      Tissue f Organ / Organism Tox Endpoint
               Mode of action is defined as a sequence of key events and processes, starting with interaction of an agent with a
              cell, proceeding through operational and anatomical changes, and resulting in an adverse health effect. .Mechanism
              of action implies a more detailed understanding and description of events, often at the molecular level, than is meant
              by mode of action.
                                 Previous
                                         TOC

-------
   EPA CompTox Research Program FY2009-2012                      BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
             be utilized and their results compared to in vivo studies as appropriate given the need to predict
             effects in humans or other species. While comparing responses to those in animal hioassays will
             be an early milestone of this strategy, the ultimate goal is the prediction ofhunian risk,
             'ITlcrefore. efforts will shift towards that goal as experience with the approach increases. An
             added benefit to the loxicily pathway approach is that mixtures or their components could be
             evaluated in this manner,  and as knowledge grows, it will be possible to predict where
             interaction with multiple toxicity pathways might be expected  to lead to non-additive outcomes.
             Tliis later activity will be  an important outcome of the research highlighted in Section 4,2
             (Strategic Goal 4) that is focused on the development of virtual tissue models. As noted below.
             virtual tissue models will  also provide a basis for predicting emergent properties of tissues by
             integrating knowledge of molecular and cellular behaviors obtained from reductionist /» vitro
             approaches,

             In 2007, EPA launched ToxCast  ' in order to develop a cost-effective approach for prioritizing
             the loxicily testing of large numbers of chemicals in a short period  of time. Using data from a
             broad range of state-of-the-art UTS bioassays developed in the pharmaceutical industry.
             ToxCast  is building computational models to forecast the potential human toxicity of
             chemicals. Results from the HIS bioassays are being analy/.ed for signatures of bioactivity that
             correlate with known toxicities. These hazard predictions will  provide EPA regulator*' programs
             with science-based information helpful in prioriti/.ing chemicals for more detailed lexicological
             evaluations, and lead to more efficient use of animal testing.

             "Hie research described here focuses on two major strategic goals:
                 1) Identification of toxicity pathways and deployment of in vitro assays to characterize the
                    ability of chemicals to perturb those pathways in different biological contexts, and
                 2) Implementation of ToxCast . with an initial focus on providing input for chemical
                    prioriti/alion, shilling over time to providing input for  dose-response modeling.
             .-V key feature of ToxCast  is the phased nature of implementation  (see  Strategic Goal 2, Section
             3.2). from proof of concept, to forward validation, and finally to reduction to practice. The
             number of chemicals will grow from the hundreds to the thousands, and the number of assays
             will change as experience and biology dictate. As the number of chemicals and breadth of
             toxicity pathways covered increase. ToxCast  will improve as a unique resource to build chemo-
             informalic-based predictions of chemicals" potential human toxicity. Such advancements should
             help promote improved QSAR models  and data upon which to build virtual tissue models.

             Exposure science  also plays a large role in this strategy. More  simple and reliable screening
             models are needed that predict exposures to chemicals so that information from the full source-
             to-oulcome continuum is  brought into consideration in the evaluation of chemicals  a critically
             important step for new chemicals that have not yet been released into the environment. Examples
             of such simple methods and models for new chemicals can be  found at EPA's Sustainable
             Futures Initiative1". Additional such models should further evaluate exposure based on the life
             cycle of intended product use and the physical-chemical properties of the chemicals. This
             ' htlp:--''\v\vw.epa.gov.'nccl/toxcast.1
             1 ° http: ••' w w w epa.gov /oppt sf'
                                Previous   I      TOC

-------
   EPA CompTox Research Program FY2009-2012                      BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
             research should include the expansion of computational chemistry methods to further predict
             exposures as well as methods to predict release into the environment during product life cycle.
             Several additional screening-level models are currently under development in Canada and
             Kurope. Research in this area should be coordinated with these groups to facilitate an
             international approach for chemical screening. KPA should promote easy public access to all of
             these additional models through the Internet,

             3.1.   Strategic Goal 1: Toxicity Pathway Identification and Assay Development

             'ITie most systematic and extensive approach currently underway for screening and prioriti/ation
             is EPA's ToxCast  . Fully implementing the proposed strategy for more efficient toxicity testing
             will utili/.e a combination of the more exploratory ToxCast  chemical signature approach (see
             Strategic Goal 2), and the more hypothesis-driven approaches to elucidating loxicity pathways.
             Developing systems-based models will require comprehensive identification of the biological
             processes that can result in toxicity when they are perturbed by chemical exposures. Therefore,
             toxicity pathway identification and development of appropriate in vitro assays to characterize the
             dose-response and time course of perturbations to those pathways will be needed. Measurement
             ol chemical form and concentration from in vitro assays will also  he important in hypothesis-
             driven research that seeks to  establish linkages between perturbations of toxicity pathways and
             adverse effects, as well as for establishing structure-activity relationships. These research goals
             will utilize a range of methods (e.g.. transcriptomic, proteomic, metabolomic, cellular, and
             biochemical analyses) to identify toxicity pathways using in vivo and in vitro systems. The in
             vitro assays and toxicity pathways already included in the ToxCast  project will be a part of this
             research, but additional assays providing greater coverage of relevant toxicity pathways will
             need to be developed.  For example, developmental neurotoxicity key responses are known to
             include cell proliferation, apoptosis, differentiation (into different cell types and creating
             different functionality architecture of a cell), neurite outgrowth, synaptogenesis, and myelination
             (Cockc el al., 2007; Lien et al., 2007), but the underlying molecular pathways are not yet
             completely identified.  Through the informed use of newer "systems-based" approaches (Edwards
             & Preston.  2008). the flow of molecular regulatory information underlying the control of these
             cellular events can be characteri/.ed, classified, and modeled. To facilitate use in risk assessment,
             these studies will be coupled with mechanism of action-based studies, including animal  and
             human components as described in Strategic Goal 4.

             Current priorities for research include developing  in vitro assays for the key targets  of chemicals
             in the environment for which limited knowledge is available (e.g., developmental neurotoxicity,
             immunotoxicity. reproductive loxicity) as well as  for  relatively well-eharaeleri/.ed toxicity
             pathways such as stress response signaling. Studies representative of the full range of human
             variability will be necessary to characterize processes that may occur more readily in sensitive
             populations (e.g., asthmatics) or at certain lifestages (e.g., prenatal development). Additional
             emphasis needs to be placed  on toxicities demonstrated to occur in humans. For example, clinical
             trials or post-marketing surveillance for phamiaceuticals. as well as  molecular and genetic
             epidemiology studies,  afford the opportunity to examine effects of chemicals already introduced
             into the environment that may not currently he well assessed by in vivo animal toxicity studies.
             Some of these pathways may be important for environmental chemicals with respect to human
             variability or exposure to complex mixtures.
                                                        10
                                Previous   I      TOC

-------
   EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
                    BOSC Review Draft- 24 August, 2009
                                                     $ThouMnd«
                                                                                       Cancer
                                                                                       ReproTox
                                                                                       DevTox
                                                                                       NeuroTox
                                                                                       PulmonaryTox
                                                                                       ImmunoTox
                                                        HTS
                                                        -omics
             Bianformaticsf
            Machine Learning
3.2.    Strategic Goal 2: Chemical Prioiitization

This strategy extends approaches that are currently under development for KPA s ToxCast
program to include greater coverage of toxicity pathways and chemicals. The goal of the
ToxCast  program is to provide a comprehensive assessment ol toxicity pathways for a
relatively low cost per chemical (current estimates are in range of $20-23,000). ToxCast   (see
Figure 3) was
designed to collect                       '" 1"ftD testin9
data from a wide
range of in vitro              ¥
                          '
assays, mostly
    ; • . .   .-              •»  ?.
mechanistic in nature,      0  ^
to prioriti/.e which
chemicals to test
further and which in
vivo studies were
likely most important.
This screening and
prioritization approach
provides a near-term benefit during an extended transition to the more comprehensive proposed
vision. As more comprehensive descriptions of processes involved in toxicological responses
become available, different assays may be identified to replace those in the initial ToxCast
effort, and the relationship to in vivo studies will shift from prioritization to providing input for
dose-response modeling.

ToxCast  is being developed in a phased manner. During FYOK-O'A substantial progress will be
made on the first two phases ofthe ToxCast  program (l)ix et al., 2007; Kavlock et al.. 2008).
Phase I is a proof of concept involving 320 chemicals that have robust in vivo animal toxicity
information. 'ITiese chemicals have been profiled using over 400 high and medium throughput in
vitro assays. From these in vitro bioactiv ity profiles, classifiers or signatures predictive of
chemicals"  in vivo loxicily are being derived. Phase II  will involve validation ofthe predictive
bioaetivily and expansion of the diversity of chemicals tested. Phase III  is the most relevant to
this strategic plan, as it would begin to apply the knowledge gained in Phases I and II to the tens
of thousands of chemicals of concern to KPA regulatory offices. An adaptation ofthe approach
to evaluate the hazardous properties of nanomalerials is also anticipated.
                                       Figure 3. ToxGast™ is using a variety of UTS assays to develop bux-ictivity
                                       signatures that are predictive of effects in traditional toxicity testing approaches.
                                                         11
                                 Previous
TOC

-------
   EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
                                     BOSC Review Draft- 24 August, 2009
                               4. TOXICITY PATHWAY BASED RISK ASSESSMENT

              ITie goals of the proposed new strategy for loxicily testing include collecting mechanistic data.
              largely in vitro, for the purpose of predicting human risk from exposure to chemicals. Prediction
              of in vivo effects in humans requires a combination of measurements and computer modeling to
              link in vitro responses to tissue dosimetry to alterations in the structure and function oI tissues
              and organs. A substantial challenge will be to address the range of human variability arising from
              differences in age, life stage, genetics, disease susceptibility, epigenelics, diet, disease status, and
              other factors that potentially influence or interact with toxicity pathways.

              'ITie initial process for predicting human risk under this new approach could be summarized as
              (1) characterizing or predicting potential human exposures; (2) estimating the resulting chemical
              dosimelry (magnitude, frequency, and duration) For target pathways, tissues or organs; (3)
              measuring toxicity pathway response at doses consistent with human exposures; (4) predicting
              the in  vivo human response resulting from pathway perturbations: (5) quantifying the range of
              human variability and susceptibility; and (6) validating predictions ulili/.ing in vivo systems (e.g.,
              laboratory animals, human data). In the current state of mechanistic toxicology (top row of
              Figure 4). chemicals are administered to the test animals (usually at high doses), a variety of
                     Erviron mental
                      Chemicals
Molecubr
 Sensing
 Celular
Signafng
 TBS ue
Responses
                    Knowledgsbase
                      Toxicity
                      Pathways
Molecular
Nstwxks
 Celular
Networks
 Virtual
Tissues
Dose-Response
                Figure 4. Toxicitj Pathways to Dose-Response. The vertical arrows at each step in the process reflect the
                iterative nature of experimentation and modeling needed to gain fill! understanding of both the toxicity pathway
                determination and the relationship to normal biology.

              biochemical approaches are used to detect alterations in molecular pathways, the data are mined
              to describe the ensuing cellular alterations (e.g.. oxidative stress damage, mitochondrial
              dysfunction),  and tissue changes are confirmed at the level of morphology or function. The
                                                         12
                                 Previous
                 TOC

-------
   EPA CompTox Research Program FY2009-2012                      BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
             bottom row of the figure depicts the vision for future ways of assessing risk, which includes
             determining the key toxicity pathways, defining approaches for examining perturbations in
             molecular networks, and translating the results to responses at the cell, and ultimately tissue and
             organ level, using computational models of the relevant systems. 'Hie expectation is that
             assessments in the future will utili/.e data from in vitro studies, and the need for in vivo animal
             testing will be substantially reduced. However, until the state of science of this new approach has
             reached a level oI confidence for use in regulatory decision making, the traditional approach to
             toxicity testing will continue into the foreseeable future. With time, we  expect that it will be
             progressively augmented and ideally replaced by computational models that integrate the
             information generated from non-animal sources into predictive models of response based upon
             the underlying biology. The vertical arrows at each step in the process reflect the iterative nature
             of experimentation and modeling needed to gain full understanding of both the toxicity pathway
             determination and the relationship  to unperturbed biology. One anticipated outcome of the
             development of virtual tissues will be an increased understanding of the role of metabolism and
             of intra- and inter-cellular signaling pathways. This understanding will  lead to the development
             of improved in vitro systems that, for example, might include combined cell-based systems to
             provide metabolic competency or to better reflect the  intercellular responses in heterogeneous
             tissues.

             As the transition progresses, it is important that increased  emphasis will be placed on
             examination of exposure concentrations that  are expected  to occur in the environment. The key
             difference in future toxicity evaluations will  be the transition to a focus on ways in which
             molecular pathways (as detected by in vitro models) arc perturbed by chemical exposure
             throughout the range of exposures  from environmental to  the higher dose levels commonly used
             in contemporary toxicity studies. Dosimetry  measurements coupled with computational
             modeling will  be critical for predicting in vivo exposure levels of concern and for determining
             relevant in vitro concentrations.  Some responses of targeted toxicity pathways can be evaluated
             in simpler cell culture models, whereas, in other cases, multiple in vitro assays may be necessary
             for the integration of multiple pathways that  produce in vivo responses. Iliese situations would
             require biologically based models for the responses as well as for chemical dosimetry in order to
             predict the integrated in vivo response.

             Implementing this new paradigm requires organization of existing scientific information;
             computational methods for exposure, chemical dosimetry, and perturbations of biological
             processes; and evaluation of the  methods for risk assessment applications. The research program
             to implement this element of the strategy is defined by three goals: development of toxicity
             pathway and exposure knowledgebases; development of virtual tissues, organs, and systems; and
             evaluation of human relevance.

             4.1.   Strategic Goal 3: Toxicity Pathway Knowledgebases

             The underlying basis of the 2007 NRC report is that there  are a finite number of toxicity
             pathways (i.e., in the hundreds) that could be queried  using in vitro assays to obtain insights into
             the ability of chemicals to perturb those pathways. It refers to several stress pathways (e.g..
             oxi dative stress response) and notes the general listing of signaling pathways in a  previous NRC
             report (2006),  However, an inventory  of toxicity pathways and their involvement  in a variety of
             toxicological responses needs to be created.  Likewise, from exposure science there needs to be a
                                                        1.1
                                Previous   I      TOC

-------
   EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
                                                            BOSC Review Draft- 24 August, 2009
             complementary effort focusing on those chemical properties and computational methods that
             could be used to reliably predict behaviors in the environment and exposures. Ibis effort would
             include information on stability in the environment, likely routes lor exposure, potential for
             bioaccimiulalion, and extent of metabolism. Therefore, a strategic goal is the development of a
             kno\vledgebase lor loxicity pathways and exposure. Knowledgebases differ from traditional
             databases in the extent of integration of information and the inclusion of tools that can draw
             inferences from amongst the diverse elements.

             The knowledgebase would serve a variety of functions throughout the research and development
             effort associated with implementing this new approach to loxieity testing and will become a
             standard tool in the risk assessments of the future. ACToR (Figure 5),  the Aggregated
                                                                             Computational
                                                                              Toxicology Resource
                                                                             under development in
                                                                             ORD. is an example of
                                                                             the needed approach of
                                                                             bringing together diverse
                                                                             types of information into
                                                                             a system where
                                                                             interrelationships of
                                                                             individual database
                                                                             elements (e.g., traditional
                                                                             toxicology, chemical
                                                                             structure  information.
                                                                             high throughput
                                                                             screening data, molecular
             pathway analysis, chemical data repositories, peer reviewed published literature, and internal
             Agency databases) can be explored and ulili/cd (.ludson el al., 2008). Key steps in development
             of these knowledgebases include: (1) creating electronic repositories of existing loxieity
             information; (2) developing semantics  for describing toxicity pathways'. (3) automating pathway
             inference tools to aid in discovering mechanistic links between genomie information and
             molecular and cellular observations; and  (4) creating a toolbox with a user-friendly interface to
             organize, access, and analyze toxicity pathway assay results,

             4.2.   Strategic Goal 4: Virtual Tissues, Organs, and Systems: Linking Kxposure,
                    Dosimetry, and Response

             Computational techniques relevant to this slralegy Tall into two general branches: knowledge-
             discovery (data-collection, mining,  and analysis) represented in Strategic Goal 3. and dynamic
             computer simulalion (mathematical modeling at various levels of detail) described in ibis
             section. The central premise of the latter approach is that critical effects of environmental agents
             on molecular-, cellular-, tissue-, and organ-level pathways can be captured by computational
             models that focus on the flow of molecular regulatory information (Kmidsen & Kavlock, 2008).
             This information flow- is influenced by genetic and environmental signals, with the net outcome
             being the emergent properties associated  with baseline or abnormal collective cell behavior.
             Thus, computational systems modeling will be used to predict organ injury due to chemical
             exposure by simulating: (l)the dynamics and characteristics of exposure and dose, (2) the
Figure 5. Knowledgebase Development. ACToR brings together a diverse set of
currently unlinked resources available from internal and external sources into a
system with a user friendly interlace to readily mine and analyze loxieity data.
                                                        14
                                Previous
                                         TOC

-------
   EPA CompTox Research Program FY2009-2012                      BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
             dynamics of perturbed molecular pathways. (3) their linkage with processes leading to alterations
             ol cell state, and (4) the integration of the molecular and cellular responses into a physiological
             tissue model. By placing a strong emphasis on understanding the biology of the system and the
             key regulatory components, these virtual tissue models represent a significant opportunity to
             better understand the linkage between chemically induced alterations in toxicity pathways and
             effects at the organ level. This research represents an ambitious effort, conceivable for the first
             time due to the current technological advances. Virtual tissue and organ system models will
             initially include liver, cardiopulmonary function, selected immune system tissues, multi-organ
             endocrine axes, and developing embryonic tissues.  Development of these virtual tissue and organ
             systems will require newly generated data to both fill data gaps identified within the iterative
             process and test the predictive nature of these virtual systems. Comparative studies should
             include pathways fundamentally reliant  upon cell signaling (e.g.. cell proliferation, apoptosis.
             cell adhesion), intermediary metabolism (e.g., glycolysis. oxygen utilization, fatty acid
             biosynthesis), differentiation-specific functions (e.g., extracellular matrix remodeling), and other
             categories as developed above (sec Strategic Goal 1) to ensure that predictions are broadly
             applicable. The wealth of existing data from  NTP assays, published reports, and previous EPA
             intramural studies will be leveraged wherever possible with additional experiments designed to
             fill data gaps. Such efforts will also help answer how well in vitro experimental systems
             represent the full range of diverse cells present in the human body, how variability observed in
             the human population can modify quantitative predictions of in vivo dose-response, how
             exposure conditions influence outcomes, and how well the virtual tissue models represent the
             underlying processes.

             Not all toxicity pathways are likely to be expressed in every tissue, and likewise not all tissues
             are likely to manifest adverse outcomes following chemical perturbation. Chemicals that affect
             the same loxicity pathway can do so via a number of different (and overlapping) mechanisms,
             and development of assays across loxicity pathways leading to the same outcome is a necessary
             component of the proposed strategy. Some toxicities are manifest only when multiple cell types
             and specific cell-cell interactions are present. Other to.xicilies may he dependent upon tissue
             geometry and three-dimensional architecture. Examples include signaling between hepatocytes
             and KupfTcr cells, or the many  forms of signaling between epithelial and mesenchymal cells. As
             such, developers of virtual cells, tissues, organs, and systems musl always bear in mind the need
             to remain relevant to the processes critical to expressions of toxicity in vivo. Consistent with the
             NRC vision (2007), this need will likely entail a continued although decreasing role for in vivo
             systems for the foreseeable future.

             A premise of the new to.xicily testing strategy is that computational methods combined with an
             understanding of biological and exposure processes can be used to develop a more efficient mid
             accurate approach for predicting risks from many chemicals, On the exposure side, models have
             been developed and are available that predict fate and transport, environmental concentration,
             exposures, and doses. These models work at  multiple scales; for multiple sources, routes, and
             pathways: and for multiple chemicals, although each model only addresses a single process or
             compartment. Research is needed so that such models can take  into account weathering of
             contaminants, differences in bioavailability of contaminants, variations in exposures with age.
             and variability in exposures within populations. Research is also needed to combine these  models
             across various scales to develop a linked source-to-oulcome modeling framework, to evaluate the
             framework using multiple chemicals and exposure scenarios, and to improve the computational
                                                        15
                                Previous   I      TOC

-------
   EPA CompTox Research Program FY2009-2012                       BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
              efficiency for the approach. Ultimately, these exposure models will be linked to the virtual tissue
              models for utili/.ing in vitro toxicity test results in quantitative risk assessments. Given the
              complexity ot the challenges present in addressing each of these components, this effort
              represents a long-term goal of the strategy. However, elTorts must begin now to put us on the
              path to achieving the ultimate vision of Toxicity Testing in the 2la (.'entwy (NRC, 2007).

              'ITic derived computational models must accurately describe the processes and mechanisms that
              determine exposure and effect. They must have reliable input parameters in order to quantify
              these processes. On the exposure side, our current understanding of processes and factors for
              many classes ol chemicals and pathways (i.e., dermal and incidental ingestion) is limited. New
              approaches will be evaluated that  will  allow us to address the most significant uncertainties.
              Relational databases populated with data on exposures, exposure factors, activity patterns, and
              biomarkers will be developed as described. Inlormatic approaches or applications of network
              theory could potentially be used to provide a better understanding of important exposures, as
              well as exposure/response relationships. In the 2007 NRC report, emphasis was placed on
              biomarkers and their role in relating real world exposure to in vivo and in vitro biological
              response. They were also proposed as  primary1 indicators in surveillance programs for tracking
              predicted exposures and health outcomes. Because of this emphasis, novel approaches for using
              biomarkers and integrating them into new risk assessment approaches will be investigated for
              chemicals already existing in the human environment. Perhaps such biomarker data can be used
              to improve predictive exposure models that will  be relied upon for new  chemicals not yet
              introduced into the environment.

              4.3.   Strategic Goal 5: Human Evaluation and Quantitative Risk Assessment

              The critical challenge of this new  vision for toxicity testing using mechanistic in vitro assays.
              targeted in vitro or in vivo testing, and computational models is to demonstrate that it
              successfully and adequately predicts human toxicological responses. Proof of concept efforts
              need to address this challenge both retrospectivcly and prospcctivoly. Existing human data from
              pharmaceutical and environmental studies will be used to the extent possible. Human data could
              come from a range of sources including case reports, epidctniologieal studies (e.g.. from the
              National Children's Study), and clinical trials. KPA has extensive experience obtaining human
              clinical data following exposure to the criteria air pollutants (e.g.. o/one. particulale matter) and
              other chemicals (e.g.. MTBK)  . Engagement of the pharmaceutical industry and the Food and
              Drug Administration to access toxicity findings from clinical trials ol"drugs that were
              successfully registered or that failed to be registered would be a desirable component of this
              ellbrt. Limited data may be available for some nutrients or dietary supplements as well.

              Such efforts will help address the  question of the extent to which key events (critical
              perturbations) that are predictive of health endpoinls (e.g.. cancer, immunosuppression, kidney
              disease) must be demonstrated or  whether the perturbation of baseline biological processes
              sufficient to induce substantial cellular level response (e.g., a stress response) should be
              considered an adequate endpoint for risk assessment. Linking a specific pathway perturbation to
              " All liPA conducted or supported research is subject to and must comply with liPA regulations on the protection of
              human subjects Sec hUp Avww cpa.gov-Tcdrgstr'EPA-GENER.'-\L,12006jTcbruar\"Day-06/g 1045 htm,
              htlp:'.'www epa.gov/oamrtpncv'fbrms/Kiftfi_l 7a.pdf


                                                         16
                                 Previous   I      TOC

-------
   EPA CompTox Research Program FY2009-2012                      BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
             a particular target organ endpoint has the advantage of predicting outcomes that are already used
             in risk assessment, while alternative approaches raise issues of which endpoints should and
             should not be considered for risk assessment,  Fhis approach is relatively straightforward for
             some effects (e.g., hcmolysis of red blood cells by KGBK, where the effect and the mechanism of
             action leading to it are qualitatively the same,  even if quantitatively different). Linkage is more
             complicated for effects observed in animals that may predict human effects that are related, but
             not identical to, the outcomes in animals (e.g., developmental effects in an animal model may
             predict developmental effects in humans, but the exact manifestation might be different). On the
             other hand, as knowledge is gained about the interaction of chemicals with molecular targets, and
             this knowledge is combined with information  on how perturbations of those targets are translated
             to responses in species-specific patterns (e.g.,  how activation oFeertain transcription factors lead
             to species-specific tissues responses), it will be increasingly possible to  predict human outcomes
             from in vitro studies that identify mechanism of action. Clearly this aspect will need to be
             addressed on a case-by-case basis as we gain experience.

             To be most useful in evaluation of risk  to humans, the pathway-based efforts should ideally be
             tied to a known mechanism of action, such as  via the use of quantitative biologically based, dose-
             response models. Understanding of the relevant mechanism of action will enable the
             identification of biomarkers for key event parameters (linked to toxicity pathways) that can be
             monitored in human studies for those chemicals already released into the environment at
             significant levels. 'ITtese biomarkers could be  measured in observational human studies to
             provide in vivo data to support the underlying  pathway-based  model. In addition, genetic
             susceptibility in humans identified via whole genome association studies will provide support for
             pathway-based models when genes critical for a key toxicity pathway are associated with
             susceptibility. Finally, the use of quantitative models requires estimation of uncertainty and
             variability in the predictions from in vitro assays and computational models. Formal methods for
             model evaluation are essential for demonstrating the success of this new approach to toxicily
             testing and risk assessment.
                                                        17
                                Previous   I      TOC

-------
   EPA CompTox Research Program FY2009-2012                       BOSC Review Draft- 24 August, 2009
IV. APPENDICES cont.
                                         5.  INSTITUTIONAL TRANSITION

              Implementing major changes in toxicity testing ol environmental contaminants and incorporating
              new types of toxicity data into risk assessment will require significant institutional changes. This
              section will touch upon three major thrusts of implementing institutional transition: operational
              transition, organizational transition, and outreach.

              s.l.    Strategic Goal 6: Operational Transition

              Operational transition covers the technical aspects associated with EPA's implementation of a
              new toxicity testing paradigm and associated changes in risk assessment. It will consider such
              disparate topics as the importance of grounding the science, ensuring consistency of approaches
              within HP A, and working with outside partners and issues associated with the use of new models
              and tools.

              'ITie KRC "envision|s| a future in which tests based on human cell systems can serve as better
              models of human biologic responses than apical studies in different species." Achieving such a
              future, however, will require substantial research to study and define various toxicity pathways.
              In evaluating possible options for the future of toxicity testing, the NRC eventually chose an
              option involving both in vitro and in vivo tests but based primarily upon human biology and the
              attendant use of substantially fewer animal studies that would be focused on mechanism and
              metabolism, llieir vision for the next 10 to 20 years relies on understanding perturbations of
              critical  cellular responses and the use of computational approaches for assessing hazard and risk.

              A paradigm shift in toxicity testing based on pathway perturbation will likely require significant
              methodological advances and future changes to EPA's risk assessment guidelines. Although it is
              infeasible to denote a specific timeline for how long it will take to substantially complete the
              strategic goals associated with toxicity pathway identification, chemical screening and
              prioritization. and toxicity pathway-based risk assessment, this plan takes the view that advances
              are likely to be gradual over the next decade or two. The good news is that toxicity testing
              research efforts have already begun moving EPA and others towards the use of in silico
              technologies and high throughput testing systems. The speed at which we are able to complete
              this transition will  depend on the availability of increased research funding.  It is  important to
              note that our understanding of toxicity pathways for some apical endpoints (e.g., hepatotoxicity)
              may be developed at a faster pace than others (e.g.. ncurotoxicity) thus, allowing more rapid
              introduction of newer high-throughput in vilro testing methods.

              Grounding the Science - From a broad regulatory perspective, data  used by EPA to support
              regulatory decisions will be shaped by the statutory language covering the action, regulatory
              policies, and the resulting time and resources allocated to the assessment. Where appropriate, use
              of data  should be consistent with the EPA guidance articulated in a number of science policy and
              guidance documents, including toxicity testing guidelines, risk assessment guidelines'".
              information quality guidelines , and peer review guidance.1
              12 hUp:^www,epa.gov/risk/guidance.htm
              13 http '/www.epa.gov/qualily.inibrmationgURiclincs'
              M http:"www epa.gov/peerreview pJf's/Peer8o20Revi
                                                         18
                                Previous  I      TOC

-------
   EPA CompTox Research Program FY2009-2012                       BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
             To implement this new paradigm, regulators, stakeholders, and the public will need to develop
             confidence that the data generated can he used effectively and that public health will continue to
             be protected. A step-wise implementation is envisioned: first, experience will be gained from
             proof of concept studies using data from chemicals (e.g., pesticides) with a large set of toxieily
             data developed using the current paradigm. Availability oI both new and traditional types of data
             will  allow extrapolation and comparison of results across methodologies.

             Optimally, early success stories that meet programmatic needs  in specific areas such as
             mechanism of action analyses or cumulative risk assessments will demonstrate the broader
             applicability ol computational toxicology within the Agency. Reliability of the testing paradigm
             will  need to be evaluated via a comprehensive development and review process, involving public
             comment, harmonization with other agencies and international  organizations, and peer review by
             experts in the field. Bringing new methods into regulatory practice will require several phases
             starting from the development of the science and technologies, to technology transfer and
             building the regulatory infrastructure, to incorporation of the new tools into decision making.

             Because this transformative paradigm will rely on new and complex science and will  likely be
             surrounded by some controversy, an important part of regulatory acceptance will be to conduct
             research that will verify the approaches and models that will come to replace much of the way
             toxicity testing and risk assessments are conducted in the Agency today. An  important
             component of the effort to develop new approaches to testing will be to translate the research
             into  regulatory applications.

             Issues Associated With the Use of New Methods and Models - For this new paradigm to be
             successful, new methods and models should be thoroughly evaluated prior to their application
             and use in regulatory decision making. The computer-based models used by  the Agency should
             be publicly available. Testing methods should be accompanied by documentation that describes
             (1) the method and its theoretical basis. (2) the techniques used to verity that the method is
             accurate, and (3) the process  used to evaluate whether the method and the results are sufficient to
             provide an adequate basis for its use in regulatory decision making. Access to data to  allow for
             third party independent replication of results, to the extent practicable,  is essential. Such review
             is appropriate before the Agency relies on data from such a method.

             Working With Outside Partners -The appendix provides details about the many outside parties
             KPA will need to partner with in order to implement this strategic plan  including:

                •  Other federal bodies such as the National Toxicology Program (NTP) and the  Kill
                    Chemical Gcnomics Center (NCGC). with whom EPA has a memorandum of
                    understanding to collaborate:
                •  "Hie Interagency Coordinating Committee on the Validation of Alternative Methods
                    (ICC VAVI), which is made up of representatives from 15 federal agencies thai generate
                    or use toxicological data:
                •  Foreign governmental parties and programs such as REACH, which is the new European
                    Union Regulation on  Registration. Evaluation, Authori/alion, and Restriction of
                    Chemicals that went into effect June 1, 2007;
             '* See htlp;;''epa.gov;crem libraw ORFAfguidancedrafll -JB pJf


                                                        19
                                Previous   I      TOC

-------
   EPA CompTox Research Program FY2009-2012                      BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
                 •   The OECD (Organization tor Economic Co-Operation and Development), which
                    represents over 30 countries in the Americas, Europe and Asia;
                 •   Academia;
                 •   Chemical industry: and
                 •   Non-governmental organizations.

             Case Study Development - Significant challenges, such as interpretation and communication of
             data obtained using new toxieity testing approaches, will emerge under a new paradigm for
             toxicity testing, A key feature of a successful communication strategy will be to develop case
             studies using new kinds of data that can serve as a basis to explore, evaluate, and most
             importantly explain hazard, dose-response, and exposure information in a risk assessment
             framework. Characterization  of risk information, both qualitative and quantitative, in a manner
             suitable lor communication to risk managers will be a significant challenge for the research and
             risk assessment community, but it will be crucial if the new toxicity testing paradigm is to reach
             its potential.

             5.2.   Strategic Goal 7: Organizational Transition

             Organisational transition is meant to cover changes in direction over lime with regard to
             deployment of human capital resources necessary to implement the new toxicity testing
             paradigm such as hiring of scientists with particular scientific expertise and training of existing
             scientific staff. For example,  KPA has hired key new scientific stuff and initialed training
             including three new training courses in genoniics designed and implemented by EPA's Risk
             Assessment Forum. Additional resources and training programs will be needed in both KPA's
             research program as well as its regulatory' and regional programs.

             As  noted in Section 2, several intra-agcncy. interageney, and international activities arc already
             underway to begin the transformation that will change the nature of toxicity data generated and
             how it is used to assess chemically induced risks to human health. Substantial funding will be
             needed to provide the scientific basis for creating new testing tools; to verify the utility of new
             testing tools including conducting peer review: to develop and standardize data-storage, data-
             access, and data-management systems; to evaluate predictive power for humans; and to improve
             the understanding of the implications of test results and how they can be applied in risk
             assessments used in environmental decision-making.

             EPA expects that the use of less expensive, high-throughput testing methods will allow for the
             generation of toxicity data for thousands of currently untested or under-tested chemicals. The
             availability of these new data will likely lead to the need for more staff to interpret the data for
             many more chemicals and manage their risks. Additionally, toxieity databases such as EPA's
             IRIS and models used to assess risks may need to undergo substantial  changes in the long term
             requiring future resources.

             5.3.   Strategic Goal 8: Outreach

             Outreach consists of those efforts that will be used to help educate the public and stakeholders as
             well as improve risk communication.
                                                       20
                                Previous  I     TOC

-------
   EPA CompTox Research Program FY2009-2012                       BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
             In reaching out to the public, it will be important to re-emphasize points made by EPA
             Administrator Carol Browner in a 1995 memorandum to senior Agency staff about the Agency's
             policy related to its new Risk Characterisation Program, lliis memorandum described tbe
             importance of adhering to the "core values of transparency, clarity, consistency, and
             reasonableness (which) need to guide each of us in our day-to-day work, from the lexicologist
             reviewing the  individual (scientific) study, to the exposure and risk assessors, to the risk
             manager, and through It) the ultimate decision-maker.  Further, "because transparency in
             decision-making and clarity in communication will likely lead to more outside questioning of our
             assumptions and science policies, we must be more vigilant about ensuring that our core
             assumptions and science policies are consistent and comparable across programs, well grounded
             Stakeholder Involvement   Implementation of a paradigm shift in toxicity testing and related
             changes to risk assessment methods and practices will require a sustained effort over many years
             - remember that the NRC envisioned some 10 to 20 years to reach their goal. This transition to
             new methods and approaches will need to be transparent, including efforts to share information
             with both the public and risk managers. It will be critical to effectively communicate with
             stakeholders (the public, scientists, federal and state agencies, industry, the mass media.
             nongovernmental organizations) about tbe new tools and the overall program regarding its
             strengths, limitations, and uncertainties. One way to enhance stakeholder involvement and ensure
             cooperation is to hold periodic workshops where all parties can gather to share information and
             progress; another tool is for EPA to establish a web portal to detail advancements in the science
             and relate these to improvements in risk assessment methods and practice.

             Collaboration among different elements in the research community involved in relevant research
             on new testing approaches will be needed to take advantage of the new knowledge, technologies.
             and analytical tools as they are developed, and collaboration between research and regulatory
             scientists will be vital to ensure that the methods developed can be reliably used in risk
             assessments of various types (initially qualitative, but ultimately both qualitative and
             quantitative). Mechanisms for ensuring sustained communication and collaboration, such as data
             sharing, will also be needed. Independent review and evaluation of the new toxieity testing
             paradigm should be conducted to provide advice for mideourse corrections, weigh progress,
             evaluate new and emerging methods, and make any necessary refinements in light of new
             scientific challenges advances. This may be accomplished using existing KPA mechanisms for
             peer review, e.g.. through reviews by the Board of Scientific Counselors, the Science Advisory
             Board, and the FIFRA Scientific Advisory Panel. For testing that the Agency may wish to
             require, performance standards should be considered so that individual methods from any
             qualified source may be used. The NRC (2007) stressed thai "in vitro tests would be developed
             not to predict the results of current [animal] apical  toxicity tests but rather as [human] cell-based
             assays that are informative about mechanistic responses of human tissues to toxic chemicals.
             The  [NRC] committee is aware of the implementation challenges that the new toxicity-testing
             paradigm would face." Presumably, establishing regulatory7 confidence that the new approaches
             are robust and protective of human health will be at the forefront of future challenges for EPA
             and its partners.
             16 http:"'www.epa.gov/oswer;nskassessnienl'pdr;1995_0521_risk_characteriitation_program.pdf


                                                        21
                                Previous   I      TOC

-------
   EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
                   BOSC Review Draft- 24 August, 2009
             Rink Communication   Communicating with policy makers and the public is an important part
             of any risk management exercise.  ITie complexity of the emerging toxieity testing paradigm and
             how new types ofdala and information will be used to assess risk will make communication of
             results challenging: consequently, the Agency must work to build public trust in the adopted
             technologies. As the science moves away from well-established animal models, a significant
             effort must be made to share information with risk assessors/managers and the public by clearly
             describing test results and methodologies in a transparent manner. A fundamental aspect of
             gaining public trust is transparency. Therefore, education and effective communication with
             stakeholders (the public, scientists, regulatory authorities, industry, the mass media, and
             nongovernmental organizations) on the strengths, limitations, and uncertainties of the new
             tools paradigm will be critical.

             Ciiven that these new methods will be less intuitive than looking for traditional effects in whole
             animal studies, communication strategies will be very important. At this time, much of EPA's
             effort in this area is presented on the Agency's National Center for Computational Toxicology
             Web site.  As the new toxicity testing paradigm continues to evolve, the Agency will need to be
             vigilant in maintaining an interactive Web site to describe each individual assay or method in use
             and where it fits into the exposure-response continuum.

             When communicating about risk, it is important for the Agency to address the source,  cause.
             variability, uncertainty, and the potential adversity of tlie risks, including the degree of
             confidence in the risk assessment methodology, the rationale for the risk management decision.
             and the options for reducing risk (U.S. EPA. 1995; U.S. EPA,  1998).  EPA will continue to
             interact with stakeholders in order to develop and maintain effective informational tools.
              7 hUp:''wvv\y.epa.gov/comptox.1
                                                       22
                                Previous
TOC

-------
   EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
                          BOSC Review Draft- 24 August, 2009
                                                 6,  FUTURE STEPS

             This strategic plan describes an ambitious and substantive change in the process by which
             chemicals are evaluated for their toxicity. The NRC (2007) suggested that such a transformation
             would require up to S100M per year in funding over a 10-20 year period to have a reasonable
             chance of reaching the goals. Even including the resources of sister agencies, the overall federal
             budget for the collaborative c(Torts does not approach the NRC proposed  level of funding-
             Decision on the relative role of KPA vis-a-vis other partners will have a major impact on the
             resources that EPA needs to dedicate to this effort. These decisions will have to be made as the
             strategy is implemented. Explanation of these decisions,  their rationale, and implications will be
             included in a subsequent implementation plan.

             Regardless of whatever level of funding is ultimately applied to the vision of a more  efficient and
             effective chemical safety evaluation effort, translation of this strategy into research and activities
             related to operational and organizational change will require development of an implementation
             plan as well as periodic peer review of directions and progress. Representatives from those EPA
             organizations most involved and impacted by the new vision  will play key roles in the
             implementation program. The Science  Advisory Board and/or the Board of Scientific Counselors
             will play key roles in the scientific peer review of the program. As noted in Section 4, there will
             be a progression in the
             implementation efforts from an
             early  focus on hazard
             identification to a growing
             emphasis on the  use of toxicity
             pathway characterization in risk
             assessment. Support for
             institutional transitions is also
             expected to increase overtime as
             the tools and technologies
             emerge out of the research
             programs and become available
             for regulatory use. Figure 6
             depicts one potential way that the
                                ^ Screening^* rioritizat ion
                                  Risk Assessnent
                                 ""Institutional Transit on
      2010
             2016    2020
                Year
                           2025
             level of effort of the three main
             activities involved in this strategy
             could change over time.
Figure 6. Relative (%) Emphitit of the Three Main Components of
this Strategic Plan over its Expected 20-year Duration.
                                Previous
       TOC

-------
   EPA CompTox Research Program FY2009-2012                     BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
                                 APPENDIX: OTHER RELATED ACTIVITIES

             Other US Government Activities

             ITie National Toxicology Program (NTP) at the National Institute of Environmental Health
             Sciences (NIKHS) coordinates loxicological testing programs within the Department of Health
             and Human Services '. Similar to EPA, NTP is developing the use of computational models, in
             vitro assays, and non-mammalian in vivo assays targeting key pathways, molecular events, or
             processes linked to disease or injury for incorporation into a transformed chemical testing
             paradigm.

             Hie NIH Chemical (ienomics Center (NCCJC) of the National Human Genome Research
             Institute conducts ultra high throughput screening assays as part of the Mill's Molecular
             Libraries Initiative within the NIH Roadmap

             A Memorandum of Understanding  was recently signed by KPA, the NIP, and the NCC5C to
             collaborate on generating a comprehensive map of the biological pathways alTected by
             environmental chemical exposures and use this map to predict how potential chemical toxicants
             will affect various types of cells, tissues, and individuals. 'I IK hope is to refine many of the
             toxicity tests performed on animals and eventually supplant them with in vitro testing and
             computational prediction (Collins et al., 2008).

             In 2004 the Food and Drug Administration (FDA) produced a report20 addressing the need to
             translate the rapid advances in basic biomedical sciences into new preventions, treatments and
             cures. FDA holds large databases of human, animal, and in vitro data for screening drug
             candidates for toxicity that may also be useful for screening environmental chemicals. The
             FDA's National Center for Toxicological Research (NCTR) aims to develop methods for the
             analysis and integration of gcnoinic, transeriptomic. proteomic. and metabolomic data to
             elucidate mechanisms of toxicity31.  NCTR has coordinated the Microarray Quality Control
             (MAQC) project, with numerous partners including EPA (Shi et al.. 2006). In addition. NCTR
             has provided its Array Track database to EPA for storage of genotnics data for research and
             possible regulator.' use.

             The Interagency Coordinating Committee on the Validation of Alternative Methods
             (IC'C'VAM) was established  by law in 2000 to promote development, validation, and regulatory
             acceptance of alternative safely testing methods. ICC VAN! is made up of representatives from 15
             federal agencies that generate or use toxicological data. Emphasis is on alternative methods that
             will reduce, refine, and'or replace the use of animals in testing while maintaining and promoting
             scientific quality and the protection of human health and the environment". The NTP
             Interagency Center for the Evaluation of Alternative Toxicological Methods (NICE ATM)
             administers and provides scientific support for 1CCVAM. ICCVAM/N1CEATM evaluates test
             method submissions and nominations, prepares technical review documents, and organi/.es
             18 http  ntp mehs mh gi^ nip main  p'Ot N211 uc mitidU\ewiiUuilpath\shitepdper.html
             ~' http  ivutt Ida go\  mtr o> en icw mission htm
             22 http  ILC\ am niehs nih go\ .ihout m_(j \ him
                                                       24
                               Previous   I      TOC

-------
   EPA CompTox Research Program FY2009-2012                      BOSC Review Draft- 24 August, 2009
IV. APPENDICES cont.
             scientific workshops and peer review meetings. For example. ICCVAM NICEATM recently
             released a report" that describes two in vitro eytotoxieily tests that can be used for estimating
             starting doses tor acute oral toxicity tests, thereby reducing the number of animals used.

             Related Activities by Foreign Governments

             A new European Union Regulation on Registration, Evaluation, Authorization, and
             Restriction of Chemicals (REACH) went into effect June 1, 2007. 'Hie main goals of REACH
             are (1) to improve the protection ol human health and the environment from risks associated \\ith
             chemicals in commerce and (2) to promote alternative lest methods. REACH requires
             manufacturers and importers to demonstrate they have appropriately identified and managed the
             risks of substances produced or imported in quantities of one ton or more per year per company.
             'Hie new Kuropean Chemicals Agency (KCHA)"' will manage the system databases, coordinate
             evaluation of chemicals, and run a public database of hazard information" .

             The Kuropean Centre for the Validation of Alternative Methods (KCVAM)" coordinates
             the validation of alternative test methods in the European Union. ECVAM develops, maintains,
             and manages a database on alternative procedures and promotes the development, validation, and
             international recognition of alternative lest methods.

             The Japanese Center for the Validation of Alternative Methods (JaCV'AM) is pail of the
             Japanese National Institute of Health Sciences. JaCVAM has conducted validation  studies for
             alternative test methods and participates in international validation efforts",

             The Korean Center for the Validation of Alternative Methods (KoC.'V AM) is a branch of
             NITR, the National Institute of Toxicological Research. NITR is collaborating with the Korean
             Societv for Alternatives to Animal Experiments (KSAAE) to refine methods in acute oral.
                                                                       -)0
             reproductive development, genetic, and endocrine toxicity testing"'.

             The Organization for Economic Co-Operation and Development (OECD) represents 30
             countries in the Americas (including the United States). Europe, and Asia. The OECD
             "Guidelines for the Testing of Chemicals"' provides a collection of internationally harmonized
             testing methods  for a number of lexicological cndpoints using in vivo, in vitro, and even
             alternative approaches."  Test guidelines can be updated to reflect scientific advances and the
             state of the science if member countries agree to do so. A few OECD workgroups and  efforts
             address issues relevant to this KPA strategy, e.g., the OECD QSAR Toolbox30 and the joint
             OECD I PCS (International Programme for Chemical Safety) Toxicogenomics Working Group,
             which has developed a proposal for a Molecular Screening Project, modeled after EPA's
             ToxCast  program.
             23 http k«.\jm nidisnihgo\ melhoiK diuicloxmi nru  Imcr.hlm
             l> http ei,h,] europa eu re,u.h_en a,p
             J' hup ci cuiopj ou em ironwent chonin.jK leach reach inlro htm
             u http OA dm in. il
             Jl hup www nihsgd jp english index html
             m http w\\wMA mi ac ip ivue P \Kk pdl
             "'http tiUinid souricoci Jorg\l ^vdHHltl 2? nw  f ;'rpsv/pcntxIitaL;pl5 about.hlm?jnlissn 1607310.x
             30 http www oeLd or
-------
   EPA CompTox Research Program FY2009-2012                       BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
             Academia

             Numerous I .S. academic researchers and centers are funded by NIH or EPA s National Center
             for Environmental Research to develop assays and analysis methods that might be helpful to the
             goals of this KPA research strategy, This includes two Bioinlornialies Centers funded by KPA in
             2006.

             'Hie European Commission funds several large academic, government, and industry consortia
             that are conducting research that could lead to effective in vitro toxicity tests. The CASC'ADK
             Network of Excellence   studies human health effects of chemical residues and contaminants in
             food and drinking water, designing assays to elucidate estrogen, testosterone, and thyroid
             hormone pathways for the development of mechanism- and disease-based test methods. The aim
             of the carcinoCiKNOMK.'S"* project is to develop in vitro methods for assessing the
             carcinogenic potential of compounds. ReProTect" is optimizing an integrated set of
             reproductive developmental tests for a detailed understanding of gametogenesis, sleroidogenesis,
             and embryogenesis that can support regulatory decisions.

             Industry

             The European Partnership for Alternative Approaches to Animal Testing (EPAA)" is a
             joint initiative from the European Commission and a number of companies and trade federations.
             Its purpose is to promote the development of alternative approaches to safety testing. The KPA A
             focuses on mapping existing research; developing new alternative approaches and strategies; and
             promoting communication, education, validation, and acceptance of alternative approaches,

             Non-Governmental Organizations (NGOs)

             The Comparative Toxicogenomics Database " (C I'D) elucidates molecular mechanisms by
             which environmental chemicals affect human disease. CTD includes manually curated data
             describing cross-species chemical-gene, protein interactions and chemical - and gene- disease
             relationships to illuminate molecular mechanisms underlying variable susceptibility and
             environmentally influenced diseases. These data will also provide insights into complex
             chemical-gene and protein interaction networks.

             The Johns Hopkins Outer for Alternatives to Animal Testing36 supports the creation.
             development, validation, and use of alternatives to animals in research, product safety testing.
             and education. Similarly. AItTox.org  provides information on non-animal methods for toxicity
             testing including a table'8 that summarizes the alternative testing methods by endpoint that have
             been approved or endorsed internationally by at least one regulatory agency.
             31 http••'••'www.cascadcncl org;
             3J http:/'www,carcinogenotnics.eu
             31 http ,'/www rcprotcct.cu,1'
             M http:'•ec.curopa cu'cntcrprisc'cpaa- index en.htm
             35 hup',''tld.mdibl.org/
             36 hUp:'/3ltweb.|hsph,edttinde,\.htrn
             3 http-.-';www.alttox.org/aboul.
             K http;,-'.-www nlUox,org'lire'validalion-ra'ValidHled-ra-methods.btinl
                                                       26
                                Previous   I     TOC

-------
   EPA CompTox Research Program FY2009-2012                       BOSC Review Draft- 24 August, 2009

IV. APPENDICES cont.
              REFERENCES

              Coecke S, Goldberg AM, Allen S, Buzanska I,, Calamandrei G. Croftort K, Ilareng I., Ilartung T, Knaut
              II, Ilonegger P. Jacobs VI, I.ein V, I.i A, Mundy \\, Owen D, Schneider S, Silbergeld E, Reum 1,
              Trnovec I, Monnet-Tschudi 1% Bal-Price A. (2007) Workgroup report: incorporating in vitro alternative
              methods for developmental neurotoxieity into international hazard and risk assessment strategies. Environ
              Health I'erspect. 115(6):924-31.

              Collins FS, Gray GM, BucherJR. (2008) Transforming Environmental Health Protection, Science
              319:906-7.

              Dix DJ, Houek KA. Martin MT, Richard AM, Set/er RW. Kavlock RJ. (2007) The ToxCasI program lor
              prioritizing toxicity testing of environmental chemicals. Taxical Sci. 95( 1 ):5-l 2.

              Kdwards S\V, Preston R.I. Systems Biology and Mode of Action Based Risk Assessment. Taxical. Sci. in
              press, doi: 10.1093 toxsci 'kfnl 90.

              Guylon KZ, Kyle AD. Aubrechl J. Cogliano VJ. Easlmond DA, Jackson M. Keshava N. Sandy MS,
              Sonawane B, Zhang L, Waters MD, Smith MT. (2008) Improving prediction of chemical carcinogenicity
              by considering multiple mechanisms and applying toxicogenomic approaches. Afutat. Res.  In press,
              doi:10,10]6j.mn-ev,20()8.U).001.

              Judson R, Richard A. Dix D, llouck K, Elloumi !•', Martin M Cathey 1, Iransue TR, Spencer R, Wolf M.
              (2008) ACToR—Aggregated Compulalional Toxicology Resource. Taxicol App]Pharmacol. 233(1):7-13.

              Kavlock RJ, Ankle> G, Blancato J. Brccn M, Conolly R, Dix D, llouck K. Ilubal E. Judson R,
              Rabinowilz J, liichard A, Sclzcr RW. Shah I. Villcncuvc D. Wcbcr E. (2007) Computational Toxicology
              A State oi the Science Mini Review. Toxicol Sci. Dec 7.

              Kavlock RJ. Dix DJ,  llouck KA, Judson RS, Martin MT. Richard AM. (200K). ToxCasI™: Developing
              predictive signatures for chemical toxicity. Proceedings of the 6th World Congress on Alternatives in the
              Life Scie««'.v.TX-4-1. In Press.

              Lein P, Locke P, Goldberg A. (2007) Meeling reporl: alternative!! lor developmental neuroloxicity testing.
              Environ Health Perspeci. 11 5(5):764-8.

              National Research Council of the Xalional Academies (NRC). (2006) Toxicity Testing lor Assessment of
              Environmental Agents. The National Academies Press. Washington, DC.

              National Research Council of the National Academies (NRC). (2007) Toxieity Testing in the 21"
              Century: A Vision and A Strategy, 'llie National Academies Press. Washington, DC.

              Shi L et al. (2006) The MicroArray Quality Control (MAQC) project shows inler- and intraplaiform
              reproducibility of gene expression measurements. Hat Biatechnol. 24(9):1151 -61.

              U.S.  Environmental Protection Agency. (1995) Ecological risk: a primer for risk  managers. Washington,
              DC: U.S. Environmental Protection Agency. EPA. 734i'R-95<001. Available al htlp: www.epa.gov nscep .

              U.S.  Environmental Protection Agency. (1998) Guidelines for Ecological Risk Assessment. Washington,
              DC: U.S. Environmental Protection Agency. EPA 630/R-95/002F.
              http: cfpub.epa.gov. ncea.clhtrecordisplay.cfm?deid-12460
                                                          27
                                  Previous   I      TOC

-------
  EPA CompTox Research Program FY2009-2012

IV. APPENDICES cont.
                BOSC Review Draft- 24 August, 2009
         &EPA
             United States
             Environmental Protection
             Agency
                       PRESORTED STANDARD


                             EPA
                         PERMIT i:
             Office of the Science Advisor (G105R)
             Washington. DC 2D46O

             Official Business
             Penalty tor Private Use
             $300
                          Previous
TOC

-------
CTRP BOSC Poster Abstracts
   September 29-30, 2009

         Session I
         Session II
      Previous I  TOC

-------

Session 1



















Poster

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Title

ACToR: Aggregated Computational Toxicology
Resource
DSSToxand Chemical Information Technologies in
Support of Predictive Toxicology
Characteristics and Applications of the ToxRefDB In Vivo
Datasets from Chronic, Reproductive and Developmental
Assays
Literature Mining and Knowledge Discovery Tools for
Virtual Tissues
ExpoCast: Exposure Science for Prioritization and
Toxicity Testing
Computational Approaches and Tools for Exposure
Prioritization and Biomonitoring Data Interpretation
Advanced Exposure Mertics for Chemical Risk Aanalysis
The U.S. "Tox21 Community" and the Future of
Toxicology
Supported STAR Research Centers Advance the Field of
Computational Toxicology
UNC The Carolina Environmental Bioinformatics Center
UMDNJ Environmental Bioinformatics & Computational
Toxicology Center (ebCTC): Research Collaborations in
Multi-Scale Modeling of Environmental Toxicants
UNC Systms Biology - Carolina Center for
Computational Toxicology
UTH - The Texas-Indiana Virtual STAR Center; Data-
Generating in vitro and in silico Models of Developmental
Toxicity in Embryonic Stem Cells and Zebrafish
Mechanistic Indicators of Childhood Asthma (MICA): A
Systems Biology Approach for the Integration of
Multifactorial Environmental Health Data
Linkage of Exposure and Effects Using Genomics,
Proteomics, and Metabolomics in Small Fish Models
Development of a Searchable Metabolite Database and
Simulator of Xenobiotic Metabolism
Risk Assessment of the Inflammogenic and Mutagenic
Effects of Diesel Exhaust Particulates: A Systems
Biology Approach
Development of Microbial Metagenomic Markers for
Environmental Monitoring
Develop a Systems Approach to Characterizing and
Predicting Thyroid Toxicity Using an Amphibian Model
Presenters Bio

R Judson
A Richard
M Martin
A Singh
E Cohen Hubal
C. Tan
T. Collette
R Tice (NTP/NIEHS)
D Segal (NCER)
F Wright (UNC)
B Welsh (UMDNJ)
I Rusyn (UNC)
Eva M. Bondesson (UTH)
J Gallagher (NHEERL)
T Collette for G. Ankley
(NERL)
J Jones (Athens)
J Samet (NHEERL)
Jorge W. Santo Domingo
(NERL)
S Degitz (NHEERL)
Previous
TOC

-------
ACToR: Aggregated Computational Toxicology Resource

Authors: Richard Judson1, Tommy Cathey2, Thomas Transue2, Ann Richard1, Doris
Smith3, James Vail3,  Kaitlin Daniel1
1 National Center for Computational Toxicology, USEPA, RTP, NC 27711
2 Lockheed Martin, Contractor to the USEPA, RTP, NC 27711
3The National Caucus and Center on Black Aged, Inc., Senior Environmental
Employment Program, Grantee to the USEPA, RTP, NC 27711
The EPA Aggregated Computational Toxicology Resource (ACToR) is a set of
databases compiling information on chemicals in the environment from a large number
of public and in-house EPA sources. ACToR has 3 main goals: (1) The serve as a
repository of public toxicology information on chemicals of interest to the EPA, and in
particular to be a central source for the testing data on all chemicals regulated by all
EPA programs; (2) To be a source of in vivo training data sets for building in vitro to in
vivo computational models; (3) To serve as a central source of chemical structure and
identity information for the ToxCast™ and Tox21 programs. There are 4 main
databases, all linked through a common set of chemical information and a common
structure linking chemicals to assay data:  the public ACToR system (available at
http://actor.epa.gov), the ToxMiner database holding ToxCast  and Tox21  data, along
with results form statistical analyses on these data; the Tox21 chemical repository which
is managing the ordering and sample tracking process for the larger Tox21  project; and
the public version of ToxRefDB. The public ACToR system contains information on
~500K compounds with toxicology, exposure and chemical property information from
>400 public sources. The web site is visited by ~1,000 unique users per month and
generates ~1,000 page requests per day on average. The databases are built on open
source technology, which has allowed us to export them to a number of collaborating
organizations.
This work was reviewed by EPA and approved for publication but does not  necessarily
reflect official Agency policy.
                      Previous  I    TOC

-------
DSSTox and Chemical Information Technologies in Support of Predictive
Toxicology

Ann M. Richard, National Center for Computational Toxicology,  USEPA, RTP, NC

The EPA NCCT Distributed Structure-Searchable Toxicity (DSSTox) Database project
initially focused on the curation and publication of high-quality, standardized, chemical
structure-annotated toxicity databases for use in structure-activity relationship (SAR)
modeling.  In recent years, the project has expanded to include: creation of DSSTox
files for high-interest EPA chemical inventories; strengthening structure-based linkages
among public resources; tailoring chemical and bioassay DSSTox content for
incorporation into NIH's PubChem; creating local structure-browsing capabilities of
DSSTox content and  inventories; and expanding comparability of and linkages to gene
expression data. Within the NCCT, the DSSTox project framework is applying strict
quality standards for chemical information, pertaining to both generic (ACToR,
ToxRefDB) and actual test substances (ToxCast™, Tox21).  Within these projects, we
are working to expand comparability and linkages of summarized toxicity data in the
context of a standardized cheminformatics environment.  Future research will build on
these cheminformatics data foundations and enriched data resources to develop data
mining strategies to explore new and flexible ways to relate chemical structure to
biological endpoints (e.g., reactivity groupings, biofunctional or toxicity-informed
similarity, chemical feature space), and new representations of biological endpoints in
relation to structure-based modeling (e.g., HTS clusters, bioassay profiles, summarized
or grouped effects, qualitative active and inactive classes).  Incorporating traditional
SAR concepts into this new HTS data-rich world poses conceptual and practical
challenges, but also holds great promise for improving toxicity prediction capabilities.
This work was reviewed by EPA and approved for publication but does not necessarily
reflect official Agency policy.
                      Previous  I    TOC

-------
Characteristics and Applications of the ToxRefDB In Vivo Datasets from
Chronic, Reproductive and Developmental Assays

Matthew Martin, NCCT/ORD, USEPA, Research Triangle Park, NC, USA.

ToxRefDB was developed to store data from in vivo animal toxicity studies. The
initial focus was populating ToxRefDB with pesticide registration toxicity data that
has been historically stored as hard-copy and scanned documents by the Office of
Pesticide Programs. A significant portion of these data have now been processed
into ToxRefDB in a standardized and structured format. ToxRefDB currently includes
chronic, cancer, sub-chronic, developmental, and reproductive studies on over 400
chemicals, many of which are pesticide active ingredients. These data are now
computable within ToxRefDB, and are serving as reference toxicity data for the
development of ToxCast™ predictive signatures as well as for retrospective
analyses assessing past performance of guideline toxicity studies and  informing on
potential changes to current guidelines. The three primary datasets currently being
used for predictive modeling have been  published and include chronic, reproductive
and developmental endpoints. The rat and mouse chronic data primarily focuses on
pathological endpoints related to progression and formation of tumors. The
reproductive data is from  rat multi-generation studies and focuses on reproductive
performance measures, reproductive organ pathologies, and offspring  survival
decrements. The developmental endpoints are from rat and rabbit prenatal studies
for which detailed anatomical information is collected on observed malformations
which were subsequently mapped to the developing system. Thus ToxRefDB
provides high quality, comparable data for over 200 of the of the 309 ToxCast™
chemicals from chronic, reproductive and developmental study types. This work was
reviewed by EPA and approved for publication but does not necessarily reflect
official Agency policy.
                   Previous  I    TOC

-------
Literature Mining and Knowledge Discovery Tools for Virtual Tissues

Singh AV1, Knudsen T B2 and Shah I2
1 Lockheed Martin, Contractor to the USEPA, RTP, NC
2 National Center for Computational Toxicology, USEPA, RTP, NC

Virtual Tissues (VTs) are in silico models that simulate the cellular fabric of tissues to
analyze complex relationships and predict multicellular behaviors in specific biological
systems such as the mature liver (v-Liver™) or developing embryo (v-Embryo™). VT
models require input of biological knowledge about the systems under investigation. We
are using VTs to model experimental  data, such as ToxCast™ in vitro assays, with
information about molecular pathways, cellular networks and clinical phenotypes in target
organ systems.  Knowledgebase development requires a flexible platform to extract and
organize relevant facts from the scientific literature and other sources of information. The
knowledge discovery workflow starts with information retrieval  (IR) by user-defined input
on single or multiple keywords to retrieve relevant PubMed abstracts, followed by
information extraction (IE) and relationship mapping (RM). Currently, we use the publicly
available '@Note'1 tool for highly customizable named entity recognition (NER). A
vocabulary of terms was  built to describe pathologically relevant concepts using publicly
available ontologies (www.OBOfoundrv.org/) including genes,  pathways, anatomy, clinical
outcomes, and chemicals. The results from @Note are stored  in a relational database for
statistical analyses to summarize relationships and map them to broader biological
concepts. The text-mining (TM) workflow is being implemented as a modular tool that
uses open-source libraries.  We are using this workflow to extract relevant facts about
hepatocarcinogenesis and embryo dysmorphogenesis. This poster will provide specific
examples of predicted associations that were data-mined from ToxCast_320 chemicals
and 76 ToxRefDB endpoints. Biomedical literature mining aims to identify and extract
plausible patterns for explicit (IE) and implicit (TM) concepts that are previously known
and unknown, respectively and that can be used to better understand the inferred
associations in predictive modeling. [This work has been reviewed by EPA and cleared for
presentation, but does not reflect official Agency policy].
      http://sysbio.di.uminho.pt/anote/wiki/index.php/Main Page.
                       Previous  I    TOC

-------
ExpoCast: Exposure Science for Prioritization and Toxicity Testing

Elaine A Cohen Hubal1 and Peter Egeghy2
1 National Center for Computational Toxicology, U.S. EPA, RTP, NC, USA
2 National Exposure Research Laboratory, U.S. EPA, RTP, NC, USA

The US EPA is completing the Phase I pilot for a chemical prioritization research
program, called ToxCast™.  Here EPA is developing methods for using
computational chemistry, high-throughput screening, and toxicogenomic
technologies to predict potential toxicity and prioritize limited testing resources.
There is a clear need for a parallel and collaborative effort across the exposure
and risk assessment community to provide the exposure science required for
interpretation of high-throughput in vitro toxicity data. A coherent research
program is required to advance exposure characterization to translate advances
and findings in computational toxicology for enhanced risk assessment, informed
decision making and improved public health.  US  EPA is initiating the
ExpoCast™ program  to ensure that the required exposure science and
computational tools are ready to address global needs for rapid characterization
of exposure potential  arising from  the manufacture and use of tens of thousands
of chemicals and to meet challenges posed by new  toxicity testing approaches.
ExpoCast™ will provide an overarching framework for science required to
characterize biologically-relevant exposure in support of the Agency
computational toxicology program. The overall goal of this program is to develop
novel approaches and tools for evaluating and classifying chemicals, based on
potential for biologically-relevant human exposure, to inform prioritization and
toxicity testing.  Broadly and long-term, the ExpoCast™ program will foster novel
exposure science research to (1) inform chemical prioritization, (2) understand
implications of system response to chemical perturbations at the individual and
population levels,  (3)  link information on potential  toxicity of environmental
contaminants to real-world health  outcomes.  This presentation will  introduce
EPA's ExpoCast™ program.

This work has been reviewed and approved by the US EPA for publication but
does not necessarily reflect Agency policies.
                   Previous I    TOC

-------
Computational Approaches and Tools for Exposure Prioritization and
Biomonitoring Data Interpretation

Cecilia Tan, Eric Weber, John Kenneke, Marsha Morgan, Daniel Chang, Michael-Rock
Goldsmith, Rogelio Tornero-Velez, Curt Dary
National Exposure Research Laboratory, Office of Research and Development, U.S.
Environmental Protection Agency, Research Triangle Park, NC 27711 USA
The ability to describe the source-environment-exposure-dose-response continuum is
essential for identifying exposures of greater concern to prioritize chemicals for toxicity
testing or risk assessment, as well as for interpreting biomarker data for better
assessment of exposure or risk. To link each element in this continuum, scientists at
the National Exposure Research Laboratory (NERL) and the National Center for
Computational Toxicology (NCCT) are collaborating to develop, evaluate, and apply
various computational approaches and tools including predictive environmental fate
modeling (i.e., Environmental Fate Simulator (EPS)), exposure modeling, physiological
based pharmacokinetic (PBPK) modeling, and pharmacodynamic modeling, and
biologically based dose-response modeling. Specifically, NERL currently directs
research activities in the following areas: (1) EPS; (2) screening level PBPK modeling;
and (3) interpretation and use of biomonitoring data. The components of EPS include:
a computational tool for calculating physical and chemical properties based on chemical
structure; a reaction pathway simulator for predicting transformation pathways and
products; linked databases populated with measured/calculated molecular descriptors
necessary for predicting physical transport and chemical reactivity; an expert system for
environmental characterization data needed to estimate partitioning behavior and
reactivity; and the EPS software that provides seamless linkage of disparate models
and databases. Screening level PBPK models are used to link external exposures to
tissue dosimetry for improved dose response assessment.  NERL scientists utilize a
combination of results from in vitro/in vivo studies and QSAR/computational chemistry
techniques to estimate chemical-specific parameters required for PBPK models.  This
knowledge and expertise are especially important for chiral chemicals, which exist as
mixtures of stereoisomers having different physical and/or biological properties but are
frequently treated in toxicity testing and risk assessment as single chemicals.  For
interpreting biomonitoring data, NERL is developing a framework to use the same
computational approaches (e.g., PBPK modeling) to assess the quantitative
relationships between biomarkers and human exposures. For example,  an exposure
study is underway to estimate non-occupational exposure to pyrethroids based on
urinary biomarkers.  This framework will identify the critical data gaps and uncertainties
in estimating human exposures and will help in designing future exposure and
epidemiological studies.
This work was reviewed by EPA and approved for publication but does not necessarily
reflect official Agency policy.
                      Previous  I    TOC

-------
Advanced Exposure Metrics for Chemical Risk Analysis

Timothy W. Collette, Quincy Teng, Drew Ekman, Jim Lazorchak, David Lattier,
Michael-Rock Goldsmith, Joachim Pleil, National Exposure Research Laboratory,
USEPA

Direct measurement of human exposure to environmental contaminants in real
time (when the exposure is actually occurring) is rare and difficult to obtain. This
frustrates both exposure assessments and investigations into the linkage
between chemical exposure and human disease.  However, it is feasible to
obtain information on the levels  of environmental contaminants (and their
metabolites and adducts) in the  biofluids of individuals that may have been
exposed. Furthermore,  it is feasible to obtain information on the occurrence of
specific diseases and other adverse conditions in various human demographics.
The Agency's exposure and risk assessments could be greatly improved if these
chemical biomarkers could be used to both reconstruct previous exposure
scenarios, and to predict the future likelihood of adverse effects. While progress
has, indeed, been made along these paths, biomarker methods based solely on
xenobiotics and their metabolites/adducts  are inherently limited.  This new
research program (still in the planning stages) is based on the belief that systems
biology approaches and 'omic-based  biomarkers, when used in conjunction with
tradition biomarkers, offer great  promise for both exposure reconstruction  and for
elucidating the linkages  between exposures and adverse outcomes.

A significant amount of research has already been devoted to the use of systems
biology approaches and 'omic techniques  (transcriptomics, proteomics, and
metabolomics) to screen chemicals for hazardous effects.  However, changes in
transcripts, proteins, and endogenous metabolites may, in some cases, be more-
certain indicators of chemical  exposures than of apical chemical effects.
Nonetheless, these powerful new techniques have rarely been applied as
biomarkers of exposure.  In comparison to (or in combination with) conventional
biomarkers, 'omic markers  of  exposure offer considerable promise for exposure
assessment.  Note that 'omic  markers are a unique pattern of a large  number
and wide variety of transcript,  protein, or endogenous metabolite changes.
These signatures may be more  informative and  more chemical-specific than a
conventional biomarker. Also, taking  advantage of the earlier research in  'omic
markers for effects, these markers of  exposure may, in some cases, be able to
identify exposure to a specific mode-of-action-active chemical.  Indeed, 'omic
markers can sometimes serve as a linkage across the source-to-outcome
continuum, functioning concomitantly  as markers of exposure, dose
characterization, and effects.

This abstract has been reviewed in accordance  with the U. S. Environmental
Protection Agency's peer and administrative review policies and approved for
presentation and publication.
                   Previous  I    TOC

-------
The U.S. "Tox21 Community" and the Future of Toxicology

Raymond Tice, Ph.D. 1 Robert Kavlock, Ph.D.2, and Christopher Austin, M.D.3

1 Chief, Biomolecular Screening Branch, National Toxicology Program, National Institute
of Environmental Health Sciences, RTP, NC 27709
2 National Center for Computational Toxicology, USEPA, RTP, NC 27711
3Director, NIH Chemical Genomics Center, National Human Genome Research Institute
National Institutes of Health, Bethesda, MD 20892-3370

In early 2008, the National Institute of Environmental Health Sciences/National
Toxicology Program, the NIH Chemical Genomics Center, and the Environmental
Protection Agency's National Center for Computational Toxicology entered into a
Memorandum of Understanding to collaborate on the research, development, validation,
and translation of new and innovative test methods that characterize key steps in
toxicity pathways.  A central component is the exploration of high throughput screening
assays and tests using phylogenetically lower animal species (e.g., fish, worms), as well
as high throughput whole genome analytical methods, to evaluate  mechanisms of
toxicity. The goals of the "Tox21 Community" are to investigate the use of these new
tools to (1) prioritize substances for further in-depth toxicological evaluation, (2) identify
mechanisms of action for further investigation, and (3) develop predictive models for in
vivo biological response. Success is expected to result in test methods for toxicity
testing that are more mechanistically based and economically efficient; as a
consequence, a reduction or replacement of animals in regulatory testing is anticipated
to occur in parallel with an increased ability to evaluate the large numbers of chemicals
that currently lack adequate toxicological evaluation.  The initial focus of this
collaboration has been on identifying toxicity-related pathways (and assays for those
pathways), establishing a Tox21  library of ~10000 compounds, and developing the
databases and bioinformatic tools needed to  mine the resulting data. The coordinated
approaches being taken to achieve our goals, the lessons learned, and expectations for
the future will be presented.  This work was reviewed by EPA and approved for
publication but does not necessarily reflect official Agency policy.
                      Previous I    TOC

-------
NCER-Supported STAR Research Centers Advance the Field of Computational
Toxicology

Deborah Segal, National Center for Environmental Research (NCER), USEPA,
Washington, DC

Advances in genomics and computer methods have positioned computational
toxicology at the forefront in the development of predictive models of exposures to
pollutants and their subsequent health effects. These models will improve the scientific
foundation for conducting risk assessments.  In an effort to advance the field, ORD's
National Center for Environmental Research (NCER), through the Science To Achieve
Results (STAR) program, has funded research centers that are integrating modern
computing and information technology with molecular biology and chemistry. The goals
of the STAR computational toxicology program are the following:

   •  Improve linkages across the source-to-outcome continuum
   •  Develop approaches for prioritizing chemicals for further screening and testing
   •  Produce better methods and predictive models for quantitative risk assessment

NCER has issued three Requests for Application (RFAs) for STAR computational
toxicology research centers. As a result of a 2005 RFA "Environmental Bioinformatics
Research Centers," EPA funded two centers that are developing statistical and
bioinformatics tools and approaches for predicting toxicity to chemical exposures. A
2007 RFA, "Computational Toxicology Centers:  Development of Predictive
Environmental and Biomedical Computer-Based Simulations and Models," led to an
additional center. It is applying high-performance computing techniques and resources
to in silico multi-scale modeling applications at the cellular, organ, and system-wide
level to address the environmental problems and research needs facing the U.S. In July
of 2009, the most recent center grant was awarded  following the issuance of the RFA,
"STAR Computational Toxicology Research Centers: In vitro and in silico Models of
Developmental Toxicity Pathways." This center will  bridge the interface of in vitro data
generation and in silico model development to answer critical biological questions
related to toxicity pathways important to human  development.  The first three centers
are operating as cooperative agreements with ORD. This has enabled productive
collaborations between center Pis and EPA scientists and has resulted in numerous
jointly published journal articles in the peer reviewed literature.
This work was reviewed by EPA and approved for publication but does not necessarily
reflect official Agency policy.
                      Previous  I    TOC

-------
The Carolina Environmental Bioinformatics Center

Fred A. Wright, Alexander Tropsha, Leonard McMillan, Ivan Rusyn

The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599

The Carolina Environmental Bioinformatics Center brings together multiple investigators
and disciplines, combining expertise in biostatistics, computational biology, chem-
informatics and computer science to advance the field of Computational Toxicology.
The Center is developing novel analytic and computational methods, creating efficient
user-friendly tools to disseminate to the wider community, and applying the
computational methods to data relevant to chemical toxicology. Effort is divided into
three Research Projects, with an emphasis on collaboration within the Center and with
the EPA. Project 1: Biostatistics in Computational  Biology (PI Wright) provides
biostatistical support to the Center, performing analysis and developing new methods in
collaboration with EPA personnel and the computational  toxicology community. This
Project's investigators have engaged in numerous collaborations with EPA in diverse
areas, ranging from pathway dose-response methods development to analysis of-
omics response to pesticide exposures. The Project has  also contributed directly to the
EPA NCCT's involvement with the Microarray Quality Control Consortium II. In addition,
Project 1 is actively working on prediction methods for EPA ToxCast® data, bringing
together Center investigators and industry partners in machine learning methods for
toxicity prediction. Project 2: Chem-informatics (PI Tropsha) coordinates the compilation
and mining of data from relevant external databases and performs analysis and
methods development for building statistically significant  and externally predictive
Quantitative Structure-Activity Relationship models of chemical toxicology data.  In
addition, Project 2 is developing computational tools to perform these tasks, and has
longstanding collaborations with EPA investigators in DSSTox data  analysis.
Moreover, Project 2 is providing important chemical descriptor data to improve toxicity
prediction for ToxCast® data analysis, and providing context for further prioritization in
chemical testing. Project 3: Computational Infrastructure  for Systems Toxicology (Pis
Rusyn and McMillan) has created a framework for handling emerging -omics data on
genetic susceptibility  in model organisms.  In addition, Project 3 provides programming
expertise to create graphical tools that are used by partners within the Center and with
collaborators at the EPA.  The synergy of the interactions within the Center, as well as
with the EPA and toxicological community, are strengthening and advancing the field of
computational toxicology through direct partnerships and the dissemination of tools
used by both bioinformatics and bench scientists.
                       Previous  I    TOC

-------
Environmental Bioinformatics & Computational Toxicology Center (ebCTC):
Research Collaborations in Multiscale Modeling of the Effects of Environmental
Toxicants

William J. Welsh and Panes G. Georgopoulos, University of Medicine and Dentistry -
R.W. Johnson Medical School

The USEPA-funded environmental bioinformatics and Computational Toxicology Center
(ebCTC) is a research consortium of the  University of Medicine and Dentistry - R.W.
Johnson Medical School, Princeton University, Rutgers  University, and the Center for
Toxicoinformatics of the US  Food and Drug Administration.  ebCTC augments USEPA's
research at its National  Center for  Computational  Toxicology  (NCCT)  through  the
development  and   application  of  novel computational  methods   supporting  the
mechanistic assessment of health risks associated with environmental factors.

ebCTC  brings  together  a  multidisciplinary  team  of  computational scientists and
engineers, with  backgrounds in cheminformatics, enviroinformatics, bioinformatics, and
mechanistic process modeling,  to address, collaboratively and simultaneously,  multiple
elements of the "environmental health sequence" from "source" (e.g. the release or
formation of a "stressor", such as a  chemical, radiological,  or  biological  agent  in an
environmental  medium) to  "outcome" (e.g. the  development of an  environmentally
caused disease). This effort is pursued through the study and elucidation of the cascade
of individual events and processes involved in the above source-to-outcome continuum
within a consistent and integrative multiscale analysis framework. The ebCTC approach
considers human  health  state as the result of coupled dynamic systems spanning
multiple scales  of "biological space"  (i.e.  involving processes and interactions at the
scales  of  molecules,  cells,  tissues, organs,  organisms,  and populations).   This
integrative  analysis framework embraces a general "reverse engineering" approach,
that incorporates multiple diagnostic and  prognostic modeling methodologies, to  study
and reveal  the hierarchical  structures and functional dynamics of multiscale biological
systems in relation to their perturbations by behavioral and environmental influences.

The methods and computational tools that  are being developed through the above effort
are extensively evaluated  and  refined  through collaborative  applications involving
ebCTC scientists and  colleagues from the three consortium  universities,  USFDA, and
USEPA.  Examples  of  ongoing  (or completed)  projects  that  involve  extensive
collaborations  of   ebCTC  with  USEPA  scientists  include: the  development and
application  of methods  for reconstructing population  exposures  to environmental
chemicals  from biomarker data;  the biologically-based  modeling  of  multimedia,
multipathway, and multiroute population exposures to arsenic and arsenic  compounds;
the development of biologically-based toxicokinetic and  toxicodynamic  models that
incorporate the effects of aging in physiological and biochemical  process dynamics; the
incorporation of toxicogenomic data in human health risk analyses with dibutyl phthalate
(DBP) as a case  study;  the pathway analysis of microarray data collected following
exposures to conazoles; etc. ebCTC researchers are also interacting continuously and
extensively with USEPA  scientists in numerous research activities,  that include the
development of methods  and tools related to the Virtual Liver project efforts,  and the
analysis of data that are becoming available through the ToxCast project.
                      Previous  I    TOC

-------
Carolina Center for Computational Toxicology

Ivan Rusyn, Shawn Gomez, Timothy Elston, Fred Wright, Alexander Tropsha

The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599

The Carolina Center for Computational Toxicology is engaged in a broad interdisciplinary
effort to devise novel tools, methods and knowledge to assist the regulatory agencies
and the greater environmental health sciences community in protecting the environment
and human health.  Using publicly available data, the Center applies knowledge and
expertise of the individual investigators and teams to develop complex predictive
modeling solutions  that range from mechanistic, interpretative data modeling to
discovery and decision support efforts. Furthermore, each of the Center's three
Research Projects is engaged in active collaboration with other projects within both the
Center and the EPA. Project 1: Biomedical modeling of chemical-perturbed networks
(Pis Gomez and  Elston), is mapping chemical-perturbed networks and devising
modeling tools that can predict the pathobiology of the test compounds based on a
limited set of biological data. This group of investigators  is actively collaborating with the
v-Liver® project and are examining the network biology relationships using ToxCast®
and ToxRefDB® data. Project 2:  Toxico-genetic modeling (Pis Wright and Rusyn), is
building computational tools that will enable toxicologists to understand the role of
genetic diversity among individuals in responses to toxicants. This project is most
actively engaged in data analysis for the ToxCast®, ToxRefDB® and ACToR projects at
the EPA and is also conducting biological experiments in genetically defined human and
mouse cells in collaboration with the Tox21 partners (EPA, NIEHS/NTP and NCGC).
Project 3: Chem-informatics (PI Tropsha)  is engaged in unbiased discovery-driven
prediction of adverse chronic in vivo outcomes based on statistical modeling of known
relationships between  chemical structures, biological screening results, and the genetic
makeup of the organism. This project continues its historically tight interactions with
DSSTox, ACToR, ToxCast®, and ToxRefDB® teams within EPA through data analysis
and model development efforts. Collectively, the Center not only advances the field of
computational toxicology through the development of new methods, software, and applied
models but also advocates the application  of the knowledge into practice through
collaborative efforts with EPA and other governmental stakeholders. The emphasis of our
work is on the usability of the outputs by the risk assessment community and the
investigative toxicologists thus facilitating the transition  of the field of computational
toxicology from a hypothesis-driven toward a predictive science.
                      Previous  I    TOC

-------
NCER STAR GRANT ABSTRACT

EPA Grant Number: 83428901

Title: The Texas-Indiana Virtual STAR Center; Data-Generating  in vitro and in silico
Models of Developmental Toxicity in Embryonic Stem Cells and Zebrafish

Investigator(s):
1. Prof. Jan-Ake Gustafsson (Contact PI)      E-mail: jgustafsson@uh.edu
2. Prof. Richard H. Finnell             E-mail: rfinnell@ibt.tamhsc.edu
3. Prof. James A. Glazier              E-mail: glazier@indiana.edu
Institution(s):
1. University of Houston, Department of Biology and Biochemistry, Houston, TX 77204
2. The Texas A&M Institute for Genomic Medicine, Texas A&M University/Texas A&M
Health Science Center, Houston, TX 77030
3. Indiana University, Department of Physics, Bloomington, IN 47405-7003

EPA Project Officer: (leave blank)
Project Period:
Project start: November 1, 2009
Project end: October 31, 2012

Description
Objectives/Hypothesis:
As chemical production increases worldwide, there is increasing  evidence as to their
hazardous effects on human health at today's exposure levels, which further implies
that  current chemical  regulation  is  insufficient.  Thus,  a restructuring  of  the  risk
assessment procedure  will  be  required to  protect  future generations. Given  the very
large  number  of man-made chemicals and the likely complexity of their various and
synergistic modes of action, emerging technologies will be required for the restructuring.
The main objective of the proposed multidisciplinary Texas Indiana Virtual STAR (TIVS)
Center is to contribute to a more reliable chemical risk assessment through the
development of high throughput in vitro and  in silico screening models of developmental
toxicity.  Specifically,  the  TIVS Center aims to generate in  vitro  models of murine
embryonic stem cells  and zebrafish for developmental toxicity.  The data produced from
these models  will be  further  exploited  to produce  predictive in silico  models for
developmental toxicity  on  processes  that  are relevant also for  human  embryonic
development.

Approach:
The  project is divided into three Investigational Areas; zebrafish  models,  murine
embryonic stem cells models and in silico simulations. The approaches are to:
   1.  Generate  developmental  models   suitable  for  high  throughput  screening.
      Zebrafish developmental models (transgenic GFP/EGFP/RFP models of crucial
      steps in development) and embryonic stem cell (ESC) differentiation  models
      (transgenic beta-geo models of crucial steps in differentiation) will be generated.
      Important morphology features  and signaling pathways during development will
      be  documented.  The impact of environmental  pollutants on development and
                      Previous  I    TOC

-------
      differentiation will be assessed in the models. Finally, the models will be refined
      for high throughput screening and automation.
   2. Generate a computational model that faithfully recreates the major morphological
      features  of normal  wild-type zebrafish  development (ie- segmentation  into
      somites, proper patterning of vascular and neural systems) and the differentiation
      to three primitive  layers  (endoderm, mesoderm  and  ectoderm)  in  mouse
      embryonic  stem cells. The data for simulations are produced from  developed
      high information content zebrafish and ESC models. Once a working model  of
      normal development has been generated, we will  carry out a directed series  of
      parameter  sweeps  to try to  create developmental  defects  in silico.  We will
      compare the results of computationally  created  defects with  experimentally-
      generated defects in zebrafish and embryonic stem cells. Best matches between
      the two datasets will suggest  hypotheses about possible mechanisms by which
      defects occur.
   3. Perform proof-of-concept experiments of the in vitro and in silico test platforms
      with a blind test of chemicals.

Techniques will be molecular biology techniques on zebrafish and ESC models, such as
cloning, imaging,  in vitro  differentiation  and  in vitro exposure studies,  and  in  silico
mathematical simulations.

Expected Results  (Outputs/Outcomes):
In collaboration with other initiatives taken in the field of chemical safety, our generated
results and models will contribute to large screening effort to prioritize chemicals for
further risk assessment. We will specifically contribute with:
   •  9 transgenic fish lines validated for toxicity screening
   •  16 embryonic stem cell models validated for toxicity screening
   •  High information content models on development  and differentiation to produce
      data for in silico simulations, within the project and elsewhere
   •  Computational models for developmental toxicology  of normal  development and
      of mechanisms by which chemical perturbations cause experimentally-observed
      developmental defects
   •  Information on developmental  toxicity on 39 compounds

All the data produced in this project will be released to public databases. The developed
models will be automated for high throughput screening.

Supplemental Keywords:
Risk  assessment, effects,  dose-response,  teratogen,  organism,  cellular,   infants,
chemicals, toxics,  aquatic ecosystem protection, pollution prevention, green chemistry,
public policy, environmental  chemistry,  biology,   physics, genetics,  mathematics,
modeling,  measurement methods.
                      Previous  I    TOC

-------
Mechanistic Indicators of Childhood Asthma (MICA): A Systems Biology
Approach for the Integration of Multifactorial Environmental Health Data

Jane Gallagher1, David Reif2, Edward Hudgens 1, Ann Williams1, Mary Johnson1, Ron
Williams3, Haluk Ozkaynak3, Lucas Neas1, Brooke Heidenfelder1, Elaine Cohen Hubal2
and Stephen Edwards1

1 National Health Environmental Effects Research Laboratory, USEPA, RTP,  NC
2 National Center for Computational Toxicology, USEPA, RTP, NC
3 National Exposure Research Laboratory, USEPA, RTP NC

Modern methods in molecular biology and advanced computational tools show promise
in elucidating complex interactions that occur between genes and environmental factors
in diseases such as asthma; however appropriately designed studies are critical for
these methods to reach their full potential. We used a case-control study to investigate
whether genomic data (blood gene expression), viewed together with a spectrum of
exposure, effects and susceptibility markers (blood, urine and nail), provide a
mechanistic explanation for the increased susceptibility of asthmatics to ambient air
pollutants. We studied 205 non-asthmatic and asthmatic children, (9-12 years of age)
who participated in a clinical study in Detroit, Michigan. The study combines a traditional
epidemiological design with an integrative approach to investigate the environmental
exposure of children to indoor-outdoor air. The study includes measurements of internal
dose (metals,  total and allergen specific IgE, PAH and VOC metabolites) and clinical
measures of health outcome (immunological, cardiovascular and respiratory). A parallel
study using allergen sensitized Brown Norway rats included expression data from both
blood and lung to establish relationships between transcriptional changes in the blood
and corresponding changes in the target tissue. Expected immunological indications of
asthma have been obtained. In addition, initial  results from our analyses point to the
complex nature of childhood health and risk factors linked to metabolic syndrome
(obesity, blood pressure and dyslipidemia). For example, 31 % and 34% of the
asthmatic MICA subjects were either overweight (BMI > 25) or hypertensive, (age and
gender adjusted blood pressure values > 90th percentile).  This study represents a new
paradigm for epidemiological  studies in which traditional  health endpoints and
biomarkers are coupled with genetics and high content (Omics) data to expand the use
of mechanistic models for  human risk assessment.

This work was reviewed by EPA and approved for publication but does not necessarily
reflect official Agency policy.
                      Previous  I    TOC

-------
Linkage of Exposure and Effects Using Genomics, Proteomics, and
Metabolomics in Small Fish Models

Timothy W. Collette , Gerald Ankley (NERL), Dan Villeneuve, et al.2L); Drew
Ekman, David Bencic, Rong-Lin Wang, et al. (NERL); Rory Conolly (NCCT);
Nancy Denslow (University of Florida); Natalia Garcia-Reyero (Jackson State
University); Dalma Martinovic (University of St. Thomas); Ed Perkins (US Army
Corps of Engineers); Karen Watanabe (Oregon Health Sciences University)

Knowledge of possible toxic mechanisms/modes of action (MOA) of chemicals
can provide valuable insights as to appropriate methods for assessing exposure
and effects, thereby reducing uncertainties related to extrapolation across
species, endpoints and chemical structure. However, MOA-based testing
seldom has been used for assessing the ecological risk of chemicals. This is in
part because past regulatory mandates have focused more on adverse effects of
chemicals (reductions in survival, growth or reproduction) than the MOA through
which these effects are caused. A recent departure from this involves endocrine-
disrupting chemicals (EDCs), where there is a regulatory need for USEPA to
understand both MOA and adverse outcomes. To achieve this understanding,
advances in predictive approaches are required whereby mechanistic changes
caused by chemicals at  the molecular level can be translated into apical
responses meaningful to ecological risk assessment, such as effects on
development and reproduction, and ultimately population-level impacts.

This is a large, integrated project with collaborators from multiple ORD
laboratories/centers, other Federal agencies, and several universities (originally
through EPA's extramural grants program), that is employing two small fish
models, the fathead minnow (Pimephales promelas) and zebrafish (Danio rerio),
to develop better predictive tools for assessing the ecological risk of EDCs.  For
this work, a systems-based approach is being used to delineate toxicity pathways
for 12 model EDCs (muscimol, fipronil, haloperidol, apomorphine, ketoconazole,
trilostane, prochloraz, fadrozole, flutamide, vinclozolin, 17(3-trenbolone and 17a-
ethinylestradiol) with different known or hypothesized toxic MOA.  The studies
employ a combination of state-of-the-art genomic (transcriptomic, proteomic,
metabolomic), bioinformatic and modeling approaches,  in conjunction with whole
animal testing protocols, to develop response linkages across biological levels of
organization, ranging from molecular alterations to population impacts.

This abstract has been reviewed in accordance with the U. S. Environmental
Protection Agency's peer and administrative review policies and approved for
presentation and publication.
                   Previous  I    TOC

-------
Development of a Searchable Metabolite Database and Simulator of Xenobiotic
Metabolism

W. JACK JONES1; Pat Schmieder and Rick Kolanczyk2; Ovanes Mekenyan3
1 National Exposure Research Laboratory, Ecosystems Research Division, USEPA, GA
2 National Health and Environmental Effects Research Laboratory, Mid-Continent
Ecology Division, USEPA, Duluth, MN
3 Laboratory of Mathematical Chemistry, Bourgas University, Bourgas, Bulgaria.

      Methods and tools are needed by the EPA's Office of Prevention, Pesticides, and
Toxic Substances (OPPTS) to evaluate and prioritize chemicals for toxicity testing and
hazard assessment, and to enhance the interpretation of registrant data that is
submitted as part of the regulatory process to improve human health and ecological risk
assessments. An often overlooked process, the metabolic activation of chemicals
(production of potentially hazardous transformation products from parent chemicals of
concern), is considered to be an important factor for assessing risk to the environment
and human health. The primary goals of this project are to enhance the  ability to
interpret metabolism data via development of a metabolism database (mammalian liver)
that is searchable by text and chemical structure and additionally to develop an in silico
capability for reliably forecasting the metabolism of xenobiotic chemicals of EPA
concern.
      Metabolism data, collected from the peer-reviewed literature and from registrant-
submitted data (required for chemical registration/re-registration), has been coded for
risk assessor evaluation/use and for development, training and improvement of a
metabolic simulator. Metabolic pathway information is electronically stored in MetaPath,
a software system allowing sophisticated chemical structure/substructure search
queries to identify commonalities and differences in metabolites among  chemicals,
species, dosing regimes, etc. The system depicts metabolic pathways and provides
rapid  retrieval of metabolism study information and associated metadata including
metabolite quantities where available.  The database will be used by OPPTS scientists
to increase efficiency of metabolism data access and analysis for performance of risk
assessments.
      An initial version of a metabolic simulator is under development. The simulator
utilizes a library of more than 340 "functional-group" transformations targeting both in
vitro and in vivo mammalian liver metabolism. Literature-derived, experimentally
determined metabolic maps for diverse chemicals were used for initial simulator
training, with performance of the simulator enhanced by expanding the chemical domain
focus on collection of additional metabolism maps for transformations underrepresented
in the initial training set. Future research will include linking metabolism  predictions with
exposure and toxic effects models to enhance prioritization tools for toxicity testing and
chemical assessments for large chemical lists of concern.
      The  potential impact of this work is significant as it provides much needed tools
to EPA Offices such as OPPTS and the scientific community for evaluating the potential
role of metabolism  in enhancing or diminishing toxicity. Linkage of these tools with
exposure and toxic effects models will assist Agency scientists in prioritizing large
chemical lists for further toxicity evaluations, especially for data poor chemicals. These
tools will also allow risk assessors to more systematically and efficiently assess the
hazard of both parent chemicals and their potentially bioactive metabolites. This work
was reviewed by EPA and approved for publication but does not necessarily reflect
official Agency policy.
                      Previous  I    TOC

-------
   1.  Generate  developmental  models  suitable  for  high  throughput  screening.
      Zebrafish developmental models (transgenic  GFP/EGFP/RFP models of crucial
      steps in development) and  embryonic stem cell  (ESC) differentiation models
      (transgenic beta-geo models of crucial steps in differentiation) will be generated.
      Important morphology features and signaling pathways during development will
      be documented. The impact of environmental pollutants  on development and
      differentiation will be assessed in the models. Finally, the models will be refined
      for high throughput screening and automation.
   2.  Generate a computational model that faithfully recreates the major morphological
      features  of  normal  wild-type zebrafish  development  (ie- segmentation into
      somites, proper patterning of vascular and neural systems)  and the differentiation
      to three primitive  layers  (endoderm, mesoderm and  ectoderm)  in  mouse
      embryonic stem cells. The data for simulations are produced from  developed
      high information content zebrafish and ESC  models. Once a working model of
      normal development has been generated, we will carry out a directed series of
      parameter sweeps  to try  to create developmental defects  in silico.  We will
      compare the results  of computationally  created  defects with experimentally-
      generated defects in zebrafish and embryonic stem cells. Best matches between
      the two datasets will suggest hypotheses about  possible mechanisms by which
      defects occur.
   3.  Perform proof-of-concept experiments of the  in vitro and in silico test platforms
      with a blind test of chemicals.

Techniques will be molecular biology techniques on zebrafish and  ESC models, such as
cloning, imaging,  in vitro  differentiation  and in vitro exposure studies, and in silico
mathematical simulations.

Expected Results (Outputs/Outcomes):
In collaboration with other initiatives taken in the field of chemical  safety, our generated
results and models will contribute to large screening effort to prioritize chemicals for
further risk assessment. We will specifically contribute with:
   •  9 transgenic fish lines validated for toxicity screening
   •  16 embryonic stem cell models validated for toxicity screening
   •  High information content models on development and differentiation to produce
      data for in silico simulations, within the project and elsewhere
   •  Computational models for developmental toxicology of normal development and
      of mechanisms by which chemical perturbations  cause experimentally-observed
      developmental defects
   •  Information on developmental toxicity on 39 compounds

All the data produced in this project will be released to public databases. The developed
models will be automated for high throughput screening.

Supplemental Keywords:
Risk  assessment, effects,  dose-response, teratogen,  organism,  cellular, infants,
chemicals, toxics,  aquatic ecosystem protection, pollution prevention, green chemistry,
public  policy, environmental chemistry,  biology,  physics,  genetics,  mathematics,
modeling, measurement methods.
                      Previous  I    TOC

-------
Risk Assessment of the Inflammogenic and Mutagenic Effects of Diesel Exhaust
Particulates: A Systems Biology Approach

James M.  Samet,  NHEERL, M. Ian Gilmour, NHEERL, William Reed,  UNC-Chapel Hill,
David DeMarini, NHEERL, William Linak, NRMRL, Seung Cho, Arcadia Corp., Dongsun
Cao, UNC-Chapel Hill,  Hugh Barton, NHEERL/NCCT

Diesel exhaust particulate matter (DEP) is a ubiquitous ambient air contaminant derived
from mobile and stationary diesel fuel combustion. Exposure to DEP is associated with
carcinogenic and  immunotoxic effects  in  humans  and experimental animals.  At the
cellular  level,  these health  effects are  underlain by  genotoxic and inflammatory
properties of chemical compounds  present in DEP. DEP is composed of  elemental,
inorganic and organic compounds that vary widely in composition with the source of the
fuel,  engine operating conditions,  sampling  methods  and other parameters. The
genotoxic  and  inflammatory potencies  of DEP also  vary  with  its  physicochemical
properties,  and these  differences  along with multiple  health  effects  impede the
development of targeted regulatory strategies for mitigating the impact of DEP exposure
on  human health. While traditional  reductive toxicology approaches are not likely to
succeed in quantifying relationships between DEP composition and its numerous health
effects, generating a database  for  modeling the  toxicological  effects  of DEP  would
provide a  framework for quantitative  hazard identification.  This  project  undertook  a
systems approach towards the development of a predictive a computational  model that
quantitatively describes relationships between the composition of DEP and its genotoxic
and inflammogenic  potencies.  In phase  1 (Specific Aim  1), 16 distinct  DEP  were
generated using a combination of  fuels,  engine  types, engine loads and  collection
temperatures. These DEP were characterized through extensive chemical and physical
analyses.  In phase 2 (Specific Aims  2  and 3),  the  inflammogenic  and  genotoxic
potencies  of each of the 16 DEP was  determined quantitatively. Specific  bioassays
measured the expression of the pivotal  inflammatory mediator IL-8 in cultured human
lung cells in  response to DEP exposure. Signaling  mechanisms that regulate the
expression of IL-8/MIP-2  in response to DEP exposure were also examined  in order to
provide mechanistic  insight and support for the models. The genotoxicity of the 16 DEP
was assayed using bacterial mutagenicity  assays. Phase 3 (Specific Aim 4) will  utilize
the generated  data  to construct a series  of statistical and  mathematical models that
quantitatively relate DEP  composition, its inflammogenic and  mutagenic effects and the
relevant intracellular signaling mechanisms.   Projects funded by this start-up award
produced  new findings ranging from the  physicochemical  properties of DEP to the
molecular  mechanisms  of toxicity  of  DEP  inhalation. Specifically, these projects
generated data on combustion factors that influence the chemical speciation of DEP,
identified the signal  transduction mechanisms activated by  DEP  exposure  of human
lung  cells, and  ranked  and  characterized  the  genotoxicity  of DEP  of varying
composition. The  information provided by  these projects has decreased uncertainty in
the risk assessment of DEP exposure and provided biological plausibility in  support of
regulatory efforts aimed at mitigating the health effects of DEP inhalation.  This work
was reviewed by  EPA and approved for  publication but does not necessarily reflect
official Agency policy.
                      Previous  I    TOC

-------
Development of Microbial Metagenomic Markers for Environmental Monitoring

Jorge W. Santo Domingo, NRMRL/ORD, AWBERC, Cincinnati, OH

Microbiological impairment of water is assessed by monitoring for the presence of
sanitary indicator bacteria. However, conventional methods used to detect bacterial
indicators do not provide any information on the sources impacting water.  Recently
developed microbial source tracking (MST) methods have been used to determine the
sources of fecal pollution and pathogens affecting surface waters. While several studies
have reported the successful application of MST methods,  none of the methods can
meet performance expectations in complex water systems. One of the basic problems
of MST is the need to rely on the use of culture based methods to assess  the primary
sources of pollution. Our research group has developed and evaluated  nonculture-
based genomic methods for environmental monitoring and risk assessment based on
fecal metagenome (microbial community genome) and 16S rRNA gene sequences. To
identify potential host-specific markers we developed a novel completive hybridization
approach called genome fragment enrichment (GFE).  Using GFE we have identified
dozens of metagenome specific gene fragments which we then used in assay
development. Thus far, we have developed markers to track human, cattle, and poultry
sources of fecal pollution. In addition,  using sequencing analyses of 16S rDNA clone
libraries we have  identified several novel markers for waterfowl (i.e., gulls  and geese).
We have applied the latter assays in samples collected from multiples sites in Lake
Ontario. The results from these studies have confirmed the importance  of  waterfowl as
primary sources of fecal pollution  in the region. The methods developed in this study will
be useful in epidemiological studies and in the evaluation of risk management practices
designed to prevent,  reduce, and  eliminate pollution of recreational waters and waters
used as sources of drinking water.
This work was reviewed by EPA and approved for publication but does  not necessarily
reflect official Agency policy.
                      Previous  I    TOC

-------
Develop a Systems Approach to Characterizing and Predicting Thyroid Toxicity
Using an Amphibian Model

Sigmund Degitz, Mike Hornung, Joseph Tietge, National Health and Environmetnal
Effects Research Laboratory, Mid-Continent Ecology Division, Duluth, MN

This research makes use of in vitro and in vivo approaches to understand and
discriminate the compensatory and toxicological responses of the highly regulated HPT
system. Development of an initial systems model will be based on the current
understanding of the HPT axis and the compensatory processes involved in thyroid
hormone homeostasis. Experiments have been conducted to better understand the
relationships of the critical sub-components of the system. Particular emphasis has
been placed on  understanding the relative importance of gene expression in the
pituitary, thyroid, and peripheral tissues under normal conditions and following exposure
to chemicals known to interfere with thyroid hormone (TH) synthesis. These molecular
changes are being linked to functional measurements of key hormones and enzymes
that are part of the HPT pathway, all of which are being interpreted in the context of
organismal-level effects.
The primary goal of this work is to develop a sufficient understanding of the HPT so that
predictive models  can be developed, testing protocols can be  abbreviated, and efforts in
inter-species extrapolation can be improved. One of the most  likely uses for a HPT
systems model is to aid in the understanding and discrimination of different modes of
action.  As such, this work further enables the development of  quantitative structure
activity relationships (QSARs) by providing a basis for sorting  chemicals by mode of
action,  a necessary step prior to quantifying features of chemical structure associated
with a particular type of toxicity. If these relationships can  ultimately be established, then
predictive models  can be developed to rank chemicals for future in vivo testing.
This work was reviewed by EPA and approved for publication but does not necessarily
reflect official Agency policy.
                      Previous I    TOC

-------

Session II
















Poster

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Title

Bioactivity Profiling Results from ToxCast Phase I
Assays
ToxMiner™ Software Interface for Visualizing and
Analyzing ToxCast Data
Evaluating the Toxicity Pathways Using High-Throughput
Environmental Chemical Data
Modeling and Predicting Cancer from ToxCast Phase I
Data
Endocrine Profiling and Prioritization Using ToxCast Assay?
Computational Molecular Modeling Methods for
Screening for Chemical Toxicity: The Toxicant-Target
Approach
Development of high throughput methods for chemical
screening and prioritization
The Second Phase of ToxCast and Initial Applications to
Chemical Prioritization
Nuclear Receptor Activity and Liver Cancer Lesion
Progression
A Virtual Liver for Simulating Chemical-Induced Injury
Experimental Models for Quantitative Analysis of
Hepatocarcinogenesis
EPA'S Virtual Embryo: Modeling Developmental Toxicity
Predictive Modeling of Developmental and Reproductive
Toxicity Pathways
Adaptive Responses to Prochloraz Exposure that Alter
Dose-Response and Time-Course Behaviors
Applying Uncertainty Analysis to a Risk Assessment for
the Pesticide Permethrin
Methodology for Uncertainty Analysis of Dynamic
Computational Toxicology Models
Presenters Bio

K Houck
M. Martin
H Mortensen
R Judson
DReif
J Rabinowitz
WMundyetal(NHEERL)
D. Dix
I Shah
J Wambaugh
Chris Gorton (NHEERL)
T Knudsen
Hunter et al
RConolly
R Setzer
J Davis
Previous
TOC

-------
Bioactivity Profiling Results from ToxCast Phase I Assays

Keith Houck, National Center for Computational Toxicology, USEPA, RTP, NC

The ToxCast™ Phase I  library of 309 chemicals (320 substances including replicates)
was profiled against 470 in vitro assays to generate data to build initial predictive
models of in vivo toxicity. The in vitro assays included nine different technologies
encompassing cell-free, high-throughput screening assays as well as cell-based assays
in a variety of cell lines,  primary human cells as well as primary rat hepatocytes.
Concentration-response data were collected for all active chemical-assay combinations.
Data reproducibility was compared by calculating the concordance of results for the
chemical replicates.  Cell-based assays were, in general, more sensitive to chemical
effects than were biochemical assays. Chemical responses were diverse with a wide
range of promiscuity. Many expected interactions were noted  in  the data, including
endocrine and xenobiotic metabolism enzyme activity. These  data are being used to
build correlations to  in vivo toxicity endpoints as captured in the  relational database
ToxRefDB.
This work was reviewed by EPA and approved for publication but does not necessarily
reflect official Agency policy.
                      Previous  I    TOC

-------
ToxMiner™ Software Interface for Visualizing and Analyzing ToxCast Data

Matthew T. Martin2, Ling-Chieh Tsai1, David J. Dix2, Richard S. Judson2, David M. Reif2,
Russell S. Thomas1

1The Hammer Institutes for Health Sciences, 6 Davis Drive, Research Triangle Park, NC
27709
2National Center for Computational Toxicology, USEPA, RTP, NC 27709

The ToxCast™ dataset represents a collection of assays and endpoints that will require
both standard statistical approaches as well as customized data analysis workflows.  To
analyze this unique dataset, we have developed an integrated database with Java-
based interface called ToxMiner.  The software is organized around both chemical and
assay centric visualization and analysis tools.  Currently, these visualization and
analysis tools include standard views of chemical and assay properties and results, the
ability to define subsets of chemicals and assay results based on specified criteria,
correlation matrices across assays, hierarchical clustering, and relative risk calculations.
Machine learning algorithms have been added using Weka. The ToxMiner software
and database will be made freely available.  This work was reviewed by EPA and
approved for publication but does not necessarily reflect official Agency policy.
                      Previous I    TOC

-------
Evaluating the Toxicity Pathways Using High-Throughput Environmental
Chemical Data

Holly M. Mortensen*, David Reif, David Dix, Keith Houck, Robert Kavlock,
Richard Judson, National Center for Computational Toxicology, USEPA, RTP,
NC

The application of HTS methods to the characterization of human phenotypic
response to environmental chemicals is a largely unexplored area of
pharmacogenomics. The U.S. Environmental Protection Agency (EPA), through
its ToxCast™ program, is developing  predictive toxicity approaches that use in
vitro high-throughput screening  (HTS) to profile and model the bioactivity of
environmental chemicals. Current efforts draw from the extensive use of HTS
technologies by pharma and biotech industries for the purposes of drug
discovery, with notable similarities and differences. Output from the first phase of
these experiments has been used to construct target gene lists that have been
linked with publically available information on gene and protein annotation,
molecular, biological, and cellular pathway/processes, as well as gene-disease
association information. These data are integrated, and can be accessed and
queried using the ToxMiner™ database.  Currently there is no standard for
analysis of available gene-pathway interaction data, and most studies to date
have focused on a single data source; however, by looking across pathway data
sources we illustrate, using computational network methods, previously
undefined toxicity and toxicity-related  pathway coverage in relation to global
pathway space. Finally, we illustrate what pathways are being affected by the
ToxCast™ chemicals screened  in Phase I, and the relation of those pathways to
human disease.

This work was reviewed by EPA and approved for publication but does not
necessarily reflect official Agency policy.
                   Previous  I    TOC

-------
Modeling and Predicting Cancer from ToxCast Phase I Data

Authors: Richard Judson, David Dix, Robert Kavlock, Imran Shah, Keith Houck,
Thomas Knudsen, David Reif, Matt Martin, National Center for Computational
Toxicology, USEPA, RTP, NC

The ToxCast™ program is generating a diverse collection of in vitro cell free and
cell based HTS data to be used for predictive modeling of in vivo toxicity. We are
using this in vitro data, plus corresponding in vivo data from ToxRefDB, to
develop models for prediction and prioritization. This poster will focus on a set of
machine learning based models that produce toxicity signatures, which are
algorithms that yield a toxicity class prediction  based on an association between
in vitro assay data and an in vivo endpoint, derived from training examples. We
demonstrate this approach with a signature for rat liver proliferative lesions using
data from the chemicals with rat chronic/cancer data in ToxRefDB, 61 of which
are positive for this endpoint. We also demonstrate the use of derived gene and
pathway perturbation scores which are more aggregated predictors that can be
used in machine learning approaches. Qualitative uses of these perturbation
scores are demonstrated with relation to other in vivo endpoints  in rodents and
humans, including their use in predicting whether a chemical will be a probable
human carcinogen.
This work was reviewed by EPA and approved for publication but does not
necessarily reflect official Agency policy.
                   Previous  I    TOC

-------
Endocrine Profiling and Prioritization Using ToxCast Assays
David Reif1, Matthew Martin1, Keith Houck1, Richard Judson1, Thomas Knudsen1,
Shirlee Tan2, Vicki Dellarco3, David Dix1 and Robert Kavlock1
1 National Center for Computational Toxicology, USEPA, RTP, NC
20ffice of Science Coordination and Policy, Washington, DC
30ffice of Pesticide Programs; USEPA, Washington, DC
The U.S. EPA's Endocrine Disrupter Screening Program (EDSP) is charged with
screening pesticide chemicals and environmental contaminants for their potential to
affect the endocrine systems of humans and wildlife (http://www.epa.gov/endo/). The
prioritization of chemicals for testing is a goal shared by both the EDSP and the U.S.
EPA's ToxCast™ program (http://epa.gov/ncct/toxcast/), in which a battery of in vitro,
high-throughput screening assays (467) have assessed a  library of 309 environmental
chemicals at a cost <1% of that required for full-scale animal testing. In order to aid the
EDSP, we describe putative endocrine profiles for the entire ToxCast™ library of 309
unique chemicals by focusing on assays involving the estrogen  (n=5), androgen (n=4)
and thyroid (n=4) signaling pathways, as well as other nuclear receptors and xenobiotic
metabolizing enzymes (n=70) that have potential relevance to endocrine signaling.
Using these multi-assay profiles in combination with information on relevant chemical
properties, toxicity pathways, and in vivo study results, we present a flexible ranking
system by which chemicals can be prioritized for further screening. By incorporating
multiple sources of information (in vitro assays + chemical descriptors + pathways + in
vivo studies), this prioritization system offers a comprehensive look at a given
chemical's toxicity signature. Importantly, the signatures provide a transparent look at
the relative contribution of all information sources that determine an overall priority
ranking. The results demonstrate  that combining multiple data sources into an overall
weight of evidence approach for prioritizing further chemical testing results in more
robust conclusions than any single line of support taken alone.  This work was reviewed
by EPA and approved for publication but does not necessarily reflect official Agency
policy.
                      Previous  I    TOC

-------
Computational Molecular Modeling Methods for Screening for Chemical Toxicity:
The Toxicant-Target Approach

JR Rabinowitz and SB Little, National Center for Computational Toxicology, USEPA,
RTP, NC

The risk posed to human health and the environment by chemicals that result from
human activity often must be evaluated when relevant elements of the preferred data
set  are unavailable. Therefore, strategies are needed that estimate this information and
prioritize the outstanding data requirements.  Knowledge of the potential mechanisms
for activity provides a rational basis for the extrapolations inherent in the preliminary
evaluation of risk and the establishment of priorities for obtaining missing data for
environmental chemicals. The differential step in many mechanisms of toxicity may be
generalized as the  interaction between  a small molecule (a potential toxicant) and one
or more macromolecular targets. An approach based on computation of the interaction
between a potential molecular toxicant and a library of macromolecular targets of
toxicity has been proposed for chemical screening. A library of potential protein targets
for chemical toxicity has been developed from the Protein Data Bank
(www.rcsb.org/pdb). As a test of this approach the interaction between targets
constructed from the rat estrogen receptor and molecules in a data set of chemicals
tested for the capacity to compete with the natural ligand for that receptor have been
computed using molecular "docking" methods. These methods were developed to aid
in the discovery of new Pharmaceuticals (chemicals that bind strongly to the receptor).
In this application they are being tested for their capacity to identify molecules that bind
weakly to the receptor in a data base of primarily inactive chemicals. The data set
being studied (KIERBL in DSSTox) contains 280 chemicals plus 17  -estradiol.  Of
these chemicals 14 compete with the natural ligand for the receptor and each binds 3-5
orders of magnitude more weakly than 17  -estradiol.  Two different rapid computational
"docking" methods  have been applied.  Using these without consideration of the
geometry of binding between the toxicant and the target, all of the active molecules
were discovered in the first 16% of the chemicals. When a filter is applied based on the
geometry of a simplified pharmacophore for binding to the ER, the results are improved
and all of the active molecules were discovered in the first 8% of the chemicals.  In
order to obtain no false negatives in the model that includes the pharmacophore filter
only 8 molecules of the 280 are false positives.  These results indicate that molecular
"docking" algorithms that were designed to find the chemicals that act most strongly at a
receptor can efficiently separate weakly active chemicals from a library of primarily
inactive chemicals. The advantage of using a pharmacophore filter suggests that the
development of filters of this type for other receptors will prove valuable for other
potential targets. This approach may be used in conjunction with other molecular
parameters and bioassay data to address chemical prioritization. The evaluation of the
capability of these methods, or any multi-parameter method for chemical screening,
requires an understanding of the position of an untested chemical in the parameter
space of model. A method for determining this position in the space of relevant
parameters is being developed.
This work was reviewed by EPA and approved for publication but does not necessarily
reflect official Agency policy.
                      Previous  I    TOC

-------
Development of high throughput methods for chemical screening and
prioritization

T. Shafer1, W. Mundy1, S. Simmons1, R. Luebke1, K. Crofton1, K. Houck2, D. Dix2

1  National Health Environmental Effects Research Laboratory, USEPA, RTP, NC
2  National Center for Computational Toxicology, USEPA, RTP, NC

To implement the predictive toxicity testing envisioned in the NAS report  on Toxicity
Testing in the 21st century, rapid and efficient in vitro screens that are clearly linked to
adverse outcomes are needed. While high throughput screens exist for many different
biological endpoints, there are adverse outcomes for which efficient screens currently
do not exist. These include developmental neurotoxicity, immunotoxicity,  and cellular
stress response pathways. The goal of this research program is to develop cell-based
assays linked to adverse outcomes. Two stress response pathway-based assays that
measure the activation of oxidative stress and unfolded protein (heat shock) responses
in human liver cells have been developed to date. Screens for potential developmental
neurotoxicants are being developed by identifying and utilizing high throughput, in vitro
assays for critical developmental processes that are sensitive to perturbation by
toxicants. A new project will develop in  vitro screening assays for immunotoxicity, based
on cellular signaling by antigen processing cells and effector cells. This work will focus
on toxicant-induced alteration in cytokine production which in turn modulates immune
function and development of allergic diseases. To date, screening assays for
proliferation, neurite outgrowth and cytotoxicity have been evaluated and utilized to
screen the ToxCast 320 at a single concentration. Compounds (~120) that altered an
endpoint >3 standard deviations from the control mean were further characterized with
complete concentration-response curves. Analysis of these data to determine relative
potency and efficacy of these compounds is ongoing. Additionally,  the ToxCast 320
chemicals were screened using the two stress response pathway assays in human liver
cells.  These data provided 15-point concentration-response curves for each compound,
which facilitated discrimination between "active" and "inactive" compounds and
generated relative potency (AC50) and efficacy information. These  stress assays are
currently being used to screen compounds of  interest to OPPT, and have also been
transferred to the NCGC for incorporation into their testing battery. Overall, these results
demonstrate the utility of cell-based HTS assays to screen and prioritize  chemicals for
toxicity testing.
This work was reviewed by EPA and approved for publication but does not necessarily
reflect official Agency policy.
                      Previous  I    TOC

-------
The Second Phase of ToxCast and Initial Applications to Chemical Prioritization

David Dix1, Keith Houck1, Richard Judson1,  Robert Kavlock1, Stephen Little1, Matthew Martin1, Holly
Mortensen1, David Reif1, Ann Richard1, Woodrow Setzer1, Andrew Beam2, Daniel Rotroff3, Maritja
Wolf4

1 National Center for Computational Toxicology, USEPA, RTP, NC
2Dept. of Statistics, North Carolina State University, Raleigh, NC
3 Department of Environmental Sciences and Engineering, University of North Carolina at Chapel Hill,
NC
4Lockheed Martin (Contractor to U.S. EPA), Research Triangle Park, NC

Tens of thousands of chemicals and other contaminants exist in our environment, but only a fraction of
these have been characterized for their potential hazard to humans.  ToxCast™ is focused on closing
this data gap and improving the management of chemical risk through a high throughput screening
(HTS) system providing bioactivity profiles in a broad range of pathways relevant to carcinogenic,
mutagenic, reproductive and chronic toxicity. In Phase II of ToxCast™ 700 additional chemicals will
be screened in many of same 467 assays of Phase I, as well as additional HTS assays being added to
the program. These include cell-free assays, cell-based assays in a variety of human and rodent
primary cells and cell lines, and assays in zebrafish and other non-mammalian species.  The 700
chemicals being prepared for Phase II of ToxCast™ are a subset of the 10,000 chemical library being
assembled for Tox21 HTS testing by NTP-NCGC-EPA.  Approximately 400 Phase II chemicals are
pesticidal actives, inerts or  antimicrobials, or industrial chemicals rich in existing animal toxicity data
and thus useful for verifying and expanding  predictive toxicity signatures and pathways from Phase I
screening. Approximately 150 Phase II chemicals are pesticidal inerts, antimicrobials, or industrial
chemicals with limited toxicity data and in need of chemical  categorization and prioritization.  Another
150 Phase II chemicals are failed pharmaceutical compounds with animal and human toxicity data for
direct confirmation of human toxicity pathways and predictors. ToxCast™ HTS assays will provide an
in vitro threshold concentration for specific biological targets, genes,  pathways or predictors, which in
combination with high throughput pharmacokinetic modeling can provide  an estimated dose at which a
similar in vivo activity is expected. This estimated in vivo equivalent dose can then be used  in
combination with chemical  structures from DSSTox, and existing in vitro and in vivo toxicity data from
ToxRefDB for predictive modeling and chemical prioritization.  Upon successful completion of Phase
II, the ToxCast™ program will be prepared to conduct rapid, quantitative and high-quality hazard
characterizations and subsequent prioritizations on thousands of chemicals.  In combination with
ExpoCast™ profiling of exposure potential, these chemical prioritizations  can eventually be based
upon both hazard and exposure characterizations, providing ranking of chemicals for entry into
targeted testing specific to carcinogenic, mutagenic, developmental and reproductive, or chronic
toxicity. Decision support software for chemical prioritizations is being provided by ToxMiner™, an
integrated database and interface for ToxCast™ data analysis, visualization and uncertainty
assessment. ToxMiner is user-friendly and will be freely available, facilitating widespread
implementation. ToxCast data will also be available through ACToR, other EPA websites, and
PubChem.  ToxCast provides a means to generating meaningful data on  the thousands of untested
environmental chemicals, and with associated tools a way to use this data to guide more intelligent,
targeted testing of environmental chemicals in the future. This abstract was reviewed by EPA and
approved for publication, but may not necessarily reflect official Agency policy.
                             Previous  I    TOC

-------
Nuclear Receptor Activity and Liver Cancer Lesion Progression

Imran Shah, Keith Houck, Richard S. Judson, Robert J. Kavlock, Matthew T. Martin, John
Wambaugh, David J. Dix , National Center for Computational Toxicology, US EPA, RTP, NC

Nuclear receptors (NRs) are ligand-activated transcription factors that control diverse cellular
processes.  Chronic stimulation of some NRs is a non-genotoxic mechanism of rodent liver
cancer with unclear relevance to humans. We explored this question using human CAR, PXR,
PPARa, LXR, ER, AR and Ahr activity assays for 309 environmental chemicals from
ToxCast™,  and liver histopathology data from long-term rodent testing studies from
ToxRefDB. The chemicals activated multiple human NRs in combinations that were
informative  about rodent liver injury. In addition, some surprising relationships were observed
between the degree of human NR activity and the severity of hepatic lesions progressing to
cancer. The results have implications for nuclear receptor chemical biology and the
extrapolation of in vitro data for predicting liver cancer in humans. In this poster we report on
this analysis, highlighting putative relationships between NRs and cancer lesion progression.
Furthermore, we describe the selection of chemicals and cellular endpoints for modeling NR-
mediated mitogenic, mutagenic and cytotoxic processes involved in hepatocarconogenesis.
This work was reviewed by EPA and approved for publication but does not necessarily reflect
official agency policy.
                         Previous  I    TOC

-------
A Virtual Liver for Simulating Chemical-Induced Injury

J Wambaugh, J Jack, and I Shah, National Center for Computational Toxicology,
USEPA, RTP, NC

The US EPA Virtual Liver - vLiver™ -is a tissue simulator that is designed to predict
histopathologic lesions - the gold-standard for toxicity. We have developed an
approach for a biologically motivated model of a canonical liver lobule. The simulated
lobule is composed of discrete representations of hepatic cells that can each determine
their state and fate in response to their local environment, which is determined by a
dynamic graph of the interactions between cells and vascular segments.

We are simultaneously developing two interacting models - one model for the cellular
dynamics that drives how an individual hepatocyte responds to its local environment,
including local concentrations of endogenous and xenobiotic compounds, and a second,
tissue model that determines the local environment of each simulated hepatocyte,
including the impact of whole-organism environmental exposure. The two interacting
models provide a working framework in which research focusing on refining specific
aspects of a single scale, i.e. cellular or tissue, can determine consequences on both
scales.

We are investigating the molecular mechanisms underlying chemically-induced
physiological changes in hepatocytes.  Specifically, we are focusing on the roles of a
subset of the nuclear receptor superfamily - the so-called adopted orphan nuclear
receptors. Through in silico models, we hope to elucidate the biochemical processes
governing the important hepatocellular processes associated with the diseases and
disorders of liver toxicity.  The chemical induction of nuclear receptors has been linked
to a variety of important cellular processes in the liver, including proliferation, steatosis,
apoptosis, necrosis, and hyperplasia.  Nuclear receptor-mediated effects can have
drastic consequences in rodent hepatocytes.  Building in silico models will help to reveal
the differences between the human and rodent cell behavior with respect to nuclear
receptor activation/inhibition.

The Virtual Liver cellular dynamics model requires two modules of cellular signaling
networks: one for the effects of chemicals on nuclear receptor activation, crosstalk, and
regulation of gene expression, and a second describing the effects of nuclear receptor-
mediated gene expression on the cell signaling pathways, ultimately predicting changes
in cellular phenotypes. Whereas the first module is an investigation into gene
expression, the second module is the realization of that gene expression within the
context of normal cellular function. The second module provides a causal link between
nuclear receptor-mediated gene expression and cellular changes - including,
proliferation, survival, death, and disease (cancer).

This poster will present preliminary aspects of the of the first cellular dynamics module.
Literature curation, the ToxCast™ data set,  and the v-Liver™ Knowledgebase are being
used to establish a nuclear receptor crosstalk and gene expression simulation model.
We are interested in modeling these activities with a threshold networks approach  to
                       Previous  I    TOC

-------
Boolean networks that, while relatively simple, is capable of capturing the dynamics of
cellular processes with very low computational overhead.

The Virtual Liver tissue model for microdosimetry makes use of a modified
physiologically-based pharmacokinetic (PBPK) approach to determining local
concentrations throughout the simulated lobule.  Based upon ordinary differential
equations, this microdosimetry approach bypasses computationally-intensive fluid
dynamics to rapidly determine the impact of environmental exposure to individual
hepatocytes within the simulated  lobule. By providing a spatially-extended environment,
the tissue model is intended to allow physiologically based models for inter-cellular
communication and lesion progression as allowed by the cellular dynamics model. We
will also be presenting results of our microdosimetry model as they pertain to lobule
layout, hetergeniety of the hepatocellular environment, and the consequences of oral
vs. inhalation exposure.

The Virtual Liver ultimately provides a framework for making predictions of in vivo
consequences based upon in vitro data. The cellular dynamics model is intended to be
calibrated with in vitro  measures of chemical activity, while the tissue model can be
calibrated with histopathology slides and pharmacokinetic data.  As a chemically-
perdurable simulation of a homeostatic tissue function, virtual tissues will be  powerful
tools for 21st Century Toxicology.

This work was reviewed by EPA and approved for publication but does not necessarily
reflect official Agency policy.
                       Previous  I    TOC

-------
Experimental Models for Quantitative Analysis of Hepatocarcinogenesis

Chris Gorton2, John Jack1, John Wambaugh1 and Imran Shah1
1 National Center for Computational Toxicology (NCCT), US EPA, RTP, NC.
National Health and Environmental Effects Lab (NHEERL), US EPA, RTP, NC

Predictive models of chemical-induced liver cancer face the challenge of bridging
causative molecular mechanisms to adverse clinical outcomes. The latent sequence of
intervening events from chemical insult to toxicity are poorly understood because they
span multiple levels of biological organization and timescales. The availability of high-
throughput molecular assays provide a global view of epigenetic, transcriptional and
pathway level changes that can shed much needed light on the regulatory networks
perturbed by xenobiotic stressors. A key challenge in this process is to resolve the role
of these networks in the normal homeostatic response of cells  as opposed to
irreversible alterations due to persistent stress.  To link molecular mechanisms to
neoplastic lesions will require quantitative assays on molecular changes and altered
cellular phenotypes, as that is the level of biological organization at which  tissue
damage becomes manifest. This poster outlines the collaboration between NHEERL
and NCCT on an integrative experimental strategy aimed at developing a model of NR-
mediated hepatocarconogenesis (The US EPA Virtual Liver v-Liver™). As  a proof of
concept we are using 20 nuclear receptor (NR) activating chemicals from the EPA
ToxCast™ Program to design short-term in vitro and in vivo studies to generate data on
a range of molecular and cellular endpoints.

This work was reviewed by EPA and approved for publication but does not necessarily
reflect official agency policy.
                      Previous  I    TOC

-------
EPA's Virtual Embryo: Modeling Developmental Toxicity

Knudsen TB1, Singh AV2, Rountree MR3, DeWoskin RS4 and Spencer RM2
1 National Center for Computational Toxicology, USEPA, RTP, NC
2 Lockheed Martin, Contractor to the USEPA, RTP, NC
3 National Center for Computational Toxicology, USEPA, RTP, NC
4 National Center for Environmental Assessment, USEPA, RTP, NC
Embryogenesis is regulated by concurrent activities of signaling pathways organized into
networks that control spatial patterning, molecular clocks, morphogenetic rearrangements
and cell differentiation. Quantitative mathematical and computational models are needed
to better understand how genetic errors and biochemical disruptions may perturb these
complex processes, leading to developmental defects. EPA's Virtual Embryo (v-
Embryo™) is an effort to build cell-based computational models using detailed knowledge
of molecular embryology and data from the ToxCast™ high-throughput in vitro screening
effort. The end goal is a library of simulations that can be manipulated in silico and
correlated with in vitro responses or in vivo phenotypes in predictive modeling of
developmental processes and toxicities. The Specific Aims of the project are to: build a
virtual tissue knowledgebase (VT-KB) relevant to development; construct a virtual tissue
simulation engine (VT-SE) for embryonic systems; specify rules for component
interactions of developmental  signaling pathways; and analyze abnormal developmental
trajectories that follow perturbations. Software for these purposes includes open-access
programming environments such as CompuCellSD, Python, BioTapestry and GanttPV.
Initial models for proof of principle are focusing on two systems with extensive
experimental embryology and targets for disruption by environmental chemicals: limb-bud
development and optic cup development. The modeling effort can enhance EPA efforts
applying the latest scientific knowledge in quantitative models of dose-response
relationships and uncertainty analysis of developmental and reproductive toxicity. [This
work has been reviewed by EPA and cleared for presentation, but does not reflect official
Agency policy].
                       Previous  I    TOC

-------
Predictive Modeling of Developmental and Reproductive Toxicity Pathways

Hunter ES1, Padilla S1, and Knudsen TB2
1 National Health and Environmental Effects Research Laboratory, USEPA, RTP, NC
2 National Center for Computational Toxicology, USEPA, RTP, NC

EPA must evaluate environmental chemicals for potential effects on development and
reproduction. Mechanistic information is essential to understanding how chemicals perturb
development; unfortunately, the mechanisms of prenatal developmental toxicity are not
understood in sufficient depth or detail for risk assessment purposes. This presentation
will explore new approaches for pathway-based prediction of developmental toxicity using
data from high-throughout screening (HTS) assays and from alternative models. These in
vitro platforms include: ToxCast™ HTS biochemical assays (NovaScreen), murine
embryonic stem (ES) cell lines, and zebrafish (ZF) embryos. Anchoring in vivo data was
obtained from Toxicity Reference Database (ToxRefDB) and includes chemical-endpoint
relationships for developmental  endpoints in pregnant rat and rabbit studies. Results have
been obtained for an initial pass of the ToxCast_320 chemical library in the ES cell and ZF
embryo assays, using one concentration level (25 |j,M and 80 |j,M, respectively).
Chemicals were ranked by activity in the ES cells (cytotoxicity, myosin heavy chain
immunoreactivity) and ZF larva (malformations, embryo lethality). Data inclusive of all the
diverse platforms and species studied here was obtained for ~215 chemicals. A
preliminary analysis of the single concentration data is being undertaken to correlate
relative developmental activities of chemicals against ES and ZF systems with the
NovaScreen and with ToxRefDB, to identify potential target pathways leading to adverse
developmental and reproductive endpoints. Although many of the important molecular
components of embryogenesis are highly conserved in these species, the developmental
processes and strategies can differ markedly; therefore, initial efforts are focused on data
standardization and calibration.  [This work has been reviewed by EPA and cleared for
presentation, but does not reflect official Agency policy].
                       Previous  I    TOC

-------
Adaptive Responses to Prochloraz Exposure That Alter Dose-Response and Time-
Course Behaviors

Rory Conolly1, Miyuki Breen2, Dan Villneneuve3, Gary Ankley3

1 National Center for Computational Toxicology, USEPA, RTP, NC
2 North Carolina State University, Graduate Program with the USEPA, RTP, NC
3 National Health Effects and Environmental Research Laboratory, USEPA, RTP, NC
Dose response and time-course (DRTC) are, along with exposure, the major determinants of
health risk.  Adaptive changes within exposed organisms in response to environmental stress
are common, and alter DRTC behaviors to minimize the effects caused by stressors. In this
project, we are analyzing how several feedback regulatory loops in fathead minnows
compensate for endocrine stress due to the fungicide Prochloraz. Affected endpoints include
estradiol (E2) levels, ovarian aromatase mRNA, and vitellogenin levels.  The data show, for
example, a significant decrease in E2 levels followed by a return to baseline during prolonged
exposure to Prochloraz. Characterization of the mechanisms that underlie these kinds of
adaptive changes will build toward a refined description of DRTC behavior for Prochloraz,
thereby helping us to better understand when exposures pose health risks and when they do
not.  In addition, this project will help us to evaluate the possibility that activation of stress
response pathways is itself a  useful regulatory endpoint,  i.e., the possibility that it is
appropriate to regulate exposures such that stress response pathways are not overwhelmed
and without explicit consideration of downstream,  more apical endpoints.

This work was reviewed by EPA and approved for publication but does not necessarily reflect
official agency policy.
                         Previous  I    TOC

-------
Applying Uncertainty Analysis to a Risk Assessment for the Pesticide
Permethrin
R. Woodrow Setzer1, Jimena Davis1, Rogelio Tornero-Velez2, Jianping Xue2,
Valerie Zartarian2
1 National Center for Computational Toxicology, 2National Exposure Research
Laboratory, ORD, US EPA, RTP, NC

We discuss the application of methods of uncertainty analysis from our previous
poster to the problem of a risk assessment for exposure to the food-use  pesticide
permethrin resulting from residential pesticide crack and crevice application.
Exposures are simulated by the SHEDS (Stochastic Human Exposure and Dose
Simulation) model, which is loosely coupled to a PBPK model for human internal
dose estimation.  This presentation discusses approaches for quantifying the
uncertainties at several points in the coupled  model:  parameter estimation in the
PBPK model; extrapolation to a human  model;  exposure parameters in  SHEDS;
and evaluation of overall uncertainty of the predictions of the coupled model and
application of sensitivity analysis to identify the most important contributors to
that uncertainty. Uncertainties in each component model are characterized as
probability distributions on the parameters of that model. In the case of the
PBPK model, the uncertainty distribution is derived from prior information about
parameter values as well as in vitro data specific to permethrin pharmacokinetics,
and is computed using Bayesian statistical methods. Extrapolating the PBPK
model from rodents to humans involves changing physiological parameters and
extrapolating from rodent to  human chemical-specific parameter values.
Uncertainties here are estimated both from limited human data and from
experience in using similar extrapolation methods in other chemicals.
Uncertainties in the SHEDS  model are derived from an understanding of the
uncertainty of the component distributions describing pesticide use and
parameters governing pesticide fate and human behavior relating to exposure.
The final output of the coupled model is a probability distribution of exposures
that characterizes the distribution of internal dose for a defined population. We
use Monte-Carlo methods to propagate the uncertainty in each of the
components to make confidence bands around this probability distribution.
Finally, global sensitivity analysis allows us to identify individual components of
uncertainty which contribute most to the overall uncertainty in the coupled
model's predictions. This work was reviewed by EPA and approved for
publication but does not necessarily reflect official Agency policy.
                   Previous I    TOC

-------
Methodology for Uncertainty Analysis of Dynamic Computational Toxicology
Models

Jimena Davis, John Wambaugh, Ramon I. Garcia, R. Woodrow Setzer, National Center
for Computational Toxicology, USEPA, RTP, NC

The task of quantifying the uncertainty in both parameter estimates and model
predictions has become more important with the increased use of dynamic
computational toxicology models by the EPA. Dynamic toxicological models include
physiologically-based pharmacokinetic (PBPK) models, closely-related (and often
coupled) pharmacodynamic models or biologically-based dose-response (BBDR)
models, and models such as those for virtual tissues. Given a set of values for
biological parameters describing the subject and chemical parameters describing the
compound, biological models can make predictions that both allow assessment of the
understanding of how data came to be (interpolation) as well as what might occur under
different conditions (extrapolation).  Careful consideration must be given to determining
the value of model parameters and uncertainty about those values as well as the
selection of one model over another given the overall uncertainty about different
models. Quantitative uncertainty analyses are necessary for fully vetting models for
applications such as risk assessments.  Along with uncertainty, in these types of
systems variability must also be accounted for on various levels, such as variation in
model parameters across individuals in a  population as well as variation in experimental
data. Thus, the analysis of computational toxicology models requires valid statistical
methodologies that are capable of handling both uncertainty and variability accurately.
Several methodological issues are  generic to these dynamic toxicological models.
Often, values must be determined for parameters in the absence of hard chemical-
specific data, contributing parameter uncertainty.  We discuss  approaches for
developing prior distributions quantifying the uncertainty in chemical-specific parameters
based on comparisons of measured values with predicted values from computational or
in vitro methods. Such informative priors  are beneficial both in the context of Bayesian
estimation, and for assessing uncertainty  of predictions from dynamic models for which
in vivo data are entirely lacking.  One application wherein informative priors are
particularly useful that is  of interest to the  Agency focuses on the development of better
quantitative approaches for cumulative risk assessments of linked exposure-dose-
effects models.  A major part of this effort involves formulating PBPK models, which
include well known physiological parameters as well as unknown physicochemical
parameters that describe the uptake and disposition of chemicals or toxins through the
body. Using PBPK models as a motivating example, we discuss some of the
advantages and drawbacks associated with the use of hierarchical Bayesian analysis in
model calibration, uncertainty analysis, and model evaluation.  Conventional
computational methods for estimating parameters and evaluating their uncertainty,
which were developed for substantially simpler non-linear models, require lengthy
computations (e.g., weeks or even  months) when applied to dynamic models. We
discuss some attempts to standardize this analysis, address the  issue of efficient
computational time for deterministic (e.g.,  PBPK) models, and  deal with uncertainty in
stochastic models (e.g., agent-based virtual tissue models). Finally, we discuss
evaluating how well models describe data and approaches to evaluating model
uncertainty.  This work was reviewed by EPA and approved for publication but does not
necessarily reflect official Agency policy.
                      Previous I    TOC

-------
                     CTRP Scientist Biosketches
Maria Bondesson-Bolin
Elaine Cohen Hubal
Timothy W. Collette
Rory Conolly
Jon "Christopher Gorton
Jimena L. Davis
Sigmund J, Degitz Jr.
David J. Dix
Peter Paul Egeghy
Jane E. Gallagher
Keith A. Houck
Edward "Sid" Hunter, III
John Jack
William Jack Jones
Richard Judson
Robert J. Kavlock
Thomas B. Knudsen
Stephen Blair Little
Matthew T. Martin
Holly M. Mortensen
William R. Mundy
James Rabinowitz
David M. Reif
Ann M. Richard
Ivan Rusyn
James M. Samet
Jorge W. Santo Domingo
Deborah  Segal
R. Woodrow Setzer
Imran Shah
AmarV. Singh
Cecilia Tan
Raymond R. Tice
John F. Wambaugh
William J. Welsh
Fred A. Wright
                    Previous I    TOC    I   Next

-------
                                         BIOGRAPHICAL SKETCH

NAME
Maria Bondesson-Bolin

POSITION TITLE
Research Assistant Professor
  EDUCATION/TRAINING
INSTITUTION AND LOCATION
Uppsala University, Sweden
Karolinska Institutet, Sweden
Ludwig Institute, Karolinska Institutet, Sweden
Karolinska Institutet, Sweden
DEGREE
(if applicable)
1988
1995
1995-96
1996
YEAR(s)
BS
PhD
Postdoc
Postdoc
FIELD OF STUDY
Microbiology
Cell & Molecular Biology
Oncology & Pathology
Cell & Molecular Biology
 A.  Positions and Honors.
Positions

1998-02
2002
2003
2003-2009
2009 - Present Research Assistant Professor, University of Houston, Houston, TX
Research Assistant Professor, Department of Cell and Molecular Biology, Karolinska Institutet
Senior Researcher, Department of Cell and Molecular Biology, Karolinska Institutet
Research Secretary, Swedish Research Council/Medicine
Project Manager, CASCADE, Dept. of Biosciences and Nutrition, Karolinska Institutet
Honors

1998

2002
Swedish Natural Science Research Council/Swedish Research Council, Research Assistant
Professorship
Dissertation committee for Maria Lindebro; Mechanisms of regulation of dioxin receptor function
 B.  Peer-reviewed publications

    1.  Svensson C., Bondesson M., Nyberg E.,  Linder S.,  Jones N. and Akusjarvi G. Independent Transformation
       Activity by Adenovirus-5 E1A-Conserved Regions 1 or 2 Mutants. Virology 182:553-561, 1991

    2.  Linder S., Popowics P., Svensson S., Marshall H., Bondesson M. and Akusjarvi G. Enhanced Invasive Properties
       of Rat Embryo Fibroblasts Transformed by Adenovirus E1A Mutants with Deletions in the Carboxy-terminal Exon.
       Oncogene 7:439-443, 1992

    3.  Bondesson M.,  Svensson C., Linder S.  and Akusjarvi G. The Carboxy-terminal Exon of the Adenovirus  E1A
       protein is Required for E4F-dependent Transcription Activation. The EMBO Journal 11:3347-3354, 1992

    4.  Bondesson M.,  Mannervik M., Akusjarvi G.  and Svensson C. An Adenovirus  E1A  Transcriptional  Represser
       Domain Functions as an Activator when Tethered to a Promoter. Nucleic Acids Research 22:3053-3060, 1994

    5.  Bondesson M. Transcriptional regulation by the adenovirus E1A proteins. Thesis 1995

    6.  Bondesson M., Ohman K., Mannervik M., Fan S. and Akusjarvi G. Adenovirus E4 Open Reading Frame 4 Protein
       Autoregulates E4 Transcription by Inhibiting E1A Transactivation of the E4 Promoter. Journal of Virology 70:3844-
       3851,1996

    7.  Wahlstrb'm G., Vennstrb'm B. and Bondesson M. The adenovirus E1A oncoprotein  is a potent coactivator for
       thyroid hormone receptors. Molecular Endocrinology 1999 13:1119-1129

    8.  Castro D., Arvidsson M.,  Bondesson M. and Perlmann T. Activity of the Nurrl carboxyl-terminal domain depends
       on cell type and integrity of the activation function 2. The Journal of Biological Chemistry 1999 274:37483-37490

    9.  Ichimura K.,  Bondesson  M., Goike H., Schmidt  E.,  Moshref A. and Collins VP.  Deregulation of the  p14
       ARF/MDM2/p53  pathway is  a  prerequisite for human astrocytic gliomas with G1/S transition  control gene
       abnormalities. Cancer Research 2000 60, 417-424




-------
    10. Nygard M., Wahlstrom G.M., Tokumoto Y.M.,  Gustafsson M.V. and Bondesson M. (2003) Hormone-dependent
       repression of the E2F-1 gene by thyroid hormone receptors. Molecular Endocrinology  17, 79-92

    11. Gustafsson MV, Zheng X,  Pereira  T, Gradin K, Jin S,  Lundkvist L,  Ruas JL, Poellinger L,  Lendahl  U and
       Bondesson M (2005). Hypoxia requires notch signaling to maintain the undifferentiated cell state. Developmental
       Cell 9, 617-628

    12. Nygard M, Becker N, Demeneix B, Pettersson K and Bondesson M  (2006). Thyroid hormone-mediated negative
       transcriptional regulation of necdin expression. Journal of Molecular Endocrinology 36(3), 517-30

    13. Turowska O,  Nauman  A,  Pietrzak  M,  Poptawski  P, Master A,  Nygard M,  Bondesson  M, Tanski  Z and
       Puzianowska-Kuznicka M  (2007) Overexpression of E2F1 in clear cell Renal Cell Carcinoma: a  potential  impact
       of erroneous regulation by thyroid hormone nuclear receptors, Thyroid, 11:1039-48

    14. Bondesson M,  Jonsson J,  Pongratz I, Olea N, Craved! J-P, Zalko D,  Hakansson H, Halldin K, Di Lorenzo D, Behl
       C, Manthey D, Balaguer P,  Demeneix B,  Fini  JB, Laudet V, Gustafsson J-A, (2009) A CASCADE of Effects of
       Bisphenol A, In press Reproductive Toxicology, doi:10.1016/j.reprotox.2009.06.014

Other scientific publications

    15. Demeneix B, Gustafsson JA, Bondesson M et  al. (2005) Vote REACH for the safer management of chemicals in
       EU. Financial Times Nov7

    16. Bondesson M and Gustafsson JA (2006) Chemical Contaminants in food: The CASCADE Network of Excellence.
       Food Science and Technology 20; 34-36


 C. Research Support.

Ongoing Research Support

Agency: Swedish FORMAS                             Project Period: 11/01/2008 - 12/31/2009
Title: In vitro methods for endocrine disruption
Role on Project: PI                                     Total grant: $143,000

Agency: Swedish Research Council                      Project Period: 01/01/2007 - 12/31/2009
Title: Effects of environmental pollutants on nerve cell differentiation
Role on Project: PI                                     Total grant: $ 108,000

Agency: EU / I.D.# LSHM-CT-2005-018652               Project Period: 03/01/2006 - 02/31/2011
Title: CRESCENDO "Nuclear receptors during development and aging"
Coordinator: Barbara Demeneix, Ph.D, Professor (France)
Co-P.I.: Jan-Ake Gustafsson, M.D., Ph.D.,  Professor
Role on Project: Work Package Manager                 Total grant: $14.000.000 (for 20 research groups)

Agency: EU / I.D.# FOOD-CT-2004-506319               Project Period: 02/01/2004 - 02/28/2010
Title: CASCADE "Chemicals as contaminants in the food chain: An NOE for research, risk assessment and education"
Coordinator and P.I.: Jan-Ake Gustafsson, M.D., Ph.D., Professor
Role on Project: Project Manager (to  03/31/2009)           Total grant: $ 18.000.000 (for 25 research groups)

Agency: US-EPA                                      Project Period: 11/01/2009-10/31/2012
Title: Title: The Texas-Indiana Virtual STAR Center; Data-Generating in vitro and in silico Models of Developmental
Toxicity in Embryonic Stem Cells and Zebrafish
P.I.: Jan-Ake Gustafsson M.D., Ph.D., Professor
Role on Project: Project Manager                        Total grant: $ 3.190 993 (for 3 research groups)


-------
        Principal Investigator/Program Director (Last, First, Middle):   Cohen Hubal, Elaine
                                     BIOGRAPHICAL SKETCH
          Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                          Follow this format for each person. DO NOT EXCEED FOUR PAGES.
  NAME
  Elaine Cohen Hubal
  eRA COMMONS USER NAME
   POSITION TITLE
   Chemical Engineer
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
Massachusetts Institute of Technology,
Cambridge, MA
North Carolina State University, Raleigh, NC
North Carolina State University, Raleigh, NC
CUT
DEGREE
(if applicable)
S.B.
M.S.
Ph.D.
Predoc
Fellow
YEAR(s)
1984-1988
1990-1992
1992-1996
1992-1996
FIELD OF STUDY
Chemical Engineer
Chemical Engineer
Chemical Engineer

A. POSITIONS and HONORS

Research and Professional Experience:
March 2004 - present Research Scientist.  NCCT, US EPA
May 2004 - March 2004 Research Scientist.  One-year detail to ETD.  NHEERL, US EPA
May 2003 - Apr 2004 Acting Associate Director for Human Exposure Modeling. HEASD, NERL
Oct 2002 - Apr 2003 Acting Associate Director for Human Exposure Measurements. HEASD, NERL
1997-2004 Chemical Engineer.  HEASD, NERL
1996-1997 Research Chemical Engineer.  RTI, RTP, NC
1988-1990 Chemical Engineer.  Camp Dresser & McKee, Boston, MA

Selected Awards and Honors:
•   USEPA 2008 Scientific and Technological Achievement Awards (STAA), for research that improves our
    understanding of children's exposure to pesticides in the residential environment;
•   USEPA 2007 Children's Environmental Health Excellence Award
•   USEPA Bronze Medal for Commendable Service, for developing a risk assessment resource, a Framework
    for Assessing Health Risks of Environmental Exposures to Children 2006
•   USEPA Gold Medal for Exceptional Service, in recognition of advancing the scientific basis for assessing
    and monitoring children's environmental exposures through the development of agency-wide risk
    assessment guidance 2006

Invited Lectures/Symposia (selected):
Cohen Hubal, EA. Does exposure science imitate art? Plenary. International Council of Chemical
    Associations Long Range Research  Initiative (ICCA-LRI) workshop: Connecting Innovations in Biological,
    Exposure and Risk Sciences: Better Information for Better Decisions.  Charleston, South Carolina. June
    2009
Cohen Hubal, EA. Biologically relevant exposure science for toxicity testing. Presented to:
    The Strategic Science Team of the ACC Long-Range Research Initiative
    Washington, D.C.  May 13, 2009
Cohen Hubal, EA, T Pastoor.  Improving Exposure Science and  Dose Metrics for Toxicity Testing, Screening,
    Prioritizing, and Risk Assessment. ILSI Health and Environmental Sciences Institute (HESI) Annual
    Emerging Issues Forum, Tucson, AZ, Jan 20, 2009.
                              Previous
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Cohen Hubal, Elaine

Cohen Hubal, E, D Reif, S Edwards, L Neas, E Hudgens, J Gallagher. Mechanistic Indicators of Childhood
   Asthma (MICA) Study.  11th SAC Seminar:  New Trends in Chemical Toxicology. Moscow, Russian
   Federation. September 22-25, 2008.
Cohen Hubal, EA.  Considering Susceptibility: Translation for Risk Management.  ICCA-LRI workshop:
   Twenty-first century approaches to toxicity testing, biomonitoring, and risk assessment. Amsterdam, The
   Netherlands. June 16-17, 2008.
Cohen Hubal, EA. Computational Toxicology. Workshop on Toxicogenomics in Risk Assessment. Toxicology
   And Risk Assessment Conference. Cincinnati, OH  April 14-15, 2008.
Cohen Hubal, EA.  Invited panelist. Bridging Human and Ecological Exposure Sciences: A Window of
   Opportunity.  ISEA Annual Meeting, Durham, NC. October 14-18, 2007.
Cohen Hubal, EA.  Exposure Assessment. Fundamentals of Human Health Risk Assessment with a Case
   Study Approach.  Continuing Education Course AMOS. Society of Toxicology Continuing Education
   Course, Charlotte, North Carolina, March 25, 2007.
Cohen Hubal, EA. Application of Nanotechnology-Enabled Sensor Technologies for Monitoring Human
   Exposure to  Environmental Contaminants. EPA/ORCAS Nanotechnology Applications in Environmental
   Health Workshop. RTP, NC, April 20, 2006.
Cohen Hubal, EA.  Application of Micro- and Nanoscale Sensor Technologies for Monitoring Human Exposure
   to Environmental  Contaminants.  ILSI HESI Emerging Issues Forum. Puerto Rico, January 17, 2006.
Cohen Hubal, EA.  Framework to Use Biomonitoring Data to Inform Exposure Assessment in Children.
   WHO/I PCS Workshop on Advances in the Use of Biomarkers in Children, Buenos Aires, Argentina,
   November 17-18, 2005.

Leadership Provided to Scientific Community (selected):
• Chair Exposure Science for Screening Prioritizing and Toxicity Testing Community of Practice (ExpoCoP).
   June 2008-present.
•  Editorial Board Journal of Exposure Science and Environmental Epidemiology.  January 2007 - present.
•  Co-chair  International Society of Exposure Science (formerly ISEA) 2009 Annual Meeting: Transforming
   Exposure Science for the 21st Century, Nov 1-6, Minneapolis, MN.
•  Co-chair  International Council of Chemical Associations Long Range Research Initiative (ICCA-LRI)
   workshop: Connecting Innovations in Biological, Exposure and Risk Sciences: Better Information for Better
   Decisions. Charleston, South Carolina. June 2009
•  Member National Children's Study Data Access Committee.  2008- present
•  World Health Organization Temporary Adviser to plan the I PCS international workshop on "Identifying
   Important Life Stages for Monitoring and Assessing Risks from Exposures to Environmental
   Contaminants." 2009-present
•  Program  planning committee for the International Society of Exposure Analysis 2007 Annual Meeting.
   Chair symposium: Computational Toxicology. Durham, NC. October 14-18, 2007.
•  Member ILSI Health and Environmental Sciences Institute, Sensitive Subpopulations Working Group,
   2006-present.
•  Member ILSI Health and Environmental Sciences Institute, Biomonitoring Working Group 2004-present
•  Co-organized /co-chaired with Richard Judsen, Session title "Genetic Variation, Gene-Environment
   Interactions and Environmental Risk Assessment" for International Science Forum on Computational
   Toxicology, May 21-23, 2007, RTP, NC.
•  Cohen Hubal, EA. Exposure Science for Computational Toxicology. US EPA NCCT Course on
   Computational Toxicology. Research Triangle Park, NC.   March 4, 2008.
• NCCT Cosponsor  with NCER. Program planning committee for US EPA Workshop on Research Needs for
   Community-Based Risk Assessment. Session organizer/chair: Data needs and measurement methods for
   CBRA. Research Triangle Park, NC. October 18-19, 2007.
•  Member US EPA  Risk Assessment Forum. June 2004 - 2009.
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Cohen Hubal, Elaine

B. SELECTED PUBLICATIONS
Cohen Hubal, EA. Biologically-Relevant Exposure Science for 21st Century Toxicity Testing
   Toxicol. Sci., Advance Access published on July 14, 2009; doi: doi:10.1093/toxsci/kfp159
Sheldon, LS, and EA Cohen Hubal. Exposure as Part of a Systems Approach for Assessing Risk. Environ
   Health Perspect doi:10.1289/ehp.0800407 [Online 8 April 2009]
Cohen Hubal EA, Richard AM, Imran S, Gallagher J, KavlockR, Blancato J, Edwards S. (2009) Exposure
   Science and the US EPA National Center for Computational Toxicology. Journal Expo Sci Environ
   Epidemiol Available online: Nov 5 2008 [Epub ahead of print]
Xu, Y., EA Cohen Hubal,  PA Clausen, JC Little.  Predicting Residential Exposure to Phthalate Plasticizer
   Emitted from Vinyl Flooring-A Mechanistic Analysis.  Environ. Sci. Technol., DOI: 10.1021/es801354f
   available online February 19, 2009.
Heidenfelder, BL, DM Reif, EA Cohen Hubal, EE Hudgens, LA Bramble,  JG Wagner, JR Harkema, M
   Morishita, GJ Keeler, SW Edwards, JE Gallagher. (2009) Comparative microarray analysis and pulmonary
   morphometric changes in brown Norway rats exposed to ovalbumin and/or concentrated air particulates.
   Toxicol Sci. 108(1), 207-221.
Sanchez, Y, K Deener, E Cohen Hubal, C  Knowlton, D Reif, D Segal. Research needs for community based
   risk assessment.  J Expo Sci Environ Epidem. 2009 Feb 25 [Epub ahead of print]
Cohen Hubal, EA, MG Nishioka, WA Ivancic, M Morara, P Egeghy. (2008) Comparing surface residue transfer
   efficiencies to hands using polar and non-polar fluorescent tracers. Environmental Science & Technology
   42 (3), 934-939.
Kavlock, RJ, G Ankley, J  Blancato, M Breen, R Conolly, D Dix, K Houck, E Cohen Hubal, R Judson, J
   Rabinowitz, A Richard, RWSetzer, I Shah, D Villeneuve, and E Weber.  (2008) Computational toxicology:
   A state of the science mini review.  Toxicological Sciences 103(1), 14-27.
Cohen Hubal, EA, J Moya, SG Selevan. (2008) A lifestage approach to assessing children's exposure.
   Developmental and Reproductive Toxicology. Birth Defects Res (Part B) 83:522-529.
Gallagher J., Hudgens E., Heidenfelder B.N., Reif D.M., Neas L, Wlliams A., Harkema J. , Hester S., Edwards
   S.E., Cohen Hubal EA.  Mechanistic indicators of children's asthma study (MICA): A systems biology
   apporoach for the integration of multifactorial environmental health data. Submitted.
Xu, Y., EA Cohen Hubal,  PA Clausen, JC Little.  Predicting Residential Exposure to Phthalate Plasticizer
   Emitted from Vinyl Flooring - Sensitivity, Uncertainty, and Implications for Biomonitoring. Submitted.
Gallagher, JE, EA Cohen Hubal,  SW Edwards. Invited Chapter "Biomarkers of Environmental Exposure"
   "Biomarkers of toxicity: A New Era in Medicine Editors Vishal S. Vaidya and Joseph V. Bonventre
   PublisherJohn Wley  and Sons, Inc. Submitted.
RN Hines, D Sargent, H Autrup, LS Birnbaum, RL Brent, NG Doerrer, EA Cohen Hubal, DR Juberg, C Laurent,
   R  Luebke, K Olejniczak, CJ Portier, WSIikker. Approaches for Assessing Risks to  Sensitive Populations:
   Lessons Learned from Evaluating Risks in the Pediatric Population. Submitted.
Reif, DM, JE.  Gallagher, BL Heidenfelder,  EE Hudgens, W Jones, CL Wlliams-DeVane, LM Neas,  EA Cohen
   Hubal, SW Edwards.  Elucidating Asthma  Phenotypes via Integrated Analysis of Blood Gene Expression
   Data with Demographic and Clinical Information. In Preparation.
Reif, DM, CL Wlliams-DeVane, EA Cohen Hubal, W Jones, EE Hudgens, BL Heidenfelder, LM Neas, JE.
   Gallagher, SW Edwards._Systems Modeling of Gene Expression, Demographic and Clinical Data to
   Determine Disease Endotypes In preparation.
Firestone M, J Moya, E Cohen Hubal,  V Zartarian. (2007) Identifying childhood age groups for exposure assessments
   monitoring. Risk Analysis 27(3): 701-714.
Ryan, PB, TA Burke, EA Cohen Hubal, JJ  Cura, TE McKone.  (2007) Using Biomarkers to inform cumulative
   risk assessment.  Environ Health Perspect 115:833-84
deFur, PL, GW Evans, EA Cohen Hubal, AD  Kyle, RA Morello-Frosch, D Williams. (2007) Vulnerability as a
   function of individual and group resources in cumulative risk assessment. Environ Health Perspect
   115:817-824.
Cohen Hubal, EA, P Egeghy, K Leovic, G Akland.  (2006)  Measuring potential dermal transfer of a pesticide  to
   children in a daycare center. Environ Health Perspect 114(2)264-269.
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Cohen Hubal, Elaine

Cohen Hubal, EA. (2006) Uso de los Datos de Biomonitoreo para Informar sobre la Evaluation de la
   Exposition Infantil [Using Biomonitoring Data to Inform Exposure Assessment in Children] Acta
   Toxicologica Argentina [Journal of the Argentinan Society of Toxicology]. 14(suplemento)17-19.
Barone S Jr, RC Brown, S Euling, E Cohen Hubal, CA Kimmel, S Makris, J Moya, SG Selevan, B Sonawane, T
   Thomas, C Thompson.  (2006) Vision General de al  Evaluation del Riesgo en Salud Infantil Empleando un
   Enfouque por Etapas de Desarrollo [Overview of a Life Stage Approach to Children's Health Risk
   Assessment] Acta Toxicologica Argentina. 14(suplemento)7-10.
Birnbaum, LS, EA Cohen Hubal.  (2006) Polybrominated diphenyl ethers: a case study for application of
   biomonitoring data to characterize exposure.  Environ Health Perspect 114:1770-1775.
                              Previous
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Collette, Timothy W.
                                     BIOGRAPHICAL SKETCH
          Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                          Follow this format for each person. DO NOT EXCEED FOUR PAGES.
  NAME
  Timothy W. Collette
  eRA COMMONS USER NAME
   POSITION TITLE
   Research Chemist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
Berry College, Rome, GA
University of Georgia, Athens, GA
DEGREE
(if applicable)
B.S.
Ph.D.
YEAR(s)
1981
1985
FIELD OF STUDY
Chemistry
Physical Chemisrty
A. POSITIONS and HONORS

Research and Professional Experience:
1985 - Present       Research Chemist, Processes and Modeling Branch, Ecosystems Research Division,
                    NERL, U.S. EPA, Athens, GA
1981 - 1985         Teaching Assistant and Research Assistant, Department of Chemistry, University of
                    Georgia, Athens, GA.

Selected Awards and Honors : (from 1998 - 2009)
1999   STAA -  Methods and Monitoring Category
2000   STAA -  Methods and Monitoring Category
2000   Bronze Medal - Application of Raman Spectroscopy
2000   Athens Federal Executives Association Public Service Recognition Award
2001   Sigma Xi Outstanding  Research Paper Award - University of Georgia Chapter
2002   STAA -  Methods and Monitoring Category
2002   STAA -  Review Article Category
2003   STAA -  Methods and Monitoring Category
2004   Office of Pollution Prevention and Toxics Mission Award - PFOA Workgroup
2004   Bronze Medal - Computational Toxicology Design Team
2004   Bronze Medal - Promoting Strong Science in Agency Decisions
2004   Commendation from the Office of Acquisition Management
2004   STAA - Methods and Monitoring Category
2006   Gold Medal - Perchlorate Risk Characterization Team
2007   STAA - Fate and Transport Category
2008   STAA - Methods and Monitoring Category

Invited Lectures/Symposia (selected): (selected from about 35 during 1998 - 2009)
"The Value of GC-IR for Environmental Contaminant Identification, T.W. Collette, 49th Pittsburgh Conference,
New Orleans, LA., March (1998).
"Optimization of Modern Dispersive Raman Spectrometers for Molecular Speciation of Organics in Water,"
T.W. Collette and T.L. Wlliams, 26th Annual Meeting of the Federation of Analytical Chemistry and
Spectroscopy Societies, Vancouver CANADA, October (1999).
"Perchlorate in Fertilizers?: Analysis by Raman Spectroscopy" T.W.  Collette and T.L. Wlliams, 220th National
Meeting of The American Chemical Society, Washington, DC., August (2000).
"Speciation of Complex Organic Contaminants in Water with Raman  Spectroscopy" T.W. Collette and T.L.
Wlliams, 30th International Symposium on Environmental Analytical  Chemistry, Espoo, FINLAND, June
(2000).
                              Previous
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Collette, Timothy W.

"Determination of Perchlorate in Some Fertilizers And Plant Tissue by Raman Spectroscopy" T.W. Collette and
T.L. Williams, 222th National Meeting  of The American Chemical Society, Chicago IL, August (2001).
"The Role of Raman Spectroscopy in the Analytical Chemistry of Potable Water" T.W. Collette and T.L.
Williams, 222th National Meeting of The American Chemical Society, Chicago IL, August (2001).
"Raman Analysis of Fertilizer and Plant Tissue Extracts for Perchlorate Contamination" T.W. Collette and T.L.
Williams, Eastern Analytical Symposium, Atlantic City NJ, October (2001).
 "Principals of Infrared Spectroscopy and Spectral Interpretation" T.W. Collette, Joint Oil Analysis Program
Technology Showcase, U.S. Air Force, Pensacola, FL, April (2004).
"The Athens Lab's Role in EPA's Computational Toxicology Program", T.W. Collette. University of Georgia,
Interdisciplinary Toxicology  Program Retreat, Athens, GA, February (2006).
"Metabolomics in Small Fish Toxicology and Ecological Risk Assessments," T. Collette, D. Ekman, Q. Teng, D.
Villeneuve, and G. Ankley, 3rd International Conference of the Metabolomics Society, University of
Manchester, UNITED KINGDOM, June (2007).
"Metabolomics in Small Fish Toxicology and Other Environmental Applications", T. Collette, Fort Johnson
Marine Science Seminar Series, Charleston, South Carolina, October (2007).
"Assessing Exposures to  Regulated Chemicals using Metabolomics with Multiple Analytical Techniques", T.
Collette, D. Ekman, W. Garrison, M. Henderson, and Q. Teng,  2007 Eastern Analytical Symposium,
Somerset, New Jersey, November (2007).
"Fish Toxicogenomics: Moving into Monitoring and Regulation", NERC International Opportunity Workshop,
Pacific Environmental Science Centre, North Vancouver, CANADA, April (2008).

Assistance/Leadership Provided to the Scientific Community:
Society of Applied Spectroscopy, Chair,  National Tellers Committee: 2006 - present
Advisory Board: Comprehensive Analytical Chemistry:  1998 - 2007
Society of the Sigma Xi, University of Georgia Chapter, Admission Committee: 2002 - 2005
Editorial Advisory Board:  Vibrational Spectroscopy:  1999-2005
Coordinating Committee:  International Symposium on Environ. Anal. Chem.: 1996 - 2001
Special Issue Editor: Vibrational Spectroscopy: September 2000
Program Committee: SPIE  Symposium on  Environmental and Industrial Sensing: 1999
Exhibit Chairman:  11th International Conference on Fourier Transform Spectroscopy: 1997
Program Committee: International Symposium on Environ. Anal. Chem.:  1997,  1999, 2000
Associate Editor: Vibrational Spectroscopy:  1995 -1999

Assistance/Leadership Provided to the Agency:
EPA Cross-ORD Post-doc Recruitment Workgroup: 2005
EPA/ORD Computational Toxicology Implementation and Steering Committee: 2004 - present
EPA/OPPT   Telomer Degradation Workgroup 2004 - present
EPA/OPPT  PFOA Monitoring Workgroup: 2003 - present
EPA/ORD Safe Pesticides, Safe Products Long Term Goal 3 Workgroup: 2003 - present
EPA/ORD Computational Toxicology Research Initiative Design Team: 2002-2003

B. SELECTED PUBLICATIONS (selected from 1998 - 2008, total more than 60)
  "Identification of New Ozone Disinfection Byproducts in Drinking Water,"   S.D. Richardson, A.D. Thruston,
     Jr., T.V. Caughran, P.M. Chen, T.W. Collette, T.L. Floyd, K.M. Schenck, B.W. Lykins, Jr., G. Sun, and G.
     Majetich, Environ. Sci. Technol. 33, 3368-3377 (1999).
"Perchlorate Identification in Fertilizers," S. Susarla, T.W. Collette, A.W. Garrison, N.L. Wolfe, and S.C.
     McCutcheon, Environ. Sci. Technol. 33, 3469-3472 (1999).
"Identification of New Drinking Water Disinfection Byproducts Formed in the Presence of Bromide," S.D.
     Richardson,  A.D. Thruston, Jr., T.V. Caughran, P.M. Chen, T.W. Collette, T.L. Floyd, K.M. Schenck,
     B.W. Lykins, Jr., G.  Sun, and G.  Majetich, Environ. Sci. Technol. 33, 3378-3383 (1999).
"Optimization of Raman Spectroscopy for Speciation of Organics in Water," T.W. Collette, T.L. Williams, and
     J.C. D'Angelo, Appl. Spectres. 55,  750-766 (2001).
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Collette, Timothy W.

"Raman Spectroscopic Analysis of Fertilizers and Plant Tissue for Perchlorate" T.L Williams, R.B. Martin, and
     T.W. Collette, Appl. Spectros. 55, 967-988 (2001).
 "Analysis of Hydroponic Fertilizer Matrixes for Perchlorate: Comparison of Analytical Techniques " T.W.
     Collette, T.L Williams, E.T. Urbansky, M.L Magnuson, G.N. Hebert, and S.H. Strauss, Analyst, 128, 88-
     97 (2003).
"Degradation of Chlorpyrifos in Aqueous Chlorine Solutions: Pathways, Kinetics, and Modeling", S.E. Duirk and
     T.W. Collette, Environ. Sci.  Techno!., 40, 546 - 551  (2006).
"Monitoring the Speciation of Aqueous Free Chlorine from pH 1-12 with Raman Spectroscopy to Determine the
     Identity of the Potent Low-pH Oxidant",  D.P. Cherney, S.E. Duirk, J.C. Tarr, and T.W. Collette, Appl.
     Spectros., 60, 764 - 772 (2006).
 "Raman Spectroscopy-Based Metabolomics For Differentiating Exposures to Triazole Fungicides Using Rat
     Urine," D.P. Cherney, D.R. Ekman,  D.J. Dix, and T.W. Collette, Anal. Chem. 79, 7324-7332 (2007).

"NMR Analysis of Male Fathead Minnow Urinary Metabolites: A Potential Approach for Studying Impacts of
     Chemical Exposures", D.R. Ekman, Q.  Teng, K.M. Jensen, D. Martinovic, D.L. Villeneuve, G.T. Ankley,
     and T.W. Collette, Aquatic Tox. 85,  104 - 112 (2007).
"Chlorpyrifos Transformation by Aqueous Chlorine in the Presence of Bromide and Natural Organic Matter",
     S.E. Duirk, J.C. Tarr, and T.W. Collette, J. Agric.  Food. Chem. 56, 1328 - 1335 (2008).
"Investigating Compensation and Recovery of Fathead Minnow (Pimephales Promelas) Exposed to 17 -
     Ethynylestradiol with Metabolite Profiling", D.R. Ekman, Q., Teng, D.L. Villeneuve, M. D. Kahl, K.M.
     Jensen, E.J. Durhan, G.T. Ankley, and  T. W. Collette, Environ. Sci.  Technol. 42, 4188-4195 (2008).
"International NMR-based Environmental Metabolomics Intercomparison Exercise", M.R. Viant, D.W. Bearden,
     J.G. Bundy, I.W. Burton, T.W. Collette,  D.R. Ekman, V. Ezernieks, T.K.  Karakach, C.Y. Lin, S. Rochfort,
     J.S. de Ropp, Q. Teng, R.S. Tjeerdema, J.A. Walter, and H. Wu, Environ. Sci. Technol. 43, 219-225
     (2009).
"Spectral Relative Standard Deviation: A Practical Benchmark in Metabolomics", H.M.  Parsons, D.R. Ekman,
     T.W. Collette, and M.R. Viant, Analyst.  134, 478-485 (2009).
"A Direct Cell Quenching Method for Cell-Culture Based Metabolomics", Q. Teng, W. Huang, T.W. Collette,
     D.R. Ekman, and C. Tan, Metabolomics. 5, 199-208 (2009).
"Profiling Lipid Metabolites Yields Unique Information on Sex- and Time-dependent Responses of Fathead
     Minnows (Pimephales promelas) Exposed to 17a-Ethynylestradiol", D.R. Ekman, Q., Teng, D.L.
     Villeneuve,  M. D.  Kahl, K.M. Jensen, E.J. Durhan, G.T. Ankley, and T. W. Collette, Metabolomics.  5, 22
     - 32 (2009).
"A Computational Model of the Hypothalamic-Pituitary-Gonadal Axis in Male Fathead Minnows Exposed to
     17  -Ethinylestradiol and 17(3-Estradiol", K.H. Watanabe,  Z. Li, K. Kroll,  D.L. Villeneuve, N. Garcia-
     Reyero, E.F. Orlando, M.S., Sepulveda, T.W. Collette, D.R. Ekman, G.T. Ankley and  N.D. Denslow,
     Toxicol. Sci., 109,  180-192 (2009).
"Endocrine-Disrupting Chemicals in Fish: Developing Exposure Indicators and Predictive Models of Effects
     based on Mechanism of Action",  G.T. Ankley, D.C. Bencic, M.S. Breen, T.W. Collette, et al. Aquatic Tox.,
     92, 168-178(2009).
"Integrating Omic Technologies Into Aquatic  Ecological Risk Assessment And Environmental Monitoring:
     Hurdles, Achievements And Future Outlook", G. Van Aggelen, G. Ankley, W. Baldwin, D. Bearden, W.
     Benson, J. Chipman,  T. Collette, et al., Accepted - Environ. Health Perspect. (2009).
                               Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Conolly, Rory
                                     BIOGRAPHICAL SKETCH
NAME
Rory Conolly
eRA COMMONS USER NAME
Bongoeight
POSITION TITLE
Senior Research Biologist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
Harvard College, Cambridge, MA
Harvard School of Public Health
Imperial Chemical Industries, Cheshire, England
DEGREE
(if applicable)
A.B.
Sc.D.
YEAR(s)
1972
1978
1978- 1979
FIELD OF STUDY
Biology
Physiology/Toxicology
Biochemical Toxicology
A. POSITIONS and HONORS
Research and Professional Experience:
1979-1986
1986-1988
1988-1989
1989-1995
1995-2004
2001-2004

2004-2005

2004-2005

2005-present
Assistant Professor of Toxicology, The University of Michigan
Research Manager, NSI Tech. Services Corp., Dayton, OH
Deputy Director, NSI Technology Services Corp., Dayton, OH
Scientist, Chemical Industry Institute of Toxicology, RTP, NC
Senior Scientist, CUT Centers for Health Research, RTP, NC
Director, Center for Computational Biology & Extrapolation Modeling, CUT Centers for
Health Research, RTP, NC
Director, Center for Computational Systems Biology & Human Health Assessment, CUT
Centers for Health Research,  RTP, NC
Senior Investigator, Computational Systems Biology & Human Health Assessment, CUT
Centers for Health Research,  RTP, NC
Senior Research Biologist, National Center for Computational Toxicology, ORD, U.S. EPA,
RTP, NC
Professional Societies and Affiliations:
1981 -present
1985-present
1985-present
1997-1998
1998-2005
1998-present

2001 -2001
2002 - present

2004 - 2005
2005 - present
2009 - present
Member, Society of Toxicology
Member, Society for Risk Analysis
Member, American Association for the Advancement of Science
President, Risk Assessment Specialty Section, Society of Toxicology
Member, U.S. EPA FIFRA Science Review Board
Adjunct Professor of Biomathematics, North Biomathematics, North Carolina State
University, Raleigh, NC
President, Biological Modeling Specialty Section, Society of Toxicology
Faculty Affiliate, Department of Environmental and Radiological Health Sciences, Colorado
State University, Fort Collins,  CO
Member, NAS Board on Environmental Studies and Toxicology
Adjunct Professor of Environmental Science, Wright State  University, Dayton, OH
Councilor, Risk Assessment Specialty Section, Society of Toxicology
Honors and Awards:
1991   Outstanding Presentation in Risk Assessment, Annual Meeting of the Society of Toxicology
1999   Outstanding Presentation in Risk Assessment, Annual Meeting of the Society of Toxicology
2003   Outstanding Presentation in Risk Assessment, Annual Meeting of the Society of Toxicology (2 awards)
2004   Best Published Paper in Risk Assessment, Risk Assessment Specialty Section, Society of Toxicology
2005   Arnold J. Lehman Award for career achievement in risk assessment, Society of Toxicology
2009   EPA Bronze medal, Perchlorate team, For exceptional assistance to the Office of Water on an
       important, highly visible, and scientifically complex health assessment of perchlorate.
 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)
                             Page  1
                              Previous
                              TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):  Conolly, Rory


Selected Invitations at National & International Symposia (last 3 years):
 "Computational Modeling to Evaluate Candidate Modes of Action for Arsenic", presented at the Workshop
   session "How Can Biologically-Based Modeling of Arsenic Kinetics and Dynamics Inform the Risk
   Assessment Process?"  46th Annual Meeting, Society of Toxicology, Charlotte, NC, USA, March 27, 2007.
"Data needs and parameter estimation for the 2-stage clonal  growth (CG) model", presented at the
   symposium, Wither Clonal Growth Modeling, Society for Risk Analysis Annual Meeting, Henry B. Gonzalez
   Convention Center, San Antonio, TX, December 12, 2007.
"Integration of Pharmacokinetic (PK) and Pharmacodynamic  (PD) Modeling of Arsenic to Inform the Risk
   Assessment Process", teleconference presentation  to the Socitey of Toxicology Risk Assessment Specilaty
   Section, March 12, 2008 (with Elaina Kenyon and Hisham el-Masri).
"Use of Mode of Action (MoA) Information in Biologically Based Modeling for Arsenic:
   Dose-Response (DR) and Time-Course (TC) Considerations", presented at the symposium, Incorporation
   of Mode of Action into Mechanistically-Based Quantitative Models, 47th Annual Meeting, Society of
   Toxicology, Seattle WA, March 20, 2008.
"Computational Modeling in Concert with Laboratory Studies:  Application to B cell", presented at the
   symposium, Dioxin Toxicity:  Mechanisms, Models & Potential Health Risks, Michagan State Univeristy,
   Superfund Program Conference, MSU Kellogg Center Lincoln Room, East Lansing, Ml, October 20-21,
   2008.
"Biologically based dose-response modeling. The potential for accurate description of the linkages in the
   applied dose-tissue dose-health effect continuum", presented at the Workshop, From Genes to Organs:
   Advancements in Modeling Biological Systems, Society of Toxicology 48th Annual Meeting, Baltimore, MD,
   March 15-19,2009.
"Studying the Basic Biology of B Cell Differentiation to Understand the Effects of 2, 3, 7, 8-tetrachlorodibenzo-
   p-dioxin (TCDD) on Immune Function", Superfund Basic research Program Webinar, presented as part of
   the Spring/Summer 2009 edition of Risk eLearning "Computational Toxicology: New Approaches for the
   21st Century." Co-presented with Norbert E. Kaminski.  June 24, 2009.

Selected Expert Committees/Advisory Panels/Organizing Activities
"Computational Systems Biology: The Integration of Data Across Multiple Levels of Biological Organization to
   Understand How Perturbations of Normal Biology Become Adverse Health Effects", NRC Committee on
   Models in the Regulatory Process, Workshop on Emerging Issues for Regulatory Environmental Modeling,
   National Academy of Sciences, Washington, DC, December 2, 2005.
Organized the minisymposium "Computational systems biology and health risks of environmental chemicals"
   as part of the joint SIAM/SMB Conference on the Life Sciences, Brownstone Hotel, Raleigh, NC, July 31 -
   August 4, 2006.
Organized the workshop "Systems Biology and the Health Risks of Environmental Chemicals" as part of the
   Seventh International Conference  on Systems Biology, Yokohama, Japan, October 9-13, 2006.
Co-organized (with Richard Phillips, Exxon-Mobil) the symposium "Computational Toxicology- Industry-Wde
   Initiative Plus EPA/NEHS Initiative" as part of the 27th Annual Meeting  of the American College of
   Toxicology, Renaissance Esmeralda Resort & Spa, Indian wells, California, November 5-8, 2006.
Invited discussion leader, "Additivity to background as a potential source of linearity. Applicability of the general
   argument of Crump, Hoel, Langley and Peto, 1976.  JNCI  58:1537-41",  NRC Workshop on the Implications
   of Receptor-Mediated Events on Dose-Response, NRC, Washington, DC, May 3-4, 2007.
"(A biologist's perspective on) Estimating low-dose risk from high-dose data and its associated uncertainty",
   NRC Workshop on Quantitative Approaches to Characterizing Uncertainty in Human Cancer Risk
   Assessment Based on Bioassay Results, NRC, Washington, DC, June 5, 2007.
Invited participant, IOM Brainstorming Session "Approximating Dose-Response Relationships Using Limited
   Data",  Institute of Medicine, Washington, DC, June  18, 2007.
Co-chair, ILSI Health And Environmental Sciences Institute, Emerging Issue Subcommittee on Methodology
   for Intermittent/ Short-term Exposure to Carcinogens (MISTEC), September 2008 - present.
Invited participant, ILSI-HESI Risk Assessment Brainstorming Session, Washington, DC, August 24-25, 2009.
Co-organizer, SOT 2010 Annual Meeting Workshop "Does Background Disease Lead to Low-Dose Linearity?",
   with  Harvey Clewell and Lorenz Rhomberg.
 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)               Page 2
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Conolly, Rory


Selected Assistance/Advisory Support to the Agency
"Computational Toxicology and New Directions in Risk Assessment", SOT Contemporary Concepts in
   Toxicology Workshop:  Probabilistic Risk Assessment (PRA):  Bridging Components Along the Exposure-
   Dose-Response Continuum, Washington,  DC, July 25 - 27, 2005.
Analysis of a PBPK/PD model for perchlorate  for the Office of Water, 2008-2009.

B. SELECTED PUBLICATIONS (selected from MOO).
Clewell, H.J., III, Quinn, D.W., Andersen, M.E., and Conolly, R.B. (1995). An improved approximation to
   the exact solution of the two-stage clonal growth model of cancer. Risk. Anal.  15, 467-473.
Goldsworthy, T.L., Conolly, R.B., and Fransson-Steen, R. (1996). Apoptosis and cancer risk
   assessment. Mutat. Res.  365, 71-90.
Kramer, D.A. and Conolly, R.B. (1997). Computer simulation of clonal growth cancer models.  I.
   Parameter estimation using an iterative absolute bisection algorithm. Risk Anal. 17,115-126.
R.B.  Conolly and M.E. Andersen. (1997). Hepatic foci in rats after diethylnitrosamine initiation and
   2,3,7,8-tetrachlorodibenzo-p-dioxin promotion:  Evaluation of a quantitative two-cell model and of
   CYP1A1/1A2 as a dosimeter. Toxicol. Appl. Pharmacol. 146, 281-293.
Conolly, R.B., Beck, B.D., and Goodman, J.I. (1999). Stimulating research to improve the scientific
   basis of risk assessment. Toxicol. Sci., 49, 1-4.
You, L, Archibeque-Engle, S., Casanova,  M.,  Conolly, R.B., and Heck, H.d'A. (1999). Transplacental
   and lactational transfer of p,p'-DDE in Sprague-Dawley rats. Toxicol. Appl. Pharmacol. 157,134-
   144.
Keys, D.A., Wallace, D.G., Kepler, T.B., and Conolly, R.B. (2000). Quantitative evaluation of alternative
   mechanisms of blood disposition of di(n-butyl) phthalate and mono(n-butyl) phthalate in rats.
   Toxicol. Sci. 53,173-184.
Haag-Gronlund, M., Conolly,  R.B., Scheu,  G.,  Warngard, L., and Fransson-Steen, R. (2000). Analysis
   of rat liver foci growth with a quantitative two-cell model after treatment with 2,4,5,3',4'-
   pentachlorobiphenyl. Toxicol. Sci. 57, 32-42.
Conolly, R.B., Lilly, P.O., and Kimbell, J.S. (2000).  Simulation modeling of the tissue disposition of
   formaldehyde to predict nasal DNA-protein cross-links in F344 rats, rhesus monkeys, and humans.
   Environ. Hlth. Perspect. 108(suppl 5), 919-924.
Ou, Y.C., Conolly, R.B., Thomas, R.S., Xu, Y., Andersen, M.E., Chubb, L.S.,  Pitot, H.C., and Yang,
   R.S.H. (2001). A clonal growth model:  Time-course simulations  of liver foci  growth following penta-
   or hexachlorobenzene tratment in a medium-term bioassay. Cancer Res. 61, 1879-1889.
Conolly, R.B., Kimbell, J.S., Janszen, D., Schlosser, P.M., Kalisak, D., Preston, J., and Miller, F.J.
   (2003). Biologically motivated computational modeling of formaldehyde carcinogenicity in the F344
   rat. Toxicol. Sci. 75, 432-447.
Ou, Y.C., Conolly, R.B., Thomas, R.S., Gustafson, D.L, Long, M.E., Dobrev, I.D., Chubb, L.S., Xu, Y.,
   Lapidot, S.A., Andersen, M.E., and Yang,  R.S.H. (2003). Stochastic simulation of hepatic
   preneoplastic foci development for four chlorobenzene congeners in a medium-term bioassay.
   Toxicol. Sci. 73, 301-314.
Conolly, R.B. and Lutz, W.K.  (2004). Non-monotonic dose-response relationships: Mechanistic basis,
   kinetic modeling, and implications for risk assessment. Toxicol. Sci. 77:151-157.
Gaylor, D.W., Lutz, W.K., and Conolly, R.B. (2004) Statistical analysis of non-monotonic dose response
   relationships: Research design and analysis of nasal cell proliferation  in rats exposed to
   formaldehyde. Toxicol. Sci. 77:158-164.
Tan,  Y.-M., Butterworth, B.E., Gargas, M.L. and Conolly, R.B. (2003). Biologically motivated
   computational modeling of chloroform cytolethality and regenerative cellular proliferation. Toxicol.
   Sci. 75:192-200.
Paul S. Price, P.S., Conolly, R.B., Chaisson, C.F., Gross, E.A., and Young, J.S. (2003). Modeling inter-
   individual variation in physiological factors used in PBPK models of humans. CRC Crit. Rev.
   Toxicol. 33:469-503.
Conolly, R.B., Kimbell, J.S., Janszen, D.J., Schlosser, P.M., Kalisak, D., Preston,  J., and Miller, F.J.
   (2004). Human  respiratory tract cancer risks of inhaled formaldehyde:  Dose-response predictions
 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                Page 3
                               Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Conolly, Rory

   derived from biologically-motivated computational modeling of a combined rodent and human
   dataset. Toxicol. Sci. 82:279-296
Andersen, M.E., Dennison, J.E., Thomas, R.E., and Conolly, R.B. (2005). New directions in incidence-
   dose modeling.  TRENDS Biotechnol. 23,  122-127.
Conolly, R.B., Gaylor, D.W., and Lutz, W.K. (2005). Population variability in biological adaptive
   responses to DMA damage and the shapes of carcinogen dose-response curves. Toxicol. Appl.
   Pharmacol. 207, S570-S575.
Lutz, W.K., Gaylor, D.W., Conolly,  R.B., and Lutz, R.W. (2005). Nonlinearity and thresholds in dose-
   response relationships for carcinogenicity  due to sampling variation, logarithmic dose scaling, or
   small differences in individual susceptibility.  Toxicol. Appl. Pharmacol. 207, S565-S569.
Zhang, Q., Andersen, M.E., and Conolly, R.B. (2006). Binary gene induction and protein expression in
   individual cells. Theor. Biol. Med. Modelling 3:18, DOI:10.1186/1742-4682-3-18.
Tan, Yu-Mei, Liao, K.H.,  Conolly, R.B., Blount, B.C.,  Mason, A.M., and Clewell, H.J. (2006). Use of a
   physiologically based pharmacokinetic model to identify exposures consistent with human
   biomonitoring data for chloroform. J. Toxicol. Environ. Health, Part A, 69,  1727-1756,
   DOI: 10.1080/15287390600631367.
Breen, M.S., Villeneuve,  D.L., Breen, M., Ankley, G.T., and Conolly, R.B. (2007). Mechanistic
   computational model of ovarian steroidogenesis to predict biochemical responses to endocrine
   active  compounds. Annals Biomed.  Engineering, 35, 970-981, DOI: 10.1007/sl0439-007-9309-7.
Conolly, R.B., and Thomas, R.S. (2007). Biologically motivated approaches to extrapolation from high
   to low  doses and the advent of systems biology:  The road to toxicological safety assessment.
   Human and Ecological Risk Assessment 13, 52-56.
Rhomberg, L.R.,  Baetcke, K., Blancato, J., Bus, J., Cohen, S., Conolly, R., Dixit, R., Doe, J., Ekelman, K.,
   Fenner-Crisp,  P., Harvey, P., Hattis, D., Jacobs, A., Jacobson-Kram,  D., Lewandowski, T., Liteplo, R.,
   Pelkonen, O.,  Rice, J., Somers, D., Turturro, A., West, W., and Olin, S. (2007).  Issues in the design and
   interpretation of chronic toxicity and carcinogenicity studies in rodents: Approaches to dose selection. Crit.
   Rev. Toxicol. 37: 729-837, DOI: 10.1080/10408440701524949.
Liao, K.H., Tan, Y.-M., Conolly, R.B., Borghoff, S.J., Gargas, M.L., Andersen, M.E., and Clewell, H.J., III.
   (2007). Bayesian estimation of pharmacokinetic and pharmacodynamic parameters in a
   mode-of-action-based cancer risk assessment for chloroform.  Risk Anal. 27, 1535-1551.
Nong, A., Tan, Y.-M., Krolski, M.E., Wang, J.,  Lunchick, C., Conolly, R.B., and Clewell, H.J., III. (2008).
   Bayesian calibration of a  physiologically based pharmacokinetic/pharmacodynamic model of carbaryl
   cholinesterase inhibition.  J. Toxicol. Environ,. Health 71, 1363-1381.
Ankley, G.T., Bencic, D.C., Breen,  M.S., Collette, T.W., Conolly, R.B., Denslow, N.D., Edwards, S.W., Ekman,
   D.R., Garcia-Reyero, N.,  Jensen, K.M., Lazorchak, J.M., Martinovic, D., Miller, D.H., Perkins,  E.J., Orlando,
   E.F., Villeneuve, D.L., Wang, R.-L, and Watanabe, K.H. (2009). Endocrine disrupting chemicals in fish:
   Developing exposure indicators and predictive models of effects based on mechanism  of action. Aquatic
   Toxicology doi:10.1016/j.aquatox.2009.01.013.
Breen, M.S., Breen, M., Terasaki, N., Yamazaki, M.,  and Conolly, R.B. (2009). Computational model of
   steroidogenesis in human H295R cells to predict biochemical response to endocrine active chemicals:
   Model  development for metyrapone. Submitted to Environmental Health Perspectives.
Bhattacharya, S., Andersen,  M.E.,  Conolly, R.B., Thomas, R.S., Kaminski, N.E., and Zhang, Q. A
   transcriptional regulatory  switch underlying B cell terminal differentiation and Its disruption by dioxin.
   Submitted to PLOS Biology.
Zhang, Q., Bhattacharya, S.,  Crawford,  R.B., Kline, D.E., Conolly, R.B., Thomas, R.S.,  Kaminski, N.E., and
   Andersen,M.E., Stochastic modeling of B lymphocyte terminal differentiation and Its suppression by dioxin.
   Submitted to BMC Systems Biology.
Kitchin, K.T., and Conolly, R.B. Arsenic induced carcinogenesis - oxidative stress as a possible mode of action
   and future research needs for more biologically based risk assessment. Submitted to Chemical research in
   Toxicology.
Luke,  N.S., Sams, R.S., III, DeVito, M.J., Conolly, R.B., and EI-Masri,  H.A. Development of a quantitative
   model incorporating key events in a hepatotoxic mode of action to predict tumor incidence. Submitted to
   Toxicological Sciences.

 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                Page  4
                               Previous  I     TOC

-------
         Principal Investigator/Program Director (Last, First, Middle):  Gorton, Jon Christopher
                                BIOGRAPHICAL SKETCH
NAME
Jon "Chris"topher Gorton
eRA COMMONS USER NAME
    POSITION TITLE

    Senior Research Biologist
EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral
training.}
INSTITUTION AND LOCATION
Grinnell College, Grinnell, IA
University of Kansas Medical Center, Kansas
City, KS
DEGREE
(if applicable)
B.S.
Ph.D.
YEAR(s)
1979
1984
FIELD OF STUDY
Biology/Chemistry
Biochemistry
 A. POSITIONS and HONORS

 Research and Professional Experience:
 2009-Present  Senior Research Biologist, NHEERL, Integrated Systems Toxicology Division
 2006- 2009    Leader, National Health Environmental Effects Research Laboratory (NHEERL)
               Toxicogenomics Core Facility
 2005-Present  Senior Research Biologist, NHEERL, Environmental Carcinogenesis Division
 2002-2005    Consultant, ToxicoGenomics, Chapel Hill, NC
 1989-2002    Research Biologist, CUT, Centers for Health Research, RTF, NC
 1987-1989    Research Fellow, Duke University, Durham, NC

 Invited Lectures/Symposia (last 2 years):
 Gorton,  J. C.  (2006)  Invited presentation, "Modulation of xenobiotic  metabolizing enzyme
    expression by caloric  restriction through  PGC-1 alpha  and PPARalpha" Experimental
    Biology Annual Meeting, San Francisco, CA. April, 2006.
 Gorton,  J. C. (2007)  Invited presentation,  "Nuclear receptor  activation - risk assessment and
    regulatory  implications" Perfluoroalkyl Acids and Related Chemistries: Toxicokinetics and
    Mode-of-Action Workshop, Arlington, VA, Feb. 14-17, 2007
 Gorton,  J. C.  (2007)   Invited  presentation,  "Nuclear  receptor transcriptional  networks"
    University of Tennessee, Memphis. Sept., 2007.
 Gorton, J. C. (2007) Invited presentation, "Nuclear receptors and aging" Gordon Conference on
    Aging, Les Diableretes, Switzerland.  Sept., 2007
 Gorton, J. C. (2007) Invited presentation, "Nuclear receptor transcriptional networks" University
    of Nice, Nice, France. October, 2007.
 Gorton,  J. C. (2007) Invited presentation, "Toxicogenomics of nuclear receptors" Bayer Crop
    Science,  Sofia-Annipolis, France, October, 2007.
 Gorton, J. C. (2007) Invited presentation, "Nuclear receptor transcriptional networks" University
    of Burgandy, Dijon, France October, 2007.
 Gorton,   J.  C. (2007)    Invited  presentation,  "Toxicogenomic  dissection  of the  PFOA
    transcriptional profile" EuroTox 2007, Amsterdam, Netherlands, October, 2007.
                                          Pagel
                          Previous
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Gorton, Jon Christopher
Gorton, J.  C.  (2008)    Invited  presentation,  "Toxicogenomic  dissection  of the  PFOA
   transcriptional profile" PFAA-II meeting, EPA, June, 2008.
Gorton, J.  C. (2009) Invited presentation, "Toxicogenomic dissection of nuclear receptor action"
   NC State University. March, 2009.

Assistance/Leadership Provided to the Scientific Community:
•  Advisory appointments (since 1998):  NIH/NIEHS: Chaired  11  review panel committees;
   reviewer on 30 panel committees. Member,  Scientific  Committee, Critical Assessment of
   Techniques  for Microarray Data  Analysis,  2000-2005;   Member, Peroxisome Proliferator
   Case  Study Work Group,  ILSI, Washington, B.C.,  2001-2003; Reviewer,  TCE Risk
   Assessment, NY Department of Health, 2005.
•  Society of Toxicology activities:  Member,  Continuing  Education Committee, 2006-2009;
   Chair,  Continuing Education Committee, 2008-2009; Co-chair,  8  symposia, roundtables,
   continuing education courses since  1999;  Secretary/Treasurer,  Carcinogenesis Specialty
   Section, 2006-2008.
•  Editorial  board appointments:  Archives of  Toxicology,  2001-2003;  Cell  Biology and
   Toxicology, 2001-present; International  Journal of Toxicology, 1998-2008; Toxicological
   Sciences, 2001-present; Toxicology, 1999-present; Toxicology Letters, 2001-present; PPAR
   Research,  2005-present;  Chemico-Biological  Interactions,   2005-present;  Journal   of
   Pharmacology and Experimental Therapeutics, 2009- present.
•  Academic affiliations: Adjunct Assistant Professor, Curriculum in Toxicology, University of
   North  Carolina,  Chapel Hill,  NC; Adjunct Assistant  Professor,  Integrated Toxicology
   Program,  Duke University, Durham, NC; Associate Member, Graduate Faculty, University
   of Louisiana, Monroe, LA
•  Miscellaneous:  Organizing  Committee,  Dioxin '91; founder and  leader, Triangle Array
   Users  Group,  April 1999-2006; Chair, platform session, Gordon Research Conference on
   Mycotoxins and  Phycotoxins, Waterville,  ME, 2005;  Chair, symposium, "Obesity as a
   modulator of chemical toxicity", AAAS annual meeting, San Francisco, CA. 2007; Member,
   Genomics Committee,  ILSI/HESI 2006-present.  Member, microarray quality  control
   (MAQC) committee, 2006-present.

Assistance/Leadership Provided to the Agency:
   •   Co-chair, PPARalpha workgroup, Risk Assessment Forum, 2005-present.
   •   Member, Data Analysis Workgroup, 2007-present.
   •   Member, Virtual Liver Project, 2006-present.
   •   Leader, NHEERL Toxicogenomics Core, 2006-present.

B. SELECTED PUBLICATIONS (selected more than 65 total).

Valles E.G., Laughter  A.R., Dunn C.S., Cannelle  S., Swanson C.L., Cattley R.C. and Corton
   J.C. (2003) Role of the peroxisome proliferator-activated receptor alpha in responses to the
   hepatocarcinogenic phthalate, diisononyl phthalate (DINP). Toxicology. 191, 211-25.
Laughter, A.R., Dunn, C.S., Howroyd, P., Cattley, R.C., Swanson, C., Corton, J.C. (2004) Role
   of the peroxisome proliferator-activated receptor alpha in responses to trichloroethylene and
   metabolites, trichloroacetate and dichloroacetate. Toxicology. 203, 83-98.
                                         Page 2
                        Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Gorton, Jon Christopher
Howroyd, P., Swanson, C., Dunn, C., Cattley, R.C. and Corton, J.C. (2004) Decreased
   Longevity and Acceleration of Age-Dependent Lesions in Mice Lacking the Nuclear
   Receptor Peroxisome Proliferator-Activated Receptor a (PPARa). lexicological Pathology.
   32,591-599.
Corton, J.C., Apte, U., Anderson, S.P., Limaye, P., Yoon, L., Latendresse, J., Dunn, C.S.,
   Everitt, J.I., Voss, K.A., Swanson, C., Wong, J.S., Gill, S.S., Chandraratna, R.A.S., Kwak,
   M.-K., Kensler, T.W., Stulnig, T.M., Steffensen, K.R., Gustaffson, J.-A. and Mehendale, H.
   (2004) Caloric restriction mimetics include nuclear receptor agonists. Journal of Biological
   Chemistry. 279, 46204-12.
Anderson, S.P., Dunn, C.S., Laughter, A.R., Yoon, L., Swanson, C., Stulnig, T.M., Steffensen,
   K.R., Chandraratna, R.A.S., Gustafsson, J-A and Corton, J.C.  (2004) Overlapping
   Transcriptional Programs Regulated by Peroxisome Proliferator-Activated Receptor a,
   Retinoid X Receptor and Liver X Receptor in Mouse Liver. Molecular Pharmacology. 66,
   1440-1452.
Anderson, S.P., Howroyd, P., Liu, J., Swanson, C., Bahnemann, R., and Corton, J.C. (2004) The
   transcriptional response to a peroxisome proliferator-activated receptor a (PPARa) agonist
   includes increased expression of proteome maintenance genes. Journal of Biological
   Chemistry. 279, 52390-52398.
Corton, J.C. and Lapinskas, P. (2005) Peroxisome proliferator-activated receptors: role in
   phthalate-induced male reproductive tract toxicity? Toxicological Sciences. 83, 4-17.
Lapinskas, P.J., Brown, S., Swanson, C., Cattley, R.C., and Corton, J.C. (2005) Role of
   peroxisome proliferator-activated receptor alpha in mediating phthalate-induced liver
   toxicity. Toxicology.207, 149-163.
Stauber, A.J., Brown-Borg, H., Liu, J., Laughter, A., Staben, R.A., Coley, J.C., Swanson, C.,
   Voss, K.A., Kopchick, J.J. and Corton, J.C. (2005) Constitutive expression of peroxisome
   proliferator-activated receptor alpha and regulated genes in dwarf mice. Molecular
   Pharmacology. 67, 681-694.
Xiao S., Anderson S.P., Swanson C., Bahnemann R., Voss K.A., Stauber A.J. and Corton J.C.
   Activation of the Nuclear Receptor Peroxisome Proliferator-Activated Receptor alpha
   (PPARa) Enhances Hepatocyte Apoptosis. . Toxicological Sciences.  92, 368-77.
Martin M.T., Brennan R., Hu W., Ayanoglu E., Lau C., Ren H., Wood C.R., Corton J.C.,
   Kavlock R.J., Dix D.J. (2007). Toxicogenomic Study of Triazole Fungicides and
   Perfluoroalkyl Acids in Rat Livers Predicts Toxicity and Categorizes Chemicals Based on
   Mechanisms of Toxicity. Toxicol Sci. 97:595-613.
Fostel JM, Burgoon L, Zwickl C, Lord P, Corton JC, Bushel PR, Cunningham M, Fan L,
   Edwards SW, Hester S, Stevens J, Tong W, Waters M, Yang C, Tennant R.. (2007). Towards
   a checklist for exchange and  interpretation of data from a toxicology study.  Toxicol Sci.
   99:26-34.
MAQC Consortium. (2007). The MicroArray Quality Control (MAQC) project shows
   interplatform reproducibility of gene expression measurements. Nature Biotechnology. 24,
   1151-61
Boedigheimer MJ, Wolfinger RD, Bass MB, Bushel PR, Chou JW, Cooper M, Corton JC, Fostel
   J, Hester S, Lee JS, Liu F, Liu J, Qian HR, Quackenbush J, Pettit S, Thompson KL. (2008).
   Sources of variation in baseline gene expression levels from toxicogenomics study control
   animals across multiple laboratories. BMC Genomics. 12, 285.
                         Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):  Gorton, Jon Christopher
Rosen MB, Lee JS, Ren H, Vallanat B, Liu J, Waalkes MP, Abbott BD, Lau C, Corton JC.
   (2008). Toxicogenomic dissection of the perfluorooctanoic acid transcript profile in mouse
   liver: evidence for the involvement of nuclear receptors PPAR alpha and CAR.  Toxicol Sci.
   103,46-56.
Rosen MB, Abbott BD, Wolf DC, Corton JC, Wood CR, Schmid JE, Das KP, Zehr RD, Blair
   ET, Lau C. (2008). Gene Profiling in the Livers of Wild-Type and PPARalpha-Null Mice
   Exposed to Perfluorooctanoic Acid (PFOA). Toxicol Pathol. 36, 592-607.
Lee JS, Ward WO, Wolf DC, Allen JW, Mills C, DeVito MJ, Corton JC. (2008). Coordinated
   changes in xenobiotic metabolizing enzyme gene expression in aging male rats. Toxicol Sci.
   106,263-83.
Corton, JC (2008) Role of PPARalpha in mediating the effects of trichloroethylene and
   metabolites. Critical Reviews in Toxicology. 27,1.
Ren H, Vallanat B, Nelson DM, Yeung LW, Guruge  KS, Lam PK, Lehman-McKeeman LD,
   Corton JC. (2009)  Evidence for the involvement of xenobiotic-responsive nuclear receptors
   in transcriptional  effects upon perfluoroalkyl acid exposure in diverse  species. Reprod
   Toxicol. 27, 266-77
Elam MB, Cowan GS  Jr, Rooney RJ, Hiler ML, Yellaturu CR, Deng X, Howell GE, Park EA,
   Gerling 1C, Patel D, Corton JC, Cagen LM, Wilcox HG, Gandhi M, Bahr MH, Allan  MC,
   Wodi LA, Cook GA, Hughes TA, Raghow R. (2009). Hepatic Gene Expression in Morbidly
   Obese Women: Implications for Disease Susceptibility. Obesity (Silver Spring). In press.
Carmen  Gonzalez M, Corton JC, Cattley RC,  Herrera E,  Bocos C. (2009). Peroxisome
   proliferator-activated  receptor  alpha  (PPARalpha)   agonists  down-regulate   alpha2-
   macroglobulin expression by a PPARalpha-dependent mechanism. Biochimie. 91, 1029-35.
Michael J. Boedigheimer, Jeff W. Chou, Matthew Cooper, J. Christopher Corton, Jennifer Fostel,
   Raegan O'Lone, P. Scott Pine, John Quackenbush, Karol L. Thompson, and Russell D.
   Wolfinger (2009). Sources of Variance in Rat Liver and Kidney Baseline Gene Expression in
   a Large Multi-Site Dataset. Submitted. In Batch effects and experimental noise in microarray
   studies: sources and solutions, ed. Scherer, A. John Wiley & Sons Ltd, Publisher
Rosen MB, Lau C, Corton JC. (2009). Does Exposure to Perfluoroalkyl Acids Present a Risk to Human
   Health? Toxicol Sci.  In press.

Submitted

Beena Vallanat, Steven P. Anderson, Holly M. Brown-Borg, Hongzu Ren, Sander Kersten,
   Sudhakar Jonnalagadda, Rajagopalan Srinivasan and J. Christopher Corton (2009). Analysis
   of the Heat Shock Response in Mouse Liver Reveals Transcriptional Dependence on the
   Nuclear Receptor Peroxisome Proliferator-Activated Receptor alpha (PPARa). BMC
   Genomics. Submitted.
Gail M. Nelson, Gene J. Ahlborn, James W. Allen, Hongzu Ren, J. Christopher Corton, Michael
   P. Waalkes, Kirk T. Kitchin, and Don A. Delker (2009). Impact of Life Stage and Duration
   of Exposure on Liver Gene Expression in Arsenic-Treated Male C3H Mice.  Toxicology.
   Submitted.
Ren H, Aleksunes, LM, Wood, C, Vallanat B, George, M, Klaassen, CD, Corton JC. (2009)
   Characterization of Peroxisome Proliferator-Activated Receptor a (PPARa) - Independent
   Effects of PPARa Activators in the Rodent Liver: Di-(2-ethylhexyl) phthalate Activates the
   Constitutive Activated Receptor. Tox Sci Submitted.
                                        Page 4
                        Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Davis, Jimena L.
                                     BIOGRAPHICAL SKETCH

NAME
Jimena L. Davis
eRA COMMONS USER NAME
POSITION TITLE
Mathematical Statistician
EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such asnursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
Clemson University, Clemson, SC
North Carolina State University, Raleigh, NC
North Carolina State University, Raleigh, NC
NCCT, US EPA, Research Triangle Park, NC
DEGREE
(if applicable)
B.S.
M.S.
Ph.D.

YEAR(s)
2003
2005
2008
2008-
FIELD OF STUDY
Mathematical Sciences
Applied Mathematics
Computational
Mathematics
Uncertainty Analysis and
Risk Assessment
A. POSITIONS and HONORS

Research and Professional Experience:
2008-Present   Cross ORD Postdoctoral Fellow, National Center for Computational Toxicology, US EPA,
               RTP, NC (Mentors: Woodrow Setzer and Rogelio Tornero-Velez)
2003-2008     Research Assistant, Department of Mathematics, North Carolina State University, Raleigh, NC
               (Advisor: H.T. Banks)
2006          Student Intern, Computational Biology Department, Sandia National Laboratories,
               Albuquerque,  NM (Advisor: Elebeoba E. May)
2002          Participant, Research  Experience for Undergraduates in Computational Number Theory and
               Combinatorics, Clemson University, Clemson, SC
Professional Societies and Affiliations:
2003-present   Member, Society of Industrial and Applied Mathematics
2003-present   Member, American Mathematical Society

Honors and Awards:
2009        Superior Accomplishment Recognition Award from the Office of Pesticide Programs
2004 - 2008  Department of Energy Computational Science Graduate Student Fellowship,
             North Carolina State University, Raleigh, NC
2006 & 2007  Association for the Concerns of African American Graduate Students Academic Achievement
             Award - College of Physical and Mathematical Sciences,
             North Carolina State University, Raleigh, NC
2003 - 2004  Statistical and Applied Mathematical Sciences Institute Fellowship,
             North Carolina State University, Raleigh, NC
2003 - 2004  Mathematics Department Fellowship,
             North Carolina State University, Raleigh, NC
2003        Faculty  Scholarship Award,
             Clemson University, Clemson, SC
2003        summa  cum laude Graduate,
             Clemson University, Clemson, SC
2002 & 2003  Beta Kappa Chapter of Phi Sigma Pi Scholarship Award,
             Clemson University, Clemson, SC
2001 - 2003  Multicultural Achievement Award,
 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)
Page J	
                               Previous
 TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Davis, Jimena L.

             Clemson University, Clemson, SC
2001 - 2003  Susan & Harry Frampton Scholarship,
             Clemson University, Clemson, SC
2000         Mathematical Sciences Freshman Award,
             Clemson University, Clemson, SC
1999 - 2003  Palmetto Fellows Scholarship,
             Clemson University, Clemson, SC
1999 - 2003  Coca-Cola Clemson Scholarship,
             Clemson University, Clemson, SC

Selected Expert Committees/Advisory Panels/Organizing Committees:
2009 -present  President, EPA-RTP Networking and Leadership Training Organization (NLTO)

B. SELECTED PUBLICATIONS

Banks H.T., Davis J.L., and Hu S., "A Computational Comparison of Alternatives to Including Uncertainty in
   Structured Population Models," CRSC Tech. Rpt. CRSC-TR09-14, North Carolina State University, June
   2009; Three Decades of Progress in Systems and Control, to appear
Banks H.T., Davis J.L., Ernstberger S.L., Hu S., Artimovich E., and DharA.K., "Experimental Design and
   Estimation of Growth Rate Distributions in Size-Structured Shrimp Populations," CRSC Tech. Rpt. CRSC-
   TR08-20, North Carolina State University,  November 2008; Inverse Problems, to appear
Davis J.L., "Uncertainty Quantification in the Estimation of Probability Distributions on Parameters in Size-
   Structured Populations," Ph.D. Dissertation (2008)
Banks H.T., Davis J.L., Ernstberger S.L., Hu S., Artimovich E., DharA.K., and BrowdyC.L, "A Comparison of
   Probabilistic and Stochastic Formulations in Modeling Growth Uncertainty and Variability," CRSC Tech.
   Rpt. CRSC-TR08-03,  North Carolina State University, February 2008; Journal of Biological Dynamics,
   Volume 3, pages 130 - 148 (2009)
Banks H.T. and Davis J.L., "Quantifying Uncertainty in the Estimation of Probability Distributions," CRSC Tech.
   Rpt. CRSC-TR07-21,  North Carolina State University, December 2007; Mathematical Biosciences and
   Engineering, Volume 5, pages 647 - 667 (2008)
Banks H.T. and Davis J.L., "A Comparison of Approximation Methods for the Estimation of Probability
   Distributions on Parameters," CRSC Tech. Rpt. CRSC-TR05-38, North Carolina State University, October
   2005; Applied Numerical Mathematics, Volume 57, pages 753 - 777 (2007)
Calkin N., Davis J., James K., Perez E., and Swannack C., "Computing the Integer Partition Function,"
   Mathematics of Computation, Volume 76,  pages 1619 - 1638 (2007)
Davis J., Fricks J., Macabea J., Stroud L., White G., and Wong A., "Evaluating a Physiologically Based
   Pharmacokinetic Model Proposed  for Use in Risk Assessment," 2003 Industrial Mathematics Modeling
   Workshop for Graduate Students,  CRSC Tech. Rpt. CRSC-TR04-07, North Carolina State University,
   March 2004

C. SELECTED PRESENTATIONS

"Quantifying Uncertainty in the Estimation of Probability Distributions on Parameters," SIAM 2008 Annual
   Meeting Graduate Student Workshop on Diversity, San Diego, CA, July 2008 (oral presentation)
"Quantifying Uncertainty in the Estimation of Probability Distributions," Department of Energy Computational
   Science Graduate Fellows' Annual Conference, Washington, DC, June 2008 (oral presentation)
"Estimation of Probability Distributions on Parameters in Size-Structured Populations," North Carolina State
   Mathematics Department Graduate Recruitment Weekend, Raleigh, NC, February 2008 (oral presentation)
"Uncertainty Quantification in the Estimation of Probability Distributions on Parameters," Applied Mathematics
   Graduate Student Seminar, Raleigh, NC, January 2008 (oral presentation)
"A Comparison of Approximation Methods for the Estimation of Probability Distributions on Parameters,"
   Infinite Possibilities Conference, Raleigh, NC, November 2007 (oral presentation)
"Using Confidence Bands to Quantify Uncertainty in the Estimation of Probability Distributions," Atlantic Coast
   Conference on Mathematics in the Life and Biological Sciences, Blacksburg, VA, May 2007 (oral
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                Page 2
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Davis, Jimena L.

   presentation)
"A Study of Computational Approaches for Parameter Estimation in the  Escherichia coli K-12 Central Metabolic
   System," 2006 Student Symposium, Sandia National Laboratories, Albuquerque, NM, August 2006 (oral
   presentation)
"A Computational and Statistical Comparison of Approximation Methods for the Estimation of Probability
   Distributions on Parameters," Department of Energy Computational  Science Graduate Fellows' Annual
   Conference, Washington, DC, June 2006 (poster presentation)
 "Comparison of Two Approximation Methods in the Estimation of Growth Rate Distributions in Size-Structured
   Mosquitofish Populations," SIAM-SEAS Conference, Charleston, SC, March 2005 (oral presentation)
"Distributions of Growth Rates in  Size-Structured Mosquitofish Population Models," Journees Jeunes, Paris,
   France, March 2004 (oral presentation)

D. WORKSHOPS

Genomes to  Global Health:  Modeling of Infectious Diseases, Statistical and Applied Mathematical Sciences
   Institute,  Research Triangle Park, NC, September 2004
SAMSI/CRSC Undergraduate Workshop, North Carolina State University, Raleigh, NC, June 2004
Mathematics Meets Biology:  Epidemics, Data Fitting, and Chaos, MAA Prep Workshop, University of
   Louisiana at Lafayette, Lafayette, LA, May 2004
Data Mining  and Machine Learning, Statistical and Applied Mathematical Sciences Institute, Research Triangle
   Park, NC, September 2003
2003 Industrial Mathematics Modeling Workshop for Graduate Students, Center for Research in Scientific
   Computation, North Carolina State University,  Raleigh, NC, July 2003

E. CONTINUING EDUCATION

"Characterizing Variability and Uncertainty with Physiologically-Based Pharmacokinetic Models," Society of
   Toxicology, Baltimore, Maryland, March 2009
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                  Page 3
                               Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Degitz Jr., Sigmund J.
                                     BIOGRAPHICAL SKETCH
          Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                          Follow this format for each person. DO NOT EXCEED FOUR PAGES.
  NAME
  Sigmund J, Degitz Jr.
  eRA COMMONS USER NAME
   POSITION TITLE
   Toxicologist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
Northland College, Ashland, Wl
University of Illinois, Urbana-Champaign
DEGREE
(if applicable)
B.S
Ph.D.
YEAR(s)
1991
1996
FIELD OF STUDY
Biology
Toxicology
A. POSITIONS and HONORS

Research and Professional Experience:
1998-Present Toxicologist, U.S. EPA, Duluth, MN
1996-1998   Postdoctoral Research Fellow, University of North Carolina, Chapel Hill

Selected Awards and Honors:
University of Minnesota-Duluth, Adjunct Associate, Integrated Biosciences Graduate Program
Bronze Medal OECD Support Team. For leadership in development of internationally-harmonized EDC Test
       methods through the OECD. 2003
Bronze Medal Promoting Strong Science in Agency Decisions. For significant achievements in working with
       program offices to promote the use of strong science in Agency decisions. 2003
STAA award level 3
STAA award level 2

B. SELECTED PUBLICATIONS
C.
Olmstead AW, Kosian PA, Korte JJ, Holcombe GW, Woodis KK, Degitz SJ. Sex reversal of the amphibian,
Xenopus tropicalis, following larval exposure to an aromatase inhibitor. AquatToxicol. 2009 Jan 31;91(2):143-
50.
Olmstead AW, Korte JJ, Woodis KK, Bennett BA, Ostazeski S, Degitz SJ. Reproductive maturation of the
tropical clawed frog: Xenopus tropicalis. Gen  Comp Endocrinol. 2009 Jan 15;160(2):117-23
Helbing, CC, Ji L,  Bailey CM, Veldhoen N, Zhang F, Holcombe GW,  Kosian PA, Tietge JE, and SJ Degitz.
       (2007) Identification of gene expression indicators for thyroid axis disruption in a Xenopus laevis
       metamorphosis screening assay. Part 2: Effects on the tail and hindlimb
Helbing CC, Bailey CM, Ji L, Gunderson MP,  Zhang F, Veldhoen N, Skirrow RC, Mu R, Lesperance M,
       Holcombe  GW, Kosian PA,Tietge JE,  and SJ Degitz. (2007) Identification of gene expression indicators
       for thyroid  axis disruption in  a Xenopus laevis metamorphosis screening assay.  Part 1:  Effects on the
       brain AquatToxicol. 2007 May 31;82(4):227-41.
Douglas JF, Denver R, Degitz SJ, Tietge JE,  and LWTouart (2007)The hypothalmic-pituitary-thyroid (HPT)
       axis in frogs and its role in frog development and reproduction.  (In press)
Villeneuve DL, Miracle AL, Jensen KM,  Degitz SJ, Kahl MD, Korte JJ, Greene KJ, Blake LS, Linnum AL,
       Ankley GT (2006) Development of quantitative  real-time PCR assays for fathead minnow (Pimephales
       promelas)  gonadotropin beta subunit mRNAs to support endocrine disrupter research. Comp Biochem
       Physiol C Toxicol  Pharmacol.
Ankley GT, Daston GP, Degitz SJ, Denslow ND, Hoke RA, Kennedy SW,  Miracle AL, Perkins EJ, Snape J,
       Tillitt DE, Tyler CR, and D Versteeg (2006) Toxicogenomics in Regulatory Ecotoxicology
       Environmenal Science and Technology 40(13):4055-4065.
Zhang F, Degitz SJ, Holcombe GW, Kosian PA, Tietge J, Veldhoen N, Helbing CC. (2006) Evaluation of gene
                                                 i
                               Previous
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Degitz Jr., Sigmund J.

       expression endpoints in the context of a Xenopus laevis metamorphosis-based bioassay to detect
       thyroid hormone disrupters. Aquatic Toxicology. 76(1):24-36.
Degitz SJ, Holcombe GW, Flynn KM, Kosian PA, Korte JJ, Tietge JE.(2005) Progress towards development of
       an amphibian-based thyroid screening assay using Xenopus laevis. Organismal and thyroidal
       responses to the model compounds 6-propylthiouracil, methimazole, and thyroxine. Toxicological
       Sciences. 87(2):353-64.
Tietge, JE, GW, Holcombe, KM, Flynn, PA, Kosian, JJ, Korte, LE, Anderson, DC, Wolf, and SJ, Degitz. (2005).
       Metamorphic inhibition of Xenopus laevis by sodium perchlorate: effects on development and thyroid
       histology. Environmental Toxicology and Chemistry, 24:926-933.
Ankley GT, SJ Degitz, SA Diamond,  and JE Tietge (2004). Assessment of environmental steressors potentially
       responsible for malformations in North American anuran amphibians. Ecotoxicology and Environmental
       Safty58, 7-16.
Degitz SJ, Rogers JM, Zucker RM, Hunter ES 3rd. (2004) Developmental toxicity of methanol: Pathogenesis in
       CD-1 and C57BL/6J mice exposed in whole embryo culture.
       Birth Defects Res Part A Clin Mol Teratol. 70,179-84
Degitz SJ, Zucker RM, Kawanishi CY, Massenburg GS, Rogers JM. (2004) Pathogenesis of methanol-induced
       craniofacial defects in C57BL/6J mice. Birth Defects Res Part A Clin Mol Teratol. 70, 172-8.
Rogers JM, Brannen KC, Barbee BD, Zucker RM, Degitz SJ. (2004) Methanol exposure during gastrulation
       causes holoprosencephaly, facial dysgenesis, and cervical vertebral malformations in C57BL/6J mice.
       Birth Defects Res Part B Dev Reprod Toxicol. 71, 80-8.
Kosian PA, EA Makynen, GT Ankley, and SJ Degitz.(2003) Bioconcentration and Metabolism  of All-Trans
       Retinoic Acid by Three Native North American Ranids. Toxicological Sciences 74, 147-156.
Degitz SJ, PA Kosian, GW Holcombe, JE Tietge, EJ Durhan, and GT Ankley.(2003) Comparing the effects of
       Retinoic Acid on amphibian limb development and lethality: chronic exposure results in lethality not limb
       malformations.  Toxicological Sciences 74, 139-146
Degitz, SJ, EJ Durhan, PA Kosian, GT Ankley, and JE Tietge. (2003) Development toxicity of  methoprene and
       its degradation products in Xenopus laevis.  Aquatic Toxicology 64, 97-105.
Gray, LE, Ostby, V. Wilson, C. Lambright, K. Bobseine, P. Hartig, A. Hotchkiss, C. Wold, J. Furr, M. Price, L
       Parks, R. Cooper, T. Stoker, S. Laws, S. Degitz, K.M. Jensen,  M.D. Kahl, JJ. Korte, E.A. Makynen,
       J.E. Tietge, and G.T. Ankley (2002) Xenoendocrine disrupters - tiered screening and testing: Filling key
       data gaps. Toxicology. 181-182, 371-82.
Simon R, JE Tietge, B Michalke, SJ Degitz, and KWSchramm. (2002)  Iodine species and the endocrine
       system: thyroid hormone levels in adult Danio rerio and developing Xenopus laevis. Anal Bioanal
       Chem 372, 481-5
Tietge, JE, SA Diamond, GT Ankley, DL DeFoe, GW Holcombe, KM Jensen, SJ Degitz, GE Elonen, and E
       Hammer (2000). Ambient solar UV-B causes mortality in larvae of three species of Rana.
       Photochemistry and Photobiology 74, 261-268.
Degitz SJ, PA Kosian, EA Makynen, KM Jensen and GT Ankley (2000) Stage- and Species-specific
       Developmental Toxicity of All- Trans Retinoic Acid in Four Native North American Ranids and Xenopus
       laevis. Toxicological Sciences 57, 264-274.
Ankley GT, JE Tietge, GW Holcombe, DL DeFoe, SA Diamond, KM Jensen, and SJ Degitz.(2000) Effects of
       laboratory ultraviolet light and natural sunlight on survival  and development of Rana pipiens. Can. J.
       Zool. 78, 1092-1100.
                              Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   DJX, David J.
                                      BIOGRAPHICAL SKETCH
          Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                           Follow this format for each person. DO NOT EXCEED FOUR PAGES.

NAME
David J. Dix
eRA COMMONS USER NAME
POSITION TITLE
Research Biologist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
University of Illinois at Chicago
Rush University, Chicago IL
North Carolina State University, Raleigh NC
National Institute of Environmental Health
Sciences, RTP, NC

DEGREE
(if applicable)
B.S.
Ph.D
Post-doc
Post-doc

YEAR(s)
1985
1990
1990-1992
1992-1995

FIELD OF STUDY
Biological Sciences
Physiology
Biochemistry
Reproductive and
Developmental
Toxicology
A.
POSITIONS and HONORS
Research and Professional Experience:
2009-present Acting Deputy Director, National Center for Computational Toxicology, U.S. EPA/ORD, RTP NC
2005-present Research Biologist, National Center for Computational Toxicology, U.S. EPA/ORD, RTP NC
2008-present  Adjunct Associate Professor, Department of Environmental Sciences and Engineering,
             University of North Carolina, Chapel Hill, NC.
2004-2005   Lead Research Biologist, Genomic Effects Team, U.S. EPA, ORD/NHEERL/RTD, RTP, NC.
2001-2007   Adjunct Assistant Professor, Department of Molecular and Environmental Toxicology, North
             Carolina State University, Raleigh, NC.
1997-1998   Adjunct Assistant Professor, Department of Biology, North Carolina Central University, Raleigh,
             NC.
1995-2004   Research Biologist, Gamete and Early Embryo Branch, Reproductive Toxicology Division,
             National Health and Environmental Effects Research Laboratory, Office of Research and
             Development, U.S. Environmental Protection Agency, RTP, NC.
1992-1995   Intramural Research Fellow, Laboratory of Reproductive and Developmental Toxicology,
             National Institute of Environmental Health Sciences, Research Triangle Park, NC (Mentor: Dr.
             E.M. Eddy).
1990-1992   Postdoctoral Research Associate, Department of Biochemistry, North Carolina State University,
             Raleigh, NC (Mentor: Dr. Elizabeth Theil).

Professional Societies and Affiliations:
1999-2006   Society for the Study of Reproduction
2001-present Society of Toxicology

Honors and Awards:
2006   EPA Superior Accomplishment Recognition Award (SARA) for creating and chairing the EPA Chemical
       Prioritization Community of Practice.
2007   SARA for fostering coordination between ToxCast and OECD Molecular Screening Project.
2007   EPA Quality Step Increase for outstanding performance.
2007   ORD Bronze Medal Award for characterizing toxicity pathways of conazoles.
2008   Promotion through Technical Qualifications Board to GS-15.
2008   ORD Scientific and Technological Achivevement Award Level II for metabolomics program.
 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)
                                        Page J	
                               Previous
                                         TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   DJX, David J.


Selected invitations at National & International Symposia:
International Society Regulatory Toxicology Pharmacology annual meeting, Nov 2005, Baltimore, MD. "ORD's
   Computational Toxicology Research Program".
NAS/NRC workshop on Applications of Genomic Signatures, Dec 2005, Welches, OR. "Antifungal
   Toxicogenomics".
CARCINOGENOMICS 1st Annual EU Consortium Meeting, Nov 2007, Valencia, SPAIN. "EPA's ToxCast
   Program: Prioritizing the Toxicity Testing of Environmental Chemicals".
4th International Conference on Toxicogenomics (ICT), Korean Society of Toxicogenomics and
   Toxicoproteomics. Incheon, Korea, Nov 2008. "EPA's ToxCast Program..."
40th Annual Symposium of the Society of Toxicology of Canada, Montreal, Quebec, Dec 2008. "The U.S.
EPA's ToxCast Program..."

Selected Expert Committees/Advisory Panels/Organizing Committees:
1995-present  Mentored 9 undergraduate, 7 graduate, and 6 postdoctoral students; one EPA intern.
2005-present  Organizing Committee of Microarray Quality Control Project.
2005-present  Editorial Board, Toxicological Sciences.
2007-present  Editorial Board, Systems Biology in Reproductive Medicine.
2007-present  Scientific Advisory Board, EU project CARCINOGENOMICS.
2009-present  Scientific Advisory Board, EU project ChemScreen.

Selected Assistance/Advisory Support to the Agency:
2006-present  Contracting Officer's Representative (Project Officer) on eight ToxCast contracts.
2006-present  Developing ToxRefDB for OPP, OECD and extramural use.
2007-2008    ORD Future of Toxicology Working Group.
2009-present  Chemical prioritization effort for OPPTS.

B. SELECTED PUBLICATIONS (selected from a total of 75 peer-reviewed).
Tully DB, JC Luft, JC Rockett, H Ren, JE Schmid, CR Wood, DJ Dix (2005). Reproductive and genomic effects
   in testes from mice exposed to the water disinfectant byproduct bromochloroacetic acid. Reproductive
   Toxicology 19(3):353-366.
Bao W, JE Schmid, AK Goetz, H Ren, DJ Dix (2005). A database for tracking toxicogenomic samples and
   procedures. Reproductive Toxicology 19(3):411-419.
Ostermeier GC, RJ Goodrich, MP Diamond, DJ Dix, SA Krawetz (2005). Towards using stable spermatozoal
   RNAs for prognostic assessment of male factor fertility. Fertility and Sterility, 83:1687-94.
Barton HA, Tang J, Sey YM, Stanko JP, Murrell RN, Rockett JC,  Dix DJ (2006).  Metabolism of myclobutanil
   and triadimefon by human and rat cytochrome P450 enzymes and liver microsomes. Xenobiotica 36:793-
   806.
Denslow ND, JKColbourne, DJ Dix, JH Freedman, CC Helbing, S Kennedy, PL Wlliams (2006). Selection of
   surrogate animal species for comparative toxicogenomics. In: Emerging Molecular and Computational
   Approaches for Cross-Species Extrapolations. Eds. W Benson and R Di Giulio. SETAC Press, Florida.
Dix DJ, Gallagher K, Benson WH, Groskinsky BL, McClintock JT, Dearfield KL, Farland WH (2006). A
   framework for the use of genomics data at the EPA. Nat Biotechnol 24:1108-11.
Goetz AK, Bao W, Ren H, Schmid JE, Tully DB, Wood C, Rockett JC, Narotsky MG, Sun G, Lambert GR, Thai
   SF, Wolf DC, Nesnow S, Dix DJ (2006). Gene expression profiling in the liver of CD-1  mice to characterize
   the hepatotoxicity of triazole fungicides. Toxicol Appl Pharmacol 215:274-84.
Kim SJ,  Dix DJ, Thompson KE, Murrell RN, Schmid JE, Gallagher JE, Rockett JC (2006). Gene expression in
   head hair follicles plucked from  men and women. Ann Clin Lab Sci 36:115-26.
Kim YK, Suarez J, Hu Y, McDonough PM, Boer C, Dix DJ, Dillmann WH (2006). Deletion of the inducible 70-
   kDa  heat shock protein genes in mice impairs cardiac contractile function and calcium handling associated
   with  hypertrophy. Circulation 113:2589-97.
Rockett  JC, Narotsky MG, Thompson KE, Thillainadarajah I, Blystone CR, Goetz AK, Ren H, Best DS, Murrell
   RN,  Nichols HP, Schmid JE, Wolf DC, Dix DJ (2006). Effect of conazole fungicides on reproductive
   development in the female rat. Reprod Toxicol 22:647-58.
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                Page  2
                              Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   DJX, David J.

Shi L et al. (2006). The MicroArray Quality Control (MAQC) project shows inter- and intraplatform
    reproducibility of gene expression measurements. Nat Biotechnol 24:1151-61.
Tully DB, Bao W, Goetz AK, Blystone CR, Ren H, Schmid JE, Strader LF, Wood CR, Best DS, Narotsky MG,
    Wolf DC, Rockett JC, Dix DJ (2006). Gene expression profiling in liver and testis of rats to characterize the
    toxicity of triazole fungicides. Toxicol Appl Pharmacol 215:260-73.
Cherney DP, Ekman DR, Dix DJ, Collette TW(2007). Raman spectroscopy-based metabolomics for
    differentiating exposures to triazole fungicides using rat urine. Anal Chem 79:7324-32.
Dix DJ, Houck KA, Martin MT, Richard AM, Setzer RW, Kavlock RJ (2007). The ToxCast program for
    prioritizing toxicity testing of environmental chemicals. Toxicol Sci 95:5-12.
Goetz AK, Ren H, Schmid JE, Blystone CR, Thillainadarajah I,  Best DS, Nichols HP, Strader LF, Wolf DC,
    Narotsky MG, Rockett JC, Dix DJ (2007). Disruption of testosterone homeostasis as a mode of action for
    the reproductive toxicity of triazole fungicides in the male rat. Toxicol Sci 95:227-39.
Kim SJ, Dix DJ, Thompson  KE, Murrell RN, Schmid JE, Gallagher JE, Rockett JC 2007). Effects of storage,
    RNA extraction, genechip type, and donor sex on gene expression profiling of human whole blood. Clin
    Chem 53:1038-45.
Martin MT, Brennan RJ, Hu W, Ayanoglu E, Lau C, Ren H, Wood CR, Gorton JC, Kavlock RJ,  Dix DJ (2007).
    Toxicogenomic  study of triazole fungicides and perfluoroalkyl acids in rat livers predicts toxicity and
    categorizes chemicals based on  mechanisms of toxicity. Toxicol Sci 97:595-613.
Platts AE, Dix DJ, Chemes  HE, Thompson  KE, Goodrich R, Rockett JC, Rawe VY,  Quintana S, Diamond MP,
    Strader LF, Krawetz SA (2007). Success and failure in human  spermatogenesis as revealed by
    teratozoospermic RNAs. Hum Mol Genet 16:763-73.
Judson R, Richard A, Dix D, Houck K, Elloumi F, Martin M, Cathey T, Transue TR, Spencer R, Wolf M.
    (2008). ACToR-Aggregated Computational Toxicology Resource. Toxicol Appl  Pharmacol. 15;233(1):7-13.
Kavlock RJ, Ankley G, Blancato J, Breen M, Conolly R, Dix D, Houck K, Hubal E, Judson R, Rabinowitz J,
    Richard A, Setzer RW, Shah I, Villeneuve D, Weber E. (2008). Computational toxicology-a state of the
    science mini review. Toxicol Sci.  2008 May;103(1):14-27.
Barrier M, Dix DJ, Mirkes PE (2009). Inducible 70 kDa heat shock proteins protect embryos from teratogen-
    induced exencephaly: Analysis using Hspa1a/a1b knockout mice. Birth Defects Res A Clin Mol Teratol
    28;85(8):732-740.
Goetz AK, Dix DJ (2009). Mode of action for reproductive and hepatic toxicity inferred from a genomic study of
    triazole antifungals. Toxicol Sci. 110(2):449-62.
Goetz AK, Dix DJ (2009). Toxicogenomic effects common to triazole antifungals and conserved between rats
    and humans. Toxicol Appl Pharmacol. 238(1 ):80-9.
Judson R, Richard A, Dix DJ, Houck K, Martin M, Kavlock R, Dellarco V, Henry T, Holderman T, Sayre P, Tan
    S, Carpenter T,  Smith E (2009). The toxicity data landscape for environmental chemicals. Environ Health
    Perspect. 117(5):685-95.
Knudsen TB, Martin MT, Kavlock RJ, Judson RS, Dix DJ, Singh AV (2009). Profiling the activity of
    environmental chemicals in prenatal developmental toxicity studies using the U.S. EPA's ToxRefDB.
    Reprod Toxicol. 28(2):209-19.
Martin MT, Judson  RS, Reif DM, Kavlock RJ, Dix DJ (2009). Profiling chemicals based on chronic toxicity
    results from the U.S. EPA ToxRef Database. Environ Health Perspect. 117(3):392-9.
Martin MT, Mendez E, Corum DG, Judson RS, Kavlock RJ, Rotroff DM, Dix DJ (2009). Profiling the
    reproductive toxicity of chemicals from multigeneration studies in the toxicity reference database.
    Toxicol Sci. 110(1):181-90.
Goetz AK, JC Rockett,  H Ren, I Thillainadarajah, DJ Dix (in press). Inhibition of  Rat and Human
    Steroidogenesis by Triazole Antifungals. Systems Biol Repro Med. In press.
Houck KA, DJ Dix, RS Judson, RJ Kavlock, J Yang, EL Berg (2009). Profiling Bioactivity of the ToxCast
    Chemical Library Using  BioMAP  Primary Human Cell Systems. J Biomolec Screen. In press.
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                 Page  3
                               Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Egeghy, Peter Paul
                                     BIOGRAPHICAL SKETCH

NAME
Peter Paul Egeghy
eRA COMMONS USER NAME
POSITION TITLE
Research Environmental Health Scientist
EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such asnursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
University of California, Berkeley
University of California, Berkeley
California State University, Northridge
University of North Carolina, Chapel Hill

DEGREE
(if applicable)
B.A.
M.P.H.
M.S.
Ph.D.

YEAR(s)
1990
1993
1994
2001

FIELD OF STUDY
Physical Environmental
Sciences
Environmental Health
Industrial Hygiene
Environmental Sciences
& Engineering
A. POSITIONS and HONORS
Research and Professional Experience:
2009-Present   Research Fellow (detail), National Center for Computational Toxicology, US EPA, Research
               Triangle Park, NC
2004-Present   Research Environmental Health Scientist, National Exposure Research Laboratory, US EPA,
               Research Triangle Park, NC
2001-2004     Environmental Health Scientist (post-doctoral fellow), National Exposure Research
               Laboratory, US EPA, Las Vegas, NV
1994-1996     Assistant Environmental Health and Safety Manager, Olive View-UCLA Medical Center, Los
               Angeles, CA

Professional Societies and Affiliations:
2004-present   Member, International Society of Exposure Science (ISES)
2006-present   Member, International Association for Breath Research (IABR)
2006-2009     Member, Society for Risk Analysis (SRA)

Honors and Awards:
2009          National Exposure Research Laboratory Special Achievement Award: Technical Support to
               the Office of Pesticide Programs and the Health Effects Division
2008          USEPA Level I  Scientific and Technological Achievement Award (STAA): Identifying Important
               Sources, Pathways, and Routes of Children's Exposures to Pesticides in Their Environments
2008          USEPA Level II Scientific and Technological Achievement Award (STAA): Improved
               Understanding of Children's Exposure to Pesticides in the Residential Environment
2007          Children's Environmental Health Excellence Award: Science Achievement.
2007          National Exposure Research Laboratory Special Achievement Award: Children's Exposure
               Team Support of the Agency's Mission
2007          USEPA Superior Accomplishment Recognition Awards (Team): for contributions to the
               Moncure Air Quality Screening Study
2006          USEPA Superior Accomplishment Recognition Awards (Team): for contributions to the
               Analysis of Children's Exposure Factors Data.
1999-2001     NIEHS Biostatistics Traineeship Award

Selected Expert Committees/Advisory Panels/Organizing Committees:
2009          Organizing Committee Member, 2009 International Society of Exposure Science Conference,
               Minneapolis, MN
 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)
Page J	
                               Previous
 TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Egeghy, Peter Paul

2008          Expert Panel Member, Novel Approaches for Assessing Exposure for School-Aged Children in
              Longitudinal Studies, US EPA STAR Grant Announcement, Washington, DC
2007-2008     Invited Author, USEPA's Scientific and Ethical  Approaches for Observational Exposure
              Studies, Research Triangle Park, NC
2007          Expert Panel, United States Army Research Institute of Environmental Medicine Workshop on
              Research Issues Related to JP8 Exposure Assessment, Natick, MA
2006-2007     Advisory Board Member, North Carolina Central University Environmental Risk and Impact in
              Communities of Color (ERICC) Initiative, Durham, NC
2005          Co-Chair, USEPA/NERL Workshop for the Analysis of Children's Exposure Measurements
              Data, Research Triangle Park, NC
2004          Expert Panel, State of EPA Mercury Science Teleconference Workshop Series
2003          Committee  Member, USEPA HEASD Authorship Guidance Council, RTP, NC
2002-2004     Expert Panel, USEPA/CDC NHANES Biomarker Interpretation Committee
2001          Section Lead, Biological Monitoring in Exposure Assessment, ISEA Exposure Assessment
              and Epidemiology Workshop, Charleston, SC

B. SELECTED PUBLICATIONS

Stout DM 2nd, Bradham KD, Egeghy PP, Jones PA, Croghan CW, Ashley PA, Pinzer E, Friedman W,
   Brinkman MC, Nishioka MG, Cox DC. (2009) American Healthy Homes Survey: a national study of
   residential pesticides measured from floor wipes. Environ Sci Technol. 43(12):4294-300.
Cohen Hubal E, Nishioka M, Ivancic W,  Morara M, Egeghy P. (2008) Comparing surface residue transfer
   efficiencies to hands using polar and non-polar fluorescent tracers. Environ Sci Technol. 42(3):934-9.
Lin YS, Egeghy PP, Rappaport SM.  (2008)  Relationships between levels of volatile organic compounds in air
   and blood from the general population. J Expo Sci Environ Epidemiol. 18(4):421-9.
Morgan MK, Sheldon LS, Thomas KW, Egeghy PP, Croghan CW, Jones PA, Chuang JC, Wlson NK. (2008)
   Adult and children's exposure to 2,4-D from multiple sources and pathways. J Expo Sci Environ Epidemiol.
   18(5):486-94.
Tulve NS, Egeghy PP, Fortmann RC, Whitaker DA, Nishioka MG, Naeher LP, Hilliard A. (2008)  Multimedia
   measurements and activity patterns  in an observational pilot study of nine young children. J  Expo Sci
   Environ Epidemiol. 18(1):31-44.
Egeghy PP, Sheldon LS, Fortmann RC, Stout II DM, Tulve NS,  Cohen Hubal EA, Melnyk LJ, Morgan MK,
   Jones PA, Whitaker DA,  Croghan CW, Coan A. (2007) Important Exposure Factors for Children: An
   Analysis of Laboratory and Observational Field Data Characterizing Cumulative Exposure to Pesticides.
   U.S.  Environmental Protection Agency,  Washington, DC, EPA/600/R-07/013.
Nakayama S, Strynar MJ, Helfant L, Egeghy P, Ye X, Lindstrom AB. (2007) Perfluorinated compounds in the
   Cape Fear Drainage Basin in North Carolina. Environ Sci Technol. 41(15):5271-6.
Kim D, Andersen ME, Chao  YC, Egeghy PP, Rappaport SM, Nylander-French LA. (2007) PBTK modeling
   demonstrates contribution of dermal and inhalation exposure components to  end-exhaled breath
   concentrations of naphthalene. Environ  Health Perspect. 115(6):894-901.
Kirrane E, Loomis D, Egeghy P, Nylander-French L. (2007) Personal exposure to benzene from fuel emissions
   among commercial fishers: comparison of two-stroke, four-stroke and diesel engines. J Expo Sci  Environ
   Epidemiol. 17(2):151-8.
Tulve NS, Driver J, Egeghy PP, Evans J, Fortmann RC,  Kissel JC, McMillan N, Melnyk LJ, Morgan MK, Starr
   JM, Stout DM, Strynar MJ. (2006) The U.S. EPA National Exposure Research Laboratory's (NERL's)
   Workshop on the Analysis of Children's Measurement Data. US EPA, Washington, DC,  EPA/600/R-06/026.
Chao YC, Kupper LL, Serdar B, Egeghy PP, Rappaport SM, Nylander-French LA. (2006) Dermal exposure to
   jet fuel JP-8 significantly contributes to the production of urinary naphthols in  fuel-cell maintenance
   workers. Environ  Health  Perspect. 114(2):182-5.
Cohen Hubal EA, Egeghy PP, Leovic KW, Akland GG. (2006) Measuring potential dermal transfer of a
   pesticide to children in a child care center. Environ Health Perspect. 114(2):264-9.
Egeghy PP, Quackenboss JJ, Catlin S, Ryan PB. (2005) Determinants of temporal variability in  NHEXAS-
   Maryland environmental  concentrations, exposures, and biomarkers. J Expo  Anal Environ Epidemiol.
   15(5):388-97.	
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                Page 2
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):    Egeghy, Peter Paul

Serdar B, Egeghy PP, Gibson R, Rappaport SM. (2004) Dose-dependent production of urinary naphthols
   among workers exposed to jet fuel (JP-8). Am J Ind Med. 46(3):234-44.
Egeghy PP, Hauf-Cabalo L, Gibson R, Rappaport SM. (2003) Benzene and naphthalene in air and breath as
   indicators of exposure to jet fuel. Occup Environ Med. 60(12):969-76.
Serdar B, Egeghy PP, Waidyanatha S, Gibson R, Rappaport SM. (2003) Urinary biomarkers of exposure to jet
   fuel (JP-8).  Environ Health Perspect. 111(14): 1760-4.
Rhodes AG, LeMasters GK, Lockey JE, Smith JW, Yiin JH, Egeghy P, Gibson R. (2003) The effects of jet fuel
   on immune cells of fuel system maintenance workers. J Occup Environ Med. 45(1):79-86.
Egeghy PP, Nylander-French L, Gwin KK,  Hertz-Picciotto I, Rappaport SM. (2002) Self-collected breath
   sampling for monitoring low-level benzene exposures  among automobile mechanics. Ann Occup Hyg.
   46(5):489-500.
Egeghy PP, Tornero-Velez R,  Rappaport SM. (2000) Environmental and biological monitoring of benzene
   during self-service automobile refueling. Environ Health Perspect.  108(12): 1195-202.

C. SELECTED PRESENTATIONS

Egeghy PP. Exposure and EPA's National Exposure Research Laboratory. International Council of Chemical
   Associations (ICCA) Long-Range Research Initiative (LRI) Workshop. Charleston, SC. June 2009 (oral
   presentation)
Egeghy PP, Tulve NS, Adetona O; Naeher LP.  Using Pesticide Screening Questions to Identify the More
   Highly Exposed Participants in a Larger Cohort. International Society for Environmental Epidemiology
   (ISEE) and  International Society of Exposure Analysis (ISEA) Joint Annual Conference, Pasadena,
   California, October, 2008 (oral presentation)
Egeghy PP, Quackenboss JJ.  Within- and  Between-Person Variation in Environmental Concentrations of
   Metals, PAHS, and Pesticides Measured  in NHEXAS-Maryland. Conference Internationale d'Epidemiologie
   et d'Exposition Environnementales, Paris, September 2006 (poster presentation)
Egeghy PP, Stout II DM, Tornero-Velez R, Furtaw Jr EJ. Prediction of Airborne Pesticide Distributional
   Parameters by Physiochemical Properties. Conference Internationale d'Epidemiologie et d'Exposition
   Environnementales, Paris, September 2006 (poster presentation)
Egeghy PP, Thomas KW. Estimates of Age-Specific Urinary Excretion Rates for Creatinine among Children.
   Conference Internationale  d'Epidemiologie et d'Exposition Environnementales, Paris, September 2006
   (oral presentation)
Egeghy PP, Tulve N, Stout II DM, Morgan M, Melnyk L, Cohen Hubal E, Fortmann R, Sheldon L. Identifying
   Important Factors Influencing Children's Exposures to Pesticides. EPA Science Forum, May 2006 (poster
   presentation)
Egeghy PP, Morgan MK, Croghan CW, Sheldon LS. Reliability of Biomarkers of Pesticide Exposure among
   Children and Adults in CTEPP-Ohio. International Society of Exposure Analysis (ISEA) 15th Annual
   Conference, Tucson, AZ, October 2005 (oral presentation)
Egeghy PP, Catlin S, Ryan PB, Quackenboss JJ. Determinants of Residential Lead  Exposure. International
   Society of Exposure Analysis, Stresa, Italy, September, 2003 (oral presentation)
Egeghy PP, Quackenboss JJ,  Ozkaynak AH, Ryan PB. Alternative Exposure Measurements to Improve
   Epidemiological Study Designs:  Determinants of Temporal Variability in Environmental Concentrations and
   Biomarkers. National Children's  Study Meetings, Baltimore, MD, December, 2002 (poster presentation)
Egeghy PP. Mixed Models Analysis of Urbanization Level on Chlorpyrifos Exposure. Presented at International
   Society of Exposure Analysis 2002 Conference, Vancouver, Canada, August, 2002  (poster presentation)

D. WORKSHOPS

Connecting Innovations in Biological Exposure and Risk Sciences: Better Information for Better Decisions.
   International Council of Chemical Associations (ICCA) Long-Range Research Initiative (LRI), Charleston,
   SC. June 2009
Research Approaches to Assessing Public Health Impacts of Risk Management Decisions Workshop, US
   EPA,  Research Triangle Park, NC, January 2008
Biomonitoring Workshop, International Life Sciences Institute (ILSI) Health and Environmental Sciences
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                 Page  3
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Egeghy, Peter Paul

   Institute (HESI), Research Triangle Park, NC, September 2004
Exposure Research Needs for Addressing Cumulative Risk, US EPA, Research Triangle Park, NC, February
   2004
Human Exposure and Dose Modeling Workshop, US EPA, Research Triangle Park, NC, October 2002
Exposure Assessment and Epidemiology Workshop, International Society of Exposure Analysis, Charleston,
   SC, November 2001

E. CONTINUING EDUCATION

"Facilitative Leadership Seminar," Western Management Development Center, Denver, CO, March 2007
"Exposure Assessment for Environmental Chemicals using Biomonitoring," Conference Internationale
   d'Epidemiologie et d'Exposition Environnementales,  Paris, September 2006
"Team Building Team Leadership Seminar," Western Management Development Center, Denver, CO, August
   2003
"Igniting Leadership at All Levels," US EPA Office of Research and Development Leadership Summit,
   Baltimore, MD, January 2003
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                Page 4
                              Previous I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):    Gallagher, Jane E.
                                     BIOGRAPHICAL SKETCH
          Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                           Follow this format for each person. DO NOT EXCEED FOUR PAGES.
  NAME
  Jane E. Gallagher
  eRA COMMONS USER NAME
    POSITION TITLE
    Research Scientist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
State University of New York at Potsdam
Purdue University West Lafayette IN
University of Utah at Salt Lake City
University of North Carolina at Chapel Hill NC
DEGREE
(if applicable)
B.S.
M.S.
M.S.
Ph.D.
YEAR(s)
1976
1981
1982
1986
FIELD OF STUDY
Chemistry
Civil Engineer
Chemical Engineer
Environmental
Toxicology
A. POSITIONS and HONORS

Research and Professional Experience:
1987-Present Health Research Scientist, Epidemiology Biomarker, Branch Human Studies Division, NHEERL,
             USEPA RTP, NC
1999-Present Adjunct Asst. Professor- UNC School of Public Health- Environmental Sciences and
             Engineering
1986-1987   National Research Council (NRC) - postdoctoral fellow. Genetic Toxicology Division US
             EPA, Research Triangle Park, NC
1982-1986   Biologist, Laboratory of Pulmonary Pathobiology, National Institute Environmental Health
             Sciences (NIEHS RTP, NC).
1981-1982   NIOSH traineeship Rocky Mountain Center for Occupational Medicine, Salt Lake City, Utah

Selected Awards and Honors (recent):
Civil Engineering Alumni Achievement Award- Purdue University -2006
Science and Technology Achievement Award  (3) -  Most recently "for contributions leading to a better
understanding of PM  health effects using animal and complementary in vitro human cell test systems" 2005.

Invited Lectures/Symposia (recent):
Mechanistic Indicators of Childhood Asthma (MICA) Study, 17th Annual Conference of the International Society
      of Exposure Analysis, Durham/Research Triangle Park, NC. October 2007.
Mechanistic Indicators of Childhood Asthma (MICA) -Integrating Environmental, Clinical and Susceptibility
      Markers to Improve the Impact of Human Air Pollution Studies.  Public Health Implications of
      Biomonitoring September 24-25  USA EPA 2007.
Mechanistic Indicators of Childhood Asthma-A Computational Toxicology Study" US/Canadian Research
      Studies Workshop.  Detroit Michigan October 21, 2005.
Integrating Biomonitoring Data into Risk Assessment 31st Annual Meeting of The Toxicology Forum at the
Given Institute Aspen, Colorado July10-14, 2005.
Office of Children's Health Protection, Washington, DC "Mechanistic Indicators of Childhood Asthma",2005.
Society of Toxicology Annual Meeting- National Children Study symposium "Validation  of non-invasive
      Biological Sources for Application in Environmental Epidemiology Studies" Baltimore, MD. 2004.
Environmental Carcinogenesis Division- "Application of biomarkers of arsenic exposure/effect in Fallen,
      Nevada. Research Triangle Park, NC - 2005.
Office of Research and Development Annual National Children Studies progress meeting "Non-invasive
      Samples for Gene/Environmental Studies" Research Triangle Park, NC 2002, 2003.
                               Previous
TOC

-------
          Principal Investigator/Program Director (Last, First, Middle):    Gallagher, Jane E.

Assistance/leadership provided to the scientific community:
  •   Editorial Board: Mutation Research 2000 - present.
  •   Journal Reviewer: Tox. Appl. Pharm., Tox. Sci., Inhal. Tox., Toxicol., Environ. Health, EHP.
  •   Adjunct Asst. Professor Environmental Sciences and Engineering, UNC Chapel Hill 2002-present.
  •   Research Advisor: Scott Rhoney, Master student School of Public Health UNC Chapel Hill NC
  •   Co-Advisor mentor for a community based air toxics emissions project: NC State
  •   Reviewer Biomarkers RFAs for National Center for Environmental Research
  •   Reviewer for World Health Organization- NC^-PAH Health Effects document 2003.

  Assistance/leadership provided to the agency:
  •   Development of shared Access Data base for the integration of multifactorial environmental health data for
     shared NERL NHEEL and NCCT users. 2008.
  •   Panel member: "Integrating Biomonitoring Data into Risk Assessment 31 st Annual Meeting of The
     Toxicology Forum at the Given Institute Aspen, Colorado JulylO-14, 2005.
  •   Topic Lead: Office of Research and Development (ORD) Human Health Risk Review- Prepared Board of
     Scientific review materials and coordinated cross-divisional session on "Oxidative Stress and Human
     Health"- Research Triangle Park, NC Feb 2005.
  •   Briefing for EPA's Office of Children's Health and Protection "Mechanistic Indicators of Childhood
     Asthma - Washington, DC 2005.
  •   Principle Investigator- Successful application for an Office of Research and Development NCCT
     competitive grant proposal (0.9 Million) involving investigators from National Exposure Research Lab
     Cincinnati and EPA's National Human Exposure and Effects Lab 2005.
  •   Assistance to Office of Science Policy, US EPA- Assisted Office of Science Policy in the identification of
     EPA ongoing asthma research for EPA's Computer Retrieval of Information on Scientific Projects
     Initiative- 2004.

  B. PUBLICATIONS: (2007-2009)
  Kim SJ, Dix DJ, Thompson KE, Murrell RN, Schmid JE, Gallagher JE, Rockett JC.
  Effects of storage, RNA extraction, genechip type,  and donor sex on gene expression profiling of human whole
  blood .  Clin Chem. Jun;53(6): 1038-45. (2007)

  Vesper, S.,McKinstry C.,  Haugland., R., Neas, L.,  Hudgens, E., Heidenfelder, B., and Gallagher J.
  Environmental Relative Moldiness Index (ERMIsm) as a Tool to Identify Mold Related Risk Factors for
  Childhood Asthma Sci Total Environ. May 1;394(1): 192-6 (2008)

  Johnson M, Hudgens E, Williams R, Andrews G, Neas L, Gallagher J, Ozkaynak H.  "A Participant-Based
  Approach to Indoor/Outdoor Air Monitoring in Community Health Studies" Journal of Exposure Science and
  Environmental Epidemiology. (2008), 1-10 (2008).

  Cohen HubalE, Richards A.,  Shah I, Edwards S, Gallagher J,  Kavlock R, Blancato, J Exposure Science and
  the US  EPA National Center for Computational Toxicology J Expo  Sci Environ Epidemiol. November  (2008).

  Heidenfelder B,. ReifD, Harkema, JR, Cohen Hubal E,  Hudgens,E.  Bramble L G. Wagner G, Harkema JR,
  Morishita M, Keeler G ,  Edwards,SW  and Gallagher J.  Comparative Microrarray Analysis and Pulmonary
  Changes in Brown Norway Rats  Exposed to Ovalbumin  and concentrated Air Particulates Tox Sci. volume
  108 2009 March 2 (2009)
                                Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Gallagher, Jane E.

Heidenfelder B, Johnson M, Hudgens E, Inmon J, Hamilton R,  Neas L, and Gallagher J, Increased plasma
reactive oxidant levels and their relationship to blood cells, total IgE, and allergen-specific IgE in asthmatic
children Journal of Asthma accepted (2009)

Williams AH, Gallagher JE, Hudgens E, Johnson MM, Mukerjee S, Ozkaynak H, Neas LN. EPA
Observational studies of children's respiratory health in Detroit and Dearborn, Michigan. Proceedings of
AWMA 102nJune 16-19; Detroit, Michigan.(2009)

J. E Gallagher, E A Cohen Hubal, S.W.Edwards Invited book Chapter "Biomarkers of Environmental
Exposure" "Biomarkers of toxicity: A New  Era in Medicine Editors Vishal S. Vaidya and Joseph V. Bonventre
Publisher: John Wiley and Sons, Inc. October 1, (2009)

Markey M. Johnson, Ron Williams, Zhihua Fan, Lin, Edward Hudgens, Jane Gallagher, Alan Vette, Lucas
Neas, Haluk Ozkaynak Indoor and outdoor concentrations of nitrogen dioxide, volatile organic compounds, and
polycyclic aromatic hydrocarbons among MICA-Air households in Detroit, Michigan submitted AWMA
(2009)

Gallagher, J Reif, D; Heidenfelder, B Neas,  L; Hudgens, E Williams, A Inmon, J; Rhoney, S,  Andrews G.,
Johnson, M Ozkaynak, H; Edwards, S, Cohen-Hubal, E Mechanistic Indicators of Childhood  asthma ( MICA);
A systems biology approach for the integration of multifactorial environmental health data submitted: Journal
of Exposure Science and Environmental Epidemiology (2009)

In preparation

David M. Reif, Jane E. Gallagher, Brooke L. Heidenfelder, Ed E. Hudgens, Wendell Jones, ClarLynda
Williams-DeVane, Lucas M. Neas, Elaine A. Cohen Hubal, Stephen W. Edwards Elucidating Asthma
Phenotypes via Integrated Analysis of Blood Gene Expression Data with Demographic and Clinical Information
(Nature Genetics) 2009

David M. Reif*, ClarLynda Williams-DeVane*, Elaine A. Cohen Hubal, Wendell Jones, Ed E. Hudgens,
Brooke L.  Heidenfelder, Lucas M. Neas, Jane E. Gallagher, Stephen W. Edwards
* Authors contributed equally. Systems  Modeling of Gene Expression, Demographic and Clinical Data to
Determine Disease Endotypes PLOS Comp Bio 2009
                               Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):    Houck, Keith,  A.
                                     BIOGRAPHICAL SKETCH
          Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                           Follow this format for each person. DO NOT EXCEED FOUR PAGES.
NAME
Keith A. Houck
eRA COMMONS USER NAME
khouck
POSITION TITLE
Toxicologist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
          INSTITUTION AND LOCATION
                                    DEGREE
                                  (if applicable)
                                                        YEAR(s)
                                                                            FIELD OF STUDY
  Guilford College, Greensboro, NC
  University of North Carolina, Chapel Hill
  Duke University, Durham, NC
  Genentech, Inc.
                                     B.S.
                                     M.S.
                                    Ph.D.
                                   Post-doc
   1980
   1982
   1989
1989-1992
          Biology
         Chemistry
    Pathology/Toxicology
Molecular and Cellular Biology
A.
POSITIONS and HONORS
Research and Professional Experience:
2006-present Toxicologist, National Center for Computational Toxicology, USEPA, NC
2006-present Joint Research Committee Member/Consultant, Cystic Fibrosis Foundation Therapeutics
2005-Present Lead Generation Team/Consultant, NINDS Spinal Muscular Atrophy Project
2005         Independent Consultant, Dept. Cell and Tissues Engineering, Becton-Dickinson, RTP, NC
1994-2004   Research Advisor, Lilly Research Laboratories, Eli Lilly & Co., RTP, NC.
1992-1994   Senior Biologist, Sphinx Pharmaceuticals, RTP, NC.
1989-1992   Postdoctoral Fellow, Molecular Biology Department, Genentech, Inc.
1982-1985   Senior Research Analyst, Pathology Department, Duke University

Professional Societies and Affiliations:
1992-present American Association for the Advancement of Science
2001-present Society of Biomolecular Sciences (Member: Conference Committee)
2007-present Society of Toxicology (Specialty  Section: Nanotoxicology)

Honors and Awards:
2000  Changing the World Award, Eli Lilly & Co.
2003  Best Paper Award, Society of Biomolecular Sciences

Selected invitations at National & International Symposia:
U.S. EPA ToxCast Data Analysis Summit. RTP, NC, May 14-15, 2009,  "Characteristics of the ToxCast In Vitro
    Datasets from Biochemical and Cellular Assays".
15th Annual Conference of the Society of Biomolecular Sciences, Lille,  France, April 26-30, 2009, "Use of
    Primary Human Cell Systems for Creating Predictive Toxicology Profiles".
 GlobalChem Conference 2009, Baltimore, MD, April 8, 2009, "Evaluation of the ToxCast Suite of Cellular and
    Molecular Assays for Prediction of In Vivo Toxicity".
The Committee on Toxicity of Chemicals in Food, Consumer Products and the Environment (COT) 'toxicology
    for the21st century', Manor Hotel, Meriden,  UK February 11, 2009,  "The US EPA's ToxCast Program for
    the Prioritization and Prediction of Environmental Chemical Toxicity".
Predictive Human Toxicity and ADME/TOX Studies, Brussels, Belgium, January 20-21, 2009, "Prediction of
    Toxicity Potential using In Vitro Screening Assays from the EPA's ToxCast Program".
CASCADE Workshop on Synergies from Global Chemical Screening Programs (US-EPA/OECD) Oct. 1, 2007,
    "EPA's ToxCast™ Program for Predicting Hazard and Prioritizing Toxicity Testing of Environmental
    Chemicals".
 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)
                                        Page J	
                               Previous
                                        TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Houck, Keith, A.

13th Annual Conference of the Society of Biomolecular Screening, Montreal, Canada, April 15-19 2007,
   "Toxicity Profiling Using High-Throughput and High-Content Technologies".
Fall Symposium of the National Capital Area Chapter Society of Toxicology, Bethesda, MD, Dec. 11, 2006,
   "Environmental chemical hazard prediction by high-throughput screening and genomics approaches in the
   ToxCast program of the US Environmental Protection Agency".

Selected Expert Committees/Advisory Panels/Organizing Committees:
2010         Co-chair, Society of Biomolecular Sciences Regional  Meeting,  RTP, NC.
2009         ECETOC Workshop on Guidance on Identifying Endocrine Disrupting Effects, Barcelona, Spain.
2009         NIH Nanomaterial Grand Opportunity Grant Review Panel
2009         NIH ARRA Challenge Grant Review
2008-present  Clinical Chemistry and Clinical Toxicology Devices Panel of the Medical Devices Advisory
             Committee, Center for Devices and Radiological Health, Food and Drug Administration
2008         NIH Roadmap RFA on Assay Development for High Throughput Molecular Screening Grant
             Review Panel
2007         Co-chair, symposium, "Toxicity Profiling Using High-Throughput and High-Content
             Technologies" 13th Annual Conference of the Society of Biomolecular Screening,  Montreal,
             Canada, April 15-19 2007.
2007-present  Lecturer, North Carolina Central University, The Brite Center, Dept. of Pharmaceutical Sciences
2006         NIH Roadmap RFA on Assay Development for High Throughput Molecular Screening Grant
             Review Panel

Selected Assistance/Advisory Support to the Agency:
2006-present  TOCOR, contracts for ToxCast.
2002-2003    Genomics Coordinator, NHEERL Genomics and Proteomics Committee.
2002-2003    Principal Investigator, CRADA with Duke University to produce DNA microarrays for NHEERL.
2003-2006    Project Officer,  GeneChip processing contract for ORD/NHEERL.
2004-2005    Principal Investigator, CRADA with Affymetrix to develop in vitro toxicogenomics.
2004-2005    Project Officer,  contract for toxicogenomics with Iconix Pharmaceuticals for ORD.
2004-present  Co-Chair, Genomics Task Force Data Analysis Workgroup of EPA Science Policy Council.
2006-present  Contracting Officer's  Representative (Project Officer)  on eight ToxCast contracts.
2003-present  Consulting with Office of Pesticide Programs on conazole reproductive toxicity.
2006-present  Developing ToxRefDB for OPP use
2007         ORD Future of Toxicology Working Group

B. SELECTED PUBLICATIONS (selected from a total of 64 peer-reviewed).

Ostermeier GC, DJ Dix, D Miller, P Khatri, SA Krawetz. Spermatozoal RNA profiles of normal fertile men.
   Lancet, 360, 772-777, 2002.
Rockett JC, RJ Kavlock, C Lambright, L Parks, JE Schmid, VS Wlson, C Wood, DJ Dix. DNA arrays to monitor
   gene expression in  rat blood and uterus following 17-beta-estradiol exposure - biomonitoring
   environmental effects using surrogate tissues. Toxicological Sciences, 69, 49-59, 2002.
Richburg, JH,  Johnson, K, Schoenfeld, HA,  Meistrich, ML and Dix, DJ (2002). Defining the cellular and
   molecular  mechanisms of toxicant action in the testis. Toxicology Letters 135: 167-183.
Hampton CR,  A Shimamoto, CL Rothnie, J Griscavage-Ennis, A Chong, DJ Dix, ED Verrier, TH Pohlman
   (2003). HSP70.1 and -70.3 are required for late-phase protection induced by ischemic preconditioning of
   mouse hearts.Am J Physiol Heart Circ Physiol 285: H866-H874.
Schmitt E, A Parcellier, S Gurbuxani, C Cande, A Hammann, M Celia Morales, CR Hunt, DJ Dix, RT Kroemer,
   F Giordanetto,  M Jaattela, JM Penninger, A Pance, G Kroemer,  C Garrido (2003). Chemosensitization by
   a non-apoptogenic heat shock protein 70-binding apoptosis inducing factor mutant. Cancer Research,
   63(23): 8233-
Hunt CR, DJ Dix, GG Sharma, RK Pandita,  A Gupta,  M Funk, TK Pandita (2004). Genomic instability and
   enhanced  radiosensitivity in Hsp70.1/3-deficient mice. Molecular and Cellular Biology, 24(2):899-911.

PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                Page  2
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):    Houck, Keith, A.

Rockett JC, ME Burczynski, AJ Fornace Jr, PC Herrmann, SA Krawetz, DJ Dix (2004). Surrogate tissue
   analysis: monitoring toxicant exposure and health status of inaccessible tissues through the analysis of
   accessible tissues and cells. Toxicology and Applied Pharmacology 194:189-199.
Rockett JC, P Patrizio, JE Schmid, NB Hecht, DJ  Dix (2004). Gene expression patterns associated with
   infertility in humans and rodent models. Mutation Research,  549:225-240.
Tully DB, JC Luft, JC Rockett, H Ren,  JE Schmid, CR Wood, DJ Dix (2005). Reproductive and genomic effects
   in testes from mice exposed to the water disinfectant byproduct bromochloroacetic acid. Reproductive
   Toxicology  19(3):353-366.
Bao W, JE Schmid, AK Goetz, H Ren, DJ Dix (2005). A database for tracking toxicogenomic samples and
   procedures. Reproductive Toxicology 19(3):411-419.
Ostermeier GC, RJ Goodrich, MP Diamond, DJ Dix, SA Krawetz (2005). Towards using stable spermatozoal
   RNAs for prognostic assessment of male factor fertility. Fertility and Sterility, 83:1687-94.
Barton  HA,  Tang J, Sey YM, Stanko JP, Murrell RN, Rockett JC, Dix DJ (2006). Metabolism of myclobutanil
   and triadimefon by human and rat cytochrome P450 enzymes and liver microsomes. Xenobiotica 36:793-
   806.
Denslow ND, JKColbourne,  DJ  Dix, JH Freedman, CC Helbing,  S Kennedy, PL Williams (2006). Selection of
   surrogate animal species for comparative toxicogenomics. In: Emerging Molecular and Computational
   Approaches for Cross-Species Extrapolations. Eds. W Benson and R Di Giulio. SETAC Press, Florida.
Dix DJ, Gallagher K, Benson WH, Groskinsky BL, McClintock JT, Dearfield KL, Farland WH (2006). A
   framework for the  use of genomics data at the EPA. Nat Biotechnol 24:1108-11.
Goetz AK, Bao W, Ren H, Schmid JE, Tully DB, Wood C, Rockett JC, Narotsky MG, Sun G, Lambert GR, Thai
   SF, Wolf DC, Nesnow S, Dix DJ (2006). Gene expression profiling in the liver of CD-1  mice to characterize
   the  hepatotoxicity  of triazole fungicides. Toxicol Appl Pharmacol 215:274-84.
Kim SJ, Dix DJ, Thompson KE, Murrell RN, Schmid JE, Gallagher JE,  Rockett JC (2006).  Gene expression in
   head hair follicles  plucked from men and women. Ann Clin Lab Sci 36:115-26.
Kim YK, Suarez J, Hu Y, McDonough  PM, Boer C, Dix DJ, Dillmann WH (2006). Deletion of the inducible 70-
   kDa heat shock protein genes in mice impairs cardiac contractile function and calcium handling associated
   with hypertrophy. Circulation 113:2589-97.
Rockett JC, Narotsky  MG, Thompson  KE, Thillainadarajah I, Blystone CR, Goetz AK, Ren H,  Best DS, Murrell
   RN, Nichols HP, Schmid JE, Wolf  DC, Dix DJ  (2006).  Effect of conazole fungicides on reproductive
   development in the female rat. Reprod Toxicol 22:647-58.
Shi L et al. (2006). The MicroArray Quality Control (MAQC) project shows inter- and intraplatform
   reproducibility of gene expression  measurements. Nat Biotechnol 24:1151-61.
Tully DB, Bao W, Goetz AK,  Blystone  CR,  Ren H, Schmid JE, Strader LF, Wood CR, Best DS, Narotsky MG,
   Wolf DC, Rockett JC,  Dix DJ (2006). Gene expression profiling in liver and testis of rats to characterize the
   toxicity of triazole fungicides. Toxicol Appl Pharmacol  215:260-73.
Cherney DP, Ekman DR,  Dix DJ, Collette TW(2007). Raman spectroscopy-based metabolomics for
   differentiating exposures to triazole fungicides using rat urine. Anal Chem 79:7324-32.
Dix DJ, Houck KA, Martin MT, Richard AM, Setzer RW, Kavlock RJ (2007). The ToxCast program for
   prioritizing toxicity  testing of environmental chemicals. Toxicol Sci 95:5-12.
Goetz AK, Ren H, Schmid JE, Blystone CR, Thillainadarajah I, Best DS,  Nichols HP, Strader  LF, Wolf DC,
   Narotsky MG, Rockett JC, Dix DJ (2007). Disruption of testosterone homeostasis as a mode of action for
   the  reproductive toxicity of triazole fungicides in the male rat. Toxicol Sci 95:227-39.
Kim SJ, Dix DJ, Thompson KE, Murrell RN, Schmid JE, Gallagher JE,  Rockett JC 2007). Effects of storage,
   RNA extraction, genechip type, and donor sex on gene expression profiling of human whole blood. Clin
   Chem 53:1038-45.
Martin MT,  Brennan RJ, Hu W, Ayanoglu E, Lau C, Ren H, Wood CR, Gorton JC, Kavlock RJ, Dix DJ (2007).
   Toxicogenomic study  of triazole fungicides and perfluoroalkyl acids in rat livers predicts toxicity and
   categorizes chemicals based on mechanisms of toxicity. Toxicol Sci 97:595-613.
Platts AE, Dix DJ, Chemes HE, Thompson KE, Goodrich  R, Rockett JC,  Rawe VY, Quintana  S, Diamond MP,
   Strader LF, Krawetz SA (2007). Success and failure in human spermatogenesis as revealed by
   teratozoospermic RNAs.  Hum Mol Genet 16:763-73.
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                 Page  3
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Hunter, Edward S.
                                     BIOGRAPHICAL SKETCH

NAME
Edward S, Hunter
eRA COMMONS USER
Hunter
,111 (Sid)
NAME
POSITION TITLE
Research Toxicologist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
Hampden-Sydney College, Virginia
Old Dominion University, Virginia
University of North Carolina at Chapel Hill
DEGREE
(if applicable)
B.S.
M.S.
Ph.D.
YEAR(s)
1975-80
1980-83
1982-86
FIELD OF STUDY
Chemistry
Toxicology
Anatomy/Embryology
A. POSITIONS and HONORS

Research and Professional Experience:
May 2009 -   Acting Chief Systems Biology Branch, ISTD, NHEERL, ORD, US EPA, RTF, NC
May 2009 -   Research Toxicologist, Systems Biology Branch, ISTD, NHEERL, ORD, US EPA, RTF, NC
2008 - 2009   Acting Director Reproductive Toxicology Division, NHEERL, ORD, US EPA, RTF, NC
2005 - 2007:  Acting Chief Gamete and Early Embryo Biology Branch, RTD, NHEERL, US EPA, RTF, NC
1993 - 2009:  Research Toxicologist, Developmental Biology Branch, RTD, NHEERL, US EPA, RTF, NC
1990 - 1993:  Toxicologist, Developmental and Reproductive Toxicology Group, NTP, NIEHS, RTF, NC
1987 - 1990:  Research Assistant Professor, Department of Cell Biology and Anatomy, UNC-Chapel Hill, NC
1986 - 1987:  Research Associate, Department of Anatomy, UNC-Chapel Hill, NC
Adjunct Associate Professor, Curriculum in Toxicology, UNC-Chapel Hill, NC

Professional Societies and Affiliations:
Memberships: Teratology Society
Editorial Boards: Section editor -Teratology  - 2000-2002

Honors and Awards:
1999 - Clark Fraser Young Investigator Award, Teratology Society; 2003 - NHEERL Award: Development and
implementation of genomics, proteomics and bioinformatics capabilities within NHEERL. US EPA, ORD,
NHEERL

Selected invitations at National & International Symposia:
Developmental and Reproductive effects of  drinking water contaminants. Gordan Research Conference.
Disinfection Byproducts. Mount Holyoak College, August, 2009; Effects of a drinking water concentrate on
Reproductive Development in a rat Bioassay. American Water Works Association Meeting, San Diego, CA,
June, 2009; Reproductive Development in a Multi-Generational Rat Bioassay of Drinking Water Concentrates
in the Four Lab Study, Toxicology and Risk Assessment Conference, Dayton, OH, Apr 2008; Embryonic
development: Gastrulation. Continuing Education Course. Society of Toxicology, Seattle, WA, Mar 2008;
Understanding Pathways of Toxicity: Making  sense of changing signals. Symposium Lecture. Teratology
Society, Vancouver, BC, June 2004; Long-Range Research Initiative Annual Science Meeting. American
Chemistry Council, Miami Florida, May, 2004; Toxic Damage to Developmental Signals, Spring Symposium,
Integrated Toxicology Program, Duke University, Feb 2003; Drinking Water and Reproduction Symposium,
Society of Toxicology, Nashville, April, 2002; Departmental Seminar, Department of Toxicology, North
Carolina State University, Raleigh, NC, November, 2001; International Conference on Genes and Gene
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)
Page 1	
Continuation Format Page
                              Previous
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):  Hunter, Edward S.

Delivery for Diseases of Alcoholism, Alcohol Research Center, UNC at Chapel Hill, North Carolina, May,
2001; The role of apoptosis in developmental neurotoxicity and neurodegeneration in adults, Society of
Toxicology Symposium, San Francisco, CA, March, 2001; Early Craniofacial Development: Life among the
Signals, Triangle Consortium for Reproductive Biology, Research Triangle Park, January, 2001.

Selected Expert Committees/Advisory Panels/Organizing Committees:

Implementation planning for ToxTesting in the 21st Century - NHEERL, US EPA (2009-); Chair - NHEER
Toxicogenomics Core Advisory Group (2008-); Revisioning the Health Divisions of NHEERL, US EPA
(2006-7); Coordinator, NHEERL Health Division Proteomics Users Group (2004-5); In Vitro Assays for
Developmental Toxicity Assessments Planning Committee. ILSI HESI Developmental and Reproductive
Toxicology Technical Committee. June 2005; Long-Range Research Initiatives. American Chemistry Council.
May, 2004; Science Policy Council Technical Framework on Genomics for EPA, Performance-Based Quality
Assurance Workgroup (2004-5); Statistically-based structure-activity relationship (SAR) Systems for
Developmental Toxicity: Limitations and Challenges. ILSI Risk Science Institute, Sept. 2003; NHEERL
Genomics and Proteomics Committee (2001-6, Chair 2006).
Symposium Organization: Using Embryonic Stem Cells for Developmental Toxicity. In Vitro Assays for
Developmental Toxicity Assessments. ILSI HESI Developmental and Reproductive Toxicology Technical
Committee. February 2007. Co-Organier with Philippe VanParys; Genomics and Proteomics in Reproductive
and Developmental Toxicity. Society of Toxicology, Salt Lake City, UT, March 2003. Co-Organizer with Kim
Treinen; Drinking Water and Reproduction, Symposium Teratology Society, Florida, June, 2000.


B. SELECTED PUBLICATIONS.
Johnson CS, Zucker RM, Hunter ES 3rd. Sulik KK. (2008). Perturbation of retinoic acid (RA)-mediated limb
    development suggests a role for diminished RA signaling in the teratogenesis of ethanol. Birth Defects Res
    A Clin Mol Teratol. 79(9):631-41.
E.S. Hunter, III and Phillip Hartig (2008).  Chapter 7:  Targeted Gene changes affecting Developmental toxicity.
    In: Developmental Toxicology. B. Abbott and D. Hanson Eds.
Rice G,  Teuschler LK, Speth TF, Richardson SD, Miltner RJ, Schenck KM, Gennings C, Hunter ES 3rd.
    Narotsky MG, Simmons JE. (2008) Integrated disinfection by-products research: assessing reproductive
    and developmental risks posed by  complex disinfection by-product mixtures. J Toxicol Environ Health A.
    2008;71(17): 1222-34.
Chapin R, Augustine-Rauch K, Beyer B, Daston G, Finnell R, Flynn T, Hunter S. Mirkes P, O'Shea KS,
    Piersma A, Sandier D, Vanparys P, Van Maele-Fabry G. (2008). State of the art in developmental toxicity
    screening methods and a way forward: a meeting report addressing embryonic stem cells, whole embryo
    culture, and zebrafish. Birth Defects Res B Dev Reprod Toxicol. Aug;83(4):446-56.
E. S. Hunter. III. Ellen Rogers, Ann Richard, Neil Chernoff (2006) Bromochloro-haloacetic acids: Effects on
    mouse embryos in vitro and QSAR considerations. Reproductive Toxicology. 21(3):260-6.
E. S. Hunter. III.  Maria Blanton, Ellen Rogers, Leonard Mole, Neil Chernoff (2006). Short-term exposures to
    haloacetic acids produces dysmorphogenesis in mouse conceptuses in vitro. Reproductive
Karoly,  E. , J.E. Schmid, E.S. Hunter. Ill (2005).  Ontogeny of transcript profiles during mouse early
    craniofacial development. Repro. Tox. 19:339-352.
S. Degitz, J Rogers and E.S. Hunter. Ill (2004). Developmental toxicity of methanol: Pathogenesis in CD-I and
    C57BL/6J mice exposed in whole  embryo culture. Birth Defects Research Part A.70:179-184
Corey S. Johnson, Maria R. Blanton, E. S. Hunter. 111(2004). Effects of ethanol and hydrogen peroxide on
    mouse limb bud mesenchyme differentiation and cell death.  In vitro Cell Dev Biol Anim 40: 108-112.
J.E. Simmons, L.K. Teuschler, C. Gennings, T.E. Speth,,  S.D. Richardson, R.J. Miltner, M.J. Narotsky, K.M.
    Schenck, E.S. Hunter. III. R.C. Hertzberg and G. Rice (2004). Component-based and whole-mixture

PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                 Page 2	                            Continuation Format Page
                               Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):  Hunter, Edward S.

    techniques for addressing the toxicity of drinking-water disinfection by-products mixtures. Journal of
    Toxicology and Environmental Health Part A, 2004, 67: 741-754.
L. White, E.S. Hunter. III. M.W. Miller, M.F. Ehrich, S. Barone, Jr. (2003). The role of apoptosis in
    neurotoxicology. In: In vitro Neurotoxicology: Principles and Challenges. Humana Press.
E.H. Rogers, E.S. Hunter. III. M.B. Rosen, J.M. Rogers, C. Lau, P.C. Hartig, B.M. Francis, N. Chernoff (2003).
    Lack of evidence for intergenerational reproductive effects due to prenatal and postnatal undernutrition in
    the female CD-I mouse. Reproductive Toxicology. 17:519-525.
M.L. Fascineli, E.S. Hunter.  III. Wilma De Grava Kempinas (2002).  Fetotoxicity caused by the interaction
    between zinc and arsenic in mice. TCM, 22:315-327.
N. Chernoff, E.S. Hunter III. L. L. Hall, M. B. Rosen, C. F. Brownie, D. Malarkey, M. Marr, J. Herkovits
    (2002) . Lack of Teratogenicity of Microcystin-LR in the Mouse and Toad. Applied Toxicology. 22:  13-17.
J.M. Rogers and E.S Hunter. III. (2001). Redox redux: a closer look at conceptal low molecular weight thiols.
    ToxicolSci. Jul;62(l):l-3.
E.S. Hunter. Ill and D.J. Dix (2001) Heat shock proteins Hsp70-l and Hsp70-3 are necessary and sufficient to
    prevent arsenite-induced dysmorphology in mouse embryos. Molecular Reproduction and Development,
    59: 285-293.
G.R. Klinefelter, E.S. Hunter. Ill and M.  G. Narotsky (2001) Reproductive and Developmental Toxicity
    Associated With Disinfection By-Products of Drinking Water. International Life Sciences Institute.
E.S. Hunter. Ill and Phillip Hartig (2000). Transient Modulation Of Gene Expression In The Neurulation Staged
    Mouse Embryo. New York Academy of Sciences. 919, 278-283.
K.W. Ward, E.H. Rogers and E.S. Hunter. Ill (2000). Comparative pathogenesis of haloacetic acid and protein
    kinase inhibitor embryotoxicity in mouse whole embryo culture. Toxicol Sci 53, 118-26.
R.M. Zucker, E.S. Hunter. Ill and J.M Rogers (1999). Apoptosis and morphology of mouse embryos by
    confocal  laser  scanning microscopy.  Methods: A companion to Methods in Enzymology. 18, 473-480.
K.W. Ward, E.H. Rogers and E.S. Hunter. Ill (1998). Dysmorphogenic effects of a specific protein Kinase C
    inhibitor during neurulation. Reprod. Toxicol., 12(5), 525-534.
Hartig, P.C. and E.S. Hunter. Ill (1998). Gene Delivery to the neurulating embryo during culture. Teratology.
    58:103-112.
R.M. Zucker, E.S. Hunter. Ill and J. Rogers (1998). Confocal laser scanning microscopy of embryo apoptosis.
    Cytometry. 33: 348-354.
G.A. Boorman, V. Dellarco, J.K. Dunnick, R.E. Chapin, E.S. Hunter. III. F. Hauchman, H. Gardner, M. Cox,
    R.C. Sills (1999). Drinking water disinfection byproducts: Review and approach to toxicity evaluation.
    Environmental Health Perspectives, 107(Suppl 1): 207-217.
Tabacova, S., E.S. Hunter. Ill and L. Balabaeva (1997). Potential role for oxidative damage in developmental
    toxicity of arsenic. IN: Arsenic: Exposure and Health Effects. Abernathy,  R.L. Calderon and W.R.
    Chappell, Eds, Chapman and Hall, NY, pp. 135-144.
E.S. Hunter. III. E.H. Rogers, J.E. Schmid and A. Richard (1996). Comparative effects of haloacetic acids in
    whole embryo culture. Teratology. 53:352-360.
A.M. Richard and E.S. Hunter. Ill (1996). Quantitative structure activity relationship for the developmental
    toxicity of haloacetic acids in mammalian whole embryo culture. Teratology. 54:57-64.
E.S. Hunter. Ill, and T.W. Sadler (1996).  Direct effects of cocaine and cocaine metabolites on embryonic
    development in whole embryo culture.  Toxicology In vitro. 10:407-414.
Tabacova, S., E.S. Hunter. Ill and B.C. Gladen (1996). Developmental toxicity of inorganic arsenic in whole
    embryo culture: Oxidation state-, dose-, time-, and gestational age dependence. Tox. Applied Pharm. 138:
    298-307.
E.S. Hunter. III. (1996). Alterations of Intermediary Metabolism as a mechanisms of abnormal development
    during early organogenesis. In: Handbook  of Experimental Pharmacology. Vol  124 I: Drug Toxicity in
    Embryonic Development I. R.J.  Kavlock and G.P. Daston, Eds., pp. 371-406.


PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                  Page 3	                            Continuation Format Page
                               Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Hunter, Edward S.

E.S. Hunter, III and J. A. Tugman (1995). Inhibitors of glycolytic metabolism effect neurulation staged mouse
    conceptuses in vitro. Teratology, 52: 317-323.
Hunter, E.S., III. L.E. Kotch, R.E. Cefalo, and T.W. Sadler (1995). Effects of cocaine administration during
    early organogenesis on prenatal and postnatal development in mice. Fundamental and Applied
    Toxicology. 28: 177-186.
Hunter. E.S.. III. J.A. Tugman, K.K. Sulik and T.W. Sadler. (1994). Effects of Short Term Exposure to Ethanol
    on Mouse Embryos in vitro. Toxicology In vitro. 8: 413-421.
Sadler, T.W. and E.S. Hunter, III (1994). Principles of abnormal development: Past, present and future. In:
    Developmental Toxicology, Second Edition. (C.A. Kimmel and J. Buelke-Sam, Eds). Raven Press, New
    York pp. 53-63.
Sadler, T.W., K.M. Deno and E.S. Hunter, III (1993). Effects of altered maternal metabolism during
    gastrulation and neurulation stages of embryogenesis. In: Maternal Nutrition and Pregnancy Outcome.
    (C.L. Keen, A. Bendix, and C.C. Willhite, Eds). Annals NY Acad Sci.
Hunter, E.S., III and T.W. Sadler (1992). The role of the visceral yolk sac in hyperglycemia-induced
    embryopathies in mouse embryos in vitro. Teratology. 45:195-203.
Hunter, E.S., III and T.W. Sadler (1989). Fuel mediated teratogenesis: Altered embryonic glucose metabolism
    as a result of hypoglycemia in vitro. Am J Physiol. 257:E269-E276.
Hunter, E.S.,111, and T.W. Sadler (1988). Embryonic metabolism of fetal fuels in whole embryo culture.
    Toxicology in Vitro. 2:163-167.
Sadler, T.W., E.S. Hunter. III. W. Balkan, L. Shum, W.E. Horton, Jr., and R.E. Wynn (1988). Effects of
    maternal diabetes on embryogenesis.  Am. J. Perinatol. 5:319-326.
Hunter, E.S., III. W.  Balkan, and T.W. Sadler (1988). Improved development of pre-somite mouse embryos in
    whole embryo culture. J. Exp. Zool. 245:264-269.
Hunter, E.S., III and T.W. Sadler (1987). D-(-)-beta-Hydroxybutyrate induced effects on mouse embryos in
    vitro. Teratology. 36: 259-264.
Hahn, V.K.M., E.S. Hunter, III. R.M. Pratt, J. Zendegui, B.C. Lee (1987). Expression of rat transforming
    growth factor alpha mRNA during development occurs predominantly in the maternal decidua. Molec. &
    Cell. Biol. 7: 2335-2343.
Hunter, E.S., III, and T.W. Sadler (1987).  A potential mechanism of DL-beta-hydroxybutyrate induced
    malformations in mouse embryos. Am. J. Physiol., 253: E72-E80.
Sadler, T.W. and E.S. Hunter, III (1987). Hypoglycemia: How little is too much for the embryo?. Am. J. Obstet.
    Gynecol, 157: 190-193.
Hunter, E.S., III, and T.W. Sadler (1987).  Metabolism of D- and DL-beta-hydroxybutyrate by mouse embryos
    in vitro. Metabolism, 36: 558-561.
Sadler, T.W., E.S. Hunter, III. W. Balkan, and R.E. Wynn. (1987). The role of maternal serum factors in
    diabetes-induced embryopathies as studied in whole embryo culture. In: Approaches to Elucidate
    Mechanisms in Teratogenesis. (F. Welsch, ed). Hemisphere Publishing Corp., Washington, pp. 109-
Horton, W.E., Jr., T.W.  Sadler, and E.S. Hunter, III (1985).  Effects of hyperketonemia on mouse embryonic
    and fetal glucose metabolism in vitro. Teratology, 31: 227- 233.
Sadler, T.W., W.E. Horton, Jr., and E.S. Hunter, III (1985). Mammalian embryos in culture: A new approach in
    investigating normal and abnormal developmental mechanisms. In:  Developmental Mechanisms: Normal
    and Abnormal. J.W. Lash and L. Saxen, eds. A.R. Liss, Inc., NY,NY, pp. 227-240.
Sadler, T.W., W.E. Horton, Jr., E.S. Hunter, III (1984). Mechanisms of diabetes-induced congenital
    malformations as studied in mammalian embryo culture. In: Diabetes and Pregnancy: Teratology,
    Toxicology and Treatment. C.C. Peterson, K. Furhmann and L. Jovanovic, eds. Praeger Press,
    Philadelphia, PA. pp. 51-71.
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                  Page 4	                           Continuation Format Page
                               Previous  I     TOC

-------
        Post-Doc (Last, First, Middle):   Jack, John, R.
                                     BIOGRAPHICAL SKETCH
          Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                          Follow this format for each person. DO NOT EXCEED FOUR PAGES.

NAME
John Jack
eRA COMMONS USER NAME
POSITION TITLE
Post-doctoral Research Associate
(Mathematician)
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
State University of New York at Potsdam
State University of New York at Potsdam
Louisiana Tech University
Environmental Protection Agency at Research
Triangle Park, NC
DEGREE
(if applicable)
B.A.
M.A.
Ph.D.
Post-doc

YEAR(s)
2004
2004
2009
2009-
Present
FIELD OF STUDY
Mathematics
Mathematics
Computational Analysis
and Modeling
Computational
Toxicology
A.
POSITIONS and HONORS
Research and Professional Experience:

2009-present Postdoctoral Research Associate, National Center for Computational Toxicology, ORD,
             Environmental Protection Agency, Research Triangle Park, NC
2006-2009   Research Assistant, Institute for Micromanufacturing, Louisiana Tech University, Ruston, LA
             (Mentor: Dr. Andrei Paun)
2008-2009   Teaching Assistant, Department of Computer Science, Louisiana Tech University, Ruston, LA
2005-2007   Teaching Assistant, Department of Mathematics, Louisiana Tech University, Ruston, LA
2003-2004   Peer Tutor, Department of Mathematics, SUNY Potsdam, Potsdam, NY
2003         Research Experience for Undergraduates (REU),  Department of Mathematics, SUNY Potsdam,
             Potsdam, NY (Mentor: Dr. Harold Ellingson)
2001         Research Experience for Undergraduates (REU),  Department of Physics, SUNY Potsdam,
             Potsdam, NY (Mentor: Dr. Biman Das)

Professional Societies and Affiliations:
2001-2004   Member of Pi Mu Epsilon

Honors and Awards:
2008-2009   LI Graduate Research Fellow, Louisiana Optical Network Interface (LONI) Institute
2008         Ph.D. Student of the Year, Institute for Micromanufacturing, Louisiana Tech University,  Ruston,
             LA
2007-2008   NSF Fellowship, Louisiana Tech University, Ruston, LA
2006-2008   University of Louisiana System Fellowship, Louisiana Tech University, Ruston, LA
2005-2007   Teaching Assistantship, Department of Mathematics, Louisiana Tech University, Ruston, LA
2003-2004   Vice President,  Pi Mu Epsilon, SUNY Potsdam, Potsdam, NY

Selected invitations at National & International Symposia:
Workshop on Language Theory, The International Conference on Unconventional Computation, 2007,
   Kingston, ON, CA. "Simulating Apoptosis Using Discrete Methods: a Membrane System and a Stochastic
   Approach. International Conference on Unconventional Computation".
The 13th International Meeting  on DNA Computing, 2007, Memphis, TN. "Modeling the Effects of HIV-1 Virions
   and Proteins on Fas-Induced Apoptosis of Infected Cells".
 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)
                                       Page J	
                               Previous
                                        TOC

-------
        Post-Doc (Last, First, Middle):  Jack, John, R.

The Joint AMS/MAA Conference, 2002, San Diego, CA. "Trivariate Statistics for Heavily Saturated Optical
   Systems".


Selected Expert Committees/Advisory Panels/Organizing Committees:
2006         Room Chair, LA-MS section MAA meeting


Selected Assistance/Advisory Support to the Agency:
None

B. SELECTED PUBLICATIONS (selected from a total of 64 peer-reviewed).

J. Jack, A. Paun. Discrete Modeling of Biochemical Signaling with Memory Enhancement. LNBI Transactions
   on Computational Systems Biology.
J. Jack, A. Paun, A. Rodriguez-Paton. Effects of HIV-1 Proteins on the Fas-mediated Apoptotic Signaling
   Cascade: a Computational Study of Latent CD4+ T Cell Activation. WMC9. 227-246, 2008.
J. Jack, A. Rodriguez-Paton, O.H. Ibarra, A. Paun.  Discrete Nondeterministic Modeling of the Fas Pathway.
   International Journal of Foundations of Computer Science.  15:5, 1147-1167, 2008.
M. DeCoster, R. Masvekar, S. Maddi,  J. Jack, and J. McNamara. Cellular Morphological and  Biochemical
   Changes During Apoptosis In Vitro: Links to Modeling. LBRN Work-inprogress Seminar. 2008.
J. Jack, F. Romero-Campero, M.  Perez-Jimenez, O.H. Ibarra, A. Paun. Simulating Apoptosis Using Discrete
   Methods: a Membrane System and a Stochastic Approach. International Conference on Unconventional
   Computation. 2007.
J. Jack, A. Paun. Modeling the Effects of HIV-1 Virions and Proteins on Fas-Induced Apoptosis of Infected
   Cells.  DNA13. 20077
A. Allan, M. Dunne, J. Jack, J. Lynd, H. Ellingson.  Classification of the Group of Units in the Gaussian
   Intergers modulo n.  Pi Mu Epsilon Journal [accepted] (8pp).
B. Das, E. Drake, J. Jack, M. Cianciosa. Higher Order Multivariate Statistics in the Intensity Fluctuations For a
   Heavily Saturate Amplified Spontaneous Emission. The 35th Meeting of the  Division of Atomic, Molecular
   and Optical Physics. 2004.
B. Das, E. Drake, J. Jack. Trivariate Characteristics of Intensity Fluctuations for Heavily Saturated Optical
   Systems. Applied Optics. 43, 834-840, 2004.
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                 Page  2
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):  Jones, Jack W.
  BIOGRAPHICAL SKETCH
   Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
  	Follow this format for each person. DO NOT EXCEED FOUR PAGES.	
  NAME
  William Jack Jones
  eRA COMMONS USER NAME
                        POSITION TITLE
                        Research Microbiologist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include
  postdoctoral training.)
INSTITUTION AND LOCATION
Clemson University
Clemson University
DEGREE
(if applicable)
B.S.
Ph.D.
YEAR(s)
1973
1981
FIELD OF STUDY
Microbiology
Microbiology
Professional Experience
1991 -Present
2001 - 2006

1989-1991

1984-1989

1981 -1984
1976-1981
Research Microbiologist. USEPA, ORD, NERL.
Metabolism and Bioremediation Research Team Leader. U.S. EPA,
NERL.
Associate Professor. School of Applied Biology, Georgia Institute of
Technology.
Assistant Professor. School of Applied Biology, Georgia Institute of
Technology.
Postdoctoral Research Associate. Dept. of Microbiology, University of Illinois.
Research Associate. Dept. of Microbiology, Clemson University.
Selected Awards and Honors
2007, NERL Special Achievement Award
2006, 2005, 2004, 2004, 2003, 2000; EPA Scientific and Technological Achievement Awards
(STAA)
2005, US EPA Computational Toxicology Research Program New Start Competitive Research
Award
2003, Facilitation Impact Award (Southeast Association of Facilitators)
1998, USEPA Bronze Medal for Commendable Service: EPA Bioremediation Team
1998, Technical Review Panel: Dow Chemical Environmental Research
1996, USEPA/ORD Internal Grant Competition Award
1995, Research Award: USEPA/ARCS (Assessment and Remediation of Contaminated
Sediments)
1994-1995, NSF/EPA Environmental Research Grants Review Panel
1994-1998, US DOD (ESTCP) Environmental Research Review Panel
1992-present, National Research Council Research Adviser

Professional  Societies
American Society for Microbiology (1975-present)
Sigma Xi (1990-present)
World Health Organization (WHO) International Program on Chemical Safety (2000-2004)
                        Previous
                    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Jones, Jack W.


Invited Lectures/Symposia
Jones, W.J., R.C. Kolanczyk, O. Mekenyan, A. Protzel, G. Dannan, S. Abel, and P.K.
   Schmieder. 2007. Designing Pesticide Metabolic Pathway/Degradate Databases for
   Registrant Submitted Health Effects/Ecological Effects Data. Presentation at BOSC Safe
   Pesticides/Safe Products (SP2) Review, Research Triangle Park, NC.
Chapkanov, A., G. Chankov, S. Temelkov, J. Jones, and O. Mekenyan. 2007. Design and
   Performance of a Xenobiotic Metabolism Database Manager for Metabolic Simulator
   Enhancement and Chemical Risk Analysis. Presentation at SETAC Europe 17th Annual
   Meeting, May 20-24, 2007, Porto, Portugal.
Jones, W.J., O. Mejenyan, R.  Kolanczyk, and P. Schmieder. 2007. Simulating Metabolism to
   Enhance Effects Modeling. Presentation at BOSC Safe Pesticides/Safe Products (SP2)
   Review, Research Triangle Park, NC.
Jones, W.J., P.K. Schmieder,  R.C. Kolanczyk, and O. Mekenyan. 2006. Simulating Metabolism
   of Xenobiotic Chemicals as a Predictor of Toxicity. Presented at BOSC Review of the
   Computational  Toxicology Program, Research Triangle Park,  NC.
Jones, W.J., O'Niell, W.L, Mazur, C.S., Kenneke, J.F.,  and Garrison, A.W. 2003.
   Enantioselective Transformation of Chiral PCBs and Fipronil in Anoxic Sediments.
   Presented at: 23rd International Symposium on  Halogenated and Persistent Organic
   Pollutants, Boston, MA, August 24-29, 2003.
Weber, E.J., T.W. Collette, W. J. Jones, J.F.  Kenneke, C.S. Mazur, L.A. Suarez, C.T. Stevens,
   J.W. Washington, K. Wolfe, G.W. Bailey, and R.S. Parmar. 2003. Modeling Chemical Fate
   and Metabolism for Computational Toxicology.  Presented at EPA Science Forum 2003,
   Washington, DC.

Selected Publications
Jones, W.J., C.S. Mazur, J.F.  Kenneke and A.W. Garrison. 2007. Enantioselective Microbial
   Transformation of the Phenylpyrazole Insecticide Fipronil in Anoxic Sediments. Environ. Sci.
   Technol. 41:8301-8307.
Jarman, J.L., W.J.  Jones, L.A. Howell, and A.W. Garrison. 2005. Application of Capillary
   Electrophoresis to Study the Enantioselective Transformation of Five Chiral Pesticides in
   Aerobic Soil Slurries. Journal of Agricultural Food Chemistry 53:6175-6182.
Tebes-Stevens, C.L. and W.J. Jones. 2004. Estimation of Microbial Reductive Transformation
   Rates for Chlorinated Benzenes and Phenols Using a QSAR Approach. Environ. Toxicol.
   Chem. 23:1600-1609.
Mazur, C.S., W.J. Jones and C. Tebes-Stevens. 2003. H2 Consumption during the Microbial
   Reductive Dehalogenation of Chlorinated Phenols and Tetrachloroethene. Biodegradation
   14:285-295.
Pakdeesusuk, U., W.J. Jones, C.M. Lee, A.W. Garrison, W.L.  O'Niell,  D.L Freedman, J.T.
   Coates, and C.S. Wong. 2003. Changes in Enantiomeric Fractions during  Microbial
   Reductive Dechlorination of PCB132, PCB149, and Aroclor 1254 in Lake Hartwell Sediment
   Microcosms. Environ. Sci. Technol. 37:1100-1107.
Mazur, C.S. and W.J. Jones. 2001. Hydrogen Concentrations  in Sulfate-Reducing  Estuarine
   sediments during PCE dehalogenation. Environ. Sci. Technol. 35:4783-4788.
Jones, W.J. and N.D. Ananyeva.  2001. Correlations between pesticide transformation rate and
   microbial respiration activity in soils of different ecosystems. Biol. Fertil. Soils 33 (6):477-
   483.
Garrison, A.W., V.A. Nzengung, J.K. Avants, J.J. Ellington, W.J. Jones, D. Rennels, and N.L.
   Wolfe. 2000. Phytodegradation of p,p-DDT and the  Enantiomers of o,p-DDT. Environ. Sci.
   Technol. 34:1663-1670.
Pardue,  J.H., S. Kongara, and W.J. Jones. 1996. Effect of Cadmium on Reductive
   Dechlorination  of Trichloroaniline. Environ. Toxicol. Chem. 15:1083-1088.
                        Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Jones, Jack W.


Liu, S.M. and W.J. Jones. 1995. Biotransformation of Dichloroaromatic Compounds in
   Nonadapted and Adapted Freshwater Sediment Slurries. Appl. Microbiol. Biotechnol.
   43:725-732.

Presentations
Protzel, A., G. Dannan, R. Kolanczyk, O. Mekenyan, S. Abel, P. Schmieder, and J. Jones. 2006.
   Development of a Structure-Searchable Database for Pesticide Metabolites and
   Environmental Degradates. Presented at the Society for Toxicology Annual Meeting, San
   Diego, CA,  March 5-9, 2006.
Mekenyan, O.G., S.D. Dimitrov, T.S. Pavlov, W.J. Jones, and P.K. Schmieder. 2005.
   Metabolism and Metabolic Activation of Chemicals: In Silico Simulation. Presented at the 3rd
   International Symposium on Computational Methods in Toxicology and Pharmacology
   Integrating Internet Resources, Shanghai, China, October 29-November 1, 2005.
Serafimova,  R., H. Aladjov, R. Kolanczyk, P. Schmieder, Y. Akahori, J. Jones, and O.
   Mekenyan.  2005. QSAR evaluation of ER Binding Affinity of Chemicals and Metabolites.
   Presented at Society of Environmental Toxicology and Chemistry 26th Annual Meeting,
   Baltimore, MD, Nov. 13-17, 2005.
Kolanczyk, R., M. Tapper, B. Nelson, V. Wehinger,  J. Denny, D. Kuehl, B.  Sheedy, C. Mazur, J.
   Kenneke, J. Jones, and P. Schmieder. 2005. Increased Endocrine Activity of Xenobiotic
   Chemicals as Mediated by Metabolic Activation. Presented at Society of Environmental
   Toxicology and Chemistry 26th Annual Meeting, Baltimore, MD.
Mekenyan, O.,  J. Jones, P. Schmieder, S. Kotov, T. Pavlov, and S. Dimitrov. 2005.
   Performance, reliability, and improvement of a tissue-specific metabolic simulator.
   Presented at Society of Environmental Toxicology and Chemistry 26th Annual Meeting,
   Baltimore, MD, Nov. 13-17, 2005.
Garrison, A.W., W.J. Jones, T.E. Wese, B. J. Konwick, M.A. Tapper, and M.K. Morgan. 2003.
   The enantiomers of chiral  pollutants pose different risks. Presented at:  24th Annual Society
   of Environmental Toxicology and Chemistry  Meeting, Austin, TX, November 9-13, 2003.
Jones, W.J., O'Niell, W.L, Mazur, C.S.,  Kenneke, J.F., and Garrison, A.W. 2003.
   Enantioselective transformation of chiral PCBs and fipronil in anoxic sediments. Presented
   at: 23rd International Symposium on Halogenated and Persistent Organic Pollutants,
   Boston, MA, August 24-29, 2003.
Garrison, A.W., Jones, W.J., Wese, T.E., Washington, J.W., Jarman, J.L, and Avants, J.  2003.
   Observations of enantioselectivity in the fate, persistence and effects of modern pesticides.
   Presented at: 23rd International Symposium on Halogenated Organic Pollutants and
   Persistent Organic Pollutants, Boston, MA, August 24-29, 2003.
Pakdeesusuk, U., C. M. Lee, D.L. Freedman, J.T. Coates, C.S. Wong, W.J. Jones, and  A.W.
   Garrison. 2002. Enantioselectivity in the biodegradation of PCB atropisomers.  Presented at:
   23rd Annual Society of Environmental Toxicology and Chemistry Meeting, Salt Lake City,
   UT, November 16-20, 2002.
Tebes-Stevens, C.L., and W.J. Jones. 2000. QSAR analysis of sorption-corrected rate
   constants for reductive biotransformation of halogenated aromatics. Presented at: 220th
   American Chemical Society National Meeting, Washington, DC, August 20-24, 2000.
                        Previous  I    TOC

-------
         Principal Investigator/Program Director (Last, First, Middle):   JlldSOfl, Richard
                                           BIOGRAPHICAL SKETCH
           Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                               Follow this format for each person. DO NOT EXCEED FOUR PAGES.
  NAME
  Richard Judson
  eRA COMMONS USER NAME
    POSITION TITLE
    Bioinformatician
  EDUCATION/TRAINING  (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
Rice University, Houston TX
Princeton University, Princeton NJ
Princeton University, Princeton NJ
University of Houston
DEGREE
(if applicable)
BA
MA
Ph.D.
Post-Doc
YEAR(s)
1981
1984
1988
1988-1989
FIELD OF STUDY
Chemistry / Physics
Chemistry
Chemistry
Chemistry / Physics
A. POSITIONS and HONORS

Research and Professional Experience:
2006-Present      Bioinformatician, National Center for Computational Toxicology, USEPA, RTF NC
2008-Present      Adjunct Professor, Univ. of North Carolina, Chapel Hill, Dept of Environmental Sciences and Engineering
2005-2006        President, GAMA BioConsulting, Guilford CT
2001-2005        Senior Vice President, Chief Scientific Officer, Genaissance Pharmaceuticals, New Haven CT
1999-2001        Senior Vice President, Informatics, Genaissance Pharmaceuticals, New Haven CT
1997-1999        Group Leader, Bioinformatics, CuraGen Corp., New Haven, CT
1990-1996        Senior Member Technical Staff, Sandia National Laboratories, Livermore, CA
1981-1983        CRT Process Engineer, Tektronix, Inc., Beaverton, OR

Honors and Awards
•   EPA Bronze Medal for Commendable Service (2008)
•   EPA / OEI CIO's John Cooper Partnership Award (2008)
•   EPA Web Workgroup Award (2009) for ACToR

Selected invitations at National & International Symposia:
•   Indiana University-Perdue University Indianapolis, Wiley lecture in Computational Chemistry, 10/02, "Informatics From Chem to
    Bio: The Intersection of Science, Engineering, Medicine, and Computing"
•   Princeton University, 10/02 "Open Problems in Genomics, Bioinformatics and Medicine"
•   American Heart Association Scientific Sessions 6/03 "Apolipoprotein E Haplotypes are Associated with Baseline Levels and
    Statin-Induced Decreases in C-Reactive Protein"
•   BIOPHEX 2004 (Toronto), 9/04 "Regulatory Issues in Pharmacogenetics Testing"
•   Michigan State University Department of Radiology, 6th Annual Molecular Imaging Workshop, 7/05 "Pharmacogenomics of
    Statin Response"
•   IBC Diagnostic Biomarkers, 12/05 (Cambridge MA) "Technical, Business and Regulatory Challenges to The Development of
    Drug / Genetic Test Combinations"
•   Pfizer Conference on Long QT Issues in Drug Development (Philadelphia), 9/05 "Using Genetic Testing to Manage Risks in
    Thorough QT Trials"
•   Molecular Diagnostics and Personalized Medicine (Boston), 9/05 "Technical, Business and Regulatory Challenges to the
    Development of Drug / Genetic Test Combinations"
•   EPA Environmental Information Symposium (Savannah), 12/06 "Developing Computer Systems and Databases for High
    Throughput Toxicity Testing Prioritization"
•   TRAC 2007 (Cincinnati), 4/07 "ACToR: Aggregated Computational Toxicology Resource"
•   EPA Science Forum (RTP, NC), 5/07 "Computational Toxicology - Where is the Data?"
•   University of Louisville, Center for Bioinformatics (Louisville KY), 6/07 "Computational Toxicology and Chemical
    Prioritization"

                                                         i
                                    Previous
TOC

-------
         Principal Investigator/Program Director (Last, First, Middle):    JlldSOfl, Richard

•   International Society for Exposure Analysis Annual meeting (Durham NC), 10/07, "ACToR: Aggregated Computational
    Toxicology Resource"
•   North Carolina State University, School of Veterinary Medicine (Raleigh NC), 1/08 "Applications of Genetic Variation to Human
    and Animal Health".
•   North Carolina State University, Bioinformatics Program (Raleigh NC), 3/08 "Introduction to the EPA ToxCast Program"
•   ebCTC Second Annual Toxicology Symposium (New Brunswick), 3/08 "Overview of the EPA ToxCast Program"
  • North Carolina State University, Department  of Environmental Science and Toxicology (Raleigh NC), 10/08 "The EPA ToxCast
    Program"
  • Society of Risk Analysis Annual Meeting (Boston MA) 12/08 "Toxicity Signatures from the ToxCast Program"
  • Society of Toxicology Annual Meeting (Baltimore MD) 3/09 "Pathway-Based Concentration Response Profiles from
    Toxicogenomics Data"
  • ICCA / LRI Annual Meeting (Charleston, SC) 6/09 Overview of the EPA ToxCast Program
  • International Implications of the U.S. National Research report on Toxicity Testing in the 21st Century 6/09 (Ottawa, ON),
    Overview of the EPA ToxCast Program

Selected Assistance/Advisory Support to the Agency:
2007 - Present: PI for ACToR (Aggregated Computational Toxicology Resource) Project
2007 - Present: Co-Pi for ToxCast Project
2007-present: Mentored 1 undergraduate, 3 graduate, and 2 postdoctoral students

Selected Expert Committees/Advisory  Panels/Organizing Committees:
2007-2008: Member of EPA/ORD Genomics Task Force, responsible for data management strategy
2007: Lecturer for Genomics Training Course developed for OPPTS
2007: Session Co-Chair for 2007 EPA Science Forum
2007-Present:  Member of ORD IT Governance Board
2009-Present:  Consultant to the FDA NCTR Scientific Advisory Board
2009: Co-chair of First EPA ToxCast Data Analysis Summit

B. SELECTED PUBLICATIONS (selected from 69 total).
•   R. Judson, J.C. Stephens, A. Windemuth, "Tracking the Causative Genetic Variant: A Gene-based Haplotype Approach",
    Conference Proceedings "Genome Targets to Drug Candidates", London November 29 (1999).
•   P. Uetz, L. Giot, G. Cagney, T. A. Mansfield, R. S. Judson, J. R. Knight, D. Lockshon, V. Narayan, M. Srinivasan, P. Pochart, A.
    Qureshi-Emili, Y. Li, B. Godwin, D. Conover, T. Kalbfleisch, G. Vijayadamodar, M. Yang, M. Johnston, S. Fields, J. Rothberg,
    "A Comprehensive Analysis of Protein-protein Interactions in Saccharomyces Cerevisiae", Nature, Vol. 403 623 (2000).
•   R. Judson, J. C. Stephens, A. Windemuth, "The Predictive Power of Haplotypes in Clinical Response", Pharmacogenomics, Vol.
    1 15(2000).
•   C. M. Drysdale, D. McGraw, C. B. Stack, J. C. Stephens, R. S. Judson, K. Nandabalan, K. Arnold, G. Ruano, S. Liggett,
    "Complex Promoter and Coding Region, Beta(2)-Adrenergic Receptor Haplotypes Alter Receptor Expression and Predict In Vivo
    Responsiveness", Proc. Nat. Acad. Science, USA, Vol. 97 10483-10488 (2000).
•   J. C. Stephens, J. A. Schneider, D. A. Tanguay, J. Choi, T. Acharya, S. E. Stanley, R. Jiang, C. J. Messer, A. Chew, J.-H. Han, J.
    Duan, J. L. Carr, M. S. Lee, B. Koshy,. M. Kumar, G. Zhang, W. R. Newell, A. Windemuth, C. Xu, T. S. Kalbfleisch, S. Shaner,
    K. Arnold, V. Schulz, C. M. Drysdale, K. Nandabalan, R. S. Judson, G. Ruano, G. F. Vovis "Haplotype Variation and Linkage
    Disequilibrium in 313 Human Genes", Science, Vol. 293, 489-493 (2001).
•   R. Judson, B. Salisbury, J. Schneider, A. Windemuth and J. C. Stephens, "How many SNPs does a genome-wide haplotype map
    require?", Pharmacogenomics, Vol.  31-13 (2002).
•   R. Judson, "Using multiple drug exposure levels to optimize power in pharmacogenetic trials", Journal of Clinical Pharmacology,
    Vol. 44 816-824 (2003).
•   R. Jiang, J. Duan, A. Windemuth and J. C. Stephens, R. Judson, C. Xu, "Genome-wide evaluation of public SNP databases",
    Pharmacogenomics, Vol. 4 779-789 (2003).
•   B. Winkelmann, M. Hoffman, M. Nauck, A. Kumar, K. Nandabalan, R. Judson, B. Boehm, A. Tall, G. Ruano, W. Marz,
    "Haplotypes of the  cholesterol ester transfer protein gene predict lipid-modifying response to statin therapy", The
    Pharmacogenomics Journal, Vol. 3 284-296 (2003).
•   A. Windemuth, M.  Kumar, K. Nandabalan, B. Koshy, C. Xu, M. Pungliya, R. Judson "Genome-wide Association of Haplotype
    Markers to Gene Expression Levels", in "The Genome of Homo sapiens"
    (Cold Spring Harbor Symposia on Quantitative Biology LVIII) (2004).
•   R. Judson, C.D. Brain, B. Dain, A. Windemuth, G. Ruano, C. Reed, "New and confirmatory evidence of an association between
    APOE genotype and baseline C-reactive protein in dyslipidemic individuals", Atherosclerosis, Vol. 177, 345-351 (2004).
                                    Previous  I     TOC

-------
         Principal Investigator/Program Director (Last, First, Middle):    JlldSOfl, Richard

•   R. Judson, A. Moss, "Pharmacogenomics in drug development: when and how to apply", in Cardiac Safety in clinical research
    and drug development, practical guidelines, J. Morgenroth & I. Gussak, Eds. (Humana Press 2005).
•   MJ. Ackerman. I. Splawski, J.C. Makielski, DJ. Jester, M.L. Will, K.W. Timothy, M.T. Keating, G. Jones, M. Chadha, C.R.
    Burrow, J.C. Stephens, C.X. Xu, R. Judson, M.E. Curran "Spectrum and prevalence of cardiac sodium channel variants among
    Black, White, Asian and Hispanic individuals: Implications for arrhythmogenic susceptibility and Brugada/Long QT syndrome
    genetic testing", Heart Rhythm, Vol.  1, 600-607 (2005).
•   C. Reed, M. Kalnik, K. Rakin, M. Athanasiou, R. Judson, "Restoring Value to Stalled Phase II Compounds: The Case for
    Developing a Novel Compound for Depression Using Pharmacogenetics", Pharmacogenomics, Vol. 6, 95-100 (2005).
•   R. Judson, B. Salisbury, "Technologies for Nutrigenomic Association Studies", in Nutrigenomics: The Future of Integrative
    Metabolism, Eds: B. German, S. Watkins, M. Roberts, (Elsevier Press, 2006).
•   R. Judson, B. Salisbury, C. Reed, M.  Ackerman, "Pharmacogenetics in Thorough QT Trials", Molecular Diagnosis & Therapy,
    Vol.10, 153-162(2006).
•   R. Judson, "Pharmacogenetics in Drug Development and Research" in Electrical Diseases of the Heart: Genetics, Mechanisms,
    Treatment, Prevention edited by Gussak, Antzelevitch, Wilde, Friedman, Ackerman and Shen (Springer, 2008).
•   R.J. Kavlock, G. Ankley, J. Blancato, M. Breen, R. Conolly, D. Dix, K. Houck, E. Hubal, R. Judson, J. Rabinowitz, A. Richard,
    R.W. Setzer, I. Shah, D. Villeneuve, E. Weber, "Computational Toxicology- A State of the Science Mini Review",
    Toxicological Sciences, Vol. 103, 14-27 (2008)
•   R.J Kavlock, D. Dix, K. Houck, R. Judson, M. Martin, A. Richard, "ToxCast: Developing Predictive Signatures for Chemical
    Toxicity", AATEX Journal WC6 Proceedings Vol. 14, 623 (2008).
•   A. Richard, C. Yang, R. Judson, "Toxicity Data Informatics: Supporting a New Paradigm for Toxicity Prediction", Toxicology
    Mechanisms and Methods, Toxicology Mechanisms and Methods Vol. 18, 103-118 (2008)
•   C.J. Verzilli, T. Shah, J.P. Casas, J. Chapman, M. Sandhu, S. Debenham, M.S. Boekholdt, K.T. Khaw, N. Wareham, R. Judson,
    E.J. Benjamin,  S. Kathiresan; M.G. Larson, J.Rong, R. Sofat,  S.E. Humphries, L.  Smeeth, G. Cavalleri, J.C. Whittaker, A.D.
    Hingorani, "Bayesian meta analysis of genetic association studies with different sets of markers", Am. J.Hum.Gen. Vol. 82, 859-
    872 (2008).
•   R. Judson, F. Elloumi, R.W. Setzer, Z. Li, I. Shah, "A Comparison of Machine Learning Algorithms for Chemical Toxicity
    Classification Using a Simulated Multi-Scale Data Mode" BMC Bioinformatics Vol. 9, 241 (2008)
•   R. Judson, A. Richard, D. Dix, K. Houck, F. Elloumi, M. Martin, T. Cathey, T.R. Transue, R. Spencer, M. Wolf, "ACToR-
    Aggregated Computational Toxicology Resource", Toxicology and Applied Pharmacology, Vol. 233, 7-13 (2008).
•   R. Judson, A. Richard, DJ. Dix, K. Houck, M. Martin, R. Kavlock, V. Dellarco, T. Henry, T. Holderman, P. Sayre, S. Tan, T.
    Carpenter, E. Smith, "The Toxicity Data Landscape for Environmental Chemicals", Environmental  Health Perspectives (2008, in
    press)
•   M.T. Martin, R. Judson, D. Reif, DJ. Dix, R. Kavlock, "Profiling Chemicals Based on Chronic Toxicity Profiles from the U.S.
    EPA ToxRef Database", Environmental Health Perspectives, Vol. 117, 392-399 (2009)
•   M.T. Martin, E. Mendez, D.G. Corum, R.S. Judson, R. J. Kavlock, D.Rotroff, D.  J. Dix, "Profiling the Reproductive Toxicity of
    Chemicals from Multigeneration Studies in the Toxicity Reference Database (ToxRefDB)", Toxicological Sciences  (2009, in
    press).
•   T.B. Knudsen, M.T. Martin, RJ. Kavlock, R.S. Judson, DJ. Dix, and A.V. Singh, "Profiling the Activity of Environmental
    Chemicals in Prenatal Developmental Toxicity Studies using the U.S. EPA's ToxRefDB", Neurotoxicology and Teratology, Vol.
    31,241(2009)
•   A.W. Knight, S. Little, K. Houck, D.  Dix, R. Judson, A. Richard, N. McCarroll., G. Akerman, C. Yang, L. Birrell, R. M.
    Walmsley "Evaluation of High-throughput Genotoxicity Assays Used in Profiling the US EPA ToxCastTM Chemicals",
    Regulatory Toxicology and Pharmacology (in press 2009)

ISSUED PATENTS
•   U.S. Patent 6,586,183 - Association of Beta 2-Adrenergic Receptor Haplotypes with Drug Response
•   U.S. Patent 6,931,326 - Methods for  Obtaining and Using Haplotype Data
•   U.S. Patent 6,944,767 - Methods and Apparatus for Ensuring the Privacy of Personal Medical Information
•   U.S. Patent 7,058,517 - Methods for Obtaining and Using Haplotype Data (2)
•   U.S. Patent 7,250,258 - CDK Genetic Markers Associated with Galantamine Response
                                    Previous   I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): KavlOCk, Robert J.
                                     BIOGRAPHICAL SKETCH

NAME
Robert J. Kavlock
eRA COMMONS USER NAME
Kavlock
POSITION TITLE
Research Biologist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
University of Miami
University of Miami
Federal Executive Institute (Class 321)
DEGREE
(if applicable)
B.S.
Ph.D.
YEAR(s)
1969-73
1973-77
2006
FIELD OF STUDY
Biology
Embryology
Leadership
A. POSITIONS and HONORS

Research and Professional Experience:
2005-        Director, National Center for Computational Toxicology, ORD, USEPA
2004-2005   Special Assistant (Computational Toxicology) to NHEERL Director
1999-2000:   Acting Associate Director for Health, NHEERL (June-January)
1989-2004:   Director, Reproductive Toxicology Division, NHEERL, USEPA, RTP, NC
1981-1989:   Chief, Perinatal Toxicology Branch,  DTD, HERL, USEPA, RTP, NC
1979-1981:   Res. Biologist,  Perinatal Toxicology Branch, DTD, HERL, USEPA, RTP, NC
1977-1979:   Research Associate, Dept. of Biology, Univ. of Miami, Coral Gables, FL
Adjunct Associate Professor, Department of Pharmacology, Duke University
Adjunct Assistant Professor, Department of Zoology, NCSU

Professional Societies and Affiliations:
Memberships: Society of Toxicology, including Developmental and Reproductive Toxicology Specialty Section
   and the North Carolina Society of Toxicology; Teratology Society
Editorial Boards: Toxicological Sciences (1994-2000); Teratogenesis, Carcinogenesis and Mutagenesis (to
   2003); Journal of Toxicology and Environmental Health, Part B (current); Birth Defects Research, Part B
   (2003-present); Neurotoxicology and Teratology (2006-present); Environmental Health Perspectives
   (Associate Editor, 2006-present)

Honors and Awards:
US Human Society North American Alternative Award, 2008; US EPA/ORD Statesman of the Year, 2007; US
EPA Bronze Medals, 2004;  Computational Toxicology Design Team, 1998, for efforts on Harmonized
Reproductive Testing Guidelines; US EPA Science Achievement Award, 1995 for efforts on validation  of
benchmark dose methodology; US EPA Scientific and Technological Achievement Awards: Level I, 1994;
Level II, 1983, 1984, 1984,  1986, 1993; Level  III, 1983, 1984, 1985, 1985, 1987, 1989, 1992, 1993 for various
peer reviewed scientific publications; US EPA Silver Medal, 1985 for development of an in vivo screening
procedure for developmental toxicity; Best Paper of the Year Award, Fundamental and Applied Toxicology,
1995; President, Teratology Society, 2001; President, Reproductive and Developmental Toxicology Specialty
Section, 1997; President, North Carolina Society of Toxicology, 1999

Selected invitations at National & International  Symposia:
ZEBET 20th Year Anniversary, Berlin, October 2009; /World Congress on Alternatives to Animals, Rome,
September 2009; National Research Council Symposium on Toxicity Based Risk Assessment, Washington
DC, May 2009;  ILSI Developmental Toxicity - New Directions Workshop, Washington DC, April 2009;
California Institute of Regenerative Medicine Stem Cells in Predictive Toxicology, Berkeley, 2008; Gene
Environmental Interactions in  Reproduction, Malmo, Sweden, Feb 2008; European Chemicals Agency,
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)
Page 1	
Continuation Format Page
                              Previous
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):  KavlOCk, Robert J.

October 2007; Duke University SBRP Symposium on HTS Assays, October 2007; 6th World Congress on
Alternatives to Animals in Research, Tokyo, August 2007; American Chemistry Council, Washington, August
2007; 2nd Low Dose Workshop on Low Dose Effects of Environmental Toxicants,  Berlin, April 2007; US EPA
Office of Pesticide Programs, Washington, Feb.2007; Duke University School of the Environment, Durham, NC
Jan. 2007; US EPA Science Policy Council, Dec. 2006; National Academy of Science Committee of Risk
Assessment, Washington, Dec 2006; 4th International Academic Conference on Environmental and
Occupational Medicine, Kunming, China, Oct. 2006; US EPA Office of Drinking Water, Sept. 2006; American
Association of Pharmaceutical Sciences, San Antonio, Oct. 2006;  US EPA Regional Science Liaisons, April
2006; ORNL Bioinformatics Summit, April 2006;  Society of Toxicology, San Diego, March 2006; US EPA
Region 6, Dallas, Jan 2006;  European Commission Science Delegation, RTP, Jan 2006; Arizona State
University Workshop on Genetics and Environmental Regulation; Jan. 2005; National Academy of Science
Committee on the Future of Toxicology, Jan. 2005; National Academy of Science Workshop on Sustainability
in the Chemical Industry, Feb. 2005; US EPA National Risk Management Research Laboratory, Cincinnati,
Feb. 2005; US EPA Office of Science Coordination and Policy, Mar 2005; Tox Forum, Aspen, July, 2005.
2007; July 2005; Oak Ridge  National Laboratory Ecogenomics Meeting, Knoxville, July, 2005; 5th World
Congress on Alternatives to Animals in Research, Berlin, August 2005; Board of Scientific Councilors, Jan,
2004; FDA Science Forum, Washington DC, Jan 2004; US/EU Bilateral Meeting on Chemical Safety,
Charlottesville, VA, Apr 2004; EPA Office of International Affairs United Kingdom  Science Exchange, Aug
2004; National Academy of Sciences Future of Toxicology Committee, Sept 2004; National Toxicology
Program Workshop on Thyroid Toxicity, Washington DC, Apr 2003; US EPA Science Advisory Board,
Washington DC, Sept 2003.

Selected Expert Committees/Advisory Panels/Organizing Committees:
Expert Panel Member, Integrated Testing of Pesticides, Canadian Council of Academies (2009-); Expert,
WHO Working Group of the Health of Effects of DDT, Geneva, June 2009;  Reviewer, European 7th
Framework Proposals for the Innovative Medicines Initiative, Brussels, February 2009; Co-Chair, Tox21
Working Group (2007-);  NIEHS SBRP Peer Review Panel, Sept 2007; Chair, EPA International Science
Forum on Computational Toxicology, 2007; OCED Molecular Screening Initiative  Working Group (2005-
present); WHO/I PCS Working Group on Principles for Evaluating Health Risks to Children, 2003-2006; Chair,
EPA Workshop on a Framework for Computational Toxicology, 2003; Chair, WHO/I PCS and Japan MOE
Workshop on Research Needs for Endocrine Disrupters, 2003; I LSI Workgroup on Human Framework for
Using MOA Information to Evaluate Human Relevance of Animal Toxicity Data, 2002-2004; EPA/NIEHS/ACC
Scientific Frontiers in Developmental Toxicity Risk Assessment, 2002; American Chemistry Council Focal Area
Leader,  Long Range Research Initiative, 2002-2005; NTP/NIEHS  Endocrine Disrupters Low-Dose Peer
Review, 2000; I PCS/WHO Steering Group for International State-of-Science Assessment of Endocrine
Disrupters, 1997-2002; ;  Reviewer, European Commission Framework Calls, 2001, 2002, 2004, 2007; NIH
ALTX-4  Study Section, Standing Member,  1997-2001; CUT Science Advisory Committee, 1996-2001; Chair,
NTP Center for Evaluation of Risk to Human Reproduction Expert Panel on Phthalates, 1999-2000 and 2005);
IARC Monograph Working Groups, Volumes 36, 41, 47, 54, 58, 73, and 79; IARC Handbooks of Cancer
Prevention, Volumes 2 and 4.

Selected Assistance/Advisory Support to the Agency:
Co-Chair, EPA Science Policy Council Future of Toxicity Testing Working Group (2007-2009); Chair, US
EPA/ORD Technical Qualifications Review Board for Science and Technology Positions (2007- 2009); Chair,
EPA/ORD Computational Toxicology Design Team (2003) and Implementation Steering Group, 2004-present;
NHEERL Genomics Program Steering Committee, 2001-2002; Endocrine Disrupter Methods Validation
Subcommittee (EPA FACA), 2001-2003; Co-Organizer, Japanese NIES/US EPA  Workshop on EDCs, Tokyo,
February 2000; NHEERL Human Health Research Strategy Implementation Team, 2001-2003;  Chair,
NHEERL Branch Chief Career Ladder Committee, 1997-1998; Chair, ORD Endocrine Disrupter Research
Strategy Committee, 1995-1998; Chair, EPA Workshop to Develop Research Needs for Endocrine Disrupters,
1995; Chair, HERL Communications Issues Committee, 1992 ;EPA Working Group on Harmonized Testing
Guidelines for Reproductive  and  Developmental Toxicity,  1991-1999; Co-Chair, HERL(NHEERL) Technical
Qualifications Board, 1989-1997; Co-Chair, ORD RIHRA Topic IV Subcommittee  (Biologically Based Dose

PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                Page 2	                          Continuation Format Page
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): KavlOCk, Robert J.

Response Models), 1988-1993; Chair, OHR/HERL Mission Statement Committee, 1987; EPA Working Group
on Developmental Toxicity Testing Guidelines, 1984-1985.

B. SELECTED PUBLICATIONS (selected more than 180 total).
Kramer, MG, Firestone, M, Kavlock,  R and Zenick, H (2009). The Future of Toxicity Testing for Environmental
   Contamainants. Environ. Health Perspectives 117:A283-A284.
Knudsen, TB, Martin, MT, Kavlock, RJ, Judson, RS, Dix, DJ and Singh, AV (2009). Profiling the Activity of
   Environmental Chemicals in Prenatal Developmental Toxicity Studies using the US EPAs ToxRefDB.
   Reproductive Toxicology 28:209-219.
Martin, MT, Mendez, E, Corum, DG,  Judson, RS, Rotroff, DM and Dix, DJ (2009). Profiling the Reproductive
   Toxicology of Chemicals from Multigeneration Studies in ToxRefBD. Toxicol. Sci. 11:181-190.
Martin, MT., Judson, RS., Reif, DM, Kavlock, RJ and Dix, DJ (2009). Profiling Chemicals Based on Chronic
   Toxicity Results from the US EPA ToxRef Database. Environ. Health Perspectives 117:392-399.
Kavlock, R. Austin, CP and Tice, RR (2009). Toxicity Testing in the 21st Century: Implications for Human
   Health Risk Assessment. Risk Analysis 29: 485-487.
Judson, R,  Richard, A., Dix, DJ, Martin, M, Kavlock, R, Dellarco, Henry, T, Holderman, T.,  Sayre, P, Tan, S,
   Carpenter, T, and Smith, E (2009).  The Toxicity Data Landscape for environmental chemicals.  Environ.
   Health Perspectives 177:685-695.
Cohen Hubal, EA, Richard, AM, Shah, I, Gallagher, J, Kavlock, R, Blancato, J and Edwards, SW(2008).
   Exposure science and the US EPA National Center for Computational Toxicology. Journal of Exposure
   Science and Environmental Epidemiology,  1-6.
Houck, KA and Kavlock, RJ (2008).  Understanding mechanisms of toxicity: Insights from drug discovery.
   Toxicol and Appl. Pharm.  277:163-178.
Rogers, JM and RJ Kavlock (2008).  Developmental toxicity.  In: Casarett & Doull's Toxicology: The
   Basic Science of Poisons, 7th ed. CD Klaassen, editor. McGraw-Hill, Inc., New York, NY, 301-331.
Dix, DJ, Houck,  KA, Martin, MT, Richard, AM, Setzer, RWand Kavlock, RJ (2007). The ToxCast Program for
   Prioritizing Toxicity Testing of Environmental Chemicals. Toxicol. Sci., 95(1); 5-12.
Martin, MT, Brennan, R, Hu, W, Ayanoglu, E, Lau, C, Ren, H, Wood, CR, Gorton, JC, Kavlock, RJ and Dix, D.
   (2007).  Toxicogenomic Study of  Triazole Fungicides and Perfluoroalkly Acids in Rat Livers Accurately
   Categorizes Chemicals and  Identifies Mechanisms of Toxicity.  Toxicol. Sci. 97(2): 595-613.
Kavlock, R, Barr, D, Boelkeheide, K, Breslin, W, Breysse, P, Chapin, R, Gaido, K, Hodgson, E, Marcus, M,
   Shea, K and Wlliams, P. (2006).  NTP-CERHR Expert Panel update on the reproductive and developmental
   toxicity of di(2-ethylhexyl phthalate. Repro.  Toxicol. 22:291-399.
Kavlock, RJ, Ankley, GT, Collette, T, Francis,  E, Hammerstrom, K, Fowle, J, Tilson, H, Schmieder, P, Veith,
   GD, Weber, W, Wolf, DC, and Young, D. (2005).  Computational Toxicology: framework, partnerships and
   program development.  Repro. Tox. 19:281-290.
Kavlock. RJ and Cummings, A (2005).  Mode of Action: Reduction of Testosterone Availability-Molinate-
   induced Inhibition of Spermatogenesis. Crit. Rev. Tox.  35:685-690.
Cummings, A and Kavlock, RJ (2005). A systems biology approach to developmental toxicology. Repro.
   Toxicol.  19:281-290.
Cummings, A and Kavlock, RJ  (2004).  Gene-environment interactions: A review of effects on reproduction and
   development. Critical Reviews in Toxicology 34:461-485.
Rockett, JC, Kavlock, RJ, Lambright, C, Parks, LG, Schmid, JE, Wlson, VS, Wood, C and Dix, DJ
   (2002).  DNA arrays to monitor gene expression in  rat blood and uterus following 17(3-estradiol
   exposure: biomonitoring environmental effects using surrogate tissues. Tox. Sci. 69:49-59
Damstra T, Barlow, S,  Bergman A,  Kavlock R and Van Der Kraak, G, editors (2002). International
   Programme On Chemical Safety Global Assessment: The State-Of-The-Science Of Endocrine
   Disrupters.  World  Health Organization, Geneva.
Setzer RW, Lau C, Mole ML, Copeland MF, Rogers JM, and  Kavlock RJ. (2001). Toward a biologically based
   dose-response model for developmental toxicity of 5-fluorouracil in the rat: a mathematical construct.
   Toxicol Sci.; 59(1):49-58.
Barlow, S, RJ Kavlock, JA  Moore, SL Schantz  DM Sheehan, DL Shuey, and JM Lary (1999).  Teratology
   Society Position Paper: The developmental toxicity of endocrine disrupters to humans.  Teratology
   60(6):365-375.	
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                 Page 3	                            Continuation Format Page
                              Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): KavlOCk, Robert J.

Reiter, LW, C DeRosa, RJ Kavlock, G Lucier, MJ Mac, J Melillo, RL Melnick, and T Sinks (1998).  The U.S.
   Federal framework for research on endocrine disrupters and an analysis of research programs supported
   during Fiscal Year 1996.  Environ Health Persp 106(3):105-113.
Kavlock, RJ and GP Daston  (1997).  Handbook of Experimental Pharmacology, Vol. 124 Drug Toxicity in
   Embryonic Development. Vol I and II: Advances in Understanding Mechanisms of Birth Defects:
   Morphogenesis and Processes at Risks.  (610 pgs). Springer-Verlang, Heidelberg, Germany, ISBN 3-540-
   61259-9; ISBN 3-540-61261-0.
Cooper, RL and RJ Kavlock  (1997).  Endocrine disrupters and reproductive development: A weight-of-
   evidence overview.  Journal of Endocrinology 152:159-166.
Kavlock, RJ and GT Ankley (1996). A perspective on the risk assessment process for endocrine-disruptive
   effects  on wildlife and human health. Risk Analysis 16(6):731-739.
Kavlock, RJ, GP Daston, C DeRosa, P Fenner-Crisp, LE Gray, S Kaattari, G Lucier, M Luster, MJ Mac, C
   Maczka, R Miller, J Moore, R Rolland, G Scott, DM Sheehan, T Sinks and HA Tilson (1996). Research
   needs for the risk assessment of health and environmental effects of endocrine disrupters: A Report of the
   U.S. EPA sponsored workshop. Environmental Health Perspectives Vol. 104, Supplement 4, pp 715-740.
Kavlock, RJ and RWSetzer  (1996).  The road to embryologically based dose-response models.
   Environmental Health Perspectives 104 (Suppl 1):107-121.
Kavlock, RJ, BC Allen,  EM Faustman, CA Kimmel (1995). Dose response assessments for developmental
   toxicity:IV. Benchmark doses for fetal weight changes. J Fund Appl Toxicol 26:211-222.
Faustman, EM, BC Allen, RJ Kavlock, and CA Kimmel (1994). Dose-Response Assessment for
   Developmental Toxicity:  I. Characterization of Data Base and Determination of NOAELs.  Fundamental
   and Applied Toxicology 23:478-486.
Shuey, DL, C Lau, RJ Kavlock, JM Rogers, TR Logsdon, RM Zucker, KH Elstein, MG Narotsky, and RW
   Setzer (1994). Biologically-Based Dose-Response Modeling in Developmental Toxicology:  Biochemical
   and Cellular Sequelae  of 5-Fluorouracil  Exposure in the Rat Fetus.  Toxicol. Appl. Pharm. 126: 129-144.
Rogers, JM, ML Mole, N Chernoff, BD Barbee, Cl Turner, TR  Logsdon and RJ Kavlock (1993).  The
   Developmental Toxicity of Inhaled Methanol in the CD-1 Mouse, with Application of Quantitative Dose-
   Response Modeling for Estimation of Benchmark Doses. Teratology 47:175-188.
Oglesby, LA, MT Ebron-McCoy, TR Logsdon, F Copeland, PE Beyer and RJ Kavlock (1992).  In Vitro
   Embryotoxicity of a Series of Para-Substituted Phenols: Structure, Activity and Correlation with In Vivo
   Data.  Terato/ogy45(1):11-33.
Kavlock, RJ, GA Green, GL Kimmel, R Morrissey,  E Owens, JM Rogers, TW Sadler, HF Stack,  MD Waters
   and F Welsch  (1991).  Activity Profiles of Developmental Toxicity:  Design Considerations and Pilot
   Implementation.  Terato/ogy43:159-185.
Kavlock, RJ (1990).  Structure-activity relationships in the developmental toxicity of substituted phenols:  In
   vivo effects.  Teratology 41(1):43-59.
Kavlock, RJ, R Short and N Chernoff (1987).  Further evaluation of an in vivo teratology screening procedure.
   Teratogenesis, Carcinogenesis, and Mutagenesis7:7^Q.
Kavlock, RJ, N Chernoff and EH Rogers (I985). The effect of acute maternal toxicity on fetal development in
   the mouse. Teratogenesis, Carcinogenesis, and Mutagenesis  5(I):3-I3.
Kavlock, RJ and JA Gray (I982).  Evaluation of renal function in neonatal rats. Biol. of the Neonate 41:279-288.
Chernoff, N and RJ Kavlock  (1982). An in vivo screen utilizing pregnant mice. J. of Toxicology and
   Environmental Health  10:541-550.
Gray, LE Jr., RJ Kavlock, N Chernoff, J Ferrell, J McLamb and J Ostby  (1982). Prenatal exposure to the
   herbicide TOK destroys the rodent Harderian gland. Science  215:293-294.
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                  Page 4	                           Continuation Format Page
                               Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Knildsen, Thomas B.
                                      BIOGRAPHICAL SKETCH
          Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                           Follow this format for each person. DO NOT EXCEED FOUR PAGES.

NAME
KNUDSEN, Thomas B.
eRA COMMONS USER NAME
TOKNUD01
POSITION TITLE
Developmental Systems Biologist (title 42)
National Center for Computational Toxicology
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
Albright College, Reading PA
Thomas Jefferson University, Philadelphia PA
Children's Hospital, Cincinnati OH
Emory University, Atlanta GA
DEGREE
(if applicable)
B.S.
Ph.D.
Postdoc
Postdoc
YEAR(s)
1976
1981
1981-82
1982-86
FIELD OF STUDY
Biology
Anatomy
Cell Biology
Developmental Biology
A. POSITIONS and HONORS

Research and Professional Experience:
1986-90   Assistant Professor, Dept. of Anatomy, E. Tenn. State University, Johnson City TN
1990-03   Asst./Assc./Full Prof, (tenured) Path. Anat. Cell Biol., Thomas Jefferson University, Philadelphia PA
2003-pres. Editor-in-Chief, Reproductive Toxicology (Elsevier)
2004-07   Professor (tenured), Mol. Cell & Craniofacial Biol., U of Louisville, School of Dentistry, Louisville KY
2004-07   Director, Systems Analysis Laboratory, U of Louisville, Birth Defects Center
2004-pres. Center for Genetics  and Molecular Medicine, U Louisville, Louisville KY
2005-07   Professor of Biochemistry and Molecular Biology (joint appointment), U Louisville
2007      Center for Environ. Genomics & Integrative Biology, U Louisville, Louisville KY
2007      Clinical and Translational Science Institute, Louisville, Biomedical Informatics Group
2007-pres. Adjunct Professor, University of Louisville
2007-pres. Developmental Systems Biologist (title 42), NCCT,  US Environmental Protection Agency, RTP NC

Professional Societies and Affiliations:
Memberships:    American Association for the Advancement  of Science (AAAS), Sigma Xi, Teratology
                 Society, Society of Toxicology, European Teratology Society
Editorial Boards:  Reproductive Toxicology (Elsevier), Editor-in Chief (2003 - present); Birth Defects Research
                 (Part C), 2002 - present; Developmental Dynamics, 2002 - present; Co-Editor,
                 Developmental Toxicology (Comprehensive Toxicology Series - Elsevier)

Honors and Awards:
Fellowships: Predoctoral trainee, T32 HD07075 (1977-80);  NIH Postdoctoral trainee, F32 HD06212 (1982-85)
Federal grants (PI): NIH/NICHD grant R29 HD24143 (1989-94); NIH/NICHD grant RO1 HD30302 (1993-98);
US EPA grant CR 824445 (1995-99); NIH/NIEHS grant RO1 ES09120 (1998-01); NIH/NIEHS grant 1 R13
ES012410 (2003) ; US EPA-NCERQA grant R 827445-01-0 (1999-03): NIH/NIAAA grant RO1AA13205 (2001-
08); NIH/NIEHS grant 1 R13 ES013116 (2004) ; NIH/NIEHS grant 2 RO1 ES09120 (2001-07); NIH/NIEHS
Training grant T32 ES07282 (1998-08) ; NIH/NIEHS grant 1 R21 ES013821-01 (2005-08)
Special Recognition:  Wilson Publication Award, Teratology Society (2002); University Scholar, U Louisville
(2003-08); ResearchlLouisville: 3rd place, Innovation in Biotechnology (2004); Distinguished Alumni Award
(2008) Thomas Jefferson University; Keynote Speaker, McGill  University Pharmacology Research Day (2009).

Leadership roles (selected advisory panels, organizing committees, workshops, editorial  boards):
Editor-in-Chief, Reproductive Toxicology (2003 - present); President of the Teratology Society (2007-08);
Chairman, Program Committee, 47th Annual Meeting of the Teratology Society; Council of the Teratology
Society (1999-02 and 2005-09); Scientific Liason Task Force, Society of Toxicology (2008-12); European
 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)
Page J	
Biographical Sketch Format Pag
                               Previous
 TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Knildsen, Thomas B.

Commission, Expert Panel (FP7); Steering Committee, First International Workshop on Virtual Tissues (EPA,
April 21-23, 2009); Steering Committee, ILSI-HESI DART Workshop on "Developmental Toxicology New
Directions", Leader - Working Group on New Technologies (2009); Co-Organizer, Symposium on  "Gene
Regulatory Networks in Developmental Biology and Computational Toxicology", Teratology Society 2009;
Director, Systems Analysis Laboratory, ULSD Birth Defects Center (2004-07); Workgroup "Consensus Panel
on Renaming the Peripheral Benzodiazepine Receptor (PER)" (2004-05); Chairman, Committee on
Bioinformatics in Teratology (2002-05); Organizer, symposium on "Systems Biology: a new venue for exploring
mechanisms of developmental toxicity", Society of Toxicology (2004); Organizer, workshop on "Microarray
Data Analysis and  Bioinformatics"; Teratology Society (2002); Organizer, symposium on "Pluripotent Stem
Cells in and of the  Embryo", Teratology Society (2001); Organizer, symposium on  "Genomics in Birth Defects
Research", Teratology Society (1998); Director, Education Course on "Risk Assessment in Developmental
Toxicology", Teratology Society (1996); FASEB delegate from the Teratology Society (1998-01); NIH Human
Embryology & Development II Study Section (1994-98).

Speaking invitations (2007-09) at National & International Symposia (selected from 80 total):
Genes for low-dose regulation of the embryonic transcriptome (Hormesis - 6th  Int Conf, Amherst 2007)
Computational Toxicology: New Approaches to Improve Environ. Hlth Protection (RIVM, Bilthoven  NL2007)
Genomics as a tool in developmental toxicology (Eurotox, Amsterdam 2007)
Computational Toxicology: New Approaches to Improve Env. Hlth Protection (AIHA-SOT, Louisville 2007)
Computational Framework to Predict Toxicity & Prioritize Testing of Env. Chems. (NCAC-SOT 2007)
Virtual tissues and artificial life simulators: applications in birth defects research (Philadelphia 2008)
Virtual tissues and developmental systems biology (Gordon Research Conference, Andover 2008)
Computational embryology (CASCADE training in Developmental Risk Assessment, Berlin 2008)
Virtual tissue models in developmental toxicity research (SOT, Baltimore 2009)
Virtual tissue models and computational embryology (McGill University, Montreal 2009)
Modeling the Embryo (Gordon Research Conference, New London 2009)
Gene  regulatory networks and the underlying biology of developmental toxicology (ETS, Aries FR 2009)

B. SELECTED PUBLICATIONS (70 total).

Chinsky JM, Ramamurthy V, Fanslow WC, Ingolia DE, Blackburn MR, Shaffer  KT, Higley HR, Trentin JJ,
   Rudolph FB, Knudsen TB, and Kellems RE (1990) Developmental expression of adenosine deaminase in
   the upper alimentary tract of  mice Differentiation 42: 172-183.
Airhart MJ, Roberts MA, Knudsen TB and Skalko, RG (1990) Axonal guidance of adenosine deaminase
   immunoreactive primary afferent fibers in developing mouse spinal cord Brain Res Bull 25: 299-309.
Hong  L,  Mulholland J, Chinsky JM, Knudsen TB, Kellems RE and Glasser SR (1991) developmental
   expression of adenosine deaminase during decidualization in the rat uterus Biol Reprod 43: 83-93.
Knudsen TB, Blackburn MR, Chinsky JM, Airhart MJ and Kellems RE (1991) Ontogeny of adenosine
   deaminase in the mouse decidua and placenta: immunolocalization and embryo transfer studies. Biol.
   Reprod. 43: 171-184.
Knudsen TB, Winters RS, Otey  SK, Blackburn MR, Airhart MJ, Church JK and Skalko RG (1992) Effects of
   (R)-deoxycoformycin (pentostatin) on intrauterine nucleoside catabolism and embryo viability in the
   pregnant mouse. Teratology  45: 91-103.
Blackburn MR, Gao X, Airhart MJ, Skalko RG, Thompson LF and Knudsen TB (1992) Adenosine levels in the
   early postimplantation mouse uterus. Quantitative analysis by HPLC-fluorometric detection and  spatio-
   temporal regulation by 5'-nucleotidase and adenosine deaminase. Dev. Dynam. 194: 155-168.
Gao X, Blackburn MR and Knudsen TB (1994) Activation of apoptosis in early mouse embryos by 2'-
   deoxyadenosine exposure. Teratology 48: 1-12.
Gao X, Knudsen TB, Ibrahim MM and Haldar S (1995)  Bcl-2 relieves deoxyadenylate stress and suppresses
   apoptosis in Pre-B leukemia cells.  Cell Death Different. 2: 69-78.
Puffinbarger NK, Hansen KR, Resta R, Laurent AB, Knudsen TB, Madara JL and Thompson LF (1995)
   Production and characterization of multiple antigenic peptide antibodies to the adenosine A2b receptor. Mol
   Pharmacol 47:  1126-1132.

 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                Page	                      Biographical Sketch Format Page
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Knildsen, Thomas B.

Ibrahim MM, Weber IT and Knudsen TB (1995) Mutagenesis of human adenosine deaminase to active forms
   that partially resist inhibition by pentostatin. Biochem. Biophvs. Res. Commn. 209: 407-416.
Wubah JA, Ibrahim MM, Gao X, Nguyen D, Pisano MM and Knudsen TB (1996) Teratogen-induced eye
   defects mediated by p53-dependent apoptosis. Current Biology 6:60-69.
Resta R, Hooker SW, Laurent AB, Rahman SMJ, Franklin M, Knudsen TB, Nadon ML and Thompson LF
   (1997) Insights into  thymic purine metabolism and adenosine deaminase deficiency revealed by transgenic
   mice overexpressing ecto-5'-nucleotidase (CD73). J Clin Invest 99: 676-683.
Blackburn MR, Knudsen TB and Kellems RE (1997) Genetically engineered mice demonstrate that adenosine
   deaminase is essential for early postimplantation development. Development 124: 3089-97
Knudsen TB (1997)  Genetic and Cellular Pathways in Teratogen-induced Cell Death. In Comprehensive
   Toxicology (Vol. 10), Sipes IG, McQueen CA and Gandolfi AJ (eds.), New York: Pergamon, pp. 529-534.
Knudsen TB and Wubah  JA (1998) Transgenic animal models. Functional analysis of developmental toxicity
   as illustrated with the p53 suppressor model. In Handbook of Developmental Neurotoxicology, Slikker, W.
   Jr. and Chang, L.W. (eds.), San Diego: Academic Press, pp 209-221.
Ibrahim MM, Razmara M,  Nguyen D, Donahue RJ, Wubah JA and Knudsen TB (1998) Altered expression of
   mitochondrial 16S ribosomal RNA in p53-deficient mouse embryos revealed by differential display.
   Biochem Biophvs Acta 1403: 254-264.
Blackburn MR, Wubah JA, Thompson LF and Knudsen TB (1999) Transitory expression of the A2b adenosine
   receptor during implantation chamber development. Devel Dynam 216: 127-136.
Knudsen TB (1999)  HPLC-based mRNA differential display. In: Developmental Biology Protocols (vol. II),
   Tuan, R.S. and Lo, C.W. (eds.). Totowa: Humana Press, Inc., pp 337-341.
Knudsen TB (2000)  Mitochondrial transduction of teratogenesis. Teratology 62: 238-239.
Donahue RJ, Razmara  M, Hoek JB and Knudsen TB (2001) Direct influence of the p53 tumor suppressor on
   mitochondrial biogenesis and function. FASEB J 15: 635-644.
Wubah JA, Setzer RW, Lau C, Charlap JH and Knudsen TB (2001)  Exposure-disease continuum for2-chloro-
   2'-deoxyadenosine, a prototype ocular teratogen. I. Dose-response analysis. Teratology 64: 154-169.
O'Hara MF,  Charlap JH, Craig RC and Knudsen TB (2002) Mitochondrial transduction of ocular teratogenesis
   during methylmercury  exposure. Teratology 65:131-144.
Lau C, Narotsky MG, Lui D, Best D, Setzer RW, Mann PC, Wubah JA and Knudsen  TB (2002) Exposure-
   disease continuum for 2-chloro-2'-deoxyadenosine, a prototype teratogen. II. Induction of lumbar hernia in
   the rat and species  comparison for the teratogenic responses. Teratology 66: 6-18.
Knudsen TB, Charlap JH and Nemeth KA (2003) Microarray applications in developmental toxicology. In:
   Perspectives in Gene Expression. K. Appasani, ed. Eaton Publishing/BioTechniques Press, Westboro MA.
   pp 173-194.
Charlap JC, Donahue RJ and Knudsen TB (2003) Exposure-disease continuum for 2-chloro-2'-
   deoxyadenosine, a prototype ocular teratogen. 3. Intervention with PK11195. Birth Defects Res (A) 67:
   108-115.
O'Hara MF,  Nibbio BJ, Craig RC, Nemeth KR, Charlap JH and Knudsen TB (2003) Mitochondrial
   benzodiazepine receptors regulate oxygen homeostasis in the early mouse embryo. Repro Tox 17: 365-375.
Lee JW, Park J, Jang B and Knudsen TB (2004) Altered Expression of Genes Related to Zinc Homeostasis in
   Early mouse Embryos Exposed to Di-2-ethylhexyl phthalate. Toxicol Lett 152: 1-10.
Knudsen TB and Green ML (2004) Response characteristics of the mitochondrial DNA genome in
   developmental health and disease. Birth Defects Res (C) 72: 313-329.
Singh AV, Knudsen KB and Knudsen TB (2005) Computational systems analysis of  developmental toxicity:
   design, development and implementation of a birth defects systems manager (BDSM). Reprod Tox 19:
   421-439.
Knudsen TB (2005)  How can we use bioinformatics to predict which agents will cause  birth defects?  In:
   Primer in Teratology (B Hales and A Scialli, eds) Teratology Society Chapter 20, pp- 58-59
Szabo G, Hoek JB, Darley-Usmar V, Hajnoczky G, Knudsen TB, Mochly-Rosen D, and Zakhari, S (2005) RSA
   2004: Combined basic research satellite symposium - session three: Alcohol and Mitochondrial
   Metabolism: AT the crossroads of life and death. Alcoh. Clin. Exp. Res. 29: 1749-1752.
Nemeth KA, Singh AV and Knudsen TB (2005) Searching for biomarkers of developmental toxicity with
   microarrays: normal eye morphogenesis in rodent embryos. Toxicol Appl Pharmacol. 206: 219-228.

 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                 Page	                     Biographical Sketch Format Page
                              Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Knildsen, Thomas B.

Knudsen KB, Singh AV and Knudsen TB (2005) Data input module for Birth Defects Systems Manager.
   Reprod Tox 20: 369-375.
Slikker W Jr, Young JF, Corley RA, Dorman DC, Conolly RB, Knudsen TB, Erstad BL, Luecke RH, Faustman
   EM, Timchalk C and Mattison DR (2005) Improving predictive modeling in pediatric drug development:
   pharmacokinetics, pharmacodynamics and mechanistic modeling. Ann NY Acad Sci 1053: 505-518.
Kinane DF, Shiba H, Stathopoulou PG, Zhao H, Lappin DF, Singh AV, Eskan MA, Beckers S, Weigel S, Alpert
   B and Knudsen TB (2006) Gingival epithelial cells heterozygous for Toll-like receptor 4 polymorphisms
   Asp299Gly and Thr399lle are hypo-responsive to Porphyromonas gingivalis. Genes & Immun 7: 190-200.
Papadopoulos V, Baraldi M, TR Guilarte, Knudsen TB, Lacapere JJ, Lindemann P, Norenberg MD, Nutt D,
   Poupon  MF,  Weizman A, Zhang MR and Gavish M. (2006) TspO: New Nomenclature for the peripheral-
   type  Benzodiazepine receptor/ recognition Site (PER) based on its structure and molecular function.
   Trends in Pharmacol Sci. 8: 402-409.
Green ML, Singh AV, Zhang Y,  Nemeth KA, Sulik KK and Knudsen TB (2007) Reprogramming of genetic
   networks during initiation of the fetal alcohol syndrome. Devel Dynam 236: 613-631.
Singh AV, Knudsen KB and Knudsen TB (2007) Integrative Analysis of the mouse embryonic transcriptome.
   Bioinformation 1: 24-30.
Singh AV, Rouhka EC, Rempala GA, Bastian CD and Knudsen TB (2007)  Integrative database management
   for mouse development: systems and concepts. Birth Defects Res (C): 81: 1-19.
Calabrese EJ, Bailer J,  Bachmann KA, Bolger PM, Borak J, Cai L, Cedergreen N, Chiueh CC, Cherian MG,
   Clarkson TW, Cook RR, Diamond DM, Doolittle DJ, Dorato MA,  Duke SO, Feinendegen L, Gardner DE,
   Hart  RW, Hastings KL,  Hayes AW, Hoffman GR, Jaworowski Z, Johnson TE, Keller JG, Klaunig JE,
   Knudsen TB, Kozumbo WJ, Lettieri T, Liu S-Z, Maisseu A,  Maynard K, Masoro EJ, Mothersil  C, Newlin
   DB, Oehme FW, Phalen RF, Philbert MA, Rattan SIS, Riviere JE, Rodricks J, Sapolsky RM, Scott  BR,
   Seymour C, Smith-Sonneborn J, Snow ET, Spear L, Stevenson  DE, Thomas Y, Williams GM and  Mattson
   MP (2007) Biological Stress Response Terminology: Integrating the concepts of adaptive response and
   preconditioning stress within a hormetic dose-response framework Toxicol Appl Pharmacol 222: 122-128
Deaciuc IV,  Song Z, Peng X, Barve SS, Song M, He Q, Knudsen TB, Singh AV, and McClain CJ (2008)
   Genome-wide transcriptome expression in the  liver of a mouse model of high carbohydrate diet-induced
   liver steatosis and its significance for the disease. Hepatol International  2: 39-49
Barthold JS, McCahan, Singh AV, Knudsen TB, Si X, Campion L and Akins RE (2008) Altered expression of
   muscle and cytoskeleton-related genes in a rat strain with inherited cryptorchidism. J Androl. 29:352-366.
Datta S,  Turner D, Singh R, Ruest LB, Pierce WM  Jr and Knudsen TB (2008) Fetal Alcohol Syndrome (FAS)
   in C57BL/6 mice detected through proteomics screening of the amniotic fluid Birth Defects Res (Part A) 82:
   177-186
Knudsen TB and Kavlock RJ (2008) Comparative bioinformatics and computational toxicology. In:
   Developmental Toxicology 3rd edition. (B Abbott and D Hansen, editors) New York: Taylor and Francis,
   Chapter 12, pp 311-360
Knudsen TB,  Martin  NT, Kavlock RJ, Judson RS,  Dix DJ and Singh AV (2009)  Profiling the Activity of
   Environmental Chemicals in Prenatal Developmental Toxicity Studies using  the U.S. EPA's ToxRefDB.
   Reproductive Toxicol 28: 209-219
Benakanakere MR, Li Q, Eskan MA, Singh AV, Galicia JC, Stathopoulou P, Knudsen TB and Kinane  DF
   (2009) MicroRNA-105 modulates TLR-2 responses in human oral keratinocytes. J Biol Chem (In Press)
Ema M, Ise  R, Katoc H, Oneda S, Hirose A, Hirata-Koizumi M, Singh AV, Knudsen TB, and  lharad T  (2009)
  Fetal malformations and early embryonic gene expression response in cynomolgus monkeys maternally
  exposed to thalidomide, (submitted June 2009)
Knudsen TB,  Houck K, Judson RS, Singh AV, Weissman A, Mortensen H, Reif D, Dix DJ, and Kavlock RJ
  (2009) Biochemical activities of 320 ToxCast™ chemicals evaluated Across 239 functional  targets,
  (submitted July 2009)
Judson RS,  Houck KA,  Kavlock RJ, Knudsen TB,  Martin MT, Mortsensen HM, Reif DM, Richard AM,  Rotroff
  DM, Shah I and Dix DJ (2009) Predictive in vitro  screening  of environmental chemicals - the ToxCast
  project, (submitted August 2009)
 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                Page	                     Biographical Sketch Format Page
                              Previous  I    TOC

-------
        Staff Scientist (Last, First, Middle):   Little, Stephen B.
                                     BIOGRAPHICAL SKETCH
          Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                          Follow this format for each person. DO NOT EXCEED FOUR PAGES.

NAME
Stephen Blair Little
eRA COMMONS USER NAME
POSITION TITLE
Chemist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
Gardner-Webb University, Boiling Springs, NC
North Carolina State University, Raleigh NC
North Carolina State University, Raleigh NC
Indiana University, Bloomington, Indiana
DEGREE
(if applicable)
B.S.
B.S.
M.A.
Graduate
Certificate
YEAR(s)
1977
1981
2001
2009
FIELD OF STUDY
Mathematics
Biochemistry
Toxicology
Chemical Informatics
A.
POSITIONS and HONORS
Research and Professional Experience:
2005-present Computational Chemist, National Center for Computational Toxicology, ORD
1995-2005   Chemist with the Environmental Carcinogenesis Division, NHEERL, US Environmental
             Protection Agency, RTP, NC
1993-1995   Research Scientist, contract support scientist with Integrated Laboratory Systems, Inc. working
             at U.S. EPA, Environmental Carcinogenesis Division, RTP, NC
1984-1993   Research Assistant, contract support scientist with Environmental Health Research and Testing,
             Inc. working at U.S. EPA, Environmental Carcinogenesis Division, RTP, NC
1982 - 1984  Research Technician, BSRC , UNC-CH, Chapel Hill, NC
1981-1982   Research Technician, Cancer Research Center, UNC-CH, Chapel Hill, NC
1980-1981    Research Chemist (GS-5), ACB, U.S. EPA, RTP, NC
1977-1979   Technical Assistant for Acme United Corporation, Fremont, NC

Professional Societies and Affiliations:
1982-Present  American Chemistry
2002 - Present  Society of Toxicology
1988 - Present   Genetics and  Environmental Mutagenesis Society

Honors and Awards:
1995  Time off Award - for special technical assistance
1997  Outstanding Performance Award
1998, 2000   On-The-Spot (OTS) awards
2001  Group cash award.
2001  "S" Special Accomplishment Recognition Award -  forsignificant contribution to the EPA cosponsored
EMS International  Breast Cancer meeting
2003 Scientific and Technological Achievement Award - honorable mention
2004   "S" Award - for serving as Conazole QA TSR technical expert on review team
2006  Time off Award 2006 - group award for establishing NCCT

Selected Expert Committees/Advisory Panels/Organizing Committees:
2003-2005   Genetics and Environmental Mutagenesis Society Scientific Board of Councilors

2007-2009   Genetics and Environmental Mutagenesis Society Scientific Board of Councilors

 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                Page J	
                              Previous
                                        TOC

-------
        Staff Scientist (Last, First, Middle):   Little, Stephen B.


B. PUBLICATIONS (since 1995)

Lewis-Bevan, L., Little, S.B. and Rabinowitz, J.R.  Quantum Mechanical Studies of the Structure and
Reactivities of the Diol Epoxides of Benzo[c]phenanthrene,  Chemical Research in Toxicology, 8: 499, 1995.

Rabinowitz, J.R., Little, S.B. and Lewis-Bevan, L.  The Effect of Crowding in the Bay/Fjord Region on  the
Structure and Reactivities of Polycyclic Aromatic Hydrocarbons and their metabolites: Quantum Mechanical
Studies,  Polycyclic Aromatic Compounds, 11:237, 1996.

Rabinowitz, J.R., Gifford, E.M. and Little, S.B.  The Interactions between Chlorinated Dioxins and a Positively
Charged Molecular Probe: A New Molecular Interaction Potential, Journal of Computational Chemistry, 19:
673-684, 1998.

Little, S. B., Rabinowitz, J. R., Wei, P. and Yang W. A Comparison of Calculated and Experimental
Geometries for Crowded Polycyclic Aromatic Hydrocarbons and Their Metabolites,  Polycyclic Aromatic
Compounds, 14:53-61, 1999.

Rabinowitz, J.R., Little, S.B and Brown, K.W. "Why Does 5-Methyl Chrysene Interact with DMA Like Both a
Planar and a Non-Planar Polycyclic Aromatic Hydrocarbon? Quantum Mechanical Studies",  International
Journal of Quantum Chemistry, 88: 99-106. 2002.

Rabinowitz, J.R., Little, S. B., Brown, K.W., Benzo[a]pyrene and benz[c]phenanthrene: the effect of structure
on the binding of water molecules to the diol epoxices. Chemical Research in Toxicology. 15: 1069-1079.
2002.

Rabinowitz, J. R., Goldsmith, M-R., Little,  S. B., and Pasquinelli, M. A. (2008) Computational Molecular
Modeling for Evaluating the Toxicity of Environmental Chemicals:  Prioritizing Bioassay Requirements.
Environmental Health Perspectives, 116, 573-577.

Knight, AW, Little, S, Houck, K, Dix, D, Judson, R, Richard, A, McCarroll, N, Akerman, G, Yang, C, Birrell, L,
Walmsley, RM, Evaluation of High-throughput Genotoxicity Assays Used in Profiling the US EPA ToxCast™
Chemicals, Regulatory Pharmacology and Toxicology (2009) in press.

Rabinowitz, J; Little, S; Laws, S; Goldsmith, M-R, Molecular Modeling for Screening Environmental Chemicals
for Estrogenicity: Use of the Toxicant-Target Approach, Chemical Research in Toxicology, (2009) accepted for
publication..
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                 Page 2
                               Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Martin, Matthew T.
                                       Biographical Sketch
         Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                          Follow this format for each person. DO NOT EXCEED FOUR PAGES.

NAME
Matthew T. Martin
eRA COMMONS USER NAME
POSITION TITLE
Biologist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
James Madison University, Harrisonburg, VA
University of North Carolina at Chapel Hill
University of North Carolina at Chapel Hill

University of North Carolina at Chapel Hill

DEGREE
(if applicable)
BS
MS
PhD



YEAR(s)
2003
2008
Currently
Enrolled

Currently
enrolled

FIELD OF STUDY
Integrated Science and
Technology (ISAT)
Environmental Sciences
and Engineering
Environmental Sciences
and Engineering
Bioinformatics and
Computational Biology
Training Program
A. POSITIONS and HONORS

Research and Professional Experience:
2005-Present   Biologist,  National Center for Computational Toxicology,  USEPA, NC
2005           Database Analyst. CH2M Hill Inc. Herndon, VA
2003-2005      Environmental Scientist. Versar Inc. Springfield, VA

Professional Societies and Affiliations:
2008-present   Society of Toxicology


Honors and Awards:
2008           Superior Accomplishment Recognition Award (SARA) for development and
                application of the Toxicity Reference Database
2008           SARA for enabling the comprehensive retrospective analysis of legacy toxicity data
                in collaboration with scientists in the Office of Pesticide Programs (OPP)
2007           SARA Office of Pesticide Programs Strategic Vision on developing an Integrated
                and Hypothesis Based Assessment Paradigm
2003           ISAT Outstanding Senior Thesis Award

Selected Expert Committees/Advisory Panels/Organizing Committees:
2008-present OECD Extended One Generation Reproductive Toxicity Study (EOGRTS) Working Group

Selected Assistance/Advisory Support to the Agency:
2006-present Developing ToxRefDB for OPP use
B. PUBLICATIONS (8 Published & 7 Submitted).
 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)
Page J	
Biographical Sketch Format Page
                              Previous
 TOC

-------
       Principal Investigator/Program Director (Last, First, Middle):   Martin, Matthew T.

Martin MT, Rotroff DM, Reif DM, Dix DJ, Judson RS, Kavlock RJ, Houck KA. Identifying Toxicity-
   Dependent Nuclear Receptor Activation by Monitoring Transcription Factor Activity (2009),
   manuscript submitted.

Rotroff DM, Beam A, Martin MT, K. Freeman, E. LeCluyse, K. Houck, R. Judson, R. Kavlock, D. J.
   Dix ,S. Ferguson. Modulation of Xenobiotic Metabolizing Enzyme and Transporter Gene
   Expression in Primary Cultures of Human Hepatocytes by ToxCast Chemicals. (2009) Toxicology
   and Applied Pharmacology, Submitted.

Knudsen et al. Biochemical Activities of 309 ToxCast™ Chemicals Evaluated Across 239 Functional
   Targets (2009), manuscript submitted.

Judson RS, Houck KA, Kavlock RJ, Martin MT, Mortensen HM, Reif DR, Richard AM, Rotroff DM,
   Shah I, Dix DJ. Predictive In Vitro Screening of Environmental Chemicals - The ToxCast Project
   (2009) manuscript to submitted to EHP.

Shah et al. Human Nuclear Receptor Activity Stratifies Rodent Hepatocarcinogens (2009), manuscript
   submitted.

Martin MT, Rotroff DM, Reif DM, Dix DJ, Judson RS, Kavlock RJ, Houck KA. Identifying Toxicity-
   Dependent Nuclear Receptor Activation by Monitoring Transcription Factor Activity (2009),
   manuscript submitted.

Judson RS, Houck KA, Kavlock RJ, Knudsen TB, Martin MT, Mortensen HM, Reif DM, Richard AM,
   Rotroff DM, Shah I & Dix DJ. High throughput screening of toxicity pathways perturbed by
   environmental chemicals (2009), manuscript submitted.

Martin MT, Judson RS, Reif DM, Kavlock RJ, Dix DJ. Profiling Chemicals Based on  Chronic Toxicity
   Results from the U.S. EPA ToxRef Database.  Environmental Health Perspectives 117:1-8 (2009).

Martin MT, Mendez E, Corum DG, Judson RS,  Kavlock RJ, Rotroff DM and Dix DJ.  Profiling the
   Reproductive Toxicity of Chemicals from Multigeneration Studies in the Toxicity Reference
   Database (ToxRefDB). Toxicol Sci, doi: 10.1093/toxsci/kfp080 [online 10 April 2009] (2009).

Knudsen TB, Martin MT, Kavlock RJ, Judson RS, Dix DJ & Singh AV.  Profiling the Activity of
   Environmental Chemicals in Prenatal Developmental Toxicity Studies using the U.S. EPA's
   ToxRefDB Reproductive Toxicol 28, in press (2009).

Kavlock RJ, DJ Dix, KA Houck, RS Judson, MT Martin, AM Richard. (2007). ToxCast: Developing
   predictive signatures for chemical toxicity. Alt.  Animal Test Experiment. 14, Special Issue, 623-
   627.

Judson R, Richard A,  Dix DJ, Houck K, Martin M, Kavlock RJ, Dellarco V, Henry Holderman T, Sayre
   P, Tan S, Carpenter T and Smith E. The Toxicity Data Landscape for Environmental Chemicals.
   Environmental Health Perspectives 2009; In Press: doi.1289/ehp.0800168 [Online 22 December
   2008].
                                          Page_2
                            Previous  I    TOC

-------
       Principal Investigator/Program Director (Last, First, Middle):   Martin, Matthew T.

Judson RS, Richard AM, Dix DJ, Houck K, Elloumi F, Martin MT, Cathey T, Transue TR, Spender R,
   Wolf M. 2008b. ACToR - Aggregated Computational Toxicology Resource. Toxicology and
   Applied Pharmacology. doi:10.1016/j.taap.2007.12.037 [Online 11 June 2008].

Martin MT, Brennan RJ, Hu W, Ayanoglu  E, Lau C, Ren H, Wood CR, Gorton JC, Kavlock RJ, Dix DJ.
   Toxicogenomic study of triazole fungicides and perfluoroalkyl acids in rat livers predicts toxicity
   and categorizes chemicals based on mechanisms of toxicity. Toxicol Sci.,97(2):595- 613. (2007)

Dix, DJ, Houck, KA, Martin, MT, Richard,  AM, Setzer, RWand Kavlock, RJ. TheToxCast Program for
   Prioritizing Toxicity Testing of Environmental Chemicals. Toxicol. Sci., 95(1); 5-12. (2007).
                             Previous
                                           Page_3
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Mortensen, Holly M.
                                     BIOGRAPHICAL SKETCH
  NAME
  Holly M. Mortensen
  eRA COMMONS USER NAME
   POSITION TITLE

   Research Biologist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
University of Maryland (College Park, MD)
University of Maryland (College Park, MD)
Stanford University (Stanford, CA)
University of California (Davis, CA)
DEGREE
(if applicable)
Ph.D.
M.S.
M.S.
B.S.
YEAR(s)
2008
2005
2001
1999
FIELD OF STUDY
Human Genetics
Biology
Anthropological
Genetics
Biological Anthropology
A. POSITIONS and HONORS

Research and Professional Experience:
2008-Present Research Biologist, National Center for Computational Toxicology
             Office of Research and Development U.S. Environmental Protection Agency,
             Research Triangle Park, NC
2008-2001    Graduate Research Assistant, Department of Biology,
             University of Maryland, College Park MD
             (Advisors: Sarah A. Tishkoff)
2006         HHMI Teaching and Learning Fellow: Course Development in Bioinformatics
1999-2001    Graduate Research Assistant, Departments of Genetics and Anthroplogical Sciences,
             Stanford University, Stanford, CA
             (Advisors: Joanna Mountain, Luca Cavalli-Sforza)
1997-1999   Research Assistant, Department of Biological Anthropology, University of California, Davis, CA
             (Advisor: David Glenn Smith)


Professional Societies and Affiliations:
2008-present Society of Toxicology (SOT)
2005-present American Association for the Advancement of Science (AAAS)
2004-present American Association of Anthropological Genetics (AAAG)
2001-present American Society of Human Genetics (ASHG)
1999-present American Association of Physical Anthropologists (AAPA)

Honors and Awards:
2006         Invited Participant, Wellcome Trust Advanced Course: Working with the HapMap
             Wellcome Trust Genome Campus, Hixton, Cambridge, UK
2006         Howard Hughes Medical Institute Teaching and Learning Fellowship
             University of Maryland, College Park, MD
2004         AAAG Student Prize for Paper Presentation
             AAPA yearly meeting, Tampa, FL
2004         Invited Participant: Course on Molecular Evolution
             Marine Biological Laboratory, Woods Hole, MA
2001-2008   NSF-IGERT Fellowship in  Hominid Paleobiology
2001-2003   Eugenie Clark Fellowship

                                              Page 1
                              Previous
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Mortensen, Holly M.

             University of Maryland, College Park, MD
2002         Jacob K. Goldhaber Travel Grant
             University of Maryland, College Park, MD
2001         Ford Foundation Graduate Research Grant for Ecological Research
             Stanford University, Stanford, CA
2000         Mary M. Wohlford Fellowship, Morrison Institure for Population and Resource Studies
             Stanford University, Stanford, CA
1999-2001    Foreign Language Area Studies (FLAS) Fellowship, Committee for African Studies
             Stanford University, Stanford, CA
1994-1996    Cal Grant A, California Student Aid Commission, CA


Teaching Experience:
2006         Teaching Assistant, HHMI Course Development in Bioinformatics
             University of Maryland, College Park, MD
2002-2003    Teaching Assistant, Introductory Genetics
             University of Maryland, College Park, MD
2004         Teaching Assistant, Principles of Biology
             University of Maryland, College Park, MD
2000         Teaching Assistant, Laboratory Research Methods in Anthropological Genetics
             Stanford University, Stanford, CA
1997         Undergraduate Teaching Assistant, Human Evolutionary Biology
             University of California, Davis, CA

B. SELECTED PUBLICATIONS
Tishkoff, S.A. Reed,F.A., Friedlaender, F.R., Ehret,  C., Ranciaro, A., Froment, A., Hirbo, J.B. Awomoyi, A.A.,
      Bodo.J., Doumbo, O., Ibrahim, M., Juma, AT.,  Kotze, M.J., Lema, G., Moore, J.H., Mortensen H.M.,
      Nyambo, T.B., Omar, S.A.,  Powell, K., Pretorius, G.S., Smith, M.W., Thera, M.A., Wambebe, C., Weber,
      J.L., Williams, S.M.  May 2009, The Genetic Structure and History of Africans and African Americans,
      Science 324 (5930): 1035-1044.
Mortensen, H.M., July 2008, Genetic Variation at the  N-Acetyltransferase (NAT) Genes in Global Human
       Populations, Doctoral Dissertation, Department of Biology, University of Maryland, College Park.
Tishkoff, S. A., Gonder, M.K., Henn,  B.M., Mortensen, H.M., Fernandopulle, N., Gignoux.C., Lema, G.,
       Nyambo, T. B, Underhill, P.A., Ramakrishnan.U., Reed,  F. A., Mountain, J. L, July 2007, History of
       click-speaking populations  of Africa inferred  from mtDNA and Y chromosome genetic variation,
       Molecular Biology and Evolution, 24 (10): 2180-2195.
Gonder, M. K., Mortensen, H.M., Reed, F.A., Tishkoff, S.A., 2007, Whole mtDNA Genome Sequence Analysis
       of Ancient African Lineages, Molecular Biology and Evolution 24 (3):757-768.
Tishkoff, S. A., Reed, F. A., Ranciaro, A., Voight, B. F., Babbitt,  C. C., Silverman, J. S., Powell, K.,
       Mortensen, H.  M., Hirbo, J. B., Osman, M., Ibrahim, M., Omar, S. A., Lema, G., Nyambo, T. B., Ghori,
       J., Bumpstead, S.,  Pritchard, J. K., Wray,  G. A.,  and  Deloukas, P., 2007, Convergent adaptation of
       human lactase persistence in Africa and Europe, Nature Genetics 39 (1):31-40.
Knight, A., Underhill, P.A., Mortensen, H.M., Zhivotovsky, L.A., Henn,  B.M.,  Ruhlen, M., Mountain
       J.L., 2003, African Y Chromosome and mtDNA Divergence Provides Insight into  the History of Click
       Languages. Current Biology,  13: 464-473.
Malhi, R.S., Mortensen, H.M., Eshleman, J.A., Kemp, B.M., Lorenz, J.G., Kaestle, F. A., Johnson, J.R.,
       Gorodezky, C.,  Smith. D.G., 2003, Native American mtDNA Prehisory in the American Southwest.
       American Journal of Physical Anthropology.  120: 108-124.
Mortensen, H.M., 2001, Evidence of African Prehistory: Y Chromosome and mtDNA Analysis of
       Four Linguistically Divergent African Groups (the Hadzabe, Datoga, Irawq, and Sukuma of North-
       eastern Tanzania). MS Thesis. Department of Anthropological Sciences, Stanford University.
                                               Page 2
                              Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):  Mortensen, Holly M.

C. SELECTED PRESENTATIONS
Mortensen, H.M. (May 2009) Mapping Human Toxicity and Disease Pathways in ToxCast™. ToxCast Data
   Analysis Summit. US EPA Research Triangle Park, NC. USA. (presentation).
Mortensen, H.M., Dix,  D., Houck, K., Kavlock, R., Judson, R. (May 2009) Using the ToxMiner™ Database
   for Identifying Disease-Gene Associations in the ToxCast™ Dataset. National Academy of Science.
   Symposium on Toxicity Pathway-Based Risk Assessment: Preparing for Paradigm Change. Washington,
   DC. USA (poster).
Houck, K., Mortensen, H.M., Witt, K.L., Menghang, X. (April 2009)Call for Nominations of Quantitative High-
   Throughput Screening Assays from Relevant Human Toxicity Pathways. The Society for Biomolecular
   Screening, Lille, France.
Mortensen, H.M., Dix,  D., Houck, K., Kavlock, R., Shah, I., Judson, R. (March 2009) The ToxCast™ Pathway
   Database: Identifying Toxicity Signatures and Potential Modes of Action from Chemical Screening Data.
   Society of Toxicology, Baltimore, MD, USA (poster).
Mortensen, H.M., Awadalla,  P., Tishkoff, S.A. (October, 2007) The Role of Natural Selection in Shaping
   Genetic Variation at the N-acetyltransferase (NAT) Genes in African and Global Populations. American
   Society of Human Genetics, San Diego, CA,  USA (poster)
Mortensen, H.M., Gonder, M. K., Tishkoff, S.A. (April, 2004) Ancient Migrations and Population
   Expansions in East Africa: Genetic Evidence for Tanzanian Prehistory. American Association of Physical
   Anthropology. Tampa, FL, USA (presentation)
Mortensen, H.M., Gonder, K., Tarazona-Santos, E., Hirbo, J., Powell, K., Knight, A., Mountain, J., Tishkoff,
   S.A, 2003, Genetic History of Hunting and Gathering Populations of Tanzania. American Association of
   Physical Anthropology, Tempe, AZ, USA (poster)
Powell, K.B, Mortensen, H.M, Tishkoff, S.A. , 2003, The evolution of Lactase Persistence in African
   Populations. American Association of Physical Anthropology, Tempe, AZ, USA (poster)
Mortensen, H.M., Gonder, M. K., Tarazona-Santos, E., Hirbo, J. B., Mountain, J., Tishkoff. S. A., 2002,
   Genetic history of linguistically diverse Tanzanian populations inferred from mtDNA. American Society of
   Human Genetics yearly meeting, Baltimore, MD, USA (poster)
                                              Page 3
                              Previous  I    TOC

-------
         Principal Investigator/Program Director (Last, First, Middle): Mlindy, William R.
                               BIOGRAPHICAL SKETCH
NAME
William R. Mundy
eRA COMMONS USER NAME
Mundy
                           POSITION TITLE

                           Research Toxicologist
EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral
training.}
INSTITUTION AND LOCATION
University of Massachusetts
University of Kentucky
University of Kentucky
DEGREE
(if applicable)
B.S.
M.S.
Ph.D.
YEAR(s)
1979
1983
1987
FIELD OF STUDY
Environmental
Science
Toxicology
Toxicology
 A. POSITIONS and HONORS

 Research and Professional Experience:
 2009 - Present   Research Toxicologist, Integrated Systems Toxicology Division, NHEERL,
 USEPA
 1990-2009
 1987-1990
 1981 -1987
 1980-1981
Research Toxicologist, Neurotoxicology Division, NHEERL, USEPA
NIH Staff Fellow, National Institute of Environmental Health Sciences
Research Assistant, Graduate Program in Toxicology, Univ. of Kentucky
Toxicology Technician, Litton Bionetics Inc., Rockville, MD
 Professional Societies and Affiliations:
 Memberships: American Society for Neurochemistry, International Neurotoxicology Association,
    North Carolina
 Society for Neuroscience, Society for Neuroscience, Society of Toxicology

 Honors and Awards:
 USEPA Scientific and Technological Achievement Award, 1999; USEPA Scientific and
 Technological Achievement Award, 2000; Neurotoxicology Division SAINT Award, 2002;
 NHEERL Strategic Goal Award: Future Issues, 2003; NHEERL Teamwork Award, 2003;
 USEPA ORD Honor Award: Bronze Medal, 2003; USEPA Scientific and Technological
 Achievement Award (Honorable Mention), 2006

 Selected invitations at National & International Symposia:
 Annual Meeting of the Society of Toxicology, Salt Lake City,  UT, March 9, 2003; Annual
 Summer Meeting of the Toxicology Forum, Aspen, CO, June 14, 2003; Mid-year Advisory Board
 meeting for the Johns Hopkins Center for Alternatives to Animal Testing (CAAT), June 10, 2004;
 ECVAM-CEFIC Workshop on validation of alternative approaches for developmental
 neurotoxicity: models and endpoints, European Commission Joint Research Centre, Ispra, Italy,
 April 19-21, 2005; Duke Center for Drug Discovery, High Content Cellular Imaging Symposium,
 Durham, NC, October 8, 2008; CAAT TestSmart DNT2 meeting, Reston, VA, November 12,
 2008
                                        Pagel
                        Previous
                       TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):  Mlindy, William R.

Selected Expert Committees/Advisory Panels/Organizing Committees:
Advisory Board Member, Johns Hopkins University Center for Alternatives to Animal Testing
(CAAT)

B. SELECTED PUBLICATIONS (selected more than 65 total).

Inglefield, J.R., Mundy, W.R. and Shafer, T.J.: Inositol 1,4,5-triphosphate receptor-sensitive
Ca2+ release, store-operated Ca2+ entry, and cAMP responsive element binding protein
phosphorylation in developing   cortical cells following exposure to polychlorinated biphenyls.
J. Pharmacol. Exp. Ther. 297:762-773.  2001.
Altmann, L, Mundy, W.R., Ward, T.R., Fastabend, A. and Lilienthal, H.: Developmental
exposure of rats to   a reconstituted PCB mixture or Aroclor 1254: Effects on long-term
potentiation and [3H]MK-801 binding in occipital cortex and hippocampus. Toxicol Sci. 61:321-
330,2001.
Inglefield, J.R., Mundy, W.R., Meacham, C.A., and Shafer, T.J.: Identification of calcium-
dependent and -   independent signaling pathways involved in polychlorinated biphenyl-induced
CREB phosphorylation in developing cortical neurons. Neuroscience 115:559-573, 2002.
Parran, O.K., Barone Jr., S., and Mundy, W.R.: Methylmercury decreases NGF-induced TrkA
   autophosphorylation and neurite outgrowth in PC12 cells. Dev. Brain Res. 141:71-81,  2003.
Parran, O.K., Barone Jr., S., and Mundy, W.R.: Methylmercury inhibits TrkA signaling through
the ERK1/2   cascade after NGF stimulation of  PC12 cells. Dev. Brain Res. 149:53-61, 2004.
Das, K.P., Freudenrich, T.M., and Mundy, W.R.:  Assessment of PC12 cell differentiation and
neurite    growth: a comparison of morphological and neurochemical measures. Neurotoxicol.
Teratol. 26:397-406, 2004.
Mundy, W.R., Freudenrich, T.M., Crofton,  K.M., and DeVito, M.J.: Accumulation of PBDE-47 in
primary   cultures of rat neocortical cells. Toxicol. Sci. 82:164-169, 2004.
Meacham, C.A., Freudenrich, T.M., Anderson, W.L., Sui, L., Lyons-Darden, T., Barone Jr., S.,
Gilbert, M.E., Mundy, W.R., and Shafer, T.J.: Accumulation of methylmercury or
polychlorinated biphenyls in in vitro  models of rat neuronal tissue. Toxicol.  Appl. Pharmacol.
205:177-187.2005.
Mundy, W.R., and Freudenrich,  T.M.: Apoptosis of cerebellar granule cells induced by organotin
compounds found in drinking water: involvement of MAP kinases. Neurotoxicology 27:71-81,
2006.
Coecke, S., Goldberg, A.M., Allen, S., Buzanska, L., Calamandrei, G., Crofton, K., Hareng, L.,
Hartung, T.,  Knaut,  H., Honegger, P., Jacobs,  M.,  Lein, P., Li, A., Mundy,  W., Owen, D.,
Schneider,  S., Silbergeld,    E., Reum, T., Trnovec, T., Tschudi-Monet, F. and Bal-Price, A.:
Workgroup report: Incorporating in vitro alternative methods for developmental neurotoxicity
into international hazard and risk assessment  strategies. Environ.  Health Perspect. 115:924-
931,2007.
Viberg, H., Mundy, W. and Eriksson, P.: Neonatal exposure to decabrominated diphenyl ether
(PBDE 209)  results in changes in BDNF, CaMKII and GAP-43, biochemical substrates of
neuronal survival growth and synaptogenesis. Neurotoxicology 29:152-159, 2008.
Radio, N.M. and Mundy, W.R.: Developmental neurotoxicity testing in vitro:  Models for
assessing chemical   effects on neurite outgrowth. Neurotoxicology 29:361-376, 2008.
Mundy, W.R., Robinette, B., Radio, N.M. and Freudenrich, T.M.:  Protein biomarkers associated
with growth   and synaptogenesis in a cell culture model of neuronal development. Toxicology
249:220-229, 2008.
                                        Page 2
                        Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Mlindy, William R.
Radio, N.M., Breier, J.M., Shafer, T.J. and Mundy, W.R.: Assessment of chemical effects on
neurite    outgrowth in PC12 cells using high content screening. Toxicol. Sci. 105:106-118,
2008.
Breier, J.M., Radio, N.M., Mundy, W.R. and Shafer, T.J.: Development of a high-throughput
screening assay for chemical effects on proliferation and viability of immortalized human neural
progenitor cells.   Toxicol. Sci. 105:119-133, 2008.
Harrill, J.A., Li, Z., Wright, F.A., Radio, N.M., Mundy, W.R., Tornero-Velez, R., and Crofton,
K.M.:  Transcriptional response of rat frontal cortex following acute In Vivo exposure to the
pyrethroid insecticides permethrin and deltamethrin. BMC Genomics 9:546, 2008.
Radio, N.M., Freudenrich, T.M., Robinette, B.L., Crofton, K.M., Mundy,  W.R.: Comparison of
PC12 and cerebellar granule cell cultures for evaluating neurite outgrowth using high content
analysis.  Neurotoxicol. Teratol. Xxx:xx-xxx, 2009.
                         Previous
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): RabiflOWitZ, James R.
                                    BIOGRAPHICAL SKETCH

NAME
James Rabinowitz
eRA COMMONS USER NAME
POSITION TITLE
Research Physicist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
Alfred University, Alfred, NY

Uppsala University, Uppsala Sweden

State University of New York at Buffalo
Institute for Environmental Medicine, NYU
Medical Center, Tuxedo, NY
Harvard School of Public Health, Continuing
Education, Boston, MA
DEGREE
(if applicable)
B.A.

Certificate

Ph. D.
Post Doc

YEAR(s)
1962

1969

1972
1973
1994
FIELD OF STUDY
Physics
Solid State Physics,
Computational
Chemistry and
Theoretical Biology
Physics
Environmental Medicine
Analyzing Risk:
Science, Assessment
and Management
A. POSITIONS and HONORS

Research and Professional Experience:
1968 -1972   Research Associate, Center for Theoretical Biology, State University of New York at Buffalo,
Buffalo, NY.
1972 -1973   Postdoctoral Fellow, Institute of Environmental Medicine, New York University Medical Center,
Tuxedo, NY.
1973 -1974   Guest Scientist, Northeast Radiological Health Laboratory, BRH/HEW/USPS, Winchester, MA.
1973 -1977   Associate Research Scientist, Institute of Environmental Medicine, New York University Medical
Center, Tuxedo, NY.
1977 -1980   Research Scientist, Science and Technology Research Center, New York Institute of
Technology, Dania, FL.
1980 -1983   Research Physicist, CBB, EBD, HERL, ORD,  EPA, RTP, NC.
1983 -1995   Research Physicist, CMB, GTD, HERL, ORD, EPA, RTP,  NC.
1991 -2000   Lecturer, Molecular Modeling Course, Department of Pharmaceutical Chemistry and Carolina
Seminars Series, UNC, Chapel Hill, NC.
1995-2001 Research Physicist, BPB, ECD, NHEERL, ORD, EPA, RTP, NC.
2001-2005Research Physicist, MTB, ECD, NHEERL, ORD, EPA, RTP, NC.
2005-present  Research Physicist, NCCT, ORD, EPA, RTP,  NC.

Professional Societies and Affiliations:
American Association for the Advancement of Science
International Society for Quantum Biology and Pharmacology
American Chemical Society; Section on Chemical Toxicology; Section on Computers  in Chemistry

Honors and Awards:
Bronze Medal for Commendable Service, U.S. EPA, for Obtaining High Performance  Computer Platforms for
Environmental Research and Risk Assessment, 1992
Scientific and Technology Achievement Award, US EPA , 1985, 1997
Special Acts Award, NHEERL Supercomputing and High Performance Computing , 1997.
ClO's John Cooper Partnership Award, Office of Environmental Information , 2008
 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                Page  1
                              Previous
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): RabiflOWitZ, James R.


Selected Invitations at National & International Symposia:
Invited presenter and participant at GE program on alternative methods. Endocrine Disruption, Metabolism and
Skin with the intent to look at new approaches to in vitro toxicity testing and how these approaches can be
used to predict human health consequences. The Center for Alternatives to Animal Testing, Johns Hopkins
Bloomburg School of Public Health, Baltimore Maryland 2007.
Invited speaker and participant at the ECVAM Workshop on Molecular Modelling Approaches for Human
Hazard Assessment of Chemicals Feb.20-22, 2006 Ispra Italy.
Invited keynote speaker for the American Chemical Symposium -Molecular Modeling in Environmental
Chemistry- sponsored by the Geological Chemistry of the ACS, with additional co-sponsors, Philadelphia, PA,
2004
Speaker at the American Chemical Society Symposium -Computational Toxicology- sponsored by the
Chemical Toxicology Section. Cosponsored by the Computers in Chemistry Section, NY, NY -2003
Invited lecturer at the EURESCO Conference Computational Biophysics: Integrating Theoretical Physics and
Biology, Biophysics  from First Principles EURO Conference: From Electronic to the Mesoscale, European
Science Foundation, San  Feliu de Guixols, Spain -2002
Speaker Molecular Modeling Applications for Environmental Problems, Computers in Chemistry, American
Chemical Society, New Orleans, LA -1996
Speaker Theoretical Calculations in Cancer Research: Progress and Perspectives, International Society of
Quantum Chemistry and Pharmacology, St. Andrews, Scotland -1995.

Selected Expert Committees/Advisory Panels/Organizing Committees:
Chairman session Bioinformatics and Computational Toxicology 48th Annual Meeting of the Society of
Toxicology. 2009.
Organizing Committee International Science Forum on Computational Toxicology 2007
Organizer of American Chemical Society Symposium -Computational Toxicology- sponsored by the
Chemical Toxicology Section. Cosponsored by the Computers in Chemistry Section, NY, NY -2003
Organizer and Chairman Symposium on Molecular Modeling Applications for Environmental Problems,
Computers in Chemistry, American Chemical Society, New Orleans, LA -1996
Organizing Committee and Co-chairman Theoretical Calculations in Cancer Research: Progress and
Perspectives, International Society of Quantum Chemistry and  Pharmacology, St. Andrews, Scotland -1995.
Executive Committee of the International Society for Quantum Biology and Pharmacology. 1999 - 2002
Reviewed research  articles for various journals.
Reviewed Proposals for the Petroleum Research Fund, National Science Foundation and NIOSH  since 1997
Consultant on Research Project at the University of Rhode  Island 2001-2004

Selected Assistance/Advisory Support to the Agency:
Initiated the Agency's participation in the U.S.-Poland Maria Sklodowska-Curie Fund for Research through
consultation with the Agency's Office of International Affairs
Reviewer, Organobromine Waste Review for RCRA, Office of Solid Waste 1997
Member of the Working Group, Hazardous Waste Identification Rule, ORD for OSW. 1997-1998
Member of the Supercomputer Working Group (The name changed to the High Performance Computing
Committee) 1990-present.
Member of the  Endocrine  Disrupter Research Implementation Plan Team 2000 -2003.
Member of the Scientific Office of the Future advisory committee and users group 2004 - 2006.
Member of NERL SP2 Steering Committee  2007

B. SELECTED PUBLICATIONS

   L Lewis-Bevan,  SB Little and Jr Rabinowitz (1995) Quantum Mechanical Studies of the Three  Dimensional
      Structure of the Diol-epoxides of Benzo(c)Phenanthrene. Chemical Research in Toxicology 8, 499-505.
   JR Rabinowitz, SB Little and EM Gifford (1998) The Interactions between Chlorinated Dioxins and a
      Positively Charged Molecular Probe: A  New Molecular Interaction Potential, Journal of Computational
      Chemistry 19, 673-684.
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                Page  2
                              Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):  RabiflOWitZ, James R.

   SB Little, JR Rabinowitz, P Wei. and W Yang (1999) A comparison of calculated and experimental
       geometries for crowded polycyclic aromatic hydrocarbons and their metabolites, Polycyclic Aromatic
       Compounds 14, 53-61.
   DM Marini, ML Shelton MJ Kohan, EE Hudgens, TE Kleindienst,  LM Ball, DB Walsh, JG de Boer, L Lewis-
       Bevan, JR Rabinowitz, LD Claxton, J Lewtas (2000) Mutagenicity in lung of big blue mice and induction
       of tandem-base substitutions in salmonella by the air pollutant peroxyacetyl nitrate (PAN): predicted
       formation of intrastrand cross-links, Mutation Research 457, 41 - 55.
   JR Rabinowitz, SB Little and KW Brown, (2001) Why does 5-methyl chrysene interact with DMA as both a
       planar and nonplanar polycyclic aromatic hydrocarbon, International Journal of Quantum Chemistry 88,
       99-106.
   KW Brown, SB Little, JR Rabinowitz (2002) Benzo[a]pyrene and  Benzo[c]phenanathrene:  The effect of
       structure on the binding of water molecules to the diol epoxides, Chemical Research in Toxicology 15,
       1069-1079.
   JR Rabinowitz, SB Little, EM Gifford (2004) Molecular Interaction Potentials for the development of
       structure activity relationships,  In Quantitative Structure-Activity Relationships  for Pollution Prevention,
       Toxicological Screening, Risk Assessment and Web Applications, Society of Environmental Toxicology
       and Chemistry, 93 - 104.
   JR Rabinowitz, M.-R Goldsmith, SB Little, and MA Pasquinelli (2008) Computational Molecular Modeling
       for Evaluating the Toxicity  of Environmental Chemicals: Prioritizing Bioassay Requirements.
       Environmental Health  Perspectives, 116, 573-577.
   JR Rabinowitz, SB Little, SC Laws and M-R Goldsmith (2009) Molecular Modeling for Screening
       Environmental Chemicals for Estrogenicity: Use of the Toxicant-Target Approach Chemical Research in
       Toxicology accepted for publication
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                 Page 3
                               Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):    Reif, David M.
                                    BIOGRAPHICAL SKETCH
   NAME
   David M. Reif
                                       POSITION TITLE
                                       Statistician
EDUCATION/TRAINING
INSTITUTION AND LOCATION
U.S. Environmental Protection Agency (RTP, NC)
Vanderbilt University (Nashville, TN)
Vanderbilt University (Nashville, TN)

College of William and Mary (Williamsburg, VA)
DEGREE
Post-doc
Ph.D.
M.S.
B S
(Monroe Scholar)
YEAR(s)
2006-
2008
2002-
2006
2003-
2005
1998-
2002
FIELD OF STUDY
Computational
Toxicology
Human Genetics
Applied Statistics

Biology
A. POSITIONS and HONORS

Research and Professional Experience:
2008-Present Statistician, National Center for Computational Toxicology,
             U.S. Environmental Protection Agency, Research Triangle Park, NC
2008-Present Visiting Scholar, Department of Statistics,
             North Carolina State University, Raleigh, NC
2006-2008   Biologist (federal post-doc), National Center for Computational Toxicology,
             U.S. Environmental Protection Agency, Research Triangle Park, NC
             (advisor: Elaine Cohen Hubal)
2002-2006   Graduate Research Assistant, Center for Human Genetics Research,
             Vanderbilt University, Nashville, TN
             (advisors: Jason Moore and Jonathan Haines)
1999-2001    Research Assistant, Department of Biology,
             College of William and  Mary, Williamsburg, VA
             (advisor: Patty Zwollo)
Honors and Awards (Selected):
2009         OTS Award, National Center for Computational Toxicology, U.S. EPA
2007         OTS Award, Human Studies Division, U.S. EPA
2005         International Travel Grant, Vanderbilt University, Nashville, TN
2003-2005   NIH Training Grant in Human Genetics, Vanderbilt University, Nashville, TN
2002         Phi Sigma Biology Honors Fraternity, College of William and Mary, Williamsburg, VA
2001         Omicron Delta Kappa Leadership Fraternity, College of William and Mary, Williamsburg, VA
2001         Monroe Scholarship Supplemental Award for International Research, College of William and
             Mary, Williamsburg, VA
2000-2001    Howard Hughes Medical Institute Undergraduate Research Grant, College of William and Mary,
             Williamsburg, VA
Selected Expert Committees/Advisory Panels/Organizing Committees:
2009         Chair, NCCT Seminar Series, Research Triangle Park,  NC, USA.
2008-2009   Program Committee, "Bioinformatics and Computational Biology", Genetic and Evolutionary
             Computation Conference.
2007-present Grant Reviewer, National Science Foundation.
2007
Session Chair, "Statistical and Mathematical Modeling for Community-Based Risk Assessment",
Community Based Risk Assessment Workshop, National Center for Environmental Research,
Research Triangle Park, NC, USA.
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)
                               Page 1 (of 4)
(Updated August 2009)
                              Previous
                                 TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Reif, David M.

2006-present  Reviewer for: Bioinformatics; PLoS Genetics; Pharmacogenomics; BMC Bioinformatics; Human
             Genetics; Journal of Exposure Science and Environmental Epidemiology; Journal of Infectious
             Disease; Biotechniques; Medical Science Monitor;IEEE/ACM Transactions on Computational
             Biology and Bioinformatics; PLoS One; Journal of Statistical Software
Selected Service/Assistance/Advisory Support to the Agency and the  Scientific Community:
2009         Task Order Contract Officer Representative (TOCOR), ToxCast in vitro screening contracts.
2009         "Gene-Environment Interactions", Science Cafe presented by the North Carolina Museum of
             Science, Raleigh, NC, USA [public seminar].
2009         "Lessons on Modern Toxicology: How Darwin Saw It Coming", EPA Greenversations weblog.
2009         "How our shared genetic history shapes responses to a changing environment", SMART lecture
             series, Panther Creek High School, Gary, NC [seminar].
2008         Career Panel, Durham Public Schools "Stay in School" program, Durham, NC,  USA.
2008         "Human Genetics",  EPA-Shaw University Research Associateship Program, Chapel Hill, NC,
             USA [seminar].
2007         "Genotyping technology and analysis in cancer research", Association for Biomedical Research,
             Chapel Hill, NC, USA [seminar].
2006-present  EPA Speaker's Bureau
Teaching Experience:
2008         Course Director and Lecturer, Introduction to R,
             North Carolina State University,  Raleigh, NC, USA [semester course].
2006         Teaching Assistant & Guest Lecturer, Statistics for Biomedical Researchers,
             Vanderbilt University, Nashville, TN, USA [semester course].
2006         Guest Lecturer, General Biology I & II,
             Nashville State Technical College, Nashville, TN, USA [semester course].

B. SELECTED PUBLICATIONS

Peer-Reviewed:
2009         Sanchez Y., Deener K., Cohen-Hubal E., Reif P.M., Segal D.A. Research Needs for
             Community Based Risk Assessment. Journal of Exposure Science and Environmental
             Epidemiology, 1(10).
2009         Reif P.M., Motsinger A.A., McKinney B.A., Edwards K.M., Chanock S.J., Rock  M.T., Crowe Jr.
             J.E., Moore J.H. Integrated Analysis of Genetic and Proteomic Data Identifies Biomarkers
             Associated with Systemic Adverse Events Following Smallpox Vaccination. Genes and
             Immunity, 10(2).
2009         Heidenfelder  B.N., Reif P.M.,  Harkema J., Cohen-Hubal E., Gallagher J, Edwards S.E.
             Comparative microarray analysis and pulmonary morphometric changes  in Brown
             Norway rats  exposed to ovalbumin and/or concentrated  airborne particulates.
             Toxicological Sciences, 108(1). [Cover Article].
2008         Martin M.T., Judson R.S.,  Reif P.M.,  Pix P.J. Profiling Chemicals Based on Chronic Toxicity
             Results from the U.S. EPA ToxRef Database. Environmental Health Perspectives, 116(11).
2008         Reif P.M., McKinney B.A., Motsinger A.A., Chanock S.J.,  RockM.T., Moore J.H., Crowe Jr. J.E.
             Genetic basis for systemic adverse events following smallpox vaccination. Journal of
             Infectious Piseases, 198(1).
2008         Motsinger A.A., Reif P.M., Fanelli T.J.,  Ritchie M.P. A Comparison of analytical methods for
             genetic association studies. Genetic Epidemiology, 32(6).
2008         Hardison N.E., Fanelli T.J., Pudek S.M., Reif P.M., Richie M.P., Motsinger A.A. A Balanced
             Accuracy Fitness Function Leads to Robust Analysis  using Grammatical  Evolution
             Neural Networks in the Case of Class Imbalance. Genetic and Evolutionary Computation
             Conference.
2007         Motsinger A.A., Reif P.M. Embracing Complexity: Gene-gene and Gene-Environment
             Interactions. In: Genes, Genomes, and Genomics, vol. 3.
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)               Page 2 (of 4)                              (Updated August 2009)
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Reif, David M.

2007         Motsinger A.A., Ritchie M.D., Reif P.M. Novel Methods for Detecting Epistasis in
             Pharmacogenomics Studies. Pharmacogenomics, 8(9).
2007         McKinney B.A., Reif P.M., White B.C., Crowe J.C., Moore J.H. Evaporative cooling feature
             selection for genotypic data involving interactions. Bioinformatics, 23(16).
2007         Reif P.M., Israel M.A., Moore J.H. Exploratory visual analysis of statistical results of
             microarray experiments comparing high and low grade glioma. Cancer Informatics, 2(1).
2007         Motsinger A.A., Reif P.M., Fanelli T.J., Pavis A.C., Ritchie M.P. Linkage disequilibrium in
             genetic association studies improves the power of Grammatical Evolution Neural
             Networks. IEEE Symposium on Computational Intelligence in Bioinformatics and
             Computational Biology.
2006         McKinney B.A., Reif P.M., Moore J.H., Crowe Jr. J.E. Cytokine Expression Patterns
             Associated with Systemic Adverse Events Following Smallpox Immunization. Journal of
             Infectious Diseases, 194(4).
2006         Reif P.M., Motsinger A.A., McKinney B.A., Crowe Jr. J.E., Moore J.H. Feature Selection using
             Random Forests for the Integrated  Analysis of Multiple Data Types. IEEE Symposium on
             Computational Intelligence in Bioinformatics and Computational Biology.
2006         McKinney B.A., Reif P.M., Ritchie M.P., Moore J.H. Machine learning for detecting gene-
             gene interactions. Applied Bioinformatics, 5(2).
2006         Motsinger A.A., Reif P.M., Pudek S.M., Ritchie M.P. Understanding the Evolutionary
             Process of Grammatical Evolution Neural Networks for Feature Selection in Genetic
             Epidemiology. IEEE Symposium on Computational Intelligence in Bioinformatics and
             Computational Biology.
2006         Reif P.M., Moore J.H. Visual analysis of statistical results from microarray studies of
             human cancer. Oncology Reports, 15(5).  [Cover Article].
2005         Wilke R., Reif P.M.,  Moore J.H.  Combinatorial pharmacogenetics. Nature Reviews Drug
             Discovery, 4(11).
2005         White B.C., Gilbert J., Reif P.M., Moore J.H. A statistical comparison of grammatical
             evolution strategies in the domain of human genetics. IEEE Congress on Evolutionary
             Computation, 6(2).
2005         Reif P.M., Pudek S.M., Shaffer C.M.,  Wang J., Moore J.H. Exploratory Visual Analysis of
             Pharmacogenomic Results. Biocomputing, 9th ed.
2004         Reif P.M., White B.C., Moore J.H. Integrated Analysis of Genetic, Genomic, and Proteomic
             Data. Expert Reviews in Proteomics,  1 (1).
2003         Reif P.M., White B.C., Olsen N.J., Aune T.A., Moore J.H. Complex function sets improve
             symbolic discriminant analysis of microarray data. In: Cantu-Paz, E. et al. (eds.) Lecture
             Notes in Computer Science, 2724.
Submitted:
2009         Knudsen T., Houck K., Judson R.S., Singh A.V., Mortensen H.A.,  Reif P.M., Pix P.J., Kavlock
             R.J. Biochemical Activities of 320 ToxCast Chemicals Evaluated Across 239 Functional
             Targets.
2009         Gallagher J., Reif P.M., Hudgens E., Heidenfelder B.N., Neas L, Williams A., Harkema J.,
             Hester S., Edwards S.E., Cohen-Hubal E.  Integration of Exposure, Effects, and
             Susceptibility Data to Improve the Predictive Value of Biomarkers for Asthmatic Children.
2009         Martin M.T., Pix P.J., Judson R.S., Kavlock R.J., Reif P.M., Richard A.M., Rotroff P.M.,
             Makarov S., Romanov S., Medvedev A., Houck K. Assessment of the Impact of
             Environmental Chemicals on Key Transcription Regulators and Correlation to Toxicity
             Endpoints.
2009         Judson R.S., Houck K., Kavlock R.J.,  Knudsen T., Martin M.T., Mortensen H.M.,  Reif P.M.,
             Richard A.M., Rotroff P.M., Shah I., Pix P.J. Predictive In Vitro Screening of Environmental
             Chemicals - The ToxCast Project.
2009         Judson R.S., Houck K., Kavlock R.J.,  Knudsen T., Martin M.T., Reif P.M., Rotroff P.M., Shah I.,
             Pix P.J. Predicting Rodent and Human Chemical Carcinogenicity Using In Vitro Assays.


PHS 398/2590 (Rev. 09/04, Reissued 4/2006)               Page 3 (of 4)                             (Updated August 2009)
                              Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Reif, David M.

2009         Rotroff D.M., Ferguson S.S., Beam A.L, DixD.J., Farmer A., Freeman K.M., Houck K., Judson
             R.S., Reif P.M., LeCluyse E.L. Xenobiotic Metabolizing Enzyme and Transporter Gene
             Expression in Primary Cultures of Human Hepatocytes Modulated by ToxCast Chemicals.
In Preparation:
2009         Reif P.M., Heidenfelder B.N., Gallagher J., Hudgens E., Neas L, Cohen-Hubal E., Edwards
             S.E. Integrating demographic, clinical, and environmental exposure information to
             identify genomic biomarkers associated with subtypes of childhood asthma.
2009         Reif P.M., Cohen-Hubal E., Hudgens E., Heidenfelder B.N., Edwards S.E., Gallagher J.
             Genetic associations with subtypes of childhood asthma.
2009         Williams-Pevane C., Gallagher J., Reif P.M., Cohen-Hubal E, Heidenfelder B.N., Harkema J.,
             Edwards S.E. Comparing gene expression patterns in blood and lung tissue of
             immunologically-challenged rats exposed to concentrated airborne particulates.
2009         Elloumi F., Judson R.S., Pix P.J., Shah I., Knudsen T.,  Reif P.M., Singh A.V., Li Z., Wright F.A.
             Deriving Toxicogenomics Pathway-based Concentration Response Profiles.
2009         Reif P.M., Pix P.J., Houck K., Martin M.T., Judson R.S., Kavlock R.  J. Biological profiling of
             endocrine related effects of chemicals using ToxCast.

C. SELECTED PRESENTATIONS

2009         "Summary of Approaches and Predictions", ToxCast Data Analysis Summit, U.S. EPA,
             Research Triangle Park, NC, USA [seminar].
2008         "Pata integration adds etiological context to a childhood asthma gene expression study",
             Genetics Department Seminar Series, North Carolina State University, Raleigh, NC, USA
             [invited seminar].
2008         "Integrating multiple data sources to discriminate subtypes of childhood asthma", Visiting
             Pulmonary Scholar Seminar Series, University of North Carolina,  Chapel Hill, NC, USA
             [invited seminar].
2008         "Integrating demographic, clinical, and environmental exposure information to identify genomic
             biomarkers associated with subtypes of childhood asthma",  International Society of Exposure
             Analysis-International Society of Environmental Epidemiology (ISEA-ISEE) Joint Annual
             Meeting, Pasadena, CA, USA [oral presentation].
2008         "Pata integration for the Mechanistic Indicators of Childhood Asthma Study", Human Studies
             Division Seminar Series, U.S. EPA, Chapel  Hill, NC, USA  [seminar].
2007         'Detection and characterization of gene-gene and gene-environment interactions in common
             human diseases and complex clinical endpoints", Therapeutic Applications of Computational
             Biology and Chemistry (TACBAC), Wellcome Trust Conference Centre, Cambridge, UK
             [invited seminar].
2006         "Genetic factors associated with adverse events following smallpox vaccination", Vanderbilt
             University Genetics Retreat, The Hermitage, TN, USA [oral presentation].
2006         "Integrated analysis of genetic and proteomic data", Vanderbilt University Genetics Interest
             Group, Nashville, TN, USA [seminar].
2005         "Exploratory Visual Analysis", Pacific Symposium on Biocomputing (PSB), Lihue, HI, USA
             [oral presentation].
2003         "Symbolic Piscriminant Analysis of microarray data using complex function sets", Genetic and
             Evolutionary Computation Conference (GECCO), Chicago, IL, USA [oral presentation].

D. WEBSITES

http://epa.gov/ncct/index.html
www.epistasis-list.org
www4.stat.ncsu.edu/~dmreif/Site/ST610A.html
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)               Page 4 (of 4)                             (Updated August 2009)
                              Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Richard, Ann M.
                                    BIOGRAPHICAL SKETCH
  NAME
  Ann M. Richard
  eRA COMMONS USER NAME
  NA
   POSITION TITLE
   Research Chemist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, dnd include pOStdOCtOTdl
INSTITUTION AND LOCATION
State University of New York at Oswego, NY
University of North Carolina at Chapel Hill, NC
DEGREE
(if applicable)
B.A./B.S.
Ph.D.
YEAR(s)
1978
1983
FIELD OF STUDY
Math/Chemistry
Physical Chemistry
A. POSITIONS and HONORS

Research and Professional Experience:
2005-present  Principal Investigator, NCCT, US EPA, RTP, NC
1999-2005    Research Chemist, Molecular Toxicology Branch, ECD/NHEERL, US EPA, RTP, NC
2001-2002   Acting Chief, Molecular Toxicology Branch, ECD, NHEERL, US EPA, RTP, NC
1997-1999    Res. Chemist, Biochemistry Pathobiology Branch, ECD, NHEERL, US EPA, RTP, NC
1987-1997    Res. Chemist, Carcinogenesis Metabolism Branch, HERL, US EPA, RTP, NC

Professional Societies and Affiliations:
QSAR & Modeling Society, Genetics & Environmental Mutagen Society

Honors and Awards:
NHEERL Strategic Goal 4: Leadership in the Scientific Community, Plan Team Award, 2000
NHEERL Special Award for Developing Process for Advancing NHEERL Genomics & Proteomics Program
Superior Achievement Cash Awards, 2001, 2002, 2003, 2005, 2006, 2007, 2008, 2009
NHEERL Strategic Goal 5 Award: Future Issues, Genomics, Proteomics & Bioinf. Program, 2003
CompTox Program 1-yr Augmented Funding Award for DSSTox Database Project Expansion, 2004-2005
CompTox Program New-Start Research Award for CEBS/DSSTox Collaboration Project, 2005-2008
ORD Science & Technology Achievement Award, Level III, 2006
ORD Communication Award, 2006
ORD Science & Technology Achievement Awards, Level II, Level III, and Honorable Mention, 2007
EPA Office of Environmental Information Award: Improvements in Computational Toxicology, 2008

Selected invitations at National & International Symposia:
ADMET-2 Conference, San Diego.CA, 2/2005
Soc. of Toxicology, Public Database Resources, New Orleans, LA, 3/2005
ILSI - HESI Developmental Toxicity Database Workgroup Meeting, Washington, DC, 3/2005
Chemoinformatics Symposium, NC Central University, Durham, NC, 4/2005
NSF International Health Advisor Board Meeting, Ann Arbor, Ml, 4/2005
National Cancer Institute Workshop on Government Databases, Fredrick, MD, 7/2005
American Chemical Soc. Nat. Mtg., International Science Policy Symposium, Washington, DC,  8/2005
Int. Congress of Environ. Mutagens, San Francisco, CA, 9/2005
Data in Life Science Workshop, European Science Agency, Milan, Italy,  11/2005
ISRTP Workshop, Progress & Barriers to Incorporating Alt Tox Methods in the US, Baltimore, MD, 11/2005
Leadscope In Silico Toxicology Consortium, Bethesda, MD, 4/2006
EU Project on the Collection & Evaluation of (Q)SAR Models for Mutag. and Carcinog.; Rome,  Italy, 6/2006
LHASA International Collaborative Group Meeting, Washington DC, 9/2006
EPA STAR Graduate Fellowship Conference, Washington DC, 9/2006
Vanderbilt University, Institute for Chemical Biology, Dept. Seminar, Nashville, TN, 2/2007
                                              1
                             Previous
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Richard, Ann M.

Soc. for Biomolecular Sci., Toxicity Profiling Using HTS and HCS Technologies, Montreal, Canada, 4/2007
Teratology Society Annual Meeting, Pittsburg, PA, 6/2007
Comp.  Methods in Toxicology & Pharmacology: Integrating Internet Resources, Moscow, Russia, 9/2007
Int. Society of Exposure Analysis Annual Meeting, Durham, NC 10/2007
eCheminfo Workshop on Predictive ADMET & Toxicology, Philadelphia, PA, 10/2007
Genetics & Environmental Mutagenesis Society Meeting, Chapel Hill, NC, 10/2007
Health  Canada's New Substances Assessment & Control Bureau, Ottawa, Canada, 11/2007
SCARLET Workshop on In Silico Methods for Carcinogenicity and Mutagenicity, Milan, Italy, 4/2008.
QSAR  & Omics Technologies and Systems Biology, Uppsala, Sweden, 9/2008.
eCheminfo Workshop on Predictive ADMET & Toxicology, Philadelphia, PA, 10/2008.
LHASA Limited Symposium: New Horizons in Toxicity Prediction, Cambridge, UK, 12/2008.
Ohio State University, Guest Lecturer, Molecular Informatics, Columbus,  OH, 3/2009.
Proctor & Gamble, Computational Toxicology Seminar Series, Cincinnati, OH, 3/2009.
Joint Research Centre, Inst. Health & Consumer Protection, Ispra, Italy, 5/2009.
US FDA Center for Food Safety & Applied Nutrition, Comp. Tox. Workshop,  Silver Spring, MD, 7/2009.
American Chemical Soc.,  Div. Chem. Inf., Chemical Text Mining & Databases, Wash. DC, 8/2009.
Int. Congress on Environ.  Mutag., New Data Initiatives & Predic. Tox.,  Florence, Italy, 8/2009.

Selected Expert Committees/Advisory Panels/Organizing Committees:
Editorial Board, Mutation Research, 1994-present.
Editorial Board, Chemical  Research in Toxicology, 1999-2002, 2003-2005, 2008-present.
Editorial Board, SAR & QSAR in Environmental Research, 2008-present.
Manuscript reviewer: Bioinformatics, Carcinogenesis, Chem.I Res.  Tox., Chemico-Biol.l Interact.,
   Chemosphere, Environ. Molec.  Mutag., Environ. Health Perspec.s, Environ.I Tox. Chem., J. Chem. Info.
   Comp. Sci., J. Amer. Chem. Soc., J. Comp. Chem., Mutat. Res,  SAR QSAR Environ. Tox., Sci. Total
   Environ., Toxicology, Tox. Sci.,  Regul. Tox. Pharmacol.
Advisory committee for Predictive Toxicity Challenge I, Freidburg, Germany, 2001
ILSI-Health Effects Sci.  Inst. Working Group on SAR Toxicity Database, 2002
Organizing Committee, Prediction of AcuteToxicity Workshop, Mclean, VA, May 5-7, 2003
OECD  Workgroup on Use & Regulatory Acceptance of QSARs, 2003-present
Organizing Committee, ADMET I Conference, San Diego, Feb. 11-13,  2004
Organizing Committee, International Congress on Environ. Mutagenesis, San Francisco, Sept., 2004
LeadScope LIST Workgroup for Implementation of ToxML standard ontologies, Mar 2004-2006
Review Committee, TERA - Health Canada QSAR Advisor Review  Board, Apr 2005
I LSI Working Group on Prediction of Developmental Toxicity, 2002-present
LHASA VITIC SAR Database Advisory Committee, 2005
National Toxicology Program  High Throughput Screening Assays Workshop, Arlington, VA, 12/05
EU Project on Evaluation  of (Q)SAR Models for Mutagenicity & Carcinogenicity; Rome, Italy, June 22-23, 2006
Developmental Neurotoxicity  Database Workshop, RTP, NC, Nov 14-15, 2006
Organizing Committee, Comp. Methods in Toxicology & Pharmacology, Moscow, Russia, 9/2007

Selected Assistance/Advisory Support to the Agency:
ORD Strategy GoalS Team - Leader in Environmental Research, 1999
NHEERL Strategy GoalS Team - Leader in Environmental Research, 1999
NHEERL Multi-Year Implementation Planning Committee for Goal 4 - Safe Communities, 2001
NHEERL Genomics and Proteomics Steering Committee, 2001-2002
NHEERL Genomics & Proteomics Committee Member, Bioinformatics Coordinator, 2002-2004
EPA Goal 4 Safe Pesticides/Safe Products Multi-Year Plan Steering Committee, 2003-present
EPA Research  & Science Architecture (RST) Target Workgroup, 2003-present
Chemoinformatics Communities of  Practice Coordinator, 2005-present
EPA Hiring Committee Lead for Senior Title 42 NCCT Bioinformatics Hire, 2006
EPA Science Connector Workgroup, NCCT Representative, 2007
Tox21 - EPA Chemical Working Group Lead, 2009.
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Richard, Ann M.

B. SELECTED PUBLICATIONS (2004 to present, 26 out of 60 total).
Richard, A.M. DSSTox Website launch: Improving public access to databases for building structure-toxicity
   prediction models. Preclinica, 2:103-108, 2004.
Balu, N., Padgett, W.T., Lambert, G., Swank, A.E., Richard, A.M.,  Nesnow, S. Identification and
   characterization of novel stable deoxyguanosine and deoxyadenosine adducts of benzo[a]pyrene-7,8-
   quinone from reactions at physiological pH. Chem. Res. Toxicol. 17:827-838, 2004.
Kundu, B., Richardson, S.D., Swartz, P.O., Matthews, P.P., Richard, A.M., DeMarini, D.M.  Mutagenicity in
   Salmonella of Halonitromethanes: A Recently Recognized Class of Disinfection By-products in Drinking
   Water. Mutat.  Res. 562:39-65, 2004.
Kundu, B., Richardson, S.D., Granville, C.A., Shaughnessy, D.T., Hanley, N.M., Swartz, P.O., Richard, A.M.,
   DeMarini, D.M. Comparative Mutagenicity of Halomethanes and Halonitromethanes in Salmonella TA100:
   Structure-Activity Analysis and Mutation Spectra. Mutat. Res. 554:335-350, 2004.
Julian, E., Wllhite, C.C., Richard, A.M., DeSesso, J.M. Challenges in Constructing Statistically-Based SAR
   Models for Developmental Toxicity. Birth Defects Research Part A, 70:902-911, 2004.
Granville, C.A., Ross, M.K., Tornero-Valez,  R., Hanley, N.M.,  Grindstaff, R.D., Gold, A., Richard, A.M.,
   Funasaka, K., Tennant, A.H., Kligerman, A.D., Evans, M.V., DeMarini, D. Genotoxicity and metabolism of
   the source-water contaminant 1,1-dichloropropene: Activation  by GSTT1-1. Mutat. Res. 572:98-112,  2005.
Fostel, J., Choi, D., Zwickl, C., Morrison, N., Rashid, A.,  Hasan, A., Bao, W., Richard, A., long, W., Bushel, P.,
   Brown, R., Bruno, M., Cunningham, M.,  Dix, D., Eastin, W., Frade,  C., Garcia, A., Heinloth, A., Irwin, R.,
   Madenspacher, J., Merrick, A., Papoian, T., Paules,  R., Rocca-Serra, P., Sansone, A., Stevens, J., Tomer,
   K., Yang, C., Waters, M. Chemical Effects in Biological Systems - Data Dictionary (CEBS-DD): A
   Compendium of Terms for the Capture and Integration of  Biological Study Design Description,
   Conventional Phenotypes and 'Omics Data. Tox. Sci., 88:585-601, 2005.
Hunter, S.E., Rogers, E., Blanton, M., Richard, A.M., Chernoff, N.  Bromochloro-haloacetic acids: Effects on
   mouse embryos in vitro and QSAR considerations, Birth Defects Research Part A, 21:260-266, 2005.
Yang, C., Richard, A.M., Cross, K.P. The art of data mining the minefields of toxicity databases to link
   chemistry to biology. Curr Comput-Aided Drug Design, 2(2):135-150, 2006.
Richard A.M., Gold, L.S., Nicklaus, M.C. Chemical structure indexing of toxicity data on the internet: Moving
   towards a flat world. Current Opinion in Drug Discovery & Develop., 9(3):314-325, 2006.
Richard A.M.  Future of Predictive Toxicology: An  Expanded View of "Chemical Toxicity" - Future of Toxicology
   Perspective. Chem. Res. Toxicol., 19:1257-1262, 2006.
Dix, D.J., Houck, K.A., Martin, M.T., Richard, A.M., Setzer, W., Kavlock, R.J. The ToxCast  program for
   prioritizing toxicity testing of environmental chemicals. Tox. Sci., 95:5-12, 2007
Benigni, R., Netzeva,  T.I., Benfenati, E., Bossa, C., Franke, R., Helma, C., Hulzebos, E., Marchant, C.,
   Richard, A., Woo, Y.T., Yang, C. The expanding role of predictive toxicology: an update on the (Q)SAR
   models for mutagens and carcinogens. J.  Environ. Sci. Health C, 25:53-97, 2007.
Benigni, R., Bossa, C. Richard, A.M., Yang, C. A novel approach:  Chemical  relational databases, and the role
   of the ISSCAN database on assessing chemical carcinogenicitiy. Annali dell' Inst. Super, di Sanita, 2007.
Richard, A., Yang, C., Judson, R. Toxicity Data Informatics: Supporting a New Paradigm for Toxicity
   Prediction. Tox. Mech. Meth., 18:103-118, 2008.
Yang, C., Arnby, C.H., Arvidson, K., Aveston, S., Benigni, R.,  Benz, R.D.,  Boyer, S., Contrera, J., Dierkes, P.,
   Han, X., Jaworska, J., Kemper, R.A., Kruhlak, N.L., Matthews, E.J., Rathman, J.F., Richard, A.M.
   Understanding Genetic Toxicity through Data  Mining: The Process of Building Knowledge by Integrating
   Multiple Genetic Toxicity Databases, Tox. Mech. Meth., 18:277-295, 2008.
Kavlock, R.J., Ankley, G.,  Blancato,  J.,  Breen, M., Conolly, R., Dix, D.,  Houck, K., Hubal, E., Judson, R.,
   Rabinowitz, J., Richard, A., Setzer,  R.W., Shah, I., Villeneuve, D., Weber, E.  Computational Toxicology-A
   State of the Science Mini Review, Tox. Sci, 103:14-27, 2008.
Zhu, H., Rusyn, I., Richard, A., Tropsha, A. The Use of Cell Viability Assay Data  Improves the Prediction
   Accuracy of Conventional Quantitative Structure Activity Relationship Models of Animal Carcinogenicity,
   Environ. Health Perspec., 116:506-513,  2008.
Judson, R., Richard, A., Dix, D., Houck, K., Elloumil, F.,  Martin, M., Cathey, T., Transue, T., Spencer,  R.,  Wolf,
   M. ACToR - Aggregated Computational Toxicology Resource, Toxicol. Appl. Pharmacol., 233:7-13, 2008.
Hubal, E., Richard, A., Shah, I., Gallagher, J., Kavlock, R., Blancato, J., Edwards, S.  Exposure Science and
   the US EPA National Center for Computational Toxicology,  J. Expos.  Sci. Environ. Epidem. 1-6, 2008.
                               Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Richard, Ann M.

BenfenatM, E., Benigni, R., DeMarini, D.M., Helma, C., Kirkland, D., Martin, T.M., Mazzatorta, P., Meunier, J.-
   R., Ouedraogo-Arras, G., Richard, A.M., Schilter, B., Schoonen, W.G.E.J., Snyder, W.G.E.J., Yang, C.,
   Young, D.M. Predictive models for carcinogenicity and mutagenicity: frameworks, state-of-the-art and
   perspectives, J. Environ. Sci. Health C, 27:57-90, 2009.
Judson,  R., Richard, A., Dix, D.J., Houck, K., Martin, M., Kavlock, R., Dellarco, V., Henry, T., Holderman, T.,
   Sayre, P., Tan, S., Carpenter, T., Smith, E. The toxicity data landscape for environmental chemicals,
   Environ. Health Perspec., 117:685-695, 2009.
Zhu, H.,  Ye, L, Richard, A., Golbraikh, A., Wright, F.A., Rusyn, I., Tropsha, A. A novel two-step hierarchical
   quantitative structure-activity relationship modeling work flow for predicting acute toxicity of chemicals in
   rodents, Environ. Health Perspec., 117:1257-1264, 2009.
Williams-Devane, C., Wolf, M.A., Richard, A.M. DSSTox chemical-index files for exposure-related experiments
   in ArrayExpress and Gene Expression Omnibus: Enabling toxico-chemogenomics data linkages,
   Bioinformatics, 25:692-694, 2009.
Wlliams-Devane, C., Wolf, M.A., Richard, A.M. Towards a public toxicogenomics capability for supporting
   predictive toxicology:  Survey of current resources and chemical indexing of experiments in GEO and
   ArrayExpress, Tox. Sci., 109:358-371, 2009.
Knight, A.W., Little, S., Houck, K., Dix, D., Judson, R., Richard, A, McCarroll, N., Ackerman, G., Yang, C.,
   BirrelH, L., Walmsley, R.M. Evaluation of high-throughput genotoxicity assays used in profiling the US EPA
   ToxCast™  Chemicals, Reg. Toxicol. Pharmacol. doi:10.1016/j.yrtph.2009.07.004, 2009.

DSSTOX WEBSITE & PUBLICATIONS
Richard, A.M., EPA DSSTox Website, launched March 2004; most  recent update March 2009:
   https://www.epa.gov/ncct/dsstox/
Richard, A.M., Transue, T., EPA DSSTox Structure-Browser, launched August 2007; v2.0, 2008:
   http://www.epa.gov/dsstox structurebrowser/
Published DSSTox Data Files: http://www.epa.gov/ncct/dsstox/DataFiles.html (4 of 14 published DSSTox Data
   Files listed below):
     i.   Gold, L.S., Slone, T.H., Williams, C.R., Burch, J.M.,  Stewart, T.W., Swank, A.E., Beidler, J., Richard,
        A.M. (2003) DSSTox Carcinogenic Potency Database Summary Tables for Rats and Mice, Hamsters,
        Dogs, and Non-Human Primates, SDF Files and  Documentation, Last updated 2008:
        CPDBAS_v5d_1547_20Nov2008; http://www.epa.gov/ncct/dsstox/sdf cpdbas.html
     ii.  Backus, G.S, Wolf, M.A., Burch, J., Richard, A.M. (2006) DSSTox EPA Integrated Risk Information
        System (IRIS) Toxicity Review Data: SDF File and Documentation, Last updated 2008:
        IRISTR_v1b_544_15Feb2008, www.epa.gov/ncct/dsstox/sdf  iristr.html
     iii.  Houck, K., Dix, D., Judson, R., Martin, M., Wolf, M.,  Kavlock, R.,Richard, A.M. (2007) DSSTox EPA
        ToxCast High Throughput Screening Testing Chemicals Structure-Index File: SDF File and
        Documentation,  Last updated  2009: TOXCST_v3a_320_12Feb2009,
        http://www.epa.gov/ncct/dsstox/sdf  toxcst.html
     iv.  Williams-Devane, C.R., Wolf, M.A. Richard, A.M. (2008) DSSTox European Bioinformatics Institute
        (EBI) ArrayExpress Repository for Gene Expression Data (ARYEXP and ARYEXP_Aux): SDF Files
        and Documentation, Last Updated: ARYEXP_v2a_958_06Mar2009,
        ARYEXP_Aux_v2a_2556_06Mar2009, www.epa.gov/ncct/dsstox/sdf arvexp.html
     v.  Williams-Devane, C.R., Wolf, M.A., Richard, A.M. (2008) DSSTox National  Center for Biotechnology
        Information (NCBI) Gene Expression Omnibus (GEO) Series Experiments (GEOGSE and
        GEOGSE_Aux): SDF Files and Documentation, Last Updated: GEOGSE_v2a_1179_09Mar2009,
        GEOGSE_Aux_v2a_2700_09Mar2009, www.epa.gov/ncct/dsstox/sdf geogse.html
     vi.  S. Laws, J. Kariya, M. Wolf, and A.M. Richard (2009) DSSTox EPA Estrogen Receptor Ki  Binding
        Study (Laws et al.) Database - (KIERBL): SDF File and Documentation, Launch version:
        KIERBL_v1a_278_17Feb2009, www.epa.gov/ncct/dsstox/sdf kierbl.html
                              Previous  I    TOC

-------
                                   Principal Investigator/Program Director: Rusyn, Ivan
                                        BIOGRAPHICAL SKETCH
          Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                             Follow this format for each person. DO NOT EXCEED FOUR PAGES.
  NAME
  Ivan Rusyn
  eRA COMMONS USER NAME
  I RUSYN
                                            POSITION TITLE
                                            Associate Professor
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
Ukrainian State Med. University, Kiev, Ukraine
Inst. Physiol. Chem. 1, University of Dusseldorf
University of North Carolina at Chapel Hill
University of North Carolina at Chapel Hill
Massachusetts Institute of Technology
DEGREE
(if applicable)
M.D. (w. Hons.)
Postdoctoral
Ph.D.
Postdoctoral
Postdoctoral
YEAR(s)
1994
1995-1996
2000
2000-01
2001-02
FIELD OF STUDY
Medicine
Free radical biology
Toxicology
DNA damage &
repair
Toxicogenomics
Professional Positions:

2002 -        Associate (since 2007) and Assistant (2002-2007) Professor, Department of Environmental Sciences &
              Engineering, University of North Carolina at Chapel Hill
2002 -        Associate (since 2007) and Assistant (2002-2007) Professor, Associate Director (since 2007), Curriculum
              in Toxicology, UNC-Chapel Hill
2005 -        Scientific Co-Director, Carolina Environmental Bioinformatics Research Center
2003 -     Member, Lineberger Comprehensive Cancer Center, UNC-Chapel Hill
2003 -     Member, Bowles Center for Alcohol Studies, UNC-Chapel Hill
2003 -     Member, Center for Environmental Health & Susceptibility, UNC-Chapel Hill
2002 -     Member, Carolina Center for Genome Sciences, UNC-Chapel Hill
2001 - 2002   Res. Assoc., Biological Engineering Division, Massachusetts Inst. of Technology
2000 - 2001   Res. Fellow, Dept. of Environmental Sci. & Engineering, UNC-Chapel Hill
1996 - 2000   Res. Assist., Curriculum in Toxicology, UNC-Chapel Hill
1995 - 1996   Guest Researcher & Fellow of German Academic Exchange Service  (DAAD),
              Institute for Physiological Chemistry I, University of Dusseldorf, Germany
1994- 1995   Intern, Department of Otolaryngology, Kiev Regional Clinical Hospital, Ukraine

Significant Professional Activities:
2009-present  Member, Standing Committee on Use of Emerging Science for Environmental Health Decisions, Board on
Life Sciences, Board on  Environmental Studies & Toxicology, National Academies, Washington, DC
2008-present  Member, Committee on Tetrachloroethylene, Board on Environmental Studies and Toxicology, National
Research Council of the National Academies, Washington, DC
2008-present  Member, Board on Publications, Society of Toxicology, Reston, VA
2006-2007  Member, Working Group on IARC Monograph Volume 96 on "Alcoholic beverage consumption, acetaldehyde
and urethane" International Agency for Research on Cancer (IARC), Lyon, France
2006-2008  Scientific Program Committee, Society of Toxicology, Reston, VA
2005-2008  Expert Consultant,  12th Report on  Carcinogens, NTP/NIEHS, Research Triangle Park, NC
Honors and Awards:
2008
2002
2000
2000
2000
2000
2000
Achievement Award, Society of Toxicology
Transition to Independent Position Award, NIEHS
Individual Postdoctoral National Research Service Award, NIEHS
Leon & Bertha Golberg Memorial Postdoctoral Fellowship, UNC-Chapel Hill
AACR - Bristol Myers Squibb Oncology Young Investigator Scholar Award
Carl C. Smith Mechanisms Specialty Section Award, Society of Toxicology
Young  Investigator Award, Society for Free Radical  Research International

                                          1
                                 Previous
                                        TOC

-------
                                    Principal Investigator/Program Director: Rusyn, Ivan

1998/99/2001  Young Investigator Award, The Oxygen Society
1995-96    Research Fellowship, German Academic Exchange Service (DAAD)
1994       First Class Honors Diploma, Ukrainian State Medical University, Kiev, Ukraine

Recent Publications (from 82 total):
Gatti,  D.M.,  Harrill, A.M.,  Wright, F.A., Threadgill, D.W., and  Rusyn, I. Replication and narrowing of gene expression
    quantitative trait loci using inbred mice. Ma mm Genome In Press.
Harrill, A.M., Watkins,  P.B., Su,  S., Ross, P.K., Harbourt,  D.E., Stylianou, I.M., Boorman, G.A., Russo, M.W.,  Sackler,
    R.S., Harris, S.C., Smith, P.C., Tennant, R., Bogue, M., Paigen, K., Harris, C., Contractor, T., Wiltshire, T., Rusyn, I.,
    and  Threadgill,  D.W.  Mouse  population-guided  resequencing  reveals  that  variants  in  CD44  contribute  to
    acetaminophen-induced liver injury in humans. Genome Res In Press.
Kim,  S.,  Collins,  L.B., Boysen, G.,  Swenberg, J.A.,  Gold,  A.,  Ball,  L.M.,  Bradford, B.U., and  Rusyn,  I. Liquid
    chromatography electrospray ionization tandem mass  spectrometry analysis method for simultaneous detection of
    trichloroacetic  acid,   dichloroacetic  acid,  S-(1,2-dichlorovinyl)glutathione  and  S-(1,2-dichlorovinyl)-L-cysteine.
    Toxicology 262:230-238, 2009.
Zhu, H., Ye, L, Richard, A., Golbraikh, A., Rusyn, I., and Tropsha, A.  A novel two-step hierarchical quantitative structure
    activity relationship modeling workflow for predicting  acute toxicity  of chemicals in rodents. Envr Health Persp
    117:1257-1264,2009.
Harrill, A.H., Ross, P.K., Threadgill, D.W., and Rusyn, I. Population-based discovery of toxicogenomics biomarkers for
    hepatotoxicity using a  laboratory strain diversity panel. Toxicol Sci 110:235-243, 2009.
Pogribny, I.P., Tryndyak, V.P., Bagnyukova, T.V., Melnyk, S., Montgomery, B., Ross,  S.A., Latendresse, J.R., Rusyn, I.,
    and Beland, F.A. Hepatic epigenetic phenotype predetermines individual susceptibility to hepatic steatosis in mice fed
    a  lipogenic methyl-deficient diet. JHepato/51: 176-186, 2009.
Kim, S., Kim, D.,  Pollack, G., Collins, L., Rusyn, I. Pharmacokinetic analysis  of trichloroethylene metabolism in male
    B6C3F1  mice: Formation and disposition of trichloroacetic acid, dichloroacetic acid, S-(1,2-dichlorovinyl)glutathione
    and S-(1,2-dichlorovinyl)-L-cysteine. Toxicol Appl Pharmacol 238: 90-99, 2009.
Ross,  P.K.,  Woods,  C.G., Bradford,  B.U.,  Kosyk,  O., Gatti, D.M.,  Cunningham, M.L.,  and Rusyn, I.  Time-course
    comparison of xenobiotic activators of CAR and  PPARalpha in mouse liver. Toxicol Appl Pharmacol 235:199-207,
    2009.
Gatti,  D.M.,  Sypa, M., Rusyn,  I., Wright, F.A.,  and Barry,  W.T. SAFEGUI:  Resampling-based  tests of categorical
    significance in gene expression data made easy.  Bioinformatics 25:541-542, 2009.
Gatti,  D.M.,  Shabalin, A.A., Lam, T.C., Wright, F.A.,  Rusyn,  I., and  Nobel,  A.B.  FastMap: Fast eQTL mapping in
    homozygous populations. Bioinformatics 25: 482-489,  2009.
Harrill, A.H.,  and Rusyn, I. Systems  biology and functional genomics approaches for the identification of cellular
    responses to drug toxicity. Expert Opin Drug Metab Toxicol 4:1379-1389, 2008.
Stotts, D., Lee, K., and Rusyn,  I. Supporting computational systems science: Genomic analysis tool federations using
    aspects and AOP.  In: Mandoiu,  I.,  Sunderraman, R., and Zelikovsky A.  (Eds.):  Bioinformatics Research and
    Applications,  Proceedings of the 4th International Symposium ISBRA 2008, Springer Berlin/Heidelberg, LNBI 4983,
    pp. 457-468, 2008.
Tsuchiya, M., Kono, H., Matsuda, M., Fujii, H., and Rusyn,  I. Protective effect of Juzen-taiho-to  on hepatocarcinogenesis
    is mediated through the inhibition of Kupffer cell-induced oxidative stress. Int J Cancer 123:2503-2511, 2008.
Bradford, B.U., O'Connell, T.M., Han, J., Kosyk, O., Shymonyak,  S.,  Ross,  P.K., Winnike, J.,  Kono,  H., and Rusyn,  I.
    Metabolomic profiling  of a modified alcohol liquid diet model for liver injury in the mouse uncovers new markers of
    disease. Toxicol Appl Pharmacol 232: 236-243, 2008.
Pogribny, I.P., Tryndyak, V.P., Boureiko, A., Melnyk, S., Bagnyukova,  T.V., Montgomery, B., and Rusyn, I. Mechanisms
    of peroxisome proliferator-induced DMA hypomethylation in rat liver. Mutat Res 644:17-23, 2008.
Han, J., Danell, R.M., Patel, J.R., Gumerov, D.R., Scarlett, C.O., Speir, J.P.,  Parker, C.E., Rusyn,  I., Zeisel, S., and
    Borchers, C.H. Towards high throughput metabolomics  using ultrahigh field Fourier transform ion cyclotron resonance
    mass spectrometry. Mefabo/oin/cs4:128-140, 2008.
Zhu, H., Rusyn, I., Richard, A., and Tropsha  A. The use of cell viability assay data improves the prediction accuracy of
    conventional quantitative structure activity relationship models of animal carcinogenicity. Envr Health Persp 116:506-
    513,2008.
Pogribny,  I.P., Rusyn,  I.  and Beland, F.A. Epigenetic aspects of genotoxic  and non-genotoxic  hepatocarcinogenesis:
    Studies in rodents. Environ Mol Mutagen 49:9-15, 2008.
Rusyn, I., Fry, R.C., Begley, T.J., Klapacz, J., Svensson, J.P., Ambrose, M., and Samson, L.D. Transcriptional networks
    in  S. cerevisiae linked  to an accumulation of base excision repair intermediates. PLoS OW£2:e1252, 2007.
Woods, C.G., Kosyk, O., Bradford, B.U., Ross, P.K., Burns, A.M., Cunningham,  M.L.,  Qu, P.,  Ibrahim,  J.G. and Rusyn  I.
    Gene expression  in mouse liver reveals  a temporal shift in molecular pathways that mediate effects of peroxisome
    proliferators. Toxicol Appl Pharmacol 225.2Q7-277, 2007.
                                  Previous  I     TOC

-------
                                    Principal Investigator/Program Director:  Rusyn, Ivan

Pogribny, IP., Tryndyak, V.P., Woods, C.G., Witt, S.E., and Rusyn, I.  Epigenetic effects of the continuous exposure to
   peroxisome proliferator WY-14,643 in mouse liver are dependent upon  peroxisome proliferator activated receptor a.
   Mutat Res 625:62-71, 2007.
Peffer, R., Moggs, J.G., Pastoor, T., Currie, R.A., Wright, J., Millburn, G., Waechter, F., and Rusyn, I. Mouse liver effects
   of Cyproconazole, a triazole fungicide: Role of the constitutive androstane receptor. Tox/co/Sc/99:315-325, 2007.
Woods, C.G., Burns, A.M., Bradford, B.U., Ross,  P.K., Kosyk, O., Swenberg, J.A.,  Cunningham, M.L., and Rusyn I. WY-
   14,643-induced cell proliferation and oxidative stress in mouse liver are independent of NADPH oxidase. Toxicol Sci
   98:366-374, 2007.
Beyer, R.P.,  et  al.  Multi-center  study  of  acetaminophen hepatotoxicity  reveals the  critical  importance  of biological
   endpoints in genomic analyses. Toxicol Sci 99:326-337, 2007.
Gatti, D., Maki, A., Chesler, E.J., Kirova, R.,  Lu, L,  Wang, J., Williams, R.W., Perkins, A., Langston,  M.A., Threadgill,
   D.W., and Rusyn,  I. Genome-level analysis of genetic regulation  of liver gene expression networks.  Hepatology
   46:548-557, 2007.
Roberts, A., McMillan,  L., Wang, W., Parker, J., Rusyn, I., Threadgill, D. Inferring missing genotypes in large SNP panels
   using fast nearest-neighbor searches over sliding windows. B/o/nforinatfcs23:i401-07, 2007.
Woods, C.G., Vanden Heuvel, J.P., and  Rusyn, I. Genomic profiling  in nuclear receptor-mediated toxicity. Toxicol Pathol
   35:474-494, 2007.
Maki, A., Kono,  H., Gupta, M., Asakawa, M., Suzuki, T., Matsuda,  M., Fujii, H., and Rusyn,  I.  Predictive power of
   biomarkers of oxidative stress and inflammation in patients with hepatitis C virus-associated hepatocellular carcinoma.
   X\nnSurgOnco/14:1182-90, 2007.
Hammond,  L., Albright, C., He, L., Rusyn, I., Watkins,  S.M., Doughman, S.D., Lemasters, J.J.,  and Coleman, R.A.
   Increased oxidative stress is associated with balanced increases in hepatocyte apoptosis and proliferation in glycerol-
   3-phosphate acyltransferase-1 deficient mice.  Exp Mol Pathol 82:210-219, 2007.
Pogribny, I.P., Tryndyak, V.P., Muskhelishvili, L., Rusyn, I., and Ross, S.A. Methyl deficiency, alterations in global histone
   modifications and carcinogenesis. J Nutr 137:2168-2228,  2007.
Yamashina, 8., Ikejima, K., Rusyn,  I.,  and Sato,  N. Glycine as a potent anti-angiogenic nutrient for tumor growth J
   Gastroenterol Hepatol22:S62-S64, 2007.
Roberts, R.A., Ganey,  P.E., Ju, C., Kamendulis,  L.M., Rusyn, I., and Klaunig, J.E. Role of the Kupffer cell  in mediating
   hepatic toxicity and carcinogenesis. Toxicol Sci 96:2-15, 2007.
Woods, C.G., Burns, A.M., Maki, A., Bradford, B.U., Cunningham, M.L., Connor, H.D., Kadiiska, M., Mason, R.P., Peters,
   J.M., and  Rusyn, I. Sustained formation of a-(4-pyridyl-1-oxide)-/V-fe/?-butylnitrone radical adducts in mouse liver by
   peroxisome proliferators is dependent upon  peroxisome  proliferator-activated receptor-a, but not NADPH oxidase.
   Free Radio S/o/Med 42:335-342, 2007.
Rusyn, I., Peters, J.M., and Cunningham, M.L. Modes of action and species-specific effects of di-(2-ethylhexyl)phthalate
   in the liver. Crit Rev Toxicol 36:459-479, 2006.
Powell, C.L.,  Kosyk, O.,  Ross,  P.K., Schoonhoven, R.,  Boysen, G., Swenberg, J.A., Heinloth,  A.M., Boorman, G.A.,
   Cunningham, M.L., Paules, R.S., and Rusyn, I. Phenotypic anchoring of acetaminophen-induced oxidative stress with
   gene expression profiles in rat liver. Toxicol Sci 93:213-222, 2006.
Kono, H., Woods, C.G., Maki, A., Connor, H.D.,  Mason,  R.P., Rusyn, I., and  Fujii, H. Electron spin resonance and spin
   trapping technique  provide direct evidence that edaravone prevents acute ischemia-reperfusion  injury of the liver by
   scavenging free radicals. Free Radio Res 40:579-588, 2006.
Powell, C.P.,  Kosyk, O., Bradford, B.U., Parker, J.S., Lobenhofer, EX., Denda, A., Uematsu, F., Nakae, D., and Rusyn, I.
   Temporal  correlation of pathology and  DMA  damage with gene expression in a choline  deficient model of rat liver
   injury. Hepato/ogy42:1137-1147, 2005.
Rusyn, I., Asakura, S., Li, Y., Kosyk, O., Koc, H., Nakamura, J., Upton, P.B., and Swenberg, J.A. Effects of ethylene
   oxide and ethylene inhalation on DMA adducts, apurinic/apyrimidinic sites and expression of base excision DMA repair
   genes in rat brain, spleen, and liver. DNA Repair 4:1099-1110, 2005.
Bammler,  T., et al. Standardizing global gene  expression analysis between  laboratories and  across platforms. Nat
   Methods  2:351-356, 2005.
Bradford, B.U., Kono, H., Isayama, F., Kosyk,  O., Wheeler, M.D., Akiyama, T.E., Bleye, L., Krausz, K.W., Gonzalez, F.J.,
   Koop,  D.R., and Rusyn, I. Cytochrome P450 CYP2E1,  but not NADPH oxidase is  required for ethanol-induced
   oxidative DNA damage in rodent liver. Hepatology 41:336-344, 2005.

Ongoing Research Support:

R01 AA016258        8/06 - 7/10                   0.50 academic  month
NIH        $431,307                               1.00 summer month
Metabolomic and toxicogenetic study of ethanol toxicity
                                  Previous  I     TOC

-------
                                   Principal Investigator/Program Director: Rusyn, Ivan

The aim of this proposal is to define a "liver toxicity susceptibility state" in mouse liver in response to ethanol by combining
knowledge of toxicology, metabolomics, gene expression profiling and mouse genetics.

R01 ES15241             12/07-11/12                     2.40 academic month
NIH        $190,907
Bioengineering partnership to improve chemical hazard testing paradigms
This proposal will apply an integrative systems approach to: develop a 3D microscale mouse liver tissue bioreactor that
can be applied  to high-throughput  screening of chemicals;  build,  test and validate  a quantitative structure-toxicity
relationship model that takes  into account genetic diversity among individuals; and validate a fiscally sensible in vivo and
in  vitro toxicity screening paradigm for  a class of allylbenzene derivatives  by producing knowledge  anchored on the
genetic variability present within the population.

RD 833825           04/08-03/12                     2.00 academic month
EPA STAR    $817,943                            0.88 summer month
Carolina Center for Computational Toxicology
The Center will develop complex predictive modeling solutions that span from mechanistic- to discovery-based efforts.

RD 832720    (Wright)9/06 - 8/10                   0.60 academic month
US EPA Star      $314,659
Carolina Environmental Bioinformatics Research Center
Project 3: Computational Infrastructure for Systems Toxicology (I. Rusyn, PI)
The objective of this proposal is to develop novel analytic and computational methods, create user-friendly tools to
disseminate the methods to the wide toxicology community, and enhance and  advance the field of Computational
Toxicology.

P42 ES005948 (Swenberg) 04/06 - 03/10                     1.50 academic month
NIEHS        $219,825
Environmental exposure and effect of hazardous chemicals: Project #2 (I. Rusyn, PI)
The primary objective of the UNC Superfund Basic Research Program is to advance multidisciplinary research  on
addressing scientific  issues  that underpin  assessment  human risk and the development  of improved  methods  for
remediation of hazardous waste sites.

R01 ES012689 (Swenberg) 2/05 - 1/10                   0.36 academic month
NIH/NIEHS    $231,919
Adducts as Quantitative Markers of Butadiene Mutagenesis
The aim of this project is to study mutagenesis of butadiene and its metabolites in rodents and humans. New biomarkers
of butadiene exposure will be developed, and applied for research on the mechanisms of action.

R21 GM076059 (Tropsha)  6/06 - 5/10                   0.36 academic month
NIH        $207,353
Robust Computational Framework for Predictive ADME-Tox Modeling
This proposal seeks to establish a universally applicable and robust predictive  ADME-Tox modeling framework based on
rigorous Quantitative Structure Activity/Property Relationships (QSAR/QSPR).
                                  Previous   I    TOC

-------
          Principal Investigator/Program Director (Last, First, Middle):   Samet, James M.
                                   BIOGRAPHICAL SKETCH
     Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                        Follow this format for each person. DO NOT EXCEED FOUR PAGES.
NAME
James M. Samet
eRA COMMONS USER NAME
POSITION TITLE
Senior Research Scientist
EDUCATION/TRAINING  (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral
training.)
INSTITUTION AND LOCATION
University of Florida
University of North Carolina at Chapel Hill,
NC
University of North Carolina at Chapel Hill,
NC

Wake Forest University School of Medicine
University of North Carolina at Chapel Hill,
NC
DEGREE
(if applicable)
B.S.
M.S.
Ph D
r i i . L-/ .

Post Doc
Post Doc
YEAR(s)
1985
1990
1992
1 \J\J£-

1992-1994
1994-1996
FIELD OF STUDY
Microbiology and Cell
Science
Toxicology
Environmental
Sciences
Firo^anoiH
^IwWOdl IWIU
Biochemistry
Environmental Health
  A. POSITIONS and HONORS

  Research and Professional Experience:
  2007-Present   Senior Principal Investigator, Clinical Research Branch, Human Studies
                Division, NHEERL
  1997-Present   Adjunct Associate Professor, Curriculum in Toxicology, University of North Carolina at
  Chapel Hill
  1997-2007     Principal Investigator, Clinical Research Branch, Human Studies Division, NHEERL
  2001-2002     Acting Chief, Clinical Research Branch, Human Studies Division 1996-Present
                Diplomate, Certified in General Toxicology, The American Board of Toxicology
  1995-1997     Research Associate, Center for Environmental Medicine and Lung Biology,
                University of North Carolina at Chapel Hill,

  Selected Awards and Honors:
  2003   EPA Gold Medal for Exceptional Service
  2002   EPA Science and Technology Achievement Award (Level 1)
  1993   Science Policy Fellow American Association for the Advancement of
             Science/Environmental Protection Agency, Office of Health and Environmental
             Assessment, U.S.  EPA, Washington, DC 20460

  B. SELECTED PUBLICATIONS

  1.  Cao,  D.,  Bromberg,  P.A. and  Samet, J.M.  (2009).  Diesel  Particle-Induced  Transcriptional
     Expression Of P21 Involves Activation Of Egfr, Src And Stat3. Am.  J. Respir. Cell. Mol. Biol. In
     press.
  2.  Lenz, A-G, Kim, YM, Hinze-Heyn, H., Karg, E, Lentner, B, Samet, JM, Schuiz, H  and Maier, KL




-------
         Principal Investigator/Program Director (Last, First, Middle):    Samet, James M.


    (2009). Effect of zinc oxide particles on cytokine mRNA expression in  alveolar epithelial cells
    exposed at the air-liquid interface. Submitted
3.  Samet, J.M., Rappold, A., Graff,  D.,  Cascio, W.E., Berntsen, J.H., Huang,  Y-C, T., Herbst, M.,
    Bassett, M.,  Montilla,  T., Hazucha, M.J., Bromberg. P.A. and Devlin, R.B.  (2009). Concentrated
    Ambient Ultrafme Particle Exposure Induces Cardiac Changes In Young Healthy Volunteers. Am. J.
    Respir. Crit. Care Med. In Press.
4.  Silbajoris, R., Huang, J.M., Cheng, W.-C, Dailey, L, Tal, T.L. Jaspers, I., Ohio, A.J., Bromberg, P.A.
    and Samet, J.M. Nanodiamond particles induce  il-8  expression through a transcript stabilization
    mechanism in human airway epithelial cells (2009). Nanotoxicology. In Press.
5.  Wu, W., Silbajoris, R.A., Cao, D, Bromberg, P.A., Zhang, Q., Peden, D. B. and Samet, J.M.. (2008).
    Regulation of cyclooxygenase-2 expression by cAMP response element and  mRNA stability in  a
    human airway epithelial cell line exposed to zinc. Toxicol. Appl. Pharmacol. In press.
6.  Tal, T.L., Silbajoris, R.A., Bromberg, P.A., Kim, Y. and Samet, J.M. (2008).  Epidermal growth factor
    activation by diesel particles is mediated by tyrosine phosphatase inhibition. Toxicol. Appl. Pharmacol.
    233:382-8.
7.  Wu W,  Silbajoris RA, Cao  D,  Bromberg PA, Zhang Q,  Peden  DB, Samet  JM. Regulation of
    cyclooxygenase-2 expression by  camp response element  and mm'a stability in a  human airway
    epithelial cell line exposed to zinc. Toxicol Appl Pharmacol 2008;231 (2):260-266.
8.  Wu, W., Madden, M, Kim, Y, Silbajoris, R., Jaspers, L, Graves, L.M., Bromberg, P.A. and Samet, J.M.
    (2006). Transcriptional and Posttranscriptional Regulation  of COX-2 Expression in Human Airway
    Epithelial Cells Exposed to Zinc Ions. Bioch. Bioph. Res. Commun. In press.
9.  Wang, X., Samet, J. M., and Ohio, A. J. (2006). Asbestos-induced activation of cell signaling pathways
    in human bronchial epithelial cells. Exp Lung Res 32,229-243.
10. Dewar BJ, Gardner OS, Chen CS, Earp HS, Samet JM,  and Graves LM. (2006) Capacitative
    calcium entry contributes to the differential transactivation  of the epidermal growth factor receptor
    in response to thiazolidinediones. MolPharmacol 72:1146-1156,2007.
11. Cao, D., Bromberg, PA and Samet, JM (2007). Diesel-induced COX-2 expression involves chromatin
    modification  via degradation of HDAC1 and recruitment  of  p300.  Am.  J.  Respir.  Cell.  Mol.
    Biol.37:232-239.
12. Samet, JM, Graff, D, Benstsen, J, Ghio AJ, Huang T and Devlin, RB. (2007)  A comparison of studies
    on the effects of controlled exposure to fine, coarse and ultrafine ambient particulate matter from a
    single location. Inhal. Toxicol. 19(Suppl. 1): 29-32.
13. Cao, D, Tal, TL, Graves, LM, Gilmour, I, Linak, W, Reed, W, Bromberg, PA, and Samet, JM. (2006).
    Diesel exhaust particulates (DEP)-induced activation of Stat3 requires activities of EGFR and Src in
    airway epithelial cells. Am. J. Physiol: Lung Cell. Mol. Biol. 292: L422-L429.
14. Kim, Y.M., Cao, D., Reed, W., Wu, W., Jaspers, L, Tal, T., Bromberg, P.A. and Samet, J.M. (2006).
    Zn +-induced   NF-kB-dependent   transcriptional   activity   involves   site-specific  P65/REL-A
    phosphorylation. Cellular Signaling.  19:538-546.
15. Kim, Y.M., Reed, W., Wu, W., Bromberg,  P.A.,Graves, L.M.  and Samet, J. M. (2006). Zn2+-
    induced IL-8 expression involves AP-1, JNK, and ERK activities in human airway epithelial cells.
    Am. J. Physiol: LungMol. Cell. Biol. 290: L1028-1035.
16. Wang,  X.C.,  Wu,  Y.M., Samet,  J.M.  and Ghio, A.J. (2006). [Expression of phosphorylated
    ERK1/2 induced by crocidolite fibers in BEAS-2B cells].  Zhonghua Lao Dong Wei Sheng Zhi Ye
    Bing Za Zhi. 24:597-600.


-------
        Principal Investigator/Program Director (Last, First, Middle):   SantO Domingo, Jorge W.
                                      BIOGRAPHICAL SKETCH
          Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                           Follow this format for each person. DO NOT EXCEED FOUR PAGES.
  NAME
  Jorge W. Santo Domingo
  eRA COMMONS USER NAME
    POSITION TITLE
    Microbiologist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
University of Puerto Rico, Rio Piedras, PR
University of Puerto Rico, Rio Piedras, PR
Michigan State University, East Lansing, Ml
Michigan State University, East Lansing, Ml
DOE - WSRC, Aiken, SC
US EPA, Cincinnati, OH
DEGREE
(if applicable)
B.S.
M.S.
Ph.D.
Post Doc
Post Doc
Post Doc
YEAR(s)
1984
1987
1994
1995
1998
2002
FIELD OF STUDY
Biology
Biology
Microbiology
Molecular Biology
Molecular Biology
Water Microbiology
A. POSITIONS and HONORS

Research and Professional Experience:
8/84 - 4/87        Research Assistant, Microbial Ecology Laboratory, University of Puerto Rico
4/87 - 9/87        Research Associate, Microbial Ecology Laboratory, University of Puerto Rico
7/87 - 7/93        Graduate Research Assistant, Microbiology Department, Michigan State University
10/93 - 10/94      Research Associate, Microbiology Department, Michigan State University
2/95-6/98         DOE Post Doctoral Fellow, Oak Ridge Institute for Science and Education
7/98 - 10/00       EPA Federal Post Doctoral Fellow, ORD-NERL, Cincinnati, OH
11/00-5/09        Microbiologist, US-EPA, Cincinnati, OH
06/00-present      Research Microbiologist, US-EPA, Cincinnati, OH

Selected Awards and Honors (since 2000):
    US EPA On spot award. 2000. Training of MCEARD technicians in molecular biology methods
    US EPA Superior Accomplishment Recognition Award. 2001.  Development of rapid methods to detect
       fecal enterococci in recreational waters
    US EPA Team Award. 2002.  Establishment of a new research program in molecular biology within the
       Microbial Contaminant Control Branch
    US EPA Superior Accomplishment Recognition Award. 2002. Mentoring of WSWRD technician during a
       detail in MCCB
    US EPA Scientific and Technological Achievement Award, Level II. 2003. Technical Information
       Impacting Attainment of Clean Water Act Microbial Water Quality Goals (TMDLs)
    National Risk Management Research Laboratory Honor Award, Support to Agency's  Mission, Microbial
       Source Tracking, 2004
    Excellence in Review Award, Environmental Science and Technology, 2005.
    US EPA Superior Accomplishment Recognition Award. 2006. For patent application entitled "Development
       of Cow-Specific Primers and Identification of Cow-Specific DNA Sequences Using Genome Fragment
       Enrichment"
    Who's Who in Science and Engineering. Marquis Who's Who. 2006-2007
    US EPA Superior Accomplishment Recognition Award. 2007. For work in projects resulting in numerous
       publications
    US EPA Superior Accomplishment Recognition Award. 2007. For research paper entitled "Identification of
       Bacterial DNA Markers for the Detection of Human Fecal Pollution in Water"

                                                  i
                               Previous
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   SantO Domingo, Jorge W.

   ASM monthly magazine Microbe highlights research paper entitled "Identification of Bacterial DNA
       Markers for the Detection of Human Fecal Pollution in Water", June 2007
   US EPA Scientific and Technological Achievement Award, Level II. 2007. Scientific and Technological
       Achievement in the Field of Genomics.
   US EPA Scientific and Technical Achievement Award, Level II. 2008. Scientific and Technological
       Achievement in the Field of Water Quality Monitoring and Indicators of Fecal Pollution

Other Experience and Professional Memberships:
   Chair of Center of Excellence for Environmental Genomics Committee: 2008 -present
   Member of the Molecular Biology Subcommittee of Standard Methods 9001: 2004-2008
   Member of the Computational Toxicology Steering Committee (EPA): 2003 -2006
   Member of the EPA Genomics Task Force Workgroup (EPA): 2006 - 2007
   Member of WERF Project Steering Committee: 2002 - 2006
   Member of the American Society for Microbiology: 1985 -present
   Associate Editor of the Journal of Environmental Quality: 2002 - present
   Ad hoc reviewer for:  Applied and Environmental Microbiology, Environmental Science and Technology,
   Microbiology Ecology, Microbiology, FEMS Microbiology Ecology, Bioremediation Journal, Water
   Research, Water Environment Research, Journal of Applied Microbiology, Letters in Applied
   Microbiology, Canadian Journal of Microbiology, Transactions of the American Society of Agricultural
   Engineers, Electronic Journal of Biotechnology, NSF, NOAA, USDA, CICEET, SBRI, Gulf of Mexico
   Program 2006 RFP

B. SELECTED PUBLICATIONS

Lu, J., J.  W. Santo Domingo, S. Hill, and T. A. Edge. 2009.Microbial Diversity and Host-specific Sequences of
   Canadian Goose Feces. Appl. Envir. Microbiol. In Press
Lamendella, R., J. W. Santo Domingo, A. C. Yannarell, S. Ghosh, G. Di Giovanni, R. I. Mackie, and D. B.
   Oerther. 2009. Evaluation of swine-specific PCR assays used for fecal source tracking and analysis of
   molecular diversity of Bacteriodales-swine specific populations. Appl. Envir. Microbiol. In Press
Lee, Y.-J., M. Molina, and J.W. Santo Domingo, J.D. Willis, M. Cyterski, D.M. Endale, and O.C. Shanks.
   2008. A temporal  assessment of cattle fecal pollution in two watersheds using 16S rRNA gene-based and
   metagenome-based assays. Appl. Environ. Microbiol. 74:6839-6847.
Lu, J. and J. W. Santo Domingo. 2008. Turkey fecal microbial community structure and functional gene
   diversity revealed by 16S rRNA gene and metagenomic sequences. J. Microbiol. 46:469-477.
Lu, J., J. W. Santo Domingo, R. Lamendella, T.Edge, and S.Hill. 2008. Phylogenetic diversity and molecular
   detection of gull feces. Appl. Environ. Microbiol. 74: 3969-3976.
Lamendella, R., Santo Domingo J. W., Kelty C, and Oerther DB. 2008. Occurrence of bifidobacteria in feces
   and environmental waters. Appl. Environ. Microbiol. 74:575-584.
Shanks OC, Atikovic  E, Blackwood AD, Lu J, Noble RT, Santo Domingo J.W., Seifring S, Sivaganesan M,
   Haugland RA. 2008. Quantitative PCR for genetic markers of bovine fecal pollution. Appl. Environ.
   Microbiol. 74:745-752.
Santo Domingo, J. W., D.G. Bambic, T.A. Edge, and S. Wuertz. 2007. Quo vadis source tracking? Towards a
   strategic framework for environmental monitoring of fecal pollution. Water Res. 41:3539-3552.
Lu, J., J. W. Santo Domingo, and O.C. Shanks. 2007. Identification of chicken-specific fecal microbial
   sequences using a metagenomic approach. Water Res. 41:3561-3574.
Shanks, O., J. W. Santo Domingo, J. Lu, C.A. Kelty, and J. Graham. 2007. PCR Assays for the identification of
   human fecal pollution in water. Appl. Environ. Microbiol. 73: 2416-2422.
                               Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   SantO Domingo, Jorge W.

Vogel, J.R., D.M. Stoeckel, R. Lamendella, R.B. Zelt, J.W. Santo Domingo, S.R. Walker, and D.B. Oerther.
   2007. Identifying fecal sources in a selected catchment reach using multiple source-tracking Tools. J.
   Environ. Qual. 36:718-729.
Lamendella, R., J. W. Santo Domingo, D. Oerther, J. Vogel, and, D. Stoeckel. 2007. Assessment of fecal
   pollution sources in a small northern-plains watershed using PCR and phylogenetic analyses of
   Bacteroidetes 16S rDNA. FEMS Microbiol. Ecol. 59:651-660.
Revetta, R.P., Santo Domingo, J.W., Kelty, C.A., Humrighouse, B.H., Oerther, D.B., Lamendella, R.,
   Keinanen-Toivola, M., and Williams, M. "Molecular diversity of drinking water microbial communities: a
   phylogenetic approach," Water Environment Federation, Proceedings of Disinfection 2007, Pittsburg, PA,
   February 4-7, 2007.
Santo Domingo, J. W., Lu, J., Shanks, O., Lamendella, R., Kelty, C. A., and Oerther, D.B. "Development of
   host-specific markers for source tracking using a novel metagenomic approach," Water Environment
   Federation, Proceedings of Disinfection 2007, Pittsburg, PA, February 4-7, 2007.
Shanks, O., J.  W. Santo Domingo, R. Lamendella, C.A. Kelty, and J. Graham. 2006. Competitive metagenomic
   DNA hybridization identifies host-specific genetic markers in cattle fecal samples. Appl. Environ.
   Microbiol. 72:4054-4060.
Shanks, O., J.  W. Santo Domingo, and J. Graham. 2006. Use of competitive DNA hybridization to identify
   differences in the genomes of two closely related fecal indicator bacteria. J. Microbiol. Methods. 66:321-
   330.
Keinanen-Toivola, M.M., R.P. Revetta, and/. W. Santo Domingo. 2006. Identification of active bacterial
   communities in drinking water biofilms using 16S rRNA-targeted clone libraries. FEMS Microbiol. Letters.
   257:182-188.
Devereux, R., P. Rublee, J. H. Paul, K. G. Field and J. W. Santo Domingo. 2006. Development and
   applications of microbial ecogenomic indicators for monitoring water quality: report of a workshop
   assessing the state of the science, research needs and future directions. Environ. Monit. Assess.  116:459-
   479.
Humrighouse, B. H., Santo Domingo, J.W., Revetta, R.P., Lamendella, R., Kelty, C., and Oerther, D.B.
   "Microbial characterization of drinking water systems receiving groundwater and surface water as the
   primary sources of water," Water Distribution System Analysis, Proceedings of the Annual Meeting,
   Cincinnati, OH, August 27-30, 2006.
Batz, M.B., M.P. Doyle, J.G. Morris, Jr., J. Painter, R. Singh, R.V. Tauxe, M.R. Taylor, D.M.A. Lo Fo Wong,
   F. Angulo, R. Buchanan, H.G. Claycamp, C. Smith DeWaal, J.W. Santo Domingo, K. Field, D. Goldman,
   S. O'Brien, M. Moore, E. Ribot, S. Sundlof, and C. Woteki. 2005. Attributing illness to  food. Emerging
   Infectious Diseases. Available from http://www.cdc.gov/ncidod/EID/volllno07/04-0634.htm
Williams, M.W., J. W. Santo Domingo, and M.C. Meckes. 2005. Population diversity in model potable water
   biofilms receiving chlorine or chloramines residual. Biofouling J. 21: 279-288.
Dick, L.K., A.E. Bernhard, T.J. Brodeur, J.W. Santo Domingo, J.M. Simpson, S.P. Walters, and K.G. Field.
   2005. Host distributions of uncultivated fecal Bacteroidales reveal genetic markers for fecal source
   identification. Appl. Environ. Microbiol. 71:3184-3191.
Simpson, JM, JW Santo Domingo, DJ Reasoner. 2004. Assessment of equine fecal contamination: the search for
   alternative bacterial source-tracking targets.  FEMS Microbiology Ecology 47:65-75.
Williams, M, J Santo Domingo, M Meckes, C Kelty, H Rochon.  2004. Phylogenetic Diversity of Drinking Water
   Bacteria in a Distribution System Simulator. Journal of Applied Microbiology. In press

Invited oral presentations (last five years)

Swine and avian MST research.  Gulf of Mexico Program MST Research Review. St Pete Beach, FL. February
   2009.
What have we learned after 10 years of MST research:  its importance in marker development and future
   directions. Gulf of Mexico Alliance MST Workshop. Plenary session. St Pete Beach, FL. February 2009.
                               Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   SantO Domingo, Jorge W.

Monitoring fecal pollution using molecular tools. Department of Biological Sciences, Northern Kentucky
   University. Highland Heights, KY. November 2008
Molecular tools for assessing microbial water quality. Department of Earth and Environmental Engineering,
   Columbia University, New York, NY. April 2008.
Microbial source tracking: it takes two to tango. Department of Microbiology. University of Tennessee.
   Knoxville, TN. April 2008.
Microbial source tracking: a tool box for environmental monitoring. Ohio State University, Ohio Agricultural
   Research and Development Center, Wooster, OH. October 2007.
Introduction to microbial source tracking: a library independent perspective. EPI-Net workshop. Chicago, IL
   September 2007.
Microbial Forensics and Environmental Monitoring of Fecal Pollution: From Phylogenetics to Metagenomics.
   In Environmental Forensics: Microbial Clues to Contamination. ASM Annual Meeting, Toronto, Canada.
   May 2007.
Microbial Source Tracking. Ontario Ministry of the Environment. Toronto, Canada. May 2007.
Fecal Source Tracking:  an essential approach to microbial water quality monitoring.  Department of Crop and
   Soil Sciences. University of Kentucky, Lexington, KY. February 2007.
Determining the Sources of Fecal  Pollution Using Molecular Methods. Allegheny Branch American Society for
   Microbiology (ABASM) meeting. Plenary session. LaTrobe, PA, October 2006.
Microbial Source Tracking: A Necessary Tool for Environmental Monitoring and Risk Assessment. National
   Beaches Conference, Niagara  Falls NY, December 2006
Environmental Monitoring of Polluted Waters Using Source Tracking Molecular Tools: Lessons Learned and
   Future Needs. Water Management Association of Ohio (WMAO) 35th Annual Fall Conference. Columbus,
   OH, November 2006
Application of a Library Independent Method used in the Identification of Fecal Pollution Sources in
   Environmental Waters. Sustainable Beaches Conference. St. Petersburg, FL, October 2005.
Assumptions and Limitations of MST. Microbial Source Tracking Workshop. Sponsored by Water
   Environment Research Foundation and the Metropolitan Water District of Southern California, San
   Antonio, TX, February 2005
                               Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):  Segal, Deborah
                                     BIOGRAPHICAL SKETCH

NAME
Deborah Segal
eRA COMMONS USER NAME
Segal
POSITION TITLE
Environmental Health Scientist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
George Washington University
Johns Hopkins University
DEGREE
(if applicable)
B.A.
M.H.S.
YEAR(s)
1987
2000
FIELD OF STUDY
Political
Communications
Toxicology
A. POSITIONS and HONORS

Research and Professional Experience:
2007-        Project Officer, National Center for Environmental Research (NCER), ORD, USEPA
2000-2007    Science Review Administrator, NCER, ORD, USEPA
1997-2000:    Predoctoral Fellow, Johns Hopkins Bloomberg School of Public Health, Division of Toxicology
1992-1997:    Science Communications Officer, American Psychological Association
1990-1992:    Technical Production Editor, APA Books, American Psychological Association
1989-1990:    Assistant Editor, Broadcasting Yearbook, Broadcasting Publications, Inc.
1988-1989:    Editorial Assistant, Broadcasting Yearbook, Broadcasting Publications, Inc.

Professional Societies and Affiliations:
International Society for Exposure Analysis, Member

Honors and Awards:
EPA Bronze Medal for Commendable Service, for Creating and Maintaining the Preeminent Peer Review
Process for Evaluating Environmental Research in the United States, 2006; NIEHS Predoctoral Fellowship
Award, 1997-2000.

Selected Invitations at National & International Symposia:
"EPA's STAR Research Program in Computational Toxicology," presented at the 2nd & 3rd Annual Systems
Toxicology Symposiums:  Piscataway, NJ, 2008, 2009. "Overview of EPA's Workshop on Research Needs for
Community-Based Risk Assessment," presented at the Symposium on Exposure Science for Community-
Based Cumulative Risk Assessment, International Society for Risk Assessment Annual Conference, Durham,
NC, 2007.

Selected Expert Committees/Advisory Panels/Organizing Committees:
Chaired/Organized session at the "Research Approaches to Assessing Public Health Outcomes of Risk
Management Decisions Workshop" ((January 2008) featuring the research of new grantees to the
"Environmental Health Outcome Indicators" STAR program. Directed all aspects of "Research Needs for
Community-Based Risk Assessment Workshop" (October 2007)  designed to identify data gaps and prioritize
research needs for community risk assessments that consider interactions between chemical and non-
chemical stressors.  Chaired and Organized session at U.S. EPA STAR Graduate Fellowships Conferences:
"Toxicogenomics" (October 2004) and "Use of Models in the Risk Assessment Process" (October 2006).

Selected Assistance/Advisory Support to the Agency:
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)
Page 1	
Continuation Format Page
                              Previous
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):  Segal, Deborah

Chair of Workgroup on Lab/Center/Office (L/C/O) Involvement in NCER Activties; Cochair of Peer Review
Improvement Committee (Chair, Subcommittee: Peer Review Summaries and FACA); Serves on NCER Policy
Implementation Workgroup; Served on Mode of Action Workgroup.

B. Publication
Sanchez, Deener, Cohen Hubal, Knowlton, Reif, & Segal (2009). "Research needs for community-based risk
assessment: findings from a multi-disciplinary workshop." Journal of Exposure Science and Environmental
Epidemiology, Advance Online Publication, 25, 1-10.
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)
                               Previous
Page 2
TOC
Continuation Format Page

-------
        Principal Investigator/Program Director (Last, First, Middle):   Setzer, WoodfOW R.
                                    BIOGRAPHICAL SKETCH
  NAME
  R. Woodrow Setzer
  eRA COMMONS USER NAME
                                 POSITION TITLE
                                 Mathematical Statistician
  EDUCATION/TRAINING
INSTITUTION AND LOCATION
University of Chicago, Chicago, Illinois
SUNY at Stony Brook, Stony Brook, New York
University of North Carolina, Chapel Hill
National Research Council Fellow, USEPA, RTP,
NC
DEGREE
(if applicable)
B.A
Ph.D.
Post-doc
Post-doc
YEAR(s)
1974
1983
1987
1989
FIELD OF STUDY
Mathematics
Ecology and Evolution
Biostatistics
Biostatistics and Risk
Assessment
A. POSITIONS and HONORS
Research and Professional Experience:
2009 - Present
2009 - Present
2005 - Present
2002 - 2005

2000-2009
1993-2002
1989-1993
1987-1989
1984-1987
1984
Adjunct Professor, Department of Biostatistics, North Carolina State University
Mathematical Statistician, NCCT, ORD, EPA
Mathematical Statistician, PKB, ETD, NHEERL, ORD EPA
Adjunct Associate Professor, Department of Biostatistics, School of Public Health,
University of North Carolina at Chapel Hill
Mathematical Statistician, BRSS, NHEERL, ORD
Health Scientist, HERL, ORD EPA
Postdoctoral Fellow, DTD, HERL, ORD EPA
Postdoctoral Fellow, Department of Biostatistics,
School of Public Health, University of North Carolina, Chapel Hill, NC
Lecturer, Department of Ecology and Evolution, State University of New York, Stony Brook,
NY
Professional Societies and Affiliations:
Society for Risk Analysis
Biometrics Society
American Statistical Association

Honors and Awards:
1.  Level I USEPA Science and Technology Achievement Award for BBDR Modeling of the Developmental
   Toxicityof5-FU, 1994
2.  Level III USEPA Science and Technology Achievement Award for Dose-Response Relationship in Multi-
   stage Carcinogenesis, 1994
3.  Honorable Mention, USEPA Science and Technology Achievement Award for A New Mechanism for the
   Exogenous Mitigation of 5-Fluorouracil-Induced Toxicity, 1997
4.  USEPA Silver Medal for the Organophosphate Cumulative Risk Assessment, 2003
5.  USEPA Bronze Medal for Commendable Service for Development of Benchmark Dose Software, 2004
6.  USEPA Silver Medal for Scientific Workgroups for EPA's Guidelines for Carcinogen Risk Assessment and
   Supplemental Guidance for Assessing Susceptibility from Early-Life Exposure to Carcinogens, 2006
7.  USEPA Bronze Medal for Commendable Service for work with the Malathion RED team, 2006.
                                            Pagel
                              Previous
                             TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Setzer, WoodfOW R.

8.  Honorable Mention, USEPA Science and Technology Achievement Award for Defining Different
   Populations of Supernumerary Ribs and Assessing their Biological and Regulatory Significance, 2006
9.  USEPA Bronze Medal for Commendable Service for Information Technology Improvement Project Team,
   2007
10. Level III USEPA Science and Technology Achievement Award for Facilitating the Evaluation and
   Utilization of Physiologically Based Pharmacokinetic (PBPK) Models in Risk Assessment, 2007
11. Honorable Mention, USEPA Science and Technology Achievement Award for ToxCast: A Biologically and
   Chemically Based System for EPA Program Offices to Prioritize Toxicity Testing of Chemicals
12. USEPA Bronze Medal for "Successful Completion of a Decade's Work Involving Cutting Edge Science and
   Innovation on the N-Methyl Carbamate Cumulative Risk Assessment", 2008
13. USEPA Bronze Medal for Efforts as Part of a Team Instrumental in Developing and Implementing the
   BMD Methodology for Use in IRIS Assessments, 2008.
14. Award for Exceptional or Outstanding ORD Technical Assistance to the Regions or Program Offices, 2008.
15. Level III USEPA Science and Technology Achievement Award for the Analysis of the Risk Assessment
   Implications of Early-Life Exposure to Carcinogens Considering Mode of Action, 2008.
16. Superior Accomplishment Award from the Office of Pesticide Programs, 2009.

Selected Invitations at National & International Symposia:
   Risk Assessment Using EPA Benchmark Dose Software Version 1.2.  A full day workshop presented (with
J. Gift) at the annual meeting of the Society for Risk Analysis, December 5, 1999
   Calculating and Using Benchmark Doses (BMD).  Federal/State Toxicology and Risk Analysis Committee,
May 21-23, 2001.
   Populations and PK Models. NERL/NHEERL Exposure to Dose Modeling Workshop, Research Triangle
Park, NC, July 10-11,2001.
   Basic Statistical Analysis of Developmental Toxicity Studies, in Experimental Design and Biostatistics, a
mini-education course at the annual meeting of the Teratology Society, Scottsdale, AZ, June 25, 2002.
   Use of NOAEL, benchmark dose, and other models for human risk assessment of hormonally active
substances. SCOPE/IUPAC International Symposium on Endocrine Active Substances, Yokohama, Japan,
November 17-21,2002.
   Cumulative Risk Analysis for Organophosphorus Pesticides. Society of Toxicology, Salt Lake City, UT,
March 9-13, 2003.
   Mediating the Meeting between Model and Data: Statistical Issues for PBPK Modeling. International
Workshop on Uncertainty and Variability in Physiologically Based Pharmacokinetic (PBPK) Models, Research
Triangle Park, NC.  October 31 - November 2, 2006.
   International Workshop on Uncertainty and Variability in Physiologically Based Pharmacokinetic Models.
An International Workshop on the Development of Good Modelling Practice for PBPK Models.  Chania, Crete,
Greece. April 26 -  28, 2007.
 Lessons Learned:  Modeling Cancer Data.  ILSI Europe Workshop on the Application of Margin of Exposure
(MoE) Approach to Compounds in Food which are both Genotoxic and Carcinogenic, Rhodes, Greece, October
1-3,2008.
 Scientific Workshop to Inform EPA's Response to National Academy of Science Comments on the Health
Effects of Dioxin in EPA's 2003 Dioxin Reassessment. Cincinnati, Ohio, 18-20 February, 2009.
 WHO International Workshop on Principles of Characterizing and Applying PBPK Models in Risk
Assessment. Berlin, Germany, 6-9 July, 2009.

Selected Expert Commiittees/Advisory Panels/Organizing Committees:
Member, Editorial Board, Toxicology Methods, 1994 - 1998
President-Elect, Research Triangle Chapter, Society for Risk Analysis, 2001 - 2002
Chair, Research Triangle  Chapter, Society for Risk Analysis, 2002 - 2003

                                              Page 2
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Setzer, WoodfOW R.

Affiliate Member of the Biostatistics and Epidemiological Methods Facility Core, University of North Carolina
at Chapel Hill Center for Environmental Health and Susceptibility
ILSI HESI Dose Dependent Transitions in Mechanisms of Toxicity Committee 2002 - 2003.
Invited Participant, WHO/IPCS Author's Workshop on Dose-Response Modeling, Geneva, Switzerland, 2004
Invited Participant, EFSA/WHO International Conference, "Risk Assessment of Compounds that are both
Genotoxic and Carcinogenic" Brussels, Belgium, 16-18 November, 2005
ILSI-Europe Expert Group on the Application of the Margin of Exposure Approach to Genotoxic Carcinogens in
       Food. 2006-2008.
Associate Editor for Journal of Statistical Software, 2007 - present.
Publications Officer, Risk Analysis Specialty Section, American Statistical Association, starting January, 2010.

Selected Assistance/Advisory Support to the Agency:
Planning Committee and epidemiology session Chair Mn/MMT Workshop held in Research Triangle Park, NC,
       March 12-15, 1991
Co-Chair, Organizing Committee for the First  HERL Symposium: Biological Mechanisms and Quantitative Risk
       Assessment, 1992 - 1993.
IRIS RfD/C Committee, 1994-1995
Chair,  Risk Assessment Forum  Technical Panel, Benchmark  Dose Technical  Guidance Document, 1998 -
       2005
Statistical Consultant/Collaborator with the National Center for Environmental Assessment for Development of
       EPA's Benchmark Dose Software. 1993 - present.
Co-Chair,  International Workshop on Uncertainty and  Variability in  Physiologically Based Pharmacokinetic
       Models (2006 - present; workshop Oct 31  - Nov 2, 2006).
Member, NCEA Statistical Working Group (2005 - present).
Member, ORD Information Technology Improvement Project Working Group (2006).

B. SELECTED PUBLICATIONS (selected from 56 peer-reviewed).

1. Scheerer JB, Xi  L, Knapp GW, Setzer RW, Bigbee WL, and Fuscoe JC (1999) Quantification of  Illegitimate
   V(D)J Recombinase-Mediated Mutations in Lymphocytes of Newborns and Adults.  Mutation. Research.
   431:291-303.
2. Hurst CH, DeVito MJ, Setzer RW, and Birnbaum LS (2000) Acute Administration of 2,3,7,8-
   Tetrachlorodibenzo-p-dioxin (TCDD)  in Pregnant Long Evans Rats: Association of Measured Tissue
   Concentrations with Developmental Effects. Toxicological Sciences 53: 411-420.
3. Lau C, Andersen ME, Crawford-Brown D,  Kavlock RJ, Kimmel CA, Knudsen TB, Muneoka K, Rogers JM,
   Setzer RW, Smith G, and Tyl R (2000). Evaluation of Biologically Based Dose-Response Modeling for
   Developmental Toxicity: A Workshop Report.  Regulatory Toxicology and Pharmacology, 31: 190-199.
4. DeWoskin RS, Barone S Jr., Clewell  HJ, Setzer RW (2001) Improving the development and use of
   biologically based dose response models  (BBDR) in risk assessment. Human and Ecological Risk
   Assessment, 6:  1091 - 1120.
5. Lau C, Mole ML Copeland MF, Rogers JM, Kavlock RJ, Shuey DL, Cameron AM, Ellis DH, Logsdon TR,
   Merriman J, and Setzer RW (2001) Toward a biologically based dose-response model for developmental
   toxicity of 5-fluorouracil in the  rat: Acquisition of experimental data. Toxicological Sciences, 59: 37-48.
6. Setzer RW, Lau C, Mole ML,  Copeland FM, Rogers JM, and Kavlock RJ (2001). Toward a biologically-
   based dose-response model for developmental toxicity of 5-fluorouracil in the rat: a mathematical
   construct.  Toxicological Sciences, 59: 49-58.
7. Shaughnessy DT, Setzer RW, and DeMarini DM (2001).  Effect of the antimutagens vanillin and
   cinnamaldehyde on the spontaneous mutation spectra of Salmonella TA104.  Mutation Research, 480-
   481: 55-69.
8. Wubah JA, Setzer RW, and Knudsen TB (2001).  Exposure-disease continuum for 2-chloro-2'-
   deoxyadenosine (2CdA), a prototype ocular teratogen. 1.  Dose-response analysis. Teratology, 64: 154-
   169.
9. Lau C, Narotsky MG, Lui D, Best D, Setzer RW, Mann PG, Wubah JA, and Knudsen, TB (2002).

                                              Page 3
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   Setzer, WoodfOW R.

   Exposure-disease continuum for 2-chloro-2'-deoxyadenosine (2-CdA), a prototype! teratogen: Induction of
   lumbar hernia in the rat and species comparisons for the teratogenic responses. Teratology 66: 6-18.
10. Knapp GW, Setzer RW, Fuscoe JC (2003).  Quantitation of aberrant interlocus T-cell receptor
   rearrangements in mouse thymocytes and the effect of the herbicide 2,4-dichlorophenoxyacetic acid.
   Environmental Molecular Mutagenesis, 42: 37-43.
11.  Rogers JM, Setzer RW, Branch S, Chernoff N (2004).  Chemically induced supernumerary lumbar ribs in
   CD-1 mice: size distribution and dose response. Birth Defects Research, 71:  17—25.
12. Smialowicz RJ, Burgin DE, Williams WC, Diliberto JJ, Setzer RW, Birnbaum LS (2004). Xyp1A2 is not
   required for 2,3,7,8-tetrachlorodibenzo-p-dioxin-induced immunosuppression.  Toxicology, 197, 15—22.
13. Slikker W, Andersen ME, Bogdanffy MS, Bus JS, Cohen SD, Conolly RB, David RM, Doerrer NG, Dorman
   DC, Gaylor DW, Hattis D, Rogers JM, Setzer RW, Swenberg JA, Wallace K (2004). Dose-dependent
   transitions in mechanisms of toxicity.  Toxicology and Applied Pharmacology, 201: 203 ~ 225.
14. Slikker W, Andersen ME, Bogdanffy MS, Bus JS, Cohen SD, Conolly RB, David RM, Doerrer NG, Dorman
   DC, Gaylor DW, Hattis D, Rogers JM, Setzer RW, Swenberg JA, Wallace K (2004). Dose-dependent
   transitions in mechanisms of toxicity: case studies. Toxicology and Applied Pharmacology, 201:  226 -
   294.
15. Clark LH, Setzer RW, Barton HA (2004) Framework for evaluation of physiologically-based
   pharmacokinetic models for use in safety or risk assessment. Risk Analysis 24:  1697 - 1717.
16. Barton HA, Cogliano VJ, Flowers L, Valcovic L, Setzer RW, Woodruff TJ (2005). Assessing Susceptibility
   from Early-Life Exposure to Carcinogens. Environmental Health Perspectives, 113:  1125 - 1133.
17. Dix DJ, Houck KA, Martin MT, Richard AM, Setzer RW, Kavlock RJ. (2007). The ToxCast Program for
   Prioritizing Toxicity Testing of Environmental Chemicals. Toxicological Sciences 95:  5 - 12.
18. Barton HA, Chiu WA, Setzer RW, Andersen  ME, Bailer AJ, Bois FY, DeWoskin RS, Hays S, Johanson G,
   Jones N, Loizou G, MacPhail RC, Portier CJ, Spendiff M,  Tan Y-M (2007). Characterizing Uncertainty and
   Variability in Physiologically-based Pharcokinetic (PBPK)  Models: State of the Science and Needs for
   Research and Implementation.  Toxicological Sciences, 99: 395 - 402.
19. Kavlock RJ, Ankley G, Blancato J, Breen M,  Conolly R, Dix D, Houck K, Hubal E, Judson R, Rabinowitz J,
   Richard A, Setzer RW, Shah I, Villeneuve D, Weber E. (2008).  Computational toxicology - a state of the
   science mini review. Toxicological Sciences 103: 14-27.
20. Judson R, Elloumi F, Setzer RW, Li Z, Shah  I. (2008)  A comparison of machine learning algorithms for
   chemical toxicity classification using a simulated multi-scale data model. BMC Bioinformatics. 2008 May
   19;9:241.
21. Wambaugh JF, Barton HA, Setzer RW.  2009.  Comparing models for PFOA pharmacokinetics using
   Bayesian analysis.  Journal of Pharmacokinetics and Pharmacodynamics 35:  683 - 712.
22.  Lou I, Wambaugh JF, Lau C, Hanson RG, Lindstrom AB, Strynar MJ, Zehr RD, Setzer RW, Barton HA.
   2009. Modeling Single and Repeated Dose  Pharmacokinetics of PFOA in Mice. Toxicological Sciences
   107: 331  -341.
23.  Rodriguez, CE, Setzer RW, Barton HA. 2009. Pharmacokinetic Modeling of  Perfluorooctanoic Acid
   During Gestation and Lactation in the Mouse. Reproductive Toxicology, doi:
   10.1016/j.reprotox.2009.02.009
                                              Page 4
                              Previous  I    TOC

-------
          Principal Investigator/Program Director (Last, First, Middle):
                                             Shah, I
                                      BIOGRAPHICAL SKETCH
                     Provide the following information for the key personnel in the order listed on Form Page 2.
                           Follow this format for each person. DO NOT EXCEED FOUR PAGES.
  NAME
  Imran Shah
    POSITION TITLE
    Computational Systems Biologist, National Center
    for Computational Toxicology, ORD, EPA
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
Imperial College of Science, Technology Medicine,
London, UK,
George Mason University, Fairfax, Virginia, USA
George Mason University, Fairfax, Virginia, USA
DEGREE
(if applicable)
B.Sc.
M.S.
Ph.D.
YEAR(s)
1989
1993
1999
FIELD OF STUDY
Physics
Applied And Engineering
Physics
Computational Biology
 A. Positions and Honors.

Research and Professional Experience
2006-present Computational Systems Biologist, National Center for Computational Toxicology, Office of
             Research and Development, US Environmental Protection Agency, Research Triangle Park,
             North Carolina.

2004-2006   Head of Computational Systems Biology,  Icoria, Research Triangle Park, North Carolina.

2000-2004   Assistant Professor of Bioinformatics, Department of Preventive Medicine and Biometrics, and
             Department of Pharmacology, School of Medicine, University of Colorado Health Sciences
             Center, Denver, Colorado.

1997-2004   Director of the Doctoral Program in Bioinformatics, School of Medicine, University of Colorado
             Health Sciences Center, Denver, Colorado.

2001-2004   Director of Integrated Informatics, Department of Pharmacology, School of Medicine, University
             of Colorado Health Sciences Center, Denver, Colorado.
2001-2004   Adjunct Assistant Professor of  Computer Science and  Engineering,  University of Colorado,
             Denver, Colorado.
1999-2001   Adjunct Assistant Professor of Computational Sciences & Informatics, School of Computational
             Sciences, George Mason University, Fairfax, Virginia.
1998-2000   Bioinformatics Research Scientist, American Type Culture Collection (ATCC),
             Manassas, Virginia.
1996-1997   Graduate Fellow, School of Computational Sciences
             George Mason University, Fairfax, Virginia.
1995-1996   Bioinformatics   Software  Developer,  The  Institute  for Genomic Research (TIGR), Rockville,
             Maryland.
1993-1994   Software developer, Vision Lab, Department of Computer Science,
             George Mason University, Fairfax, Virginia.
1991-1994   Graduate Research Assistant, School of Computational Sciences
             George Mason University, Fairfax, Virginia.

Professional Societies and Affiliations:
1997-present International Society for Computational Biology (ISCB).
1999-2006   American Association for Artificial Intelligence (AAAI).
2008-present Society of Toxicology (SOT).
PHS 398/2590 (Rev. 05/01)
Page	
Biographical Sketch Format Page




-------
        Principal Investigator/Program Director (Last, First, Middle):
Honors
1996-1997   Predoctoral Fellowship,  School of Computational Sciences, George Mason University, Fairfax,
             Virginia.
1992         NASA  Summer Fellowship  for High  Performance Computing, NASA Goddard  Space  Flight
             Center, Greenbelt, Maryland.
2008         Environmental Protection Agency, Office of Research and Development, Bronze Medal
             Award  for Genomics Training.

Selected invitations  at National & International Symposia:
"Inference in High-throughput Biology." 29th Annual Meeting of the Statistical Society of Canada. Vancouver,
Canada,  June 2001.
"Pathway Visualization in Bioinformatics." Computer Graphics and Visualization Techniques for Bioinformatics
and Medical Applications. Center for Computational Biology, University of Colorado, Denver, October 2002.
"Computational Biology with Common Lisp." Bioinformatics Session, International Lisp Conference. San
Francisco, California,  October 2002.
"Computational Inference of Metabolic Pathways." 10th International Meeting on Microbial Genomes. Lake
Arrowhead, California, September 2002.
"Microbial Metabolic Pathway Prediction." Department of Microbiology, School of Medicine, University of
Colorado, Denver, Colorado, September 2002.
"Metabolic Pathway Inference by Heuristic Search." The BioPathways Conference. Edmonton, Canada, August
2002.
"Alcoholism and Gene Expression Arrays." Workshop on Neuroinformatics. Center for Computational Biology,
University of Colorado, Denver, April 2002.
"Elucidating cis-Regulatory Control using Gene Arrays." The First Rocky Mountain Regional Bioinformatics
Meeting, Aspen, Colorado,  December 2003.
"Computational Elucidation of Transcription Control Modules from Gene Expression Arrays." Human Medical
Genetics Program,  University of Colorado Health Sciences Center, May 2003.
"Pathway Elucidation, in silico." Department of Chemical Engineering, The University of Queensland,
Queensland, Australia, July 2003.
"Integrating Biomarkers to Understand Fatty Liver Disease." Metabolic Profiling. Durham, North Carolina,
November 2005.
"Predicting Dose-Response at The System  Level." McKim Conference, Duluth, Minnesota, September 2007.
"Representing Chemical-Induced  Liver Injury for Multiscale Tissue  Modeling." Conference on  Semantics in
Healthcare & Life-Sciences (C-SHALS 2008), Boston,  Massachusetts, February 2008.
"The Virtual Liver Project: Simulating Tissue Injury through Molecular and Cellular Processes." Systems
Biology of the Liver, International Conference on Systems Biology, Goteborg, Sweden, August 2008.
"Simulating Hepatic Tissue Lesions as Virtual Cellular Systems." Society of Toxicology, March 2009.
"The EPA Virtual Liver Project." International Workshop on Virtual Tissues, Research Triangle Park, North
Carolina, April 2009.
"The Virtual Liver: Simulating Hepatic Tissue Lesions as Cellular Systems." Toxicology and Risk Assessment
Conference, Cincinatti, Ohio, April 2009.
"Cell Behaviours and Virtual Tissues." Cell Behaviour Ontology Workshop, National  Institute for General
Medical Sciences, Bethesda, Maryland, May 2009.

Selected Expert Committees/Advisory Panels/Organizing Committees:
2000-2005   Member of the Center for Computational Pharmacology, Program in  Biomolecular Structure
             University of Colorado Health Sciences Center, Denver, Colorado.
2001-2006   National Human Genome Research Institute, National  Institute for Mental Health, National  Heart
             Lung and Blood Institute, National Science Foundation and Department of Energy grant
             application review.
2001-2005   Member of the Center for Computational Biology,  University of Colorado, Denver, Colorado.
2002         Co-organizer of the first workshop on Bioinformatics, Center for Computational Biology,
             University of Colorado, Denver.

 PHS 398/2590 (Rev. 05/01)                            Page	                            Continuation Format Page


-------
        Principal Investigator/Program Director (Last, First, Middle):
2002-2004   Member of BioPAX: An ontology for representing and exchanging biochemical pathways.
2003-2004   Program Committee for the "Innovative Applications of Artificial Intelligence (IAAI-03)"
             Conference, Acapulco, Mexico.
2007         Session Chair, "Modeling Signaling as a Determinant of System Behavior", International Forum
             on Computational Toxicology, EPA,  Research Triangle Park, North Carolina.
2008-2009   Co-chair of the "First International Workshop on Virtual Tissues (v-Tissues 2009)", Research
             Triangle Park, North Carolina.
2007         Environmental Protection Agency, Office of Research and Development, Future of Toxicology
             Working Group.

 B. Selected peer-reviewed publications

Cohen Hubal, E.A., Richard, A. M., Shah I..  Gallagher, J., Kavlock, R., Blancato, J., and Edwards, S.
Exposure science and the U.S. EPA National Center for Computational Toxicology. J Expos Sci Environ
Epidemiol (November 5, 2008).

Judson, R., Elloumi, F., Setzer, R. W., Li, Z., &Shah. I. A comparison  of machine learning algorithms for
chemical toxicity classification using a simulated multi-scale data model. BMC Bioinformatics 9, 241(2008).

Kavlock, R. J., Ankley, G., Blancato, J., Breen, M.,  Conolly,  R., Dix,  D.,Houck, K., Hubal, E., Judson, R.,
Rabinowitz, J., Richard, A., Setzer, R.W.,  Shah. I..  Villeneuve, D., Weber, E. (2007). Computational toxicology
a state of the science mini review, Toxicol Sci, Advance Access, Dec 7, 2007.
Lapadat, R., DeBiasi, R.L., Tyler, K.L., Johnson, G.L, and Shah. I. Genes induced by reovirus have a distinct
modular c/s-regulatory architecture. Current Genomics, 6(7):501-513, 2005.
McShan, D., Upadhyaya, M. and Shah. I. Symbolic inference of xenobiotic metabolism. Pac. Symp. Biocomp.
9:545-56,2004.
McShan, D., Upadhyaya, M. and Shah. I.  Heuristic Search for Metabolic Engineering:  cfe novo synthesis of
vanillin. Comp.  and Chem. Engg., Bioinformatics special issue (in press). 2004.
Hink, R.L., Hokanson, J.E., Shah. I.. Long, J.C., Goldman, D. Sikela, J.M. Investigation of DUSP8 and CALCA
in alcohol dependence. Addiction Biology. 8(3):305-312, 2003.
McShan, D., Rao, S. and Shah. I. PathMiner: Predicting metabolic pathways by heuristic search.
Bioinformatics. 19(13): 1692-1698, 2003.
McShan, D., and Shah. I. Distributed Intelligent Agents in Lisp for Bioinformatics (DIAL-B). Agents in
Bioinformatics, Autonomous and Multiagent Systems,  56-59, 2002.
Shah. I. and Hunter, L. Visual management  of large scale data mining projects. Pac. Symp. Biocomp. 5:275-
287, 2000.
Shah. I. and Hunter, L. Visualization based on the Enzyme Commission nomenclature. Pac. Symp. on
Biocomp. 3:142-152, 1998.
Shah. I. and Hunter, L. Identification of Divergent Functions in Homologous Proteins by Induction  over
Conserved Modules. Intell. Syst. Mol. Biol. 6:157-164, 1998.
Shah. I. and Hunter, L. Predicting Enzyme Function from Sequence: A Systematic Appraisal. Intell. Syst.  Mol.
Biol. 5:276-283, 1997.

 C. Research Support.
Principal investigator on "Modeling Metabolic Pathways: A Bioinformatics Approach." Funded by  the National
Science Foundation, Department of Energy  and Office of Naval Research, from September 1, 2000 to March
22, 2004.
Co-Principal investigator on "Integrated Neuroinformatics Resource for Alcoholism Research." Funded by the
National Institute for Alcohol Abuse and Alcoholism, from August 1,  2001 to  March 22, 2004.
Investigator on "Gene Array Technology Center for Alcohol Research." Funded by the National Institute for
Alcohol Abuse and Alcoholism, from April  1, 2001 to March 22, 2004.
Co-Principal investigator on "Application of expression analysis to study disease pathogenesis." Funded by the
National Heart, Lung and Blood Institute,  from October 2002 to March 22, 2004.
Principal investigator on "Target Assessment Technology Suite." Funded by the National Institute of Standards,
Advanced Technology Program,  from 2002-2006.

 PHS 398/2590 (Rev. 05/01)                           Page	                            Continuation Format Page


-------
         Principal Investigator/Program Director (Last, First, Middle):
                                          BIOGRAPHICAL SKETCH
   Provide the following information for the key personnel and other significant contributors in the order listed on Form
                                                  Page 2.
  	Follow this format for each person. DO NOT EXCEED FOUR PAGES.	
NAME
Singh, Amar V.
eRA COMMONS USER NAME
POSITION TITLE
Scientific Systems Analyst
  EDUCATION/TRAINING  (Begin with baccalaureate or other initial professional education, such as nursing, and
            INSTITUTION AND LOCATION
                                                      DEGREE
                                                     (if applicable)
YEAR(s)
FIELD OF STUDY
                                                                                 Botany/Biochem/Zool.
                                                                                 Biotechnology
                                                                                 Project Management
                                                                                 Botany (Bioinformatics)
  Yuvaraja's College, Mysore India                     B.S.           1993
  University of Mysore, Mysore India                    M.S.           1995
  University of Louisville, Louisville, KY US              Certificate      2006
  CCS University, Meerut India                         PhD           2009

A.  POSITIONS AND HONORS (In chronological order)

     Research and Professional Experience:

     1995-96  Assistant Manager, Jayson's Agritech Pvt. Ltd., Mysore, India
     1996-97  Project Assistant, Central Food Technological Research Institute, Mysore, India
     1997-99  Research Fellow, Department of Biotechnology, University of Mysore, Mysore, India
     1999-00  Clinical Data Reviewer, Clinical Data Management Center, GSK, Bangalore, India
     2000-01   Research Scientist in Bioinformatics, Avestha Gengraine Technologies Pvt. Ltd., India
     2001-02  Group Leader, Bioinformatics, Avestha Gengraine Technologies Pvt. Ltd., India
     2002-03  Manager, Bioinformatics, Mascon Global Ltd., Princeton NJ
     2003-03  Research Asst. Thomas Jefferson University, Philadelphia PA
     2003-05  Research Associate, Systems Analysis Laboratory, U of Louisville Birth Defects Center, Louisville KY
     2005-07  Research Scientist, Birth Defects Center, U of Louisville Birth Defects Center, Louisville KY USA
     2005-07  Bioinformatics Manager,  Systems Analysis Laboratory, U of Louisville, Louisville KY
     2006-07  Operation Manager, Center for Environmental Genomics and Integrative  Biology, UoL Louisville KY
     2007-pres Scientific Systems Analyst, Lockheed Martin Contractor at National Center for Computational Toxicology
         (NCCT) US EPA, RTP Durham NC

     Professional Societies and Affiliations:

     International Society of Computational Biologists (ISCB), Teratology Society, Asia Pacific Bioinformatics Network
     (ApBioNET), African Society for Bioinformatics and Computational Biology (ASBCB), BioClues (Indian Bioinformatics
     Society)

     Honors and Awards:

    1997   Lady Tata Memorial Fellowship (Lady Tata Memorial Trust, Mumbai India).
    1998   Senior Research Fellowship (Central Scientific and Industrial Research, Govt of India New Delhi, India)
    2006  Young Investigator Travel Awards, 46th Annual Meeting of The Teratology Society Meeting at Tucson,
       Arizona.

    Special Recognition:  Elected to Inbios Management Group, Bioinformatics  Society of India (INBIOS) (2002-pres) ;
    ResearchILouisville: 3rd place, Innovation in  Biotechnology (2004); Committee Co-Chair, Issues and Protocols in
    Bioinformatics Education, Bioinformatics Society of lndia(2005-pres).  Member, Web Site Committee Teratology
    Society (2005-2009; Chair of the Committee 2007-2008); Member Constitution and By-Laws Committee Teratology
    Society (2009-pres); Core Council Member and Secretary BioClues; Organizing Committee Member for First Virtual
    Bioinformatics Conference lnBix'10 in India.
PHS 398/2590 (Rev. 09/04)
                                                   Page J	
             Biographical Sketch Format Page
                                 Previous
                                                   TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):

   SELECTED INVITATIONS AT NATIONAL & INTERNATIONAL SYMPOSIA:

   Invited speaker, "Systems Biology in Understanding Developmental Defects—Efforts and Challenges", First virtual
   conference on "Bioinformatics to Systems Biology" November 16, 2007, jointly organized by Bioinformatics.Org and
   the ISCB Regional Student Group, Denmark

B. SELECTED  PEER-REVIEWED PUBLICATIONS (in chronological order).


    1.   S. Ahuja, S. K. Bagga*, R. Keith, G. G. Nair, A. V. Singh and R. V. S. V. Vadlamudi. (2002) Intellectual
         Property Rights, Indian Journal of Pharmaceutical Sciences AI-PEAR-GP  Discussion of the month, Nov-Dec
         2002  Issue.

    2.   Dr. S.Bagga, A.V.Singh and  S. Goswami. (2002) Gene Prediction: A New Frontier in Pharmaceutical
         Research, II Anniversary Chronicle Pharmabiz Specials Dec 26 2002.

    3.   S. K.  Bagga*, Laura McCarthy, S. Z. Rahman, K.  Jhawar, S. Ahuja,  N.Udupa, A. V. Singh and R. V. S. V.
         Vadlamudi (2003). Power Plants: Green Pharmacy, Indian Journal of Pharmaceutical Sciences, May-June
         2003.

    4.   S. K.  Bagga*, A. V. Singh, Vibhav Garg, Sulip Goswami and R. V. S. V. Vadlamudi. (2003) Computer Aided
         versus Wet Lab Drug Discovery, Indian Journal of Pharmaceutical Sciences, Jan-Feb 2003.

    5.   Singh AV, Knudsen KB and Knudsen TB (2005) Computational systems analysis of developmental toxicity:
         design, development and implementation of a birth defects systems manager (BDSM). Reprod. Toxicol. 19: 421-
         439.

    6.   Nemeth KA, Singh AV and Knudsen TB (2005) Searching for biomarkers  of developmental toxicity with
         microarrays:  normal eye morphogenesis in  rodent embryos. Toxicol Appl Pharmacol 206(2):219-28.

    7.   Knudsen KB, Singh AV and Knudsen TB (2005) Data input module for Birth Defects Systems Manager Reprod.
         Toxicol. 20(3):369-75.

    8.   Kinane DF, Shiba H, Stathopoulou PG, Zhao H, Lappin DF, Singh AV, Eskan MA, Beckers S, Weigel S, Alpert
         B and Knudsen TB (2006) Gingival epithelial cells heterozygous for Toll-like receptor 4 polymorphisms
         Asp299Gly and Thr399lle are hypo-responsive to Porphyromonas gingivalis. Genes &  Immunity Apr;7(3):190-
         200.

    9.   Maia  L. Green, AmarV. Singh, Yihzi Zhang, Kimberly A. Nemeth, Kathleen K. Sulik, and Thomas B. Knudsen.
         (2007) Reprogramming of genetic networks During Initiation of the Fetal Alcohol Syndrome. Dev Dyn.
         Feb;236(2):613-31.

    10.  AmarV Singh, Kenneth B Knudsen and Thomas B Knudsen. (2007) Integrative Analysis of the mouse
         embryonic Transcriptome. Bioinformation, 1(10), 406-413.

    11.  AmarV Singh, Eric Rouchka, Greg Rempala, Caleb Bastian and Thomas B Knudsen. (2007) Integrative
         Database Management for Mouse Development:  Systems and Concepts Review. Birth Defects Research (Part
         C) 81:1-19.

    12.  Deaciuc IV, Song Z, Peng X,  Barve SS, Song M,  He Q, Knudsen TB, Singh AV, and McClain CJ (2008)
         Genome-wide transcriptome expression in the liver of a mouse model of high carbohydrate diet-induced liver
         steatosis and its significance  for the disease. Hepatol  International, Volume 2, Number 1 (March, 2008) 2 :39-49

    13.  Barthold JS,  McCahan, Singh AV, Knudsen TB, Si X, Campion L and Akins RE (2008) Altered expression  of
         muscle and cytoskeleton-related genes in a rat strain with inherited cryptorchidism. J. Androl. cryptorchidism. J
         Androl. 29: 352-366

    14.  Rouchka EC, Phatak AW, and Singh AV. (2008), Effect of single  nucleotide polymorphisms on Affymetrix®
         match-mismatch probe pairs  Bioinformation 2(9):405-11.

    15.  Benakanakere MR, Li Q, Eskan MA, Singh AV, Zhao J, Galicia JC,  Stathopoulou P, Knudsen TB, Kinane DF
         (2009) Modulation of TLR2 Protein Expression by miR-105 in Human Oral Keratinocytes. J Biol Chem.
         284(34):23107-15
PHS 398/2590 (Rev. 09/04)                              Page _2	                       Biographical Sketch Format Page
                                Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):

    16.   Knudsen TB, Martin MT, Kavlock RJ, Judson RS, Dix DJ, and Singh AV (2009). Profiling the activity of
          environmental chemicals in prenatal developmental toxicity studies using the U.S. EPA's ToxRefDB. Reprod
          Toxicol. 28(2):209-19
    17.   Thomas B. Knudsen, Keith A. Houck, Richard S. Judson, Amar V. Singh, Arthur Weissman, Holly M.
          Mortensen, David M. Reif, R. Woodrow Setzer, David J. Dix, and Robert J. Kavlock (2009) Biochemical
          Activities of 309 ToxCast™ Chemicals Evaluated Across 239 Functional Targets. (Submitted to Nature Chem.
          Biol)

    18.   Ema M, Iseb R, Katoc H, Onedad S, Hirosea A, Hirata-Koizumia M, Nishidac Y, Singh AV, Knudsen
          TB and Ihara T (2009) Fetal malformations and early embryonic gene expression response in
          cynomolgus monkeys maternally exposed to thalidomide Repro. Tox (Under Revision)

    19.   Singh AV,  Yang C, Kavlock RJ, and Richard AM (2010) Developmental Toxicology Research Strategies:
          Computational Toxicology. Comprehensive Toxicology, 2nd Edition (editors: GP Daston and TB Knudsen),
          Elsevier: New York (Submitted)
PHS 398/2590 (Rev. 09/04)
                                 Previous
Page _3	
TOC
Biographical Sketch Format Page

-------
 Principal Investigator/Program Director (Last, First, Middle):
 Tan, Cecilia
                                BIOGRAPHICAL SKETCH

NAME
Tan, Cecilia

POSITION TITLE
Research Physical Scientist
  EDUCATION
INSTITUTION AND LOCATION
National Cheng Kung University, Tainan, Taiwan
Harvard School of Public Health, Boston, MA
University of North Carolina at Chapel Hill,
Chapel Hill, NC
North Carolina State University
DEGREE
(if applicable)
B.S.
M.S.
Ph.D.
M.B.A.
YEAR(s)
1995
1997
2001
2009
FIELD OF STUDY
Environmental
Environmental Health
Environmental Sciences
and Engineering
Business Administration
A. POSITIONS and HONORS

Research and Professional Experience:
2009-        Research Physical Scientist, National Exposure Research Laboratory, ORD,
             USEPA
2003-2009    Associate Director, Center for Human Health Assessment, CUT Centers for
             Health Research, RTP, NC
2001-2003    Post-doctoral Fellow, CUT Centers for Health Research, RTP,  NC
1997-2001    NIOSH Trainee, The University of North Carolina at Chapel Hill, Chapel Hill, NC
1996-1997    Industrial Hygienist, Massachusetts Institute of Technology, Cambridge, MA
1993-1995    Research Assistant, National Cheng Kung University, Tainan, Taiwan


Selected Awards and Honors:
Risk Assessment Specialty Section Best Abstract Award, Society of Toxicology, New Orleans,
   LA, 2005
Risk Assessment Specialty Section Best Abstract Award, Society of Toxicology, Salt Lake City,
   UT, 2003
National Institute for Occupational Safety and Health pilot and research training grant, 2000
National Institute for Occupational Safety and Health traineeship award, 1998-2001
Department of Education Graduate Assistants in Areas of National Need (GAANN) Fellowship,
   1997-1998
Hazardous Substances Academic Training Program Fellowship,  1996-1997
1994 Chinese Institute of Engineers Student Paper Contest Best Paper Award, Taiwan, 1994
The 5th Academic Essay Contest Best Essay Award, National Cheng Kung University, Taiwan,
   1994

Invited Lectures/Symposia (selected):
Probabilistic reverse dosimetry: using pharmacokinetic modeling to estimate population-scale
   distributions of exposure from biomonitoring data. Society of Risk Analysis Annual Meeting,
   Boston, MA, December 2008.
Application of pharmacokinetic modeling to relate PFOA exposures and blood concentrations in
   human populations. ISEA/ISEE Annual Meeting, Pasadena, CA, October 2008.
Conducting reverse dosimetry with physiologically based pharmacokinetics models to estimate
   population exposure from data collected in populations-scaled biomonitoring studies.
   Society of Risk Analysis Annual Meeting, Baltimore, MD,  December 2006.
Application of pharmacokinetic modeling to estimate PFOA exposures associated with
   measured blood concentrations in human populations. Society of Risk Analysis Annual
                                       Page  1
                        Previous
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):  Tan, Cecilia

   Meeting, Baltimore, MC, December 2006.
Reconstructing human chloroform exposure from biomonitoring data with a physiologically
   based pharmacokinetic model. Society of Risk Analysis Annual Meeting, Orlando, FL,
   December 2005.
Computational modeling of chloroform cytolethality and regenerative proliferation for cancer risk
   assessment. Society of Risk Analysis Annual Meeting, Orlando, FL, December 2005.
The use of computational modeling in systems biology. NAASO Annual Scientific Meeting,
   Vancouver, BC, October 2005.
Use of biologically based computational modeling in mode of action-based risk assessment - an
   example of chloroform. U.S. EPA Workshop  on Optimizing the design and interpretation of
   epidemiologic studies to consider alternative disinfectants of drinking water, Raleigh, NC,
   June 2005.
Scientific challenges: biomonitoring & its relevance to toxicology. Human biomonitoring: an
   ICCA workshop for the global chemical industry. Charles de Gaulle, France, June 2005.
Modeling aggregate exposure and cumulative risk assessment of mixtures of common modes of
   action.  SOT Contemporary concepts in toxicology, Atlanta, GA, February 2005.
The use of physiologically based pharmacokinetic/pharmacodynamic modeling in quantitative
   safety assessment. Predictive ADME/Toxicology  Meeting. San Diego, CA, January 2005.
A physiologically based pharmacokinetic/pharmacodynamic model for N-methyl carbamate
   pesticide carbaryl. The 25th American College of Toxicology Annual Meeting, Palm Springs,
   CA, November 2004.
PBPK modeling to analyze human blood and exhaled breath for chloroform. Forum on
   disinfection by-products - exploring the current science on disinfection by-products,
   Research Triangle Park, NC, September 2004.


PBPK Modeling Courses:
A Course on Physiologically Based Pharmacokinetic (PBPK) Modeling in Drug Development
   and Evaluation. Westin Alexandria,  Alexandria, VA, 6-10 April, 2009.
2009 Society of Toxicology Annual Meeting Continuing Education Course - Characterizing
   variability and uncertainty with physiologically based pharmacokinetic models. Title:
   Variability in exposure and internal dosimetry assessed with  PBPK models. Baltimore, MD,
   15 March, 2009.
A Course on Physiologically Based Pharmacokinetic (PBPK) Modeling and Risk Assessment.
   The  Hamner Institutes for Health Research, Research Triangle Park, NC, 11-15 February,
   2008.
A Short Course on Interpretation of Biomonitoring Data Using Physiologically Based
   Pharmacokinetic (PBPK) Modeling. 2007Joint ISEA/ISEE Annual Meeting, Durham, NC, 14
   October, 2007.
A Course on Interpretation of Biomonitoring Data Using Physiologically Based Pharmacokinetic
   (PBPK) Modeling. CUT Centers for Health Research,  Research Triangle Park, NC, 25-29
   September, 2006.
A Course on Physiologically Based Pharmacokinetic (PBPK) Modeling and Risk Assessment.
   CUT Centers for Health Research, Research Triangle Park, NC, 6-10 February, 2006.
A Course on Physiologically Based Pharmacokinetic (PBPK) Modeling and Risk Assessment.
   CUT Centers for Health Research, Research Triangle Park, NC, 26-30 September, 2005.

Conference session chair:
Interpreting human biomonitoring data in the context of risk assessment: issues and challenges
   2.  Society of Risk Analysis Annual  Meeting,  Baltimore, MD, December 2006.
Forum on disinfection by-products - exploring the current science on disinfection by-products.
   CUT Centers for Health Research, Research Triangle Park, NC, September 2004.
                                              Page 2
                        Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Tan, Cecilia
B. SELECTED PUBLICATIONS
Tan, Y., Clewell, H., Andersen, M. (2008) Time dependencies in perfluorooctylacids disposition
   in rat and monkeys: a kinetic analysis. Toxicol. Lett. 177, 38-47.
Clewell, H., Tan, Y., Campbell, J., Andersen, M. (2008) Quantitative interpretation of human
   biomonitoring data. Toxicol. Appl. Pharmacol. 231, 122-133.
Nong, A., Tan, Y., Krolski, M., Wang, J., Lunchick, C., Conolly, R., Clewell, H. (2008) Bayesian
   calibration of a physiologically based pharmacokinetic/pharmacodynamic model of carbaryl
   cholinesterase inhibition. J. Toxicol. Environ. Health A 71, 1363-1381.
Teeguarden, J., Bogdanffy, M., Covington, T., Tan, Y., Jarabek, A. (2008) A PBPK model for
   evaluating the impact of aldehyde dehydrogenase polymorphisms on comparative rat and
   human nasal tissue acetaldehyde dosimetry. Inhal. Toxicol. 20(4),375-90.
Dorman, D., Struve, M., Wong, B., Gross, E., Parkinson, C., Willson, G., Tan, Y., Campbell, J.,
   Teeguarden, J., Clewell, H., Andersen, M. (2008) Derivation of an inhalation reference
   concentration based upon olfactory neuronal loss in male rats following subchronic
   acetaldehyde inhalation. Inhal Toxicol. 20(3),245-56.
Schroeter, J., Kimbell, J., Gross, E., Wilson, G., Dorman, D., Tan, Y., Clewell, H. (2008)
   Application of physiological computational fluid dynamics models to predict interspecies
   nasal dosimetry of inhaled acrolein. Inhal Toxicol 20(3),227-43.
Das, K., Grey,  B., Zehr, R., Wood, C., Butenhoff, J., Chang, S., Ehresman, D., Tan, Y., Lau, C.
   (2008) Effects of perfluorobutyrate exposure during pregnancy in the mouse. Toxicol Sci
   105(1),173-81.
Chang, S., Das, K., Ehresman, D., Ellefson, M., Gorman, G., Hart, J., Noker, P., Tan, Y., Lieder,
   P., Lau, C., Olsen, G., Butenhoff, J. (2008). Comparative pharmacokinetics of
   perfluorobutyrate in rats, mice, monkeys, and humans and relevance to human exposure via
   drinking water. Toxicol. Sci. 104(1)40-53.
Hays, S., Aylward, L, LaKind, J., Bartels, M., Barton, H.,  Boogaard, P., Brunk, C., DiZio, S.,
   Dourson, M., Goldstein, D., Lipscomb, J., Kilpatrick, M., Krewski, D., Krishnan,  K., Nordberg,
   M., Okino, M., Tan, Y., Viau, C.,  Yager, J.  (2008) Guidelines for the derivation of
   Biomonitoring Equivalents: report from the biomonitoring equivalents expert workshop.
   Regul. Toxicol. Pharmacol. 51 (S3), S4-S15.
LaKind, J., Aylward, L.,  Brunk, C., DiZio, S., Dourson,  M., Goldstein, D., Kilpatrick,  M., Krewski,
   D., Bartels, M., Barton,  H., Boogaard, P., Lipscomb, J.,  Krishnan, K., Nordberg, M., Okino,
   M., Tan, Y., Viau, C., Yager, J., Hays, S. (2008). Guidelines for the communication of
   Biomonitoring Equivalents: report from the Biomonitoring Equivalent Expert Workshop.
   Regul. Toxicol. Pharmacol. 51 (S3), S16-S26
Tan, Y., Liao, K., Clewell, H. (2007)  Reverse dosimetry: interpreting trihalomethanes
   biomonitoring data using physiologically based pharmacokinetic modeling. J. Expo Sci
   Environ Epidemiol 17,591-603.
Liao, K., Tan, Y., Clewell, H. (2007)  Development of a screening approach to interpret human
   biomonitoring data on volatile organic compounds: reverse dosimetry on biomonitoring data
   for trichloroethylene. Risk Anal. 27(5), 1223-1236.
Liao, K., Tan, Y., Conolly, R., Borghoff, S., Gargas, M., Andersen, M., Clewell, H. (2007)
   Bayesian estimation of pharmacokinetic and pharmacodynamic parameters in a mode-of-
   action based cancer risk assessment for chloroform. Risk Anal. 27(6), 1535-1551.
Barton, H., Chiu, W., Setzer, W., Andersen, M., Bailer, A., Bois, F., Dewoskin, R., Hays, S.,
   Johanson, G., Jones, N.,  Loizou, G., Macphail, R., Portier, C., Spendiff, M., Tan, Y. (2007)
   Characterizing uncertainty and variability in physiologically based pharmacokinetic models:
   state of the science and needs for research and implementation. Tox. Sci. 99, 395-402.
Tan, Y., Liao, K., Conolly, R., Blount, B., Mason, A., Clewell, H. (2006) Use of a physiologically
   based pharmacokinetic model to identify exposures consistent with  human biomonitoring
   data for chloroform. J. Toxicol. Environ.  Health 69, 1727-56.
                                               Page  3
                        Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Tan, Cecilia

Andersen, M., Clewell, H., Tan, Y., Butenhoff, J., Olsen, G. (2006) Pharmacokinetic modeling of
   saturable, renal resorption of perfluoroalkylacids in monkeys - probing the determinants of
   long plasma half-lives. Toxicol. 227, 156-164.
Tan, Y., Butterworth,  B., Gargas, M., Conolly, R. (2003) Biologically motivated computational
   modeling of chloroform cytolethality and regenerative cellular proliferation. Tox. Sci. 75, 192-
   200.
Tan, Y., Flynn, M., Buller, T. (2002) Field evaluation of models for predicting worker exposure
   during spray painting. Ann. Occup.  Hyg. 46(1), 103-112.
Tan, Y. and  Flynn, M. (2002) Methods for estimating the transfer efficiency of a compressed air
   spray gun. Appl. Occup. Environ. Hyg. 17(1), 39-46.
Tan, Y. and  Flynn, M. (2000) Experimental evaluation of a mathematical model for predicting
   transfer efficiency of a high volume - low pressure air spray gun. Appl. Occup. Environ. Hyg.
   15(10), 785-793.
Tan, Y., DiBerardinis, L, Smith, T. (1999) Exposure assessment of laboratory students. Appl.
   Occup. Environ. Hyg. 14(8), 530-8.

Book Chapters
Qiang, Z., Tan, Y., Bhattacharya, S., Andersen, M. (2008) Computational systems biology
   modeling of dosimetry and cellular response pathways. Drug Efficacy, Safety, and Biologies
   Discovery - Emerging Technologies and Tools. Ed. Ekins and Xu. John Wiley & Sons, Inc.,
   Hoboken, NJ.
Tan, Y. and  Clewell, H. (2008) Probabilistic reverse dosimetry modeling for interpreting
   biomonitoring  data (submitted to Ed. Andersen and Krishnan).
Tan, Y., Yang, Y., Andersen, M., Clewell, H. (2008). Exposure science: Pharmacokinetic
   modeling (submitted to Ed. Meliker, J.).
                                               Page 4
                         Previous  I    TOC

-------
       Principal Investigator/Program Director (Last, First, Middle):   TiC6, Raymond R.
                                     BIOGRAPHICAL SKETCH
      NAME
      Raymond R. Tice
      eRA COMMONS USER NAME
     POSITION TITLE
     Chief, Biomolecular Screening Branch
     National Toxicology Program
     National Institute of Environmental Health
     Sciences
      EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral
      training.)
INSTITUTION AND LOCATION
John Hopkins University
DEGREE
(if applicable)
Ph.D.
YEAR(s)
1976
FIELD OF STUDY
Biology
      Raymond R. Tice, Ph.D. is Chief of the NTP Biomolecular Screening Branch (BSB). The BSB is
      responsible for coordinating the NTP High Throughput Screening (HTS) Initiative and plays a
      key role in the efforts of the Tox21 Community, which is an outgrowth of a 2008 Memorandum
      of Understanding between the NTP, the NIH Chemical Genomics Center, and EPA's National
      Center for Computational Toxicology to collaborate on the research, development, validation,
      and translation of new and innovative test methods that characterize key steps in toxicity
      pathways.

      Tice has served as President of the  Environmental Mutagen Society and as Vice-President of
      the  International Association of Environmental Mutagen Societies. He is the recipient of NIH
      Director's Group Awards for activities associated with the NIH Molecular Libraries Initiative and
      with the development of the ICCVAM Five-Year Plan (2008-2012). In late 2008, along with
      Christopher Austin, Ph.D., of the NIH Chemical Genomics Center and Robert Kavlock, Ph.D., of
      EPA's National Center for Computational Toxicology, Tice received the North American
      Alternative Award from the Humane Society of the United States for "outstanding scientific
      contributions to the advancement of viable alternatives to animal testing."

      During his career,  he has served on over 50 international expert panels and committees that are
      primarily related genetic toxicology and more recently to validation of alternative test methods.
      He has published 130 scientific papers and book chapters, edited four symposia proceedings
      and contributed to 23 electronic review publications, in support of the NTP chemical nomination
      process, and to 35  NICEATM-ICCVAM publications. Tice is a member of the editorial boards of
      Mutation Research and  Environmental and Molecular Mutagenesis.

      Tice received his Ph.D.  in  biology in 1976 from Johns Hopkins University in Baltimore,
      Maryland. He was employed by the  Medical Department at Brookhaven  National Laboratory,
      Upton, New York from 1976 to1988, and by Integrated Laboratory Sciences, Inc., Durham,
      North Carolina from 1988 to 2005, where his last position was as Senior Vice-President for
      Research and Development. He joined NIEHS in 2005 as the Deputy Director of the NTP
      Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM) and was
      promoted to Chief of the Biomolecular Screening Branch in 2008.

      Selected Publications

      1. Parham F, Austin C, Southall N, Xia M, Tice R, Portier C. Dose-response modeling of high-
      throughput screening data. Submitted to Journal of Biomolecular Screening.
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)
Page J	
                              Previous
 TOC

-------
        Principal Investigator/Program Director (Last, First, Middle):   TiC6, Raymond R.


2. Xia M, Huang R, Sun Y, Semenza GL, Aldred SF, Witt KL, Inglese J, Tice RR, Austin CP.
Identification of chemical compounds that induce 1 HIF-1a activity. Toxicological Sciences 2009;
doi:10.1093/toxsci/kfp123.
3. Kavlock RJ, Austin CP, Tice RR. Toxicity Testing in the 21st Century: Implications for Human
Health Risk Assessment, Risk Analysis 2009; 29(4):485-487.
4. Witt KL, Livanos E, Kissling GE, Torous DK, Caspary W, Tice RR, Recio L.  Comparison of
flow cytometry- and microscopy-based methods for measuring micronucleated reticulocyte
frequencies in rodents treated with nongenotoxic and genotoxic chemicals. Mutation Res 2008;
649:101-113.
5. Xia M, Huang R, Witt KL, Southall N, Fostel J,  Cho M-H,  Jadhav A, Smith CS, Inglese J,
Portier CJ, Tice RR, Austin CP. Compound cytotoxicity profiling using quantitative high-
throughput screening. Environ Health Perspect 2007; 116:284-291.
6. Balls M, Amcoff P, Bremer S, Casati S, Coecke S, Clothier R, Combes R, Corvi R, Curren R,
Eskes C, Fentem J, Gribaldo L, Haider M,  Hartung T, Hoffmann S, Schectman L, Scott L,
Spielmann H, Stokes W, Tice R, Wagner D, Zuang V. The principles of weight of evidence
validation of test methods and testing strategies. The report and recommendations of ECVAM
workshop 58. Altern Lab Anim. 2006; 34(6):603-20.
7. Burlinson B, Tice RR, Speit G, Agurell E, Brendler-Schwaab SY, Collins AR, Escobar P,
Honma M, Kumaravel TS, Nakajima  M, Sasaki YF, Thybaud V, Uno Y, Vasquez M,  Hartmann
A. Fourth International Workgroup on Genotoxicity Testing: Results of the in vivo Comet Assay
Workgroup. Mutation Res 2006; 627:31-35.
8. Manson J, Brabec MJ, Buelke-Sam J, Carlson GP, Chapin RE, Favor JB, Fischer LJ, Hattis
D, Lees  PS, Perreault-Darney S, Rutledge J, Smith TJ, Tice RR, and Working P. NTP-CERHR
expert panel report on the reproductive and developmental  toxicity of acrylamide. Birth  Defects
Res B Dev Reprod Toxicol 2005; 74:17-113.
9. Witt KL, Tice RR, Wolfe G, Bishop JB. Genetic damage detected in CD-1 mouse  pups
exposed perinatally to 3'-azido-3'-deoxythymidine or dideoxyinosine via maternal dosing,
nursing,  and direct gavage. II: Effects of the individual agents compared to combination
treatment. Environ Mol  Mutagen. 2004; 44:321-328.
10.  Hartmann A, Agurell E, Beevers C, Brendler-Schwaab S, Burlinson B, Clay P, Collins A,
Smith A, Speit G, Thybaud V, Tice RR. Recommendations for conducting the  in vivo alkaline
Comet assay. Mutagenesis 2003; 18:45-51.
11.  Tice RR, Agurell E, Anderson D,  Burlinson B, Hartmann A, Kobayashi H, Miyamae  Y, Rojas
E, Ryu JC, Sasaki Y. The single cell gel/comet assay: guidelines for in vitro and in vivo  genetic
toxicology testing. Environ Mol Mutagen 2000; 35:206-221.
PHS 398/2590 (Rev. 09/04, Reissued 4/2006)                 Page 2
                        Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Wambaugh, John F.
                                      BIOGRAPHICAL SKETCH
          Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                           Follow this format for each person. DO NOT EXCEED FOUR PAGES.
  NAME
  John F. Wambaugh
                                    POSITION TITLE
                                    Physical Scientist
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
University of Michigan, Ann Arbor, Ml
Georgia Institute of Technology, Atlanta, GA
Duke University, Durham, NC
Duke University, Durham, NC
NCCT, US EPA, Research Triangle Park, NC
DEGREE
(if applicable)
B.S.
M.S.
M.S.
Ph.D.
YEAR(s)
1995-1999
1999-2001
2001-2005
2001-2006
2006-2008
FIELD OF STUDY
Physics
Physics
Computer Science
Physics
Statistical Analysis of
Biological Models
A. POSITIONS and HONORS
Research and Professional Experience:
2008-present
2006-2008

2002-2006

2001-2003
2001

1999-2001
Physical Scientist, National Center for Computational Toxicology, US EPA, RTP, NC
Postdoctoral Physicist, National Center for Computational Toxicology, US EPA, RTP, NC
(mentors: Hugh Barton and Woodrow Setzer)
Research Assistant, Department of Physics, Duke University, Durham NC
(advisor: Robert Behringer)
Teaching Assistant, Department of Physics, Duke University, Durham, NC
Visiting Student, Center for Nonlinear Science, Los Alamos National Laboratory, Los Alamos,
NM (advisors: Charles Reichhardt, Cynthia Olson-Reichhardt)
Teaching Assistant, School of Physics, Georgia Institute of Technology, Atlanta, GA
Professional Societies and Affiliations:
2007-present
2006-present

2001-present

1995-present
Member, Sigma Xi, Duke University Chapter
Member, Society of Toxicology: Biological Modeling and Risk Assessment Specialty Sections,
North Carolina Society of Toxicology
Member, American Physical Society: Division of Fluid Dynamics, Statistical and Nonlinear
Physics Topical Group, Forum on Graduate Student Affairs
Member, American Association of Physics Teachers
Honors and Awards:

2006   Dynamics Days Student Travel Award,
       Dynamics Days 2006 Conference in Baltimore, MD
2005   Duke University Graduate School Student Travel Award,
       American Physical Society, Division of Fluid Dynamics Meeting in Chicago, IL
2004   Duke University Graduate School Student Travel Award,
       American Physical Society, Division of Fluid Dynamics Meeting in Seattle, WA
2003   Duke University Graduate School Student Travel Award,
       American Physical Society, Division of Fluid Dynamics Meeting in East Rutherford, NJ
Updated 11/13/2007
                                 Page 1
                               Previous
                                TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Wambaugh, John F.

B. SELECTED PUBLICATIONS (selected from 14 total).
   1.  "Modeling Single and Repeated Dose Pharmacokinetics of PFOA in Mice," I. Lou, J.F. Wambaugh, C.
       Lau, R.H. Hanson, A.B. Lindstrom, M.J. Strynar, R.D. Zehr, and H.A. Barton, Toxicological Sciences
       107, 331-341 (2009)
   2.  "Comparing Models for PFOA Pharmacokinetics Using Bayesian Analysis," J.F. Wambaugh,  H.A.
       Barton, and R.W. Setzer, Journal of Pharmacokientics and Pharmacodynamics 35, 683-712 (2008)
   3.  Wambaugh, J.F., Majmudar, T.S., Tighe, B.P., Socolar,  J.E.S. and Behringer, R.P., "Experimental
       observation of spatial scaling in dense granular matter," in preparation
   4.  Wambaugh, J.F., Hartley, R.R., and Behringer,  R.P., "Force networks and elasticity in granular silos,"
       arXiv: 0801.3387
   5.  Wambaugh, J.F., Matthews, J.V., Gremaud, P.A. and Behringer, R.P.,  "Response to perturbations in
       granular flow," Physical Review E 76, 051303 (2007)
   6.  Wambaugh, J.F., "Graph Percolation as an Analog to Granular Force Networks," cond-mat/0603314,
       (2006)
   7.  Mort, P., McKenzie, K., Wambaugh, J.F., and Behringer, R.P., "Granular flow through an Orifice -
       Effect of granule size and shape distributions," Proceedings of Fifth World Congress on Particle
       Technology (2006)
   8.  Wambaugh, J.F. and Behringer, R.P., "Asymmetry-induced circulation  in granular hopper flows,"
       Powders & Grains 2005, pages 915-918 (2005).
   9.  Wambaugh, J.F., Marchesoni, F. and Nori,  F., "Shear and Loading in Channels: Oscillatory Shearing
       and Edge Currents of Superconducting Vortices,"  Physical Review B 67, 144515 (2003)
   10. Wambaugh, J.F., Reichhardt C., and Olson, C.J., "Ratchet-Induced Segregation and Transport of Non-
       Spherical Grains," Physical Review E 65, 031308 (2002)
   11. Wambaugh, J.F., Reichhardt, C., Olson, C.J., Marchesoni, F., and Nori, F., "Superconducting Fluxon
       Pumps and Lenses,"  Physical Review Letters 83, 5106 (1999)

C. PRESENTATIONS WITH ABSTRACTS
   1.  "Examining Models for the Pharmacokinetics of Perfluorooctanoic acids", Annual Meeting of the Society
       of Toxicology in Baltimore, Maryland,  March 2009 (talk)
   2.  "Enhancing the Modeling of PFOA Pharmacokinetics with Bayesian Analysis," Annual Meeting of the
       Society of Toxicology in Seattle, Washington, March 2008 (poster)
   3.  "Assessing Uncertainty in the Toxicology of PFOA," International  Science Forum on Computational
       Toxicology, Research Triangle Park, North  Carolina, May 2007 (poster)
   4.  "Bayesian Analysis of Parameters for Pharmacokinetic Models," Annual Meeting of the Society of
       Toxicology in Charlotte, North Carolina, March 2007 (poster)
   5.  "Joint Analysis of PFOA Plasma Concentration and Excretion Data," Perfluoroalkyl Acids and Related
       Chemistries, Society of Toxicology Workshop in Arlington, Virginia, February 2007 (poster)
   6.  "Spatial Distribution of Forces within Granular Materials," APS Division of Fluid Dynamics Annual
       Meeting in Tampa, Florida, November 2006 (talk)
   7.  "Impact of Particle Elasticity on Granular Force Networks," APS March Meeting in Baltimore, Maryland,
       March  2006 (talk)
   8.  "Square Amplitude Granular Waves," Dynamics Days 2006 Meeting in Bethesda, Maryland, January
       2006 (talk)
   9.  "Sensitivity of Granular Hopper Flows to Boundary Conditions," APS Division of Fluid Dynamics Annual
       Meeting in Chicago,  Illinois, November 2005 (talk)
   10. "Asymmetry-induced  circulation in granular hopper flows," Powders and Grains 2005, Stuttgart,
       Germany, June 2005 (poster)
   11. "Circulation in Asymmetric Granular Hoppers," APS Division of Fluid Dynamics Annual Meeting in
       Seattle, Washington,  November 2004 (talk)
   12. "Observed Deviations from Janssen Model  in Granular Silos," Dynamics Days 2004 Meeting in Chapel
       Hill, North Carolina, January 2004 (poster)
   13. "Elastic Effects in Granular Pressure Profiles," APS Division of Fluid Dynamics Annual Meeting in East
       Rutherford, New Jersey, November 2003 (talk)
   14. "Asymmetry-Induced Circulation in Conical  Granular Flows," APS Division of Fluid Dynamics Annual
Updated 11/25/2008                                 Page 2
                              Previous  I    TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Wambaugh, John F.

       Meeting in East Rutherford, New Jersey, November 2003 (talk)
   15. "Ratchet-Induced Segregation and Transport of Non-Spherical Grains," APS March Meeting in
       Indianapolis, Indiana, March 2002 (talk)
   16. "A New System for Accessing Transfer Function Coefficients for an Architectural Computer-Aided
       Thermal Optimization Tool," Fifth International Building Performance Simulation Association Meeting,
       Prague, Czech Republic, September 1997 (poster)

D. INVITED PRESENTATIONS
   1.  "Biological Modeling for Toxicology at the US Environmental Protection Agency," Department of
       Physics and Chemistry, Coastal Carolina University, Conway, South Carolina, April 2009

E. WORKSHOPS
   1.  "PFAA Days II", Environmental Protections Agency, Research Triangle Park, North Carolina, June
       2008
   2.  "Perfluoroalkyl Acids Research Planning Workshop," Environmental Protection Agency, Research
       Triangle Park, North Carolina, August 2007
   3.  "Perfluoroalkyl Acids and Related Chemistries: Toxicokinetics and Mode-of-Action," Society of
       Toxicology, Arlington, Virginia, February 2007
   4.  "Uncertainty and Variability in Pharmacokinetic Models",  Environmental Protection Agency, Research
       Triangle Park, North Carolina, October 2006
   5.  "Multiscale Model Development and Control Design Workshop on Fluctuations and Continuum
       Equations for Granular Flow", Statistical and Applied Mathematical Sciences Institute, Research
       Triangle Park, North Carolina, April 2004

F. CONTINUING EDUCATION
   1.  "Characterizing Variability and Uncertainty with Physiologically-Based  Pharmacokinetic Models,"
       Society of Toxicology, Baltimore, Maryland, March 2009
   2.  "Characterizing Modes-of-Action and Their Relevance in Assessing Human Health Risks," Society of
       Toxicology, Baltimore, Maryland, March 2009
   3.  "Beginning and Intermediate Modeling and Simulation Techniques using acsIX, with Applications to
       Computational Biology," Aegis Software, Research Triangle Park, North Carolina, May 2008
   4.  "Dose-Response Modeling for Occupational and Environmental Risk Assessment," Society of
       Toxicology, Seattle, Washington, March 2008
   5.  "Use of Data for Development of Uncertainty Factors in Non-Cancer Risk Assessment," Society of
       Toxicology, Seattle, Washington, March 2008
   6.  "Physiologically Based Pharmacokinetic Modeling for Risk Assessment Applications," Society of
       Toxicology, Charlotte, North Carolina, March 2007
   7.  "Fundamentals of Human Health Risk Assessment with a Case Study  Approach," Society of
       Toxicology, Charlotte, North Carolina, March 2007
   8.  "Interpretation of Biomonitoring Data Using Physiologically Based Pharmacokinetic Modeling," CUT,
       Center for Human Health Assessment, Research Triangle Park,  North Carolina, September 2006
Updated 11/25/2008                                  Page 3
                              Previous  I     TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Welsh, William James
                                     BIOGRAPHICAL SKETCH
          Provide the following information for the key personnel and other significant contributors in the order listed on Form Page 2.
                          Follow this format for each person.  DO NOT EXCEED FOUR PAGES.
  NAME
  William J. Welsh
  eRA COMMONS USER NAME
  Welsh04
   POSITION TITLE
   Norman H. Edelman Professor in Bioinformatics &
   Molecular Design, Department of Pharmacology
  EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
St. Joseph's University, Phila, PA
University of Pennsylvania, Phila, PA
University of Cincinnati, Cincinnati, OH
DEGREE
(if applicable)
B.S.
Ph.D.
Postdoc
YEAR(s)
1969
1975
1979-82
FIELD OF STUDY
Chemistry
Physical Chemistry
Comput Biophys Chem
A. Positions and Honors

Positions and Employment
1975-79      Research Chemist, Procter & Gamble Co. (Cinti., OH)
1979-86      Professor of Chemistry, College of Mount St. Joseph
     Research Associate Professor of Chemistry, University of Cincinnati
1986-90      Assistant Professor of Chemistry, Univ. of Missouri-St. Louis (UM-St. Louis)
1990-95      Associate Professor, UM-St. Louis
1995-2001    Professor, UM-St. Louis
2001-        Professor, Dept. of Pharmacology, UMDNJ-RWJMS
2001-        Director, UMDNJ Informatics Institute
2003-        Norman H. Edelman Chaired Professorship, UMDNJ-RWJMS
2005-        Director, UMDNJ Environmental Bioinformatics & Computational Toxicology Center

Other Experience and Professional Memberships
1985         Extramural Research Associate, National Institutes of Health, Bethesda, MD
1991-1999   Member, Editorial Board, Journal of Computational and Theoretical Polymer Science
1997-2001    Associate Director, UM-St. Louis Ctr. for  Molecular Electronics
1999-2001    Associate Director, University of Missouri Bioinformatics Center
1999-2001    Director, UM-St. Louis Center for Cheminformatics
2000-2003   Member, Editorial Advisory Bd, Journal of Computer Information and Chemical Sciences
2001-        Member, Cancer Institute of New Jersey
2002-        Member of Graduate Faculty, UMDNJ-Rutgers U.  Environ. & Occup. Health Sci. Inst. (EOHSI)
2002-        Member, New Jersey Center for Biomaterials
2003-        Member of Graduate Faculty, Medicinal Chemistry, School of Pharmacy, Rutgers Univ.
2003-        Editorial Board, Journal of Molecular Graphics & Modelling
2003-        Editorial Board, Chemical Research in Toxicology  Journal
2005-        Editorial Board, Cancer Informatics Journal

Honors
1985         Teacher of the Year Award, College of Mount St. Joseph, Ohio
1998         St. Louis Award, St. Louis Section of the  American Chemical Society
2001         Entrepreneur  of the Year Award, University of Missouri
2001         Chancellor's Award for Research and Creativity, UM-St. Louis
2003         Norman H. Edelman Endowed  Professorship in Bioinformatics, UMDNJ-RWJMS
2004         John C. Krantz, Jr. Lectureship Award, University of the Sciences in Philadelphia (USP).
2009         Honorary Society Plenary Lectureship, Georgia State University, Atlanta GE

B. Selected peer-reviewed publications in  chronological order (from over 400 publications)
Tumor-targeted bioconjugate based delivery of camptothecin: design, synthesis and in vitro evaluation,
Paranjpe PV, Chen Y, Kholodovych V, Welsh WJ, Stein S, Sinko PJ. J Control Release 100, 275-92 (2004).
                                                1
                               Previous
TOC

-------
        Principal Investigator/Program Director (Last, First, Middle): Welsh, William James

Structural model of the Plasmodium CDK, Pfmrk, a novel target for malaria therapeutics. Peng Y, Keenan SM,
Welsh WJ. J Mol Graph Model. 24(1):72-80 (2005).
Rational inhibitor design and iterative screening for identification of plasmodial cyclin dependent kinase
inhibitors, Keenan SM, Geyer JA, Welsh WJ, Prigge ST, Waters NC. Comb Chem High Throughput Screen
8(1):27-38 (2005).
Discovery of novel triazole-based opioid receptor antagonists. Zhang Q, Keenan SM, Peng Y, Nair AC, Yu SJ,
Howells RD, Welsh WJ. J Med Chem. 49(14):4044-7(2006).
Discovery of broad spectrum protein kinase inhibitors to probe the malarial cyclin dependent protein kinase
Pfmrk. Woodard CL,  Keenan SM, Gerena L, Welsh WJ, Geyer JA, Waters NC. Bioorg Med Chem Lett.
17(17):4961-6(2007).
Highly Potent Triazole-Based Tubulin Polymerization Inhibitors. Zhang Q, Peng Y, Wang XI,  Keenan SM,
Arora S, Welsh WJ. J. Med. Chem. 50, 749-754 (2007).
Shape signatures: new descriptors for predicting cardiotoxicity in silico. Chekmarev DS, Kholodovych V,
Balakin KV, Ivanenkov Y, Ekins S, Welsh WJ. Chem Res Toxicol. 21(6):1304-14 (2008).
New predictive models for blood-brain barrier permeability of drug-like molecules. Kortagere S, Chekmarev D,
Welsh WJ, Ekins S. Pharm Res. 25(8):1836-45 (2008).
Novel microtubule polymerization inhibitor with potent antiproliferative and antitumor activity.  Arora S, Wang XI,
Keenan SM, Andaya C, Zhang Q,  Peng Y, Welsh WJ. Cancer Research. 69(5): 1910-5 (2009).
Specific interactions between the viral coreceptor CXCR4 and the biguanide-based compound NB325 mediate
inhibition of human immunodeficiency virus type 1 infection. Thakkar N,  Pirrone V, Passic S,  Zhu W,
Kholodovych V, Welsh WJ, Rando RF, Labib ME, Wgdahl B, Krebs FC. Antimicrob Agents Chemother.
53(2):631-8 (2009).
Evaluations of the trans-sulfuration pathway in multiple liver toxicity studies. Schnackenberg  LK, Chen M, Sun
J, Holland RD,  Dragan Y, long W, Welsh W, Beger RD. Toxicol Appl Pharmacol. 235(1):25-32 (2009).
Hybrid scoring and classification approaches to predict human pregnane X receptor activators. Kortagere S,
Chekmarev D, Welsh WJ, Ekins S. Pharm  Res. 26(4):1001-11  (2009).
Novel microtubule polymerization inhibitor with potent antiproliferative and antitumor activity.  Arora S, Wang XI,
Keenan SM, Andaya C, Zhang Q,  Peng Y, Welsh WJ. Cancer Res. 69(5): 1910-5 (2009).
The major human pregnane X receptor (PXR) splice variant, PXR.2, exhibits significantly diminished ligand-
activated transcriptional regulation. Lin YS, Yasuda K, Assem M, Cline C, Barber J, Li CW, Kholodovych V, Ai
N, Chen JD, Welsh WJ, Ekins S, Schuetz EG. Drug Metab Dispos. 37(6): 1295-304 (2009).
ebTrack: an environmental bioinformatics system built upon ArrayTrack. Chen M, Martin J, Fang H, Isukapalli
S, Georgopoulos  PG, Welsh WJ, Tong W.  BMC Proc. 3 Suppl 2:S5 (2009).
Structure-activity relations of nanolipoblockers with the atherogenic domain of human macrophage scavenger
receptor A. Plourde NM, Kortagere S, Welsh W, Moghe PV. Biomacromolecules. 10(6):1381-91 (2009).
Understanding nuclear receptors using computational methods. Ai N, Krasowski MD, Welsh WJ, Ekins S. Drug
Discov Today 14(9-10):486-94 (2009).
NetCSSP: web application for predicting chameleon sequences and amyloid fibril formation.  Kim C, Choi J,
Lee SJ, Welsh WJ, Yoon S. Nucleic Acids Res. 37, W469-73 (2009).
Predicting inhibitors of acetylcholinesterase by regression and classification machine learning approaches with
combinations of molecular descriptors. Chekmarev D, Kholodovych V, Kortagere S, Welsh WJ, Ekins S. Pharm
Res. 26(9):2216-24 (2009).
Application of Screening Methods, Shape Signatures and Engineered Biosensors in Early Drug  Discovery
Process. Hartman I, Gillies AR, Arora S, Andaya C, Royapet N, Welsh WJ, Wood DW, Zauhar RJ. Pharm Res.
Jul 22 (2009). [Epub ahead of print]
Novel delta opioid receptor agonists exhibit differential stimulation of signaling pathways. Peng Y, Zhang Q,
Arora S, Keenan SM, Kortagere S, Wannemacher KM, Howells RD, Welsh WJ. Bioorg Med Chem. Jul 9
(2009). [Epub ahead of print]
                              Previous  I    TOC

-------
                                      BIOGRAPHICAL SKETCH
NAME
Fred A. Wright
eRA COMMONS USER NAME
Fred_Wright
POSITION TITLE
Professor
   EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, and include postdoctoral training.)
INSTITUTION AND LOCATION
SUNY at Buffalo, New York
The University of Chicago
DEGREE
(if applicable)
B.A.
PhD
YEAR(s)
1989
1994
FIELD OF STUDY
Psychology and
Statistics
Statistics
 Positions and Employment
1994-1997
1997-Jan2002

Feb 2002-2008
Jul 2008-present
                Assistant Adjunct Professor, Family & Preventive Medicine, University of California, San
                Diego.
                Assistant Professor, Division of Human Cancer Genetics, The Ohio State University,
                Columbus, OH.
                Associate Professor, Department of Biostatistics, University of North Carolina, Chapel Hill.
                Professor, Department of Biostatistics, University of North Carolina, Chapel Hill.
Honors and Awards
    2004  Elected to Delta Omega Public Health Honor Society
          Phi Beta Kappa
Other Experience and Professional Memberships
Professional memberships in American Association for the Advancement of Science, American Society of
Human Genetics, American Statistical Association.
Selected peer-reviewed publications (in chronological order)
Becker, LB, Han B, Meyer P, Wright FA, Rhodes K, Smith D, Barrett J: Racial differences in the incidence of
   cardiac arrest and subsequent survival. New England Journal of Medicine, 329: 600-606, 1993.
Kong A and Wright F: Asymptotic theory for gene mapping. Proceedings of the National Academy of
   Sciences, USA, 91: 9705-9709, 1994.
Takiyyuddin MA, Parmer RJ, Kailasam MT, Cervenka JH, Kennedy B, Ziegler M, Lin MC, Li J, Grim CE,
   Wright FA, O'Connor DT: Chromogranin A in human hypertension: influence of heredity. Hypertension, 26:
   213-220, 1995.
Winqvist R, Hampton G, Mannerma A, Blanco G, Alavaikko M, Kiviniemi H, Taskinen PJ, Evans G, Wright FA,
   Newsham I,  Cavenee W: Loss of heterozygpsity for chromosome 11 in primary human breast tumors is
   associated with poor survival after metastasis. Cancer Research, 55: 2660-2664, 1995.
Paulson TG, Wright FA, Parker BA,  Russak V, Wahl GM: Microsatellite instability correlates with  reduced
   survival and poor disease prognosis in breast cancer. Cancer Research, 56:4021-4026, 1996.
Wright FA: The phenotypic difference discards sibpair QTL linkage information. American Journal of Human
   Genetics, 60: 740-742, 1997.
Wright FA and Kong A: Linkage mapping in experimental crosses: the robustness of single-gene models.
   Genetics, 146: 417-425, 1997.
Rock CL, Flatt S, Wright FA, Faerber S, Newman V, Kealey S, Pierce JP:  Responsiveness of carptenoids to a
   high-vegetable diet intervention to prevent breast cancer recurrence. Cancer Epidemiology, Biomarkers,
   and Prevention, 6: 617-623, 1997.
Pierce JP, Faerber S, Wright FA,  Rock CL, Newman V, Flatt S,  Kealey S,  Hryniuk W: Feasibility of a
   randomized trial of a high-vegetable diet to prevent breast cancer. Nutrition and Cancer, 28:282-288, 1997.
   Rock CL, Newman V, Flatt SW, Faerber SF, Wright FA,  Pierce, JP: Nutrient intakes from foods and
   dietary supplements in women at risk for breast cancer recurrence. Nutrition and Cancer, 29:122-139,
   1997.
Dao TT, Kailasam MT, Parmer  RJ, Le HV, LeVerge RL,  Kennedy BP, Ziegler MG,  Insel PA, Wright FA,
   O'Connor DT:  Expression of altered alpha-2 adrenergic phenotypic traits in normotensive humans at
   genetic risk of hereditary (essential) hypertension. Journal of Hypertension, 16: 779-792, 1998.
Newman V, Rock CL, Faerber S,  Flatt SW, Wright FA, Pierce JP: Dietary supplement use by women at risk
   for breast cancer recurrence. The Women's Healthy  Eating and  Living Study Group.  Journal of the
   American Dietetic Association, 98: 285-292, 1998.
 PHS 398/2590 (Rev. 09/04, Reissued 4/2006)
                                              Page	
Biographical Sketch Format Page




-------
Rayburn K, Martinez R, Escobedo M, Wright FA, Farias M: Glycemic effects of various species of nopal
   (opuntia sp.) in type 2 diabetes mellitus. Texas Journal of Rural Health, 16: 68-76, 1998.
Hryniuk W, Frei E, Wright FA: A single scale for comparing dose-intensity of all chemotherapy regimens in
   breast cancer: Summation dose-intensity. Journal of Clinical Oncology, 16: 3137-3147, 1998.
De La Chapelle A, Wright FA: Linkage disequilibrium mapping in isolated populations: The example of Finland
   revisited. Proceedings of the National Academy of Sciences, USA, 95: 12416-12423, 1998.
Sadler GR, Thomas AG, Dhanjal SK, Gebrekristos B, Wright FA: Breast cancer screening adherence in
   African-American women - Black cosmetologists prompting health. Cancer, 83:1836-1839, 1998.
Fierer J, Walls L, Wright F, Kirkland TN: Genes influencing resistance to Coccidioides immitis and the interleukin-10
   response map to chromosomes 4 and 6 in mice. Infect Immun, 67:2916-2919,1999.
Wright, FA, O'Connor, DT, Yoneda, LU, Kutey, G, Roberts, E, Berry, C, Weber, JL, Timberlake, D, Schlager, G:
   Genome scan for blood pressure loci in mice. Hypertension, 34:625-630,1999.
Lin S, Irwin ME, Wright FA: A multiple locus analysis of the COGA data set. Genetic Epidemiology 17 (Suppl
   7j: S229-234, 1999.
O'Connor DT, Takiyyuddin  MA, Printz MP, Dinh TO, Barbosa JA, Rozansky DJ, Mahata SK, Wu H, Kennedy
   BP, Ziegler MG, Wright FA, Schlager G, Parmer RJ: Catecholamine storage vesicle protein expression in
   genetic hypertension. Blood Pressure 8: 285-295, 1999.
Costello JF, Fruhwald  MC,  Smiraglia DJ, Rush LJ, Robertson GP, Gao X. Wright FA, Feramisco JD,
   Peltomaki P, Lang JC, Schuller DE, Yu L, Bloomfield CD, Caligiuri MA, Yates A, Nishikawa R, Huang H-J
   S, Petreilli NJ, Zhang X, O'Dorisio MS, Held WA, Cavenee WK, Plass C: Aberrant CpG island methylation
   has non-random and tumor type-specific patterns. Nature Genetics 24:132-138, 2000.
Hoffman HM, Wright FA, Broide DH, Wanderer AA, Kolodner RD: Identification of a locus on chromosome
   1q44for Familial Cold Urticaria.  American Journal of Human Genetics 66:1693-1698, 2000.
Borrego S, Ruiz A, Saez ME, Gimm  O, Gao X, Lopez-Alonso M, Wright FA, Antinolp G, Eng C: RET
   genotypes comprising specific haplptypes of polymorphic variants predispose to isolated Hirschsprung
   disease.  Journal of Medical Genetics 37: 572-578, 2000.
Desai DC, Lockman JC, Chadwick RB, Gao X, Percesepe A, Evans GR, Miyaki M, Yuen ST, Radice P,  Maher
   ER, Wright FA, de la Chapelle A: Recurrent germline mutation in MSH2 arises frequently de novo.
   Journal of Medical Genetics 37: 646-652,  2000.
Lin S, Cheng R, Wright FA: Genetic crossover interference in the human genome. Annals of Human Genetics,
   65:79-93,2001.
Virtaneva Kl, Wright FA, Tanner SM, Yuan B, Lemon WJ, Caligiuri MA, Bloomfield CD, de la Chapelle A,
   Krahe R: Gene expression profiling reveals fundamental biological differences in AML with isolated trisomy
   8 and normal cytogenetics. Proceedings of the National Academy of Sciences USA, 98:1124-1129, 2001.
Huang J,  Kuismanen SA, Liu T, Chadwick RB, Johnson CK, Stevens MW, Richards SK, Meek JE, Gao X,
   Wright FA, Mecklin JP, Jarvinen HJ, Gronberg  H, Bisgaard ML, Lindblom A, Peltomaki P: MSH6 and
   MSH3 are rarely involved in genetic predisposition to non-polypotic colon cancer. Cancer Research, 61:
   1619-1623,2001.
D, Zhao WD, Wright FA, Yuan H-Y, Wang J-P, Sears R, BaerT, Kwon D-H, ordon D, Gibbs S,  Dai D, Yang Q,
   Spitzner J, Krahe R, Stredney D, Stutz A, Yuan B: Assembly, annotation and integration of UniGene
   clusters into the human genome  draft. Genome Research, 11: 904-918, 2001.
Wang D, Cheng  R, Gao X,  Lin S, Wright FA: Transformation of sibpair values for the Haseman-Elston
   method.  American Journal of Human Genetics 68:1238-1249, 2001.
Rush LJ, Dai Z, Smiraglia DJ, Gao X, Wright FA, Fruhwald M, Costello JF, Held WA, Yu L, Krahe R, Kolitz JE,
   Bloomfield CD, Caligiuri MA,  Plass C: Novel methylation targets in de novo acute myeloid leukemia with
   prevalence of chromosome 11 loci. Blood, 97:3226-33, 2001.
Wright FA, Lemon WJ, Zhao WD, Sears R, Zhuo D, Wang J-P, Yang H-Y, Baer T, Stredney D, Spitzner J,
   Stutz A, Krahe R, Yuan B: A draft annotation and overview of the human genome. Genome Biology, 2:
   research0025.1-0025.18, 2001.
Fruhwald MC, O'Dorisio SM, Smith L, Dai Z, Wright FA, Paulus W, Jurgens H,  Plass C: Hypermethylation as a
   potential prognostic factor and a  clue to a better understanding of the molecular pathogenesis of
   medulloblastoma - results of a genomewide methylation scan. Klinische Padiatrie, 213: 1-7, 2001.
Smiraglia DJ, Rush LJ, Fruhwald MC, Dai Z, Held, WA, Costello JF, Lang JC, Eng C, Li B, Wright FA, Caligiuri
   MA, Plass C: Excessive CpG island hypermethylation in cancer cell lines versus primary human
   malignancies.  Human Molecular Genetics, 10: 1413-1419, 2001.
Fruhwald MC, O'Dorisio SM, Dai Z, Tanner SM, Balster DA, Gao X, Wright  FA, Plass C: Aberrant promoter
   methylation of novel rather than known methylation targets is a common abnormality in medulloblastomas
   - Implications for tumor biology and potenial clinical utility.  Oncogene 20: 5033-5042, 2001.
Dai Z, Lakshmanan RR, Zhu W-G, Smiraglia DJ, Rush LJ, Fruhwald MC, BrenaRM, Li B, Wright FA, Ross P,
   OttersonGA, Plass C. Global Methylation Profiling of Lung Cancer Identifies Novel Methylated Genes.
   Neoplasia, 3: 314-323, 2001.


-------
Wu L, Saavedra HI, Timmers C, Sang L, Nuckolls F, Nevins JR, Wright FA, Robinson ML, Leone G: the E2F1,
    E2F2, and E2F3 transcription activators are essential for cellular proliferation. Nature, 414: 457-62, 2001.
Huang Y, Prasad M, Lemon WJ, Hampel H, Wright FA, Kornacker K, LiVoIsi V, Frankel W, Kloos RT, Eng C,
    Pellegata N, de la Chapelle A: Gene expression in papillary thyroid carcinoma reveals highly consistent
    profiles. Proc Natl Acad SciUSA:15044-9, 2001.
Lemon WJ, Palatini JJ, Krahe R, Wright FA.  Theoretical and experimental comparisons of gene expression
    indexes for oligonucleotide arrays. Bioinformatics. 11:1470-1476, 2002.
Pierce JP,  Faerber S, Wright FA, Rock CL, Newman  V, Flatt SW, Kealey S, Jones V, Caan BJ, Gold EB, Haan
    M, Hollenbach KA, Jones L, Marshall JR,  Ritenbaugh C, Stefanick ML, Thomson C, Wasserman L,
    Natarajan L, Gilpin E: A randomized trial of the effect of a plant-based dietary pattern on additional breast
    cancer events and survival: The Women's Healthy Eating and Living (WHEL) Study. Control Clin Trials.
    23:728-756, 2002
Yoon H, Liyanarachchi S, Wright FA, Davuluri R, Lockman JC, de la Chapelle A, Pellegata NS:  Gene
    expression  profiling of isogenic cells with different TP53 gene dosage reveals numerous genes that are
    affected by TP53 dosage and  identifies CSPG2 as a direct target of p53. Proc Natl Acad Sci USA.
    99:15632-15637,2002
Borrego S, Wright FA,  Fernandez RM, Williams N, Lopez-Alonso M, Davuluri R, Antinolo G, Eng C:  A
    founding locus within the RET proto-oncogene may account for a large proportion of apparently sporadic
    Hirschsprung disease and a subset of cases of sporadic medullary thyroid carcinoma. Am J Hum Genet.
    72:88-100,2003.
Tanner SM, Aminoff M, Wright FA, Liyanarachchi S, Kuronen  M, Saarinen A, Massika O, Mandel H, Broch H,
    de la Chapelle A: Amnionless, essential for mouse gastrulation, is mutated in recessive hereditary
    megaloblastic anemia.  Nat Genet 33:426-429, 2003.
Cheng R, Ma JZ, Wright FA, Lin S, Gao X, Wang D, Elston RC, Li MD: Nonparametric disequilibrium mapping
    of functional sites using haplotypes of multiple tightly linked single-nucleotide polymorphism (SNP)
    markers. Genetics, 164:1175-1187, 2003.
Wright FA: Information perspectives of Haseman-Elston regression. Hum Hered, 55:132-142. 2003.
Miller BJ, Wang D, Krahe R, Wright FA: Pooled analysis of loss of heterozygosity in breast cancer: a genome
    scan provides comparative evidence for multiple tumor suppressors and identifies novel candidate regions.
    Am J Hum Genet, 73:748-767, 2003.
Bachinski LL, Udd B, Meola G, Sansone V, Bassez G, Eymard B, Thornton CA, Moxley RT, Harper PS,
    Rogers MT, Jurkat-Rott K, Lehmann-Horn F, Wieser T, Gamez J, Navarro C, Bottani A,  Kohler A, Shriver
    MD,  Sallinen R, Wessman M, Zhang S, Wright FA,  Krahe  R: Confirmation of the type 2 myotonic
    dystrophy (CCTG)n expansion mutation in patients with proximal myotonic myopathy/proximal myotonic
    dystrophy of different European origins: a single shared haplotype indicates an ancestral founder effect.
    Am J Hum Genet, 73:835-48, 2003.
Wang D, Lauria M., Yuan B, Wright FA: Mega Weaver: A Simple Iterative Approach for BAG Consensus
    Assembly. In Proc. Second Asia-Pacific Bioinformatics Conference (APBC2004), Dunedin, New Zealand.
    CRPIT, 29. Chen, Y.-P. P., Ed. ACS, 2004.
Hu J, Yin G, Morris JS, Zhang L, Wright FA (2004). Entropy and survival-based weights to combine Affymetrix
    array types in the analysis of differential _expression and survival.  Methods of Microarray Data Analysis
    IV, Critical Assessment of Microarray Data Analysis  (CAMDA),  eds. J.S. Shoemaker and S.M. Lin, 95-108,
    2004.
Graham MR, Virtaneva K, Porcella SF, Barry  WT, Gowen BB, Johnson CR, Wright FA,  Musser JM: Group A
    Streptococcus transcriptome dynamics during growth in human blood reveals bacterial adaptive and
    survival strategies. Amer J Path, 166: 455-465, 2005.
Barry WT,  Nobel AB, Wright FA: Significance analysis of functional categories in gene expression studies: a
    structured permutation  approach. Bioinformatics, 21:1943-1949, 2005.
Hu J, Zou F, Wright FA: Practical FDR-based sample size calculations in microarray experiments.
    Bioinformatics, 21:3264-3272, 2005.
Drumm ML, Konstan MW, Schluchter MD, Handler A, Pace R,  Zou F, Zariwala M, Fargo D,  Xu A, Dunn JM,
    Darrah RJ,  Dorfman R, Sandford AJ, Corey M, Zielenski J, Dime P, Goddard K, Yankaskas JR, Wright
    FA, Knowles MR; Gene Modifier Study Group. Genetic modifiers of lung disease in cystic fibrosis. N EnglJ
    Med. 353:1443-1453, 2005.
Hu J, Wright FA, and Zou  F: Estimation of Expression Indexes for Oligonucleotide Arrays Using the Singular
    Value Decomposition.  Journal of the American Statistical Association, 101:41 -50, 2006.
Graham MR, Virtaneva K, Porcella SF, Gardner DJ, Long RD,  Welty DM, Barry WT, Johnson CA, Parkins LD,
    Wright FA, Musser JM. Analysis of the transcriptome of group A Streptococcus in mouse soft tissue
    infection. Am J Pathol.  169:927-42, 2006.


-------
Nadler JJ, Zou F, Huang H, Moy SS, Lauder JM, Crawley JN, Threadgill DW, Wright FA, Magnuson TR. Large
   scale gene expression differences across brain regions and inbred strains correlates with a behavioral
   phenotype. Genetics, 174:1229-1236, 2006.
Sterrett A, Wright FA: Inferring the Location of Tumor Suppressor Genes by Modeling Frequency of Allelic
   Loss. Biometrics, 63:33:40, 2007.
Hu J, Wright FA: Assessing differential gene expression with small sample sizes in oligonucleotide arrays
   using a mean-variance model.  Biometrics, 63:41-9, 2007.
Wright FA,  Huang H, Guan X, Gamiel K, Jeffries C, Barry WT, Pardo- Manuel F, Sullivan PF, Wilhelmsen KC,
   Zou F: Simulating association studies: a data-based resampling method for candidate regions or whole
   genome scans Bioinformatics, 23: 2581-2588, 2007.
Huang H, Zou F, Wright FA: Bayesian analysis of frequency of allelic loss data. Journal of the American
   Statistical Association, 102(480): p. 1245-1253, 2007.
Barry WT, Nobel AT, Wright FA:  A statistical framework for testing functional categories in microarray data.
   Annals of Applied Statistics, 2(1): 286-315, 2008.
Lee S, Sullivan PF, Zou F, Wright FA: Comment on a Simple and Improved Correction for Population
   Stratification.  American Journal of Human Genetics, 82(2): 524-526, 2008
Ghosh A, Zou F, Wright FA: Estimating  Odds Ratios in Genome Scans: An Approximate Conditional
   Likelihood Approach. American Journal of Human Genetics, 82(5): p. 1064-74, 2008
Harrill JA, Li Z, Wright FA, Radio NM, Mundy WR, Tornero-Velez R, Crofton KM.  Transcriptional response of
   rat frontal cortex following acute In Vivo exposure to the pyrethroid insecticides permethrin and
   deltamethrin. BMC Genomics, 9(1):546,  2008
Gatti DM, Shabalin AA, Lam TC, Wright FA, Rusyn I, Nobel AB. FastMap: Fast eQTL mapping in homozygous
   populations. Bioinformatics, 25(4): 482-489, 2008.
Sullivan  PF, Lin D, Tzeng JY, van den Oord E, Perkins D, Stroup TS, Wagner M, Lee S, Wright FA, Zou F, Liu
   W, Downing AM,  Lieberman J, Close SL. Genomewide association for schizophrenia in the CATIE study:
   results of stage 1. Molecular Psychiatry,  13(6):570-584, 2008.
Zou F, Nie L, Wright FA,  Sen PK: A robust QTL mapping procedure. Journal of Statistical Planning and
   Inference, 139(3): 978-989, 2009
Gatti DM, Sypa M, Rusyn I, Wright  FA, Barry WT. SAFEGUI: resampling-based tests of categorical
   significance in gene expression data made easy. Bioinformatics, 25(4): 541-542, 2009
Li Z, Wright FA, Royland J.  Age-dependent variability in gene expression in male Fischer 344 rat retina.
   ToxicolSci. 107(1):281-92, 2009.
Zhu H, Ye L, Richard A, Golbraikh A, Wright FA, Rusyn I, Tropsha A. A novel two-step hierarchical
   quantitative structure-activity relationship modeling work flow for predicting acute toxicity of chemicals in
   rodents. Environ Health Perspect, 117(8): 1257-64,
Gatti DM, Harrill  AH, Wright  FA,  Threadgill DW, Rusyn  I.  Replication and narrowing of gene expression
   quantitative trait loci using inbred mice. Mamm Genome. 2009 Jul 17. [Epub ahead of print]
Blackman SM, Hsu  S,  Ritter  SE, Naughton KM,  Wright FA, Drumm ML, Knowles MR,  Cutting  GR.  A
   susceptibility gene for type 2 diabetes  confers substantial risk  for diabetes complicating cystic fibrosis.
   Diabetologia, 52(9):1858-65, 2009 September.
Sun W, Wright FA, Tang Z, Nordgard SH, Loo PV, Yu T, Kristensen VN, Perou  CM. Integrated study  of copy
   number states and genotype calls using high-density SNP arrays. Nucleic Acids Res., 2009 Jul 6. [Epub
   ahead of print]
Byrnes A, Jacks A, Dahlman-Wright K,  Evengard B, Wright FA, Pedersen NL, Sullivan PF. Gene expression
   in  peripheral  blood leukocytes in monozygotic twins discordant  for chronic fatigue:  no evidence of a
   biomarker. PLoS One, 5;4(6):e5805, 2009 June.
Taylor-Cousar JL, Zariwala MA, Burch LH, Pace RG, Drumm ML, Galloway H, Fan H, Weston BW, Wright FA,
   Knowles  MR; Gene Modifier Study Group.  Histo-blood group gene polymorphisms as potential  genetic
   modifiers of infection and cystic fibrosis lung disease severity. PLoS One, 4(1):e4270, 2009.


-------
   Scientific Leadership Roles
Name
Elaine Cohen Hubal
Activity Type
Co-Chair
Editorial Board
Member
World Health
Organization
Temporary Adviser
Program Planning
Committee
Co-Chair
Program Planning
Committee (and
Chair)
Member
Member
Chair
Organization
International Council of Chemical
Associations Long Range Research
Initiative (ICCA-LRI) workshop:
Connecting Innovations in
Biological, Exposure and Risk
Sciences: Better Information for
Better Decisions. Charleston, SC
Journal of Exposure Science and
Environmental Epidemiology
National Children's Study Data
Access Committee
Plan the IPCS international
workshop on "Identifying Important
Life Stages for Monitoring and
Assessing Risks from Exposures to
Environmental Contaminants."
US EPA/ICCA meeting on Public
Health Applications of Human
Biomonitoring. Chair plenary
session: International Perspectives.
Research Triangle Park, NC
International Society of Exposure
Science (formerly ISEA) 2009
Annual Meeting, Minneapolis, MN
The International Society of
Exposure Analysis Annual Meeting.
Chair symposium: Computational
Toxicology. Durham, NC
ILSI Health and Environmental
Sciences Institute, Sensitive
Subpopulations Working Group
ILSI Health and Environmental
Sciences Institute, Biomonitoring
Working Group
Exposure Science for Screening
Prioritizing and Toxicity Testing
Community of Practice (ExpoCoP)
Dates of Service
June 2009
January 2007-Present
2008-Present
2009-Present
September 24-25, 2007
2009
October 14-1 8, 2007.
2006-2009
2004-Presesnt
June 2008-Present
Previous
TOC

-------
   Scientific Leadership Roles
Name
Elaine Cohen Hubal
(cont.)

Jimena Davis

David Dix
Activity Type
Member program
planning committee
Member

President
Member
Member

Organizing Committee
Editorial Board
Chair
Adjunct Assistant
Professor
Member
Organization
US EPA Workshop on Research
Needs for Community-Based Risk
Assessment. Session
organizer/chair: Data needs and
measurement methods for CBRA.
RTP, NC
US EPA Risk Assessment Forum

EPA-RTP Networking and
Leadership Training Organization
(NLTO)
Society of Industrial and Applied
Mathematics
American Mathematical Society

FDA Microarray Quality Control
Project
Toxicological Sciences
Multiple symposium sessions at
successive SOT annual meetings
Dept. Environmental and Molecular
Toxicology, North Carolina State
University
Society of Toxicology
Dates of Service
October 18-1 9, 2007
June 2004 -2009

2009-Present
2003-Present
2003-Present

2005-2008
2005-Present
2003-Present
2001-2008
2001 -Present
Previous  I   TOC

-------
   Scientific Leadership Roles
Name
David Dix (cont.)

Keith Houck
Activity Type
Co-Chair
Organizing Committee
Member
Member
Adjunct Assoc.
Professor
Member
Editorial Board
Session Co-Chair

Member
Lecturer
Organization
First EPA ToxCast Data Analysis
Summit
International Council of Chemical
Associations Long Range Research
Initiative (ICCA-LRI) workshop:
Twenty-First Century Approaches to
Toxicity Testing, Biomonitoring, and
Risk Assessment
Amsterdam, The Netherlands
OECD Extended One Generation
Reproductive Toxicity Study
(EOGRTS) Working Group
OECD Molecular Screening Project
Working Group
Dept. Environmental Sciences and
Engineering, School of Public
Health, Univ. of North Carolina at
Chapel Hill
EU CarcinoGenomics Scientific
Advisory Board
Systems Biology in Reproductive
Medicine
2007 EPA Science Forum

NIH Roadmap RFA on Assay
Development for High Throughput
Molecular Screening Grant Review
Panel
North Carolina Central University,
The Brite Center, Dept. of
Pharmaceutical Sciences
Dates of Service
2009
2008
2008-Present
2006-Present
2008-Present
2007-Present
2007-Present
2007

2008
2007-Present
Previous
TOC

-------
   Scientific Leadership Roles
Name
Keith Houck (cont.)

Richard Judson
Activity Type
Member
Member: Conference
Committee
Specialty Section:
Nanotoxicology
Co-chair
Review Panel
Review Panel
Member

Member
Lecturer
Organization
American Association for the
Advancement of Science
Society of Biomolecular Sciences
Society of Toxicology
Society of Biomolecular Sciences
Regional Meeting, RTP, NC
NIH Nanomaterial Grand
Opportunity Grant Review Panel
NIH ARRA Challenge Grant Review
Clinical Chemistry and Clinical
Toxicology Devices Panel of the
Medical Devices Advisory Cmte, Ctr
for Devices and Radiological
Health, FDA

EPA/ORD Genomics Task Force,
responsible for data management
strategy
Genomics Training Course
developed for OPPTS
Dates of Service
1992-Present
2001-Present
2007-Present
2010
2009
2009
2008-Present

2007-2008
2007
Previous
TOC

-------
   Scientific Leadership Roles
Name
Richard Judson
(cont.)

Robert Kavlock
Activity Type
Session Co-Chair
Member
Member
Adjunct Assistant
Professor
Consultant
Co-chair

Chair
Reviewer
Organization
2007 EPA Science Forum
ORD IT Governance Board
Tox21 Workgroup
UNC Dept. of Environmental
Sciences and Engineering
FDA NCTR Scientific Advisory
Board
First EPA ToxCast Data Analysis
Summit

EPA International Science Forum
on Computational Toxicology
NIEHS SBRP Peer Review Panel
Dates of Service
2007
2007-Present
2008-Present
2008-Present
2009-Present
2009

2007
September 2007
Previous  I   TOC

-------
                                Scientific Leadership Roles
Name
Activity Type
Organization
Dates of Service
                     Member (and Chair)
                     OCED Molecular Screening
                     Initiative Working Group
                                                                           2005-Present
                     Chair
                     S/T Technical Qualifications Review
                     Board
                                2008-Present
                     Chair
                     Managing Chemical Risks
                     Integrated Multidisciplinary
                     Research Working Group
                                2009-
                     Reviewer
                     European 7th Framework Proposals
                     for the Innovative Medicines
                     Initiative, Brussels
                                February 2009
                     Member
Robert Kavlock (cont.)
                     Society of Toxicology, including
                     Developmental and Reproductive
                     Toxicology Specialty Section and
                     the North Carolina Society of
                     Toxicology;
                                Current
                     Member
                     Teratology Society
                                Current
                     Expert Panel Member
                     Integrated Testing of Pesticides,
                     Canadian Council of Academies
                                2009-Present
                     Expert
                     WHO Working Group of the Health
                     of Effects of DDT, Geneva
                                June 2009
                     Co-Chair
                     Tox21 Working Group
                                2007-Present
                     Editorial Board
                     Journal of Toxicology and
                     Environmental Health, Part B
                                Current
                           Previous
                         TOC

-------
                                Scientific Leadership Roles
Name
Activity Type
Organization
Dates of Service
                     Editorial Board
                     Neurotoxicology and Teratology
                                 2006-present
                     Associate Editor
                     Environmental Health Perspectives
                                 2006-present
Robert Kavlock (cont.)
Editorial Board
Birth Defects Research, Part B
2003-present
                     Organizing Committee
                     World Congress on Alternatives to
                     Animals- Nos 6 (2007) and 7 (2009)
                                 2007 and 2009
                     Organizing Committee
                      ILSI New Directions in
                      Developmental Toxicity
                                 2009
Thomas Knudsen
                     Editorial Board -
                     Editor-in Chief
                     Reproductive Toxicology (Elsevier)
                                                      2003-Present
                     Editorial Board
                     Birth Defects Research (Part C)
                                                                           2002 - Present
                     Editorial Board
                     Developmental Dynamics,
                                                      2002 - Present
                     Editorial Board •
                     Co-Editor
                     Co-Editor, Developmental
                     Toxicology (Comprehensive
                     Toxicology Series - Elsevier)
                                 2002 - Present
                     President
                     Teratology Society
                                 2007-08
                            Previous  I     TOC

-------
   Scientific Leadership Roles
Name
Thomas Knudsen
(cont.)

Stephen Little
Activity Type
Chairman, Program
Committee
Scientific Liaison Task
Force
European
Commission
Steering Committee
Steering Committee
Co-Organizer

Member
Member
Member
Board of Directors
Councilor
Organization
47th Annual Meeting of the
Teratology Society; Council of the
Teratology Society
Society of Toxicology
Expert Panel (FP7)
First International Workshop on
Virtual Tissues (EPA)
ILSI-HESI DART Workshop on
"Developmental Toxicology New
Directions", Leader- Working
Group on New Technologies
Symposium on "Gene Regulatory
Networks in Developmental Biology
and Computational Toxicology",
Teratology Society

American Chemistry
Society of Toxicology
Genetics and Environmental
Mutagenesis Society
Genetics and Environmental
Mutagenesis Society
Dates of Service
1999-02 and 2005-09
2008-12
2009
April 21-23, 2009
2009
2009

1982-Present
2002-Present
1988-Present
2007-Present
Previous
TOC

-------
   Scientific Leadership Roles
Name
Matthew Martin

James Rabinowitz

David Reif
Activity Type
Member

Member
Member
Chairman (SOT)
Member

Chair
Member
Program Committee
Grant Reviewer
Course Director and
Lecturer
Organization
OECD Extended One Generation
Reproductive Toxicity Study
(EOGRTS) Working Group

American Association for the
Advancement of Science
International Society for Quantum
Biology and Pharmacology
Bioinformatics and Computational
Toxicology 48th Annual Meeting of
the Society of Toxicology
American Chemical Society;
Section on Chemical Toxicology;
Section on Computers in Chemistry

NCCT Seminar Series, Research
Triangle Park, NC, USA
NHEERL Data Analysis Working
Group
"Bioinformatics and Computational
Biology", Genetic and Evolutionary
Computation Conference
National Science Foundation
Introduction to R,
North Carolina State University,
Raleigh, NC, USA [semester
course.
Dates of Service
2008-Present

Current
Current
2009
Current

2009
2007-Present
2008-2009
2007-Present
2008
Previous
TOC

-------
   Scientific Leadership Roles
Name
Ann Richard
Activity Type
Consultant
Consultant
Editorial Board
OpenTox Consortium
Editorial Board
Editorial Board
Organizing Committee
NCCT Representative
EPA Lead
Organization
LeadScope LIST Workgroup, for
Implementation of ToxML standard
ontologies
ILSI Working group on Prediction of
Developmental Toxicity
SAR and QSAR in Environmental
Research
Advisory role
Mutation Research
Chemical Research in Toxicology
and SAR & QSAR in Environmental
Research
Computational Methods in
Toxicology and Pharmacology:
Integrating Internet Resources,
Moscow, Russia.
EPA Science Connector Workgroup
Tox21 EPA Chemical Working
Group
Dates of Service
March 2004-Present
2002-Present
2008-Present
2008-Present
1994-Present
2008-2010
September 2007
2007
2009
               10
Previous  I   TOC

-------
   Scientific Leadership Roles
Name
R. Woodrow Setzer
Activity Type
Adjunct Assoc.
Professor, Dept. of
Biostatistics
Adjunct Professor
Statistical Consultant/
Collaborator
Member: • (Past
Chair) Workgroup
drafting Technical
Guidance for
Benchmark Dose
Analysis
Member
Member
Member
Associate Editor
Publication Officer
Organization
UNC Dept. Biostatistics, School of
Public Health, Chapel Hill, NC
Department of Biostatistics, North
Carolina State University
National Center for Environmental
Assessment for Development of
EPA's Benchmark Dose Software
Risk Assessment Forum
ILSI-Europe Expert Group on the
Application of the Margin of
Exposure Approach to Genotoxic
Carcinogens in Food
Risk Assessment Forum's Point of
Departure Workgroup
NCEA Statistical Working Group
Journal of Statistical Software
Risk Assessment Specialty Section,
American Statistical Association
Dates of Service
2000-2009
2009-Present
1993- Present
2005-Present
2006-2008
2004-Present
2005-Present
2005-Present
2010-2012
               11
Previous
TOC

-------
   Scientific Leadership Roles
Name
Imran Shah

John Wambaugh
Activity Type
Member
Member
Co-Chair
Session Chair
Member

Member
Member
Member
Member
Organization
International Society for
Computational Biology (ISCB).
Society of Toxicology
First International Workshop on
Virtual Tissues (v-Tissues 2009),
RTP, NC
"Modeling Signaling as a
Determinant of System Behavior",
International Forum on
Computational Toxicology, EPA,
Research Triangle Park, North
Carolina.
EPA/ORD, Future of Toxicology
Working Group

Sigma Xi, Duke University Chapter
Society of Toxicology: Biological
Modeling and Risk Assessment
Specialty Sections, North Carolina
Society of Toxicology
American Physical Society: Division
of Fluid Dynamics, Statistical and
Nonlinear Physics Topical Group,
Forum on Graduate Student Affairs
American Association of Physics
Teachers
Dates of Service
1997-Present
2008-Present
2008-2009
2007
2007

2007-Present
2006-Present
2001 -Present
1995-Present
               12
Previous
TOC

-------
        NCCT Mentoring
Name

Adebowale Adenji
Andrew Beam
Michael Breen
Miyuki Breen
Kelly Chandler
Jimena Davis
Robert DeWoskin
Peter Egeghy
Fathi Elloumi
Ramon Garcia
Michael-Rock
Goldsmith
Mentor

E.Cohen Hubal
R Judson
R. Conolly
R. Conolly
T. Knudsen (NCCT) &
S. Hunter (NHEERL)
R. W. Setzer (NCCT)
&
Rogelio Tornero-Velez
(NERL)
T. Knudsen
E Cohen Hubal
R. Judson
R.W. Setzer
J. Rabinowitz
Program
Predoctoral

B.S. Computer
Science,
Computer
Engineering, &
Electrical
Engineering at
NCSU

Masters in
Biomathematics,
North Carolina
State University





Ph.D.,
Biostatisics,
University of
North Carolina,
Chapel Hill

Postdoctoral


Ph.D., Biomedical
Engineering, Case
Western Reserve
University, OH

Ph.D., Molecular
Physiology &
Biophysics, Vanderbilt
University, TN
Ph.D., Computational
Mathematics, North
Carolina State
University, NC


Ph.D., Software
Engineering,
University of Tunisia

Ph.D., Chemistry,
Duke University, NC
Other
ORD Regional
Scientist Program





NCCT Fellow
NCCT Fellow



Tenure

06/06 - 09/06
08/09 - current

12/05 -current
09/09 - current
08/08 - current
07/09 - current
06/09 - current
04/07-01/09
02/05 - 08/09
01/06-01/07
Current Position

Region 7

Research Physical
Scientist in EPA/NERL



Toxicologist in
EPA/NCEA
Research Environmental
Health Scientist in
EPA/NERL
UNC

Research Physical
Scientist in EPA/NERL
Previous
TOC

-------
         NCCT Mentoring
Name

Amber Goetz
John Jack
Nicole Kleinstreuer
Holly Mortensen
Melissa Pasquinelli
Jason Pirone
David Reif
Chester Rodriguez
Daniel Rotroff
Nisha Schuler Snipes
Mentor

D. Dix
1. Shah
R. Conolly
RJudson(NCCT)&
SueEuling(NCEA)&
Mitchell Kostich
(NHEERL)
J. Rabinowitz
1. Shah
E.Cohen Hubal
H. Barton (NCCT) &
R.W. Setzer (NCCT)
D. Dix
T Knudsen
Program
Predoctoral
Ph.D.,
Environmental
and Molecular
Toxicology,
NCSU







B.S. Biological
Sciences, North
Carolina State
University, NC

Postdoctoral

Ph.D., Computational
Analysis and
Modeling, Louisiana
Tech University, LA
Ph.D., Centre for
Bioengineering,
University of
Canterbury, NZ
Ph.D., Human
Genetics, University of
Maryland, MD
Ph.D., Theoretical
Chemistry, Carnegie
Mellon University, PA
Ph.D.,
Biomathematics and
Toxicology, North
Carolina State
University, NC
Ph.D., Human
Genetics, Vanderbilt
University, TN
Ph.D., Pharmacology,
University of
California, LA, CA

Ph.D. Philosophy, Cell
and Cancer Biology,
University of
Cincinnati, OH
Other










Tenure

08/05 - 06/07
07/09 - current
08/09 - current
09/08 - current
09/04 - 07/06
01/08-05/09
09/06-11/08
01/06-09/09
08/09 - current
09/09 - current
Current Position

Syngenta Crop
Protection, Greensboro,
NC



NCSU/College of Textiles
Faculty
UNC-CH/Applied
Mathematics Program
Statistician in EPA/NCCT
Toxicologist in
EPA/OPP/HED (DC)


Previous

-------
         NCCT Mentoring
Name

Rogelio Tornero-Velez
Beena Vallanat
John Wambaugh
Clarlynda Williams-
Devane
Michael Zager
Yuchao (Maggie) Zhao
Mentor

J. Blancato
1. Shah
H. Barton (NCCT)&
R.W. Setzer (NCCT)
A. Richard
H. Barton
R. Conolly
Program
Predoctoral



Ph.D.,
Bioinformatics,
North Carolina
State University


Postdoctoral
Ph.D., Environmental
Sciences, University
of North Carolina, NC

Ph.D., Physics, Duke
University, NC

Ph.D., Applied
Mathematics, North
Carolina State
University, NC
Ph.D., Environmental
Engineering, North
Carolina State
University, NC
Other

NCCT Fellow




Tenure

1/05-3/05
06/08-10/08
07/06-11/08
10/03-12/08
03/05-12/05
03/06 - 02/08
Current Position

Physical Scientist in
EPA/NERL
NHEERL
Physical Scientist in
EPA/NCCT
Post-Doc in
EPA/NHEERL
Pfizer, Inc. in San Diego,
CA
California EPA
Previous

-------
UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
                                                 10  o  . o o i i i • i . o . q
                                                 i   00110111.0    ;
                                                     >" 0 1 1 i • 0   00101

                                               CQtfPUTATlbNAL

                                              ij! ooo TOXICOLOGY11
           COMPUTATIONAL TOXICOLOGY
      ROTATIONAL FELLOWSHIP PROGRAM
                 Office of Research and Development



             National Center for Computational Toxicology



                    Research Triangle Park, NC
                          April 28, 2008
                 Previous
TOC

-------
       Computational Toxicology Rotational Fellowship Program
                            Participants to Date

                Ms. Beena Vallanat from the National Health and Environmental
                Effects Research Laboratory (NHEERL) was selected as our first
                Computational Toxicology Rotational Fellow. This Fellowship Program
                is intended to help translate the technologies and approaches being
                developed within the NCCT to other parts of the Agency. Beena
                began her four-month fellowship on September 15, 2008 and worked
                primarily with NCCT's Dr. Imran Shah. Her goal was to gain a greater
                understanding of computational approaches for integrating disparate
                data streams for elucidating toxicologic process in risk assessment.
                Specifically, she worked on  analyzing published DEHP time course
gene expression data in mice to computationally infer transcriptional networks linked to
cell proliferation, a key event in non-genotoxic hepatocarcinogenesis. The concordance
of the regulatory network models will be evaluated using archived data sets, and
through experimental validation of predicted transcription factors and microRNA. This
project provided useful information about key  genetic-regulatory events following DEHP
exposure that precede cell proliferation and provide a novel strategy to analyze
expression profiles for risk assessment.  This  was a great opportunity for interaction,
collaboration and facilitation of the work being conducted by NHEERL's Toxicogenomics
Core and the NCCT Virtual Liver project.
Drs. Egeghy and DeWoskin are currently on a scientist rotation for the next four to six
months in The Computational Toxicology Rotational Fellowship Program. Dr. Peter
Egeghy comes to NCCT from the National Exposure Research Laboratory and Robert
Dewoskin comes from the National Center for Environmental Assessment.
                Dr. Egeghy began his rotation on June 01, 2009 and will be working
                with Dr. Cohen Hubal on the ExpoCast™ Program. ExpoCast™ is
                being initiated to ensure the required exposure science and
                computational tools are ready to address global needs for rapid
                characterization of exposure potential arising from the manufacture
                and use of tens of thousands of chemicals. An important early
                component of ExpoCast™ will be to consider how best to consolidate
                and link human exposure data for chemical prioritization and toxicity
                testing. Dr. Egeghy will  help identify high priority human exposure data
                resources for initial chemical indexing in collaboration with ACToR and
DSSTox, lead development of standards  for exposure data representation, and direct
initial implementation of these standards for the most critical data. As a fellow with the
NCCT, Dr. Egeghy will foster the cross-ORD collaborations required to facilitate
progress on these activities.
                      Previous
TOC

-------
                Dr. DeWoskin began his rotation on July 6, 2009 and will be working
                on an integrated multidisciplinary project with Dr. Knudsen to develop
                an agent-based model that simulates cellular changes leading to
                pattern disruption in limb formation, as part of EPA's Virtual Embryo
                project (http://www.epa.gov/ncct/v-Embryo/). This research fellowship
                will provide research experience in newer technologies for predictive
                modeling of developmental toxicity, as well as training in other
                complimentary areas of NCCT research including the application of
                systems-based  in vitro assays and machine-learning algorithms to
derive cell signaling networks and build functional models for pathway-based risk
assessment. As a fellow with the NCCT, Dr. DeWoskin will foster the cross-ORD
collaborations and advance NCEA's understanding of the latest data and methods for
quantitative characterization of the Mode of Action.
                      Previous
TOC

-------
Introduction

      The Office of Research and Development's (ORD) National Center for
Computational Toxicology (NCCT) coordinates and implements EPA's research in the
field of computational toxicology.  Within the source-to-outcome framework, the NCCT
conducts and sponsors  research to provide models for fate and transport of chemicals,
environmental exposures to humans and wildlife, delivery of the chemical to the target
site of toxicity, molecular and cellular pathways of toxicity, and  ultimately systems level
understanding of biological processes and their perturbation.  For priority setting
activities, the NCCT helps establish and distribute databases of high quality toxicological
information, utilizes high throughput screening tools for understanding the potential to
interfere with toxicity pathways across chemicals and chemical classes, develops
systems level models of underlying biology to predict toxicity at the organ  level, and
formulates structure activity models on important toxicity pathways. To improve
quantitative risk assessment, it applies newly developed methods and tools to
understanding determinants of susceptibility, interspecies differences, dose
extrapolation, and risks of exposure to mixtures.  NCCT employees also serve as
scientific reviewers and advisors in providing technical assistance in the broad area of
computational toxicology, to other Laboratories and Centers in ORD, EPA Program
Offices,  Regions  and the States.  NCCT communicates the results of its efforts through
peer reviewed publications, consultations, presentations, databases, publicly available
computational models,  training sessions, and web sites. Another facet of the NCCT is to
serve as a source of training in computational toxicology by offering seminars, mini-
courses, symposia, and staff details.  To expand upon its training mission and further
facilitate a  more in-depth understanding of computational toxicology, NCCT has
developed this Computational Toxicology Rotational Fellows Program (CTRFP).

      This rotational fellowship program allows the temporary assignment of scientists
from other EPA organizations to NCCT.  This will enhance the work of EPA and build
relationships and collaborations to better equip EPA in addressing the difficult challenges
of toxicology in the 21st Century.  In addition, this program will enhance the personal
satisfaction and professional development of those EPA employees  who are involved in
the program. Candidates participating in this program will be detailed to unclassified
                     Previous  I     TOC

-------
developmental assignments for up to 1 year in duration, and are expected to return to
their home organization with enhanced skills in computational toxicology.

Purpose


      The CTRFP will provide  EPA scientists the opportunity to expand their knowledge
and experience and enhance their professional growth while promoting cross-
organization experiences that broaden employee understanding of ORD's Computational
Toxicology Program. In addition, the program will assist in developing a motivated,
flexible, and agile workforce equipped to meet the complex environmental challenges
facing the Agency now and in the future.

      The fellowships will be structured to meet the goals identified by program
participants.  Every effort will be made to ensure that assignment activities enhance or
build each participant's portfolio of skills and competencies related to the area of
computational toxicology.
      There will be 1 or 2 fellowship positions available at any given time, the final
number is dependent on the negotiated costs for each fellowship.  The CTRFP provides
rotational opportunities for permanent EPA employees in grades GS-12 through GS-15
and ST.  The program formally recruits and competitively selects candidates for
participation.  To gain full benefit from the program, participants must fulfill the
fellowship at the NCCT location-RTP, NC.


Program Features

1. Participant Eligibility
      A.  All permanent EPA employees in grades GS-12 through GS-15 and ST may
apply to participate in CTRFP,  provided they have been in their current positions for at
                    Previous  I    TOC

-------
least one year and have received favorable (i.e. fully successful, exceeds expectations or
outstanding) performance ratings.
       B.  Each selected fellow should have an  Individual Development Plans (IDP) in
place that identifies CTRFP as a developmental  activity, and the competencies or skills
he/she wishes to develop through program participation.
       C.  Employees must receive approval of their first- and second-level supervisors
to participate in the CTRFP.

2. Selection Process and Procedures
       A.  Participation in the CTRFP is open to all EPA organizations and will be
administered and managed by the NCCT Program Manager.
       B.  NCCT will use the attached template (Appendix A) for announcing and
selecting candidates for CTRFP rotational opportunities, and will ensure that all eligible
employees receive fair consideration for program selection.
       C.  Interested and eligible EPA employees will submit complete application
materials (see specifics in Appendix A) for program  consideration.
       D.  The Deputy Director of NCCT will convene an evaluation panel comprised of
senior ORD employees/managers, including NCCT employees.  The evaluation panel is
responsible for making recommendation(s) to the NCCT Director.

3. No Grade/Series Changes in Fellowship
       A.  CTRFP is a developmental program designed to build and or enhance
employee skills and competencies, not to provide promotional opportunities for
employees.
       B.  CTRFP candidates will be  detailed to  unclassified fellowships/projects vs.
specific positions; therefore, employee job series and grade levels will not change.
       C.  Employees will return to their positions of record in home organizations
following CTRFP rotations.

4. Duration of Fellowships
       Fellowships under this program are  intended to be 4-months to a  maximum of 1
year in total duration, depending on the fellowship project plan. The  project will  be
implemented in  120-day increments.  Details will be extended or terminated to affect
the agreed upon total duration.
                    Previous  I    TOC

-------
5. Documentation
   A.  CTRFP assignments will be affected and documented on SF-52s, "Requests for
       Personnel Action." Upon  notification of selection for the CTRFP, the employee's
       home organization will complete the SF-52s, obtain the signatures of the
       employee's first-and second-level supervisors and route to the employee's
       servicing personnel office. A copy of all SF-52s will be sent to the CTRFP
       Manager in NCCT.
   B.  CTRFP assignments will be officially documented as 120-day "details," and may
       be extended for up to maximum of 1 year.
   C.  At the completion of CTRFP rotations, home offices will  process "termination of
       detail" actions via an SF-52.
   D.  An assignee's official position of record, including title, occupational series, and
       grade level, will not change as  a result of participation in CTRFP.

6. Performance Management
       According to EPA's Performance Appraisal and  Recognition System (PARS)
training manual, supervisors  are to develop summary ratings for EPA employees  on
detail assignments for 120 days  or more, requiring the establishment of specific
performance criteria based on the essential duties and responsibilities of assignments.
The NCCT supervisor will:
   A.  Establish performance plans for CTRFP assignees with critical elements (CEs)
       based on the essential duties and responsibilities associated with CTRFP
       fellowships or projects.
   B.  Communicate performance  expectations to CTRFP assignees within 30 days of
       the effective date of assignments.
   C.  Complete performance evaluations with assignees at the conclusion of CTRFP
       rotations.
   D.  Provide written evaluations  with summary ratings to home supervisors (also
       referred to supervisors of record) and assignees at the end  of rotation period.
   E.  Home supervisors will consider CTRFP summary ratings in determining overall
       ratings at the conclusion of rating period.

7. Program Funding
                    Previous  I    TOC

-------
      A.  FTE will continue to be covered by the fellow's home office; however a portion
(up to a maximum of 50%) of the PC&B costs may be paid by NCCT. This is a
negotiable item.
      B.  A portion  of the travel and training expenses to and from fellowships will be
paid by each assignee's home office. NCCT is willing to pay a portion of the travel and
training expenses and will  negotiate the amount with the selected fellow's home office.
      C.  All travel  and training required by NCCT during the fellowship will be paid by
NCCT.  Any travel and training required by the fellow's home organization will be paid by
the home organization.


Roles and Responsibilities

1. Home Office Supervisors
      Home office supervisors provide important coaching, guidance, feedback, and
support to assignees.  In addition to responsibilities set forth in this guidance, home
office supervisors should:
   A. Write a letter of recommendation for the candidate to include in the application
      package.
   B. Assist candidates in building specific, measurable individual development plans
      (IDPs) that set forth the expectations of both the participating office and the
      candidate, as well as training/education to support skill advancement;
   C. Discuss learning experiences upon assignment  completion and identify lessons
      learned; and
   D. Include summary ratings for CTRFP assignees in determining overall performance
      ratings.

2. NCCT Supervisor
      The NCCT supervisor, like home supervisors, provides important instruction,
guidance, and  feedback to CTRFP assignees. The success of a rotational experience for
both the candidate and the NCCT is to  a great extent,  a function of the understanding
each other's expectations.  The NCCT supervisor will:
   A. Assist in preparing rotational agreements that set forth expectations of both the
      participating  office and the candidate;
                    Previous  I    TOC

-------
   B.  Provide an in-depth orientation on the organization, its structure, and office
       protocol;
   C.  Provide regular positive and constructive feedback on performance and task
       completion;
   D.  Provide office space as well as the supplies, computers and other tools and
       equipment needed to be successful during the rotation. Upon termination of the
       rotation, all space, supplies, equipment, etc. provided  by NCCT will be retained
       by NCCT.
   E.  Ensure the establishment of PARS plans, monitor performance, and provide
       written evaluations with summary ratings at the conclusion of the rotation.

3. CTRFP Assignees
   A.  Define personal development objectives within IDPs.
   B.  Meet with home office supervisors to discuss how fellowships will support IDP
       objectives.
   C.  Prepare application materials including resumes/curriculum vitae and statements
       of interest—which address: the knowledge, skills and  abilities they will contribute
       during the fellowship;  the desired goals and accomplishments they seek from the
       fellowship and expect to bring back to their home organization; and the  possible
       opportunities for future collaborations utilizing the field of computational
       toxicology.
   D.  Present final seminar on experience prior to completion of detail
   E.  Write narrative summaries of rotational experiences at the conclusion of
       rotational experience and provide to the information to both home and rotation
       supervisors.


Rotation Agreements

1. Rotation agreements will be developed in conjunction with the NCCT
Supervisor, after the participants are selected and will include the following
information:
   A.  The time frame for the rotation;
   B.  Funding arrangements and anticipated trips home;
                    Previous  I    TOC

-------
   C. Project needs for supplies, computers and other expenses;
   D. A plan and timeline for what specific tasks are to be performed or skills
      developed during the rotation;
   E. Estimated time and topic for candidate's seminar upon completion of the rotation.

2. Agreements should be signed by each assignee, his/her home office
supervisor, and the NCCT supervisor, and a copy of the agreement should be
provided to the CTRFP Program Manager in NCCT.

Travel Information

      For general Agency travel-related information, please see: "On the Way with
EPA: A Reference Guide for Travel," published by the EPA Office of the Comptroller, April
1999. This program meets the criteria of a rotational assignment; therefore,  after
approval by the employee's Training Officer, training/expense funds (rather than travel)
may be used to pay for the per diem costs associated with this program.

1. Timeframe for Rotation Planning
      Candidates must submit all application materials to the CTRFP Program Manager
by the announced deadline.  Failure to make these arrangements in a timely manner
may delay the start of the employee's rotation.

2. Travel Authorizations
      Prior to travel, each assignee is to prepare a Travel Authorization (TA) to cover a
4-month (120-day) rotation, plus additional TAs for each extension for up to a maximum
of 1-year, according to the final agreement.

3. Estimating  Rotation Expenses
      In order to calculate the costs of a fellowship outside of the assignee's
geographical location, the  round trip airfare/train fare/POV expenses for the appropriate
number of round trips,  per diem (at 55% of the location's daily allowance), lodging (at a
maximum of 55%), must be included. The round trip airfare/train fare/POV must be
paid  from the travel ceiling and is not included  in the example below. The 55% lodging
                    Previous I    TOC

-------
and per diem is based on what the Agency allows for 120-day rotational details. An
example is provided below for the breakout of how the 55% lodging and per diem
(meals and incidental expenses) works for a candidate doing a rotation to RTP is (based
on 2008 per diem rates): Per diem allowance = $49/DAY; Maximum full lodging (based
on hotel rates) = $97/day.

               ESTIMATED TRAINING/EXPENSE COSTS
                    (TRAVEL FUNDS NOT INCLUDED)
EXPENSE FUNDS
Candidate MI&E allowance
$49 X 0.55 X # days = $
+$36 X .055X2 days =
$39.60
Candidate lodging
maximum
$97X0.55X# days =$
TOTAL
4 MONTHS
$3,273.60
$6,402
$9,675.60
6 MONTHS:
4,837.42
$9,603
$14,440.42
1 YEAR
$9,674.84
$19,206
$28,880.84
(We hope to lower this figure substantially by using nice, furnished apartments with all
utilities and local phone included.)

4. Rotation Housing
      All housing  arrangements must be coordinated with and approved by NCCT.  The
NCCT administrative staff will assist the candidate in locating the best available housing
within a reasonable distance of the NCCT-RTP facility.
      A. Federal  travel policy does not permit EPA to pay any lodging costs for
candidates who choose to stay with relatives or other federal/EPA employees during
their rotations. (We can still pay for incidentals and per diem, however.)
      B. To be reimbursed for lodging, candidates must submit a rental certificate or
official receipt from a housing  complex or landlord.

5. Travel Vouchers
                    Previous
TOC

-------
      A. Travel vouchers are the mechanism used to claim reimbursement for travel
expenses. Each assignee will submit travel vouchers on a monthly basis, for the
duration of the rotation.
      B. Requests for advance payments for rotational costs can be submitted on day
one of a rotational assignment. Advance payments can include air fare, lodging, per
diem, subsistence, and transportation for the first month of a rotation;  however,
vouchers must be submitted monthly so that reimbursement of authorized travel
expenses can be paid.
      C. Candidates should complete Direct Deposit Forms to ensure that travel
voucher reimbursements go directly into checking/savings accounts rather being sent to
home addresses.
       D. The maximum travel reimbursement for assignees who choose to drive their
  private vehicles to RTP, NC is limited to the amount of the least expensive roundtrip
               government airfare from his/her home office to RTP, NC
                    Previous
                                       10
TOC

-------
                               APPENDIX B
         UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
         PREPARE ON YOUR HOME ORGANIZATION'S LETTERHEAD

                  MEMORANDUM OF UNDERSTANDING
DATE:
SUBJECT:  Application for Computational Toxicology Rotational Fellowship Program
            (CTRFP)

FROM:     	
THRU:
THRU:
TO:
            (Applicant's Name, Organization)
            (Applicant's First-line Supervisor's Name, Organization)
(Applicant's Second-line or Division Director's Name)

      Karen Dean
      CTRF Program Manager
      NCCT (MD-B-205-01)
      This memorandum and its attachments provide the information required for my
application to the CTRFP. Below are my signature and those of my first and second-level
supervisor which certify the accuracy of the information provided below and signify our
understanding of the terms and commitments being made as part of this application.

Applicant Information:
Organization and Mail
Code:
Position Title:
Grade/Series:
Time in Current Position:
Office Telephone Number:
Travel Preparer's Name &
Phone#






                                     11
                   Previous
                      TOC

-------
FUNDING AND FTE

1.  If the applicant named above is accepted to the CTRFP, we are aware of and commit to
   cover a portion of their travel/training expenses to and from the rotational assignment, as
   well as for those costs during the rotation. Our portion of the travel and expense costs
   will be negotiated with NCCT, but the maximum amount to be covered by NCCT is 50%
   of the total costs.
2.  We understand that the FTE will continue to be charged to the candidate's home
   organization. However, PC&B costs are negotiable and NCCT may cover up to 50% of
   the total costs while on the rotation.
3.  All travel and training required by NCCT during the rotational assignment will be paid by
   NCCT.
4.  In the rare event that the home organization requires the candidate to travel or be trained
   during the rotational assignment, the home organization will pay for those expenses.

PROGRAM FEATURES

1.      CTRFP is a developmental program designed to build and/or enhance employee skills
       and competencies in computational toxicology, not to provide promotional
       opportunities for employees.
2.      CTRFP candidates will be  detailed to unclassified rotational assignments/projects vs.
       specific positions; therefore, employee job series and grade levels will not change.
3.      Employees will return to their positions of record in home organizations following
       CTRFP rotations.
4.      Rotational assignments under this program will be a minimum of 120 days, up to a
       maximum of 1 year, depending on the projects undertaken during the rotation.
5.      For candidates requiring temporary travel, home visits may be funded, as agreed to in
       advance by the assignee, NCCT supervisor and home supervisor.
6.      For candidates requiring temporary housing, NCCT staff will assist in finding suitable
       lodging within per diem while on the rotation.

PERFORMANCE MANAGEMENT

       We certify that the applicant's latest performance rating was favorable (e.g.,fully
successful, exceeds expectations or outstanding).

       According to EPA's Performance Appraisal and Recognition System (PARS) training
manual, the NCCT supervisor will develop summary ratings for EPA employees on detail
assignments for 120 days or more, requiring the establishment of specific performance
criteria based on  the essential duties and responsibilities of assignments.
                                         12
                     Previous  I    TOC

-------
APPLICANT & MANAGEMENT CERTIFICATION
We certify and agree to the terms and conditions stated in this MOU.
 Applicant
     Signature/Date
 First Line Supervisor
     Signature/Date
 Second-Line Supervisor/Division Director    Signature/Date
                                        13
                     Previous
TOC

-------
                                  APPENDIX C

                          Candidate's Statement of Interest
              Computational Toxicology Rotational Fellowship Program

Describe how the CTRFP aligns with what you hope to achieve at EPA over the next few
years. Please limit to no more than 1 page. Please use Times Roman 12 point font.

Address the following elements in your description:

1.     The knowledge, skills and abilities you will contribute during the fellowship.

2.     The desired goals and accomplishments you seek from the fellowship and what you
expect to bring back to your home organization.

3.     The possible opportunities to utilize the field of computational toxicology in your
work once you return to your home  organization, including future collaborations.
                                        14
                     Previous  I     TOC

-------
                          APPENDIX D

 COMPUTATIONAL TOXICOLOGY ROTATIONAL FELLOWSHIP PROGRAM

                 DOCUMENTATION REQUIREMENTS
                          CHECKLIST
I. APPLICATION MATERIALS TO BE SUBMITTED BY CANDIDATE:
Required Items
Application MOU signed by candidate & managers (see Appendix B)
Candidate's Statement of Interest (see Appendix C)
Candidate's Biosketch, CV or Resume
Candidate's Supervisor's Recommendation
Completed




II. SELECTED CANDIDATE'S HOME ORGANIZATION PREPARES:
Required Items
SF-52s for duration of assignment, including termination SF-52
(provide copy to NCCT)
IDP which includes CTRFP
Travel Authorizations for CTRFP
(for joint funding, done in conjunction with NCCT)
Completed



III. SELECTED CANDIDATE AND NCCT SUPERVISOR PREPARE:
Required Items
Rotation Agreement
PARS plans
PARS summary ratings
at end of assignment
Completed



IV. UPON COMPLETION OF ROTATION, SELECTED CANDIDATE PREPARES
Required Items
Rotational Assessment
Exit Seminar
Completed


                               15
                Previous
TOC

-------
  UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
                 ROTATIONAL FELLOWSHIP PROGRAM
What Is this program and how is it impiemented?
   0 temporary assignment of scientists from other EPA organizations to ORD/NCCT
   0 1 or 2 fellowships ongoing at the same time
   0 duration is 4-months to maximum of 1 year, depending on the project
   0 employee is detailed to unclassified PD and stays at current permanent grade
   0 must be geographically located to RTP, NC during the rotation
   0 return to home organization upon completion of rotation

What is the purpose of this program?
   0 promote cross-organization experiences
   0 broaden employee understanding of ORD's Computational Toxicology Program
   0 build relationships and collaborations to better equip EPA in addressing the
      difficult challenges of toxicology in the 21st Century
   0 enhance the work of EPA

Who is Eligible?
  0 permanent EPA employees in grades GS-12 thru GS-15 and ST (not open to EPA
     post-docs and other term  or temporary employees)
  0 in current position for at least one year
  0 have received favorable (i.e. fully successful, exceeds expectations  or outstanding)
     performance ratings
  0 approval by first & second-level supervisors required

Who funds the candidate while on the rotation?
   0 FTE will continue to be covered by the fellow's home organization
   0 a portion (up to a maximum of 50%) of the  PC&B costs may be negotiated with
      and paid by NCCT
   0 a portion of the travel and training expenses to and from fellowships may  be
      negotiated with NCCT and the costs may be shared by assignee's home office and
      NCCT

How do interested employees apply?
  0 receive approval of their first- and second-level supervisors
  0 complete application materials including—statement of interest; supervisor's
     recommendation,  resume/biosketch/CV and application MOU

What is the timeline of events?
  0 application due date is June 13, 2008
  0 interviews completed by July 15, 2008
  0 selection(s) made by August 1, 2008
  0 rotation(s) begin by September 14, 2008
                    Previous
TOC
Next

-------
EPA Communities of Practice Presentation Documents


Chemical Prioritization

A Tiered Approach for the Use of Non-Testing Methods in the Regulatory Assessment of
Chemicals.  Dr. Andrew Worth, Systems Toxicology Unit, Institute for Health & Consumer
Protection, Joint Research Centre, European Commission. 06/25/2009.
http://www.epa.gov/ncct/practice community/Worth CPCP%20presentation.pdf

The U.S. EPA's 2006 Inventory Update Reporting (IUR)  Data on Chemical Substances. Dr.
Susan Sharkey, USEPA - OPPT. 05/28/2009.
http://www.epa.gov/ncct/practice communitv/lUR Overview  CPCP 28mav2009.pdf

Is ChAMP a Winning Strategy? Dr. Cal Baier-Anderson,  Environmental Defense Fund.
04/22/2009.
http://www.epa.gov/ncct/practice communitv/CPCP%20april2009%20CBA%20ChAMP%20RBP
%20Eval%20for%20ToxCast%2009.pdf

The ToxCast 320 Chemical Library in Cultures of Primary Human Hepatocytes: qNPAs as
Wndows into Chemical-Induced Hepatocyte Biology. Dr. Stephen S. Ferguson, CellzDirect.
11/18/2008.
http://www.epa.gov/ncct/practice community/CZD-
EPA%20ToxCast%20320%20CPCP%20presentation-11-20-08.pdf

The ToxCast Chemical Universe and ACToR (Aggregated Computational Toxicology
Resource). Dr. Richard Judson, USEPA-NCCT. 10/17/2008.
http://www.epa.gov/ncct/practice community/Judson%20CPCP%20Landscape%20Oct%20200
Srev.ppt

Screening for Chemical  Effects on Neuronal Proliferation and Neurite Outgrowth Using High-
Content/High-Throughput Microscopy.  Dr. Joseph M. Breier, Curriculum in Toxicology,
University of North Carolina at Chapel Hill. 09/25/2008.
http://www.epa.gov/ncct/practice communitv/CPCP-9-25-08%20Final.pdf

Chemical Modulation of Gap Junctional Intercellular Communication in Toxicology. Dr. James E.
Trosko, Center for Integrative Toxicology, Food Safety Toxicology Center, Dept.
Pediatrics/Human Development, College of Human Medicine, Michigan State  University.
08/25/2008.
http://www.epa.gov/ncct/practice communitv/Trosko lecture  2008.pdf

Solidus Bioscience's MetaChip Technology for High-Throughput In Vitro Assessment of
Chemical and Drug Candidate Toxicity. Solidus. 07/27/2008.
http://www.epa.gov/ncct/practice communitv/Solidus  Biosciences Toxcast July 08.pdf

U.S. EPA Use of QSAR and Category Approaches in Profiling Hazards of Industrial Chemicals.
Dr. Tala Henry, USEPA- OPPT. 06/11/2008.
http://www.epa.gov/ncct/practice_community/USEPA_Use%20of_QSAR_and_Category_Appro
aches_Jun08.pdf
                       Previous  I    TOC

-------
Overview of the Contaminant Candidate List 3. Dr. Thomas Carpenter, USEPA-OGWDW.
05/22/2008.
http://www.epa.gov/ncct/practice  communitv/CCL3 Community of%20Practice 052208.pdf

Screening Chemicals in Commerce to Identify Possible Persistent and Bioaccumulative
Chemicals: New Results and Future Work. Presentation by Ted Smith from EPA's Great Lakes
National Program Office.  Dr. Edwin Smith, USEPA - Region 5. 04/24/2008.
http://www.epa.gov/ncct/practice  community/Smith CPCP  presentation  24apr2008.pdf

Chemical Prioritization and Risk Assessment in the 21st Century -A Highly Personal
Perspective.  Dr. Melvin Andersen, The Hamner Institutes for Health Sciences. 03/27/2008.
http://www.epa.gov/ncct/practice  community/Andersen EPA  CPCP 27mar2008.pdf

Opportunities for Collaboration on In Vitro Testing Proposal. HESI Representatives; Dr. Jiri
Aubrecht (Pfizer), Dr. Albert Fornace (Georgetown University), Dr. Robert Schiestl, (UCLA),
Syril Pettit, M.E.M. 02/28/2008.
http://www.epa.gov/ncct/practice  community/vitro  testing.pdf

BioSeek - ToxCast Phase I Project Update.  Dr. Ellen Berg, BioSeek, Inc. 01/24/2008
http://www.epa.gov/ncct/practice  communitv/BioSeek ToxCast Summary 24Jan08.pdf
Exposure Science

Holistic Mass Balance Modeling Approach for Chemical Screening and Priority Setting. Jon
Arnot, University of Toronto Scarborough.  07/14/2009.
http://www.epa.gov/ncct/practice  community/exposure science/MassBalanceMethodsforChemi
calScreening-ExpoCoP.pdf

Connecting Environment, Biology, and Behavior for Human Exposure and Risk Assessment:
Integrative Modeling Approaches. Dr.  Panos G. Georgopoulos, Rutgers University. 05/05/2009.
http://www.epa.gov/ncct/practice  community/exposure science/050509 Panos.pdf

GExFRAME A Web-Based Framework for Accessing Global Consumer Exposure Data,
Scenarios, and Models. Dr. Muhilan Pandian, infoscientific. 04/14/2009.
http://www.epa.gov/ncct/practice  community/exposure science/041409 Pandian.pdf

Gene Expression Profiles: Biomarkers of Inter-Individual Susceptibility to Environmental Agents
And Indicators of Exposure.  Dr. Rebecca Fry, University of North Carolina at Chapel Hill.
03/10/2009.
http://www.epa.gov/ncct/practice  community/exposure science/031009 Fry.pdf

Biomonitoring Equivalents as Screening Tools for Interpretation of Human Biomonitoring Data.
Sean M. Hays, M.S., M.S. and Lesa L. Aylward,  M.S.  02/10/2009.
http://www.epa.gov/ncct/practice  community/exposure science/021009 Hays.pdf

Multimedia MultipathwayModeling of Emissions to Impacts: screening with  USEtoxand
advanced spatial modeling with IMPACT.  Dr. Olivier Jolliet, iMod-lmpact and Risk Modeling
School of Public Health, EHS, University of Michigan.  01/13/2009.
http://www.epa.gov/ncct/practice  community/exposure science/011309 Jolliet.pdf
                        Previous  I    TOC

-------
Prioritization of HPV Chemicals under the Chemical Assessment and Management Program
(ChAMP). Drs. Nhan Nguyen and Cathy Fehrenbacher, USEPA-OPPT. 12/08/2008.
http://www.epa.gov/ncct/practice community/exposure  science/120808 Fehrenbacher.pdf

Assessing the Exposure-Dose-Toxicity Relationship within the EPA's ToxCast Program. Dr.
Russell Thomas, The Hamner Institutes for Health Sciences. 11/04/2008.
http://www.epa.gov/ncct/practice community/exposure  science/110408 Thomas.pdf

Chemical Exposure Priority Setting Tool (CEPST). Dr. Mike Jayjock, The Lifeline Group, Inc.
10/07/2008.
http://www.epa.gov/ncct/practice community/exposure  science/100708 Jayjock.pdf

Characterizing Exposure to Indoor VOCs and SVOCs using Simple Mass-Transfer Models. Dr.
John Little, Virginia Polytechnic Institute and State University. 10/07/2008.
http://www.epa.gov/ncct/practice community/exposure science/100708 Little.pdf

European Centre for Ecotoxicology and Toxicology of Chemicals Targeted Risk Assessment
(ECETOC TRA) Tool. Rosemary Zaleski and Chris Money on behalf of the ECETOC TRA Task
Force.  09/09/2008.
http://www.epa.gov/ncct/practice community/exposure  science/090908 Zaleski.pdf

TOXICO-CHEMINFORMATICS: DSSTox and Chemical Structure Annotation for improved data
access.  Dr. Ann Richard, USEPA-NCCT. 08/12/2008.
http://www.epa.gov/ncct/practice community/exposure  science/081208 Richard.pdf

Considering Exposure in Priority Setting Categorization of the Domestic Substances List under
the Canadian  Environmental  Protection Act (CEPA). Dr. Bette Meek, McLaughlin Centre
University of Ottawa. 07/08/2008.
http://www.epa.gov/ncct/practice community/exposure  science/070808 Meek.pdf

Short Term Exposure Prioritization Needs for ToxCast™. Drs. Elaine Cohen Hubal and Richard
Judson, USEPA-NCCT. 06/17/2008.
http://www.epa.gov/ncct/practice community/exposure  science/ToxCast Exposure  Info Need
s.pdf

Chemical Selection for ToxCast: EPA's Program for Predicting Toxicity and Prioritizing
Chemical Testing.  Dr. Richard Judson, USEPA-NCCT. 05/27/2008.
http://www.epa.gov/ncct/practice community/exposure  science/052708Judson.pdf
                        Previous  I    TOC

-------
NCCT PARTNERSHIP AGREEEMENTS


Material Transfer Agreements (MTAs):

      For ToxCast Data Generation:

             BASF SE and its affiliate Metanomics GmbH, Berlin, DE - 05/08
             Biolog, Inc., Hayward, CA - 02/09
             CellzDirect Inc, Durham, NC -02/08
             Centronix Corp.,  Manchester, UK-01/08
             Iconix & Affymetrix Inc., Santa Clara, CA - 06/06
             Imperial College of Science, Technology and Medicine, London, UK-11/08
             Invitrogen Corp.,  Madison, Wl - 02/08
             Solidus Biosciences Inc.,  Troy,  NY-01/08
             WatchFrog S.A.,  Evry, FR - 10/08
             Zygogen, LLC, Atlanta, GA - 07/08

      For Clinical Data Sharing:

             Pfizer Inc., New York, NY - 04/09

      For ToxCast/ToxRef  Data Sharing:

             Cogenics Inc., Morrisville, NC - 09/07
             Gene Logic Inc, Gaithersburg, MD  - 10/07
             Genedata Inc, Lexington,  MA - 04/09
             GeneGo Inc, St. Joseph, Ml - 06/08
             Germany Federal Institute (BfR) for Risk Assessment, Berlin, DE - 02/08
             National Institute for Public Health and the Environment (RIVM), Bilthoven, NL-
             12/08
             SimBioSys Inc., Toronto, ON - 12/07
             U.S. EPA, National Center of Environmental Assessment (NCEA), Washington,
                   DC - 03/09
             U.S. EPA, National Exposure Research Laboratory (NERL), RTP, NC - 02/09
             U.S. EPA, National Exposure Research Laboratory, Athens, GA-04/09
             U.S. EPA, National Health & Environmental Effects Research Laboratory
                   (NHEERL), RTP, NC - 04/08
             U.S. EPA, Office  of Pollution Prevention & Toxics, Risk Assessment Division
                   (OPPT), Washington, DC - 03/09

      For ToxCast Data Analysis:

             Advanced Chemistry Development, Toronto, ON - 02/09
             Albert-Ludwigs-Universitat Freiburg, Freiburg, GE - 03/09
             BioSeek Inc., South San Francisco, CA - 04/09
             Bull & Associates, Inc., Springfield, VA - 03/09
             Cambridge Cell Networks Ltd,  Cambridge, UK-05/08
             Cellumen Inc.,  Pittsburgh, PA-03/09
             Department of Chemical & Biomolecular Engineering, The Ohio State University,
                   Columbus, OH - 02/09
                       Previous  I     TOC

-------
Department of Pharmacology, University of Medicine and Dentistry of New
       Jersey-Robert Wood Johnson Medical School, Piscataway, NJ - 03/08
Douglas Connect GmbH, Project Coordinator of OpenTox, Zeiningen, CH - 02/09
Drexel University College of Medicine, Philadelphia, PA - 04/09
Exponent, Inc. Health Practice, Philadelphia, PA-04/09
FMC Corporation, Princeton, NJ - 04/09
Food Standards Agency, London,  UK- 03/09
Fraunhofer Institute of Toxicology and Experimental Medicine (ITEM), Hannover,
       GE - 03/09
Helmholtz Zentrum Munchen (GmbH), Neuherbert, GE - 02/09
Ideaconsult Ltd, Sofia, BG - 03/09
In Silico Toxicology, Basel, CH - 03/09
Institute of Biomedical Chemistry of Russian Academy of Medical Sciences,
       Moscow, RU - 03/09
Istituto di Ricerche Farmacologiche "Mario Negri," Milano, IT - 06/09
Istituto Superiore DI Sanita, Rome, IT -  03/09
Jawaharlal NEHRU University, New Delhi, IN -03/09
Leadscope Inc., Columbus, OH - 04/09
Lhasa Limited, Leeds, UK-05/08
Louisiana Tech University, Ruston, LA -  03/09
Max Planck Institute for Molecular Genetics, Berlin, DE - 02/08
Michigan State University, Lansing, Ml - 06/08
National Center for Toxicological Research, Jefferson, AR - 02/09
National Institute of Advanced Industrial  Science & Technology (AIST), Ibaraki,
       JP-02/09
National Technical University of Athens,  Athens, GR - 04/09
North Carolina State University, Raleigh, NC - 04/08
NovaScreen Biosciences Corp., Hanover, MD - 02/09
OpenTox Consortium, David Gallagher,  Beaverton, OR - 03/09
Princeton University, Princeton, NJ -05/08
RegeneMed Inc., San Diego, CA-04/09
SABiosciences Corp., Frederick, MD - 02/09
Saint-Petersburg State Polytechnical University, Saint-Petersburg, RU - 02/09
SAS Institute Inc., Gary,  NC - 03/09
Seascape Learning Co. Pvt. Ltd, Ne Delhi, IN - 04/09
Simulations Plus Inc., Lancaster, CA-03/09
Summit Toxicology, LLP, Lyons, CO - 02/09
Syngenta Crop Protection Inc., Greensboro, NC - 03/09
Technische Universitat Munchen Dept of Informatics, Garching, GE - 03/09
The Dow Chemical Co.,  Midland, Ml - 02/09
The Institute of Biomedical Sciences, East China Normal University, Shanghai,
       CN - 02/09
Toxicogenomic Informatics & Solutions,  LLC, Lansing, Ml - 03/08
U.S. Food & Drug Administration, Office of Food Additive Safety, Center for Food
       Safety and Applied Nutrition, College Park, MD - 02/09
University of Insubria, Varese, IT -03/09
University of Kansas,  Lawrence, KS - 03/09
University of North Carolina at Chapel Hill, Chapel Hill, NC - 8/08
University of North Carolina School of Global Public Health, -Chapel Hill, NC -
       03/09
               Previous  I    TOC

-------
Memoranda of Understanding (MOUs)

      BRITE Institute Center of Excellence, North Carolina Central University, Durham, NC -
            03/08
      National Institute of Environmental Health Sciences/National Toxicology Program, RTP,
            NC & National Institutes of Health Chemical Genomics Center, Bethesda, MD -
            01/08
      The Hamner Institutes for Health Sciences, RTP, NC - 10/07
      U.S. Army Center for Environmental Health Research, Ft Detrick, MD - 04/08
      University of Cincinnati, Reading, OH - 04/08

Cooperative Reserch and  Development Agreements (CRADAs)

      Illumina Inc, San Diego, CA - 12/07
      L'OREAL, Paris,  France - 09/08

Interagency Agreements (lAGs)

      Department of Health & Human Services-NIEHS Div of Intramural Research, RTP, NC -
            08/05 [Funds In]
      Department of Health & Human Services-NIH Chemical Genomics Center, Bethesda,
            MD-12/06 [Funds Out]
                       Previous  I    TOC

-------
NCCT Bibliography 2005 - 2009

2009
1.  Ankley GT, Bencic DC, Breen MS, Collette TW, Conolly RB, Denslow ND, Edwards SW,
   Ekman DR, Garcia-Reyero N, Jensen KM,  Lazorchak JM, Martinovic D, Miller DH, Perkins
   EJ, Orlando EF, Villeneuve DL, Wang RL, Watanabe KH. Endocrine Disrupting Chemicals
   In Fish: Developing Exposure Indicators and Predictive Models of Effects Based On
   Mechanism of Action. Aquatic Toxicology 92(3): 168-78 (2009).

2.  Barrier M, Dix DJ, Mirkes PE. Inducible 70  kDa Heat Shock Proteins Protect Embryos from
   Teratogen-induced Exencephaly: Analysis  Using Hspa1a/a1b Knockout Mice. Birth Defects
   Res A Clin Mol Teratol 28; 85(8):732-740. (2009).

3.  Benakanakere MR, Li Q, Eskan MA, Singh AV, Zhao J, Galicia JC, Stathopoulou P,
   Knudsen TB, Kinane DF. Modulation of TLR2 Protein Expression by Mir-105 in Human Oral
   Keratinocytes. J Biol Chem. 284(34):23107-15 (2009).

4.  Benfenati E, Benigni R, Demarini DM, Helma C, Kirkland D, Martin TM, Mazzatorta P,
   Ouedraogo-Arras G, Richard AM, Schilter B, Schoonen WG, Snyder RD,  and Yang C.
   Predictive Models For Carcinogenicity And Mutagenicity: Frameworks,State-of-the-Art, and
   Perspectives. Journal of Environmental Science and Health. Part C, Environmental
   Carcinogenesis Reviews. 27(2):57-90, (2009).

5.  Goetz AK and Dix DJ. Mode of Action for Reproductive and Hepatic Toxicity Inferred From A
   Genomic Study of Triazole Antifungals. Toxicological Sciences. Society of Toxicology,
   110(2):449-62, (2009).

6.  Goetz AK and Dix DJ. Toxicogenomic Effects Common to Triazole Antifungals and
   Conserved Between Rats and Humans. Toxicology and Applied Pharmacology, 238(1):80-9,
   (2009).

7.  Heidenfelder BL, Reif DM, Harkema JR, Cohen Hubal EA, Hudgens EE, Bramble LA,
   Wagner JG, Morishita M, KeelerGJ, Edwards SW, Gallagher JE. Comparative Microarray
   Analysis and Pulmonary Changes in Brown Norway  Rats Exposed To Ovalbumin and
   Concentrated Air Particulates. Toxicol Sci.  108(1), 207-221. (2009).

8.  Judson R, Richard A, Dix DJ, Houck K, Martin  M, Kavlock R, Dellarco V, Henry T,
   Holderman T,  Sayre P, Tan S, Carpenter T, Smith E. The Toxicity Data Landscape for
   Environmental Chemicals (Journal). Environmental Health Perspectives, 117(5):685-95,
   (2009).

9.  Kavlock RJ, Austin CP, and Tice RR. Toxicity Testing in the 21st Century: Implications for
   Human Health Risk Assessment. Risk Analysis, 29(4):485-7; discussion 492-7 (2009).

10. Knudsen TB, Martin MT, Kavlock RJ, Judson RS, Dix DJ, Singh AV. Profiling The Activity Of
   Environmental Chemicals In Prenatal Developmental Toxicity Studies Using The U.S. EPA's
   ToxRefDB. Reprod Toxicol. 28(2):209-19 (2009).

11. Kramer MG, Firestone M, Kavlock R, and Zenick H. The Future of Toxicity Testing For
   Environmental Contaminants. Environ. Health Perspect, 117(7):A283-A284. (2009).
                       Previous  I    TOC

-------
12. Lou I, Wambaugh JF, Lau C, Hanson RG, Lindstrom AB, Strynar MJ, Zehr RD, Setzer RW,
   and Barton HA. Modeling Single and Repeated Dose Pharmacokinetics of PFOA in Mice (J).
   Toxicological Sciences, 107(2):331-41, (2009).

13. Martin MT, Judson RS, Reif DM, Kavlock RJ, Dix DJ. Profiling Chemicals Based On Chronic
   Toxicity Results From The U.S. EPA ToxRef Database. Environ Health Perspect.
   117(3):392-9. (2009).

14. Martin MT, Mendez E, Corum DG, Judson RS, Kavlock RJ, Rotroff DM, Dix DJ. Profiling
   The Reproductive Toxicity of Chemicals From Multigeneration Studies In The Toxicity
   Reference Database. Toxicol Sci. 110(1): 181-90. (2009).

15. Reif DM, Motsinger AA, Mckinney BA, Edwards KM, Chanock SJ, Rock MT, Crowe JE Jr,
   Moore JH. Integrated Analysis Of Genetic And Proteomic Data Identifies Biomarkers
   Associated With Systemic Adverse Events Following Smallpox Vaccination. Genes and
   Immunity, 10(2). (2009).

16. Rodriguez, CE., Setzer RW, and Barton HA. Pharmacokinetic Modeling of Perfluorooctanoic
   Acid During Gestation And Lactation In The Mouse. Reproductive Toxicology, (3-4):373-86,
   (2009).

17. Sheldon, LS, and Cohen Hubal EA. Exposure as Part of a Systems Approach for Assessing
   Risk.  Environ Health Perspect 117(8): 119-1194 (2009).

18. Thompson, CM., Johns DO., Sonawane B.,  Barton HA., Hattis D., Tardif R., and Krishnan K.
   Database for Physiologically Based Pharmacokinetic (PBPK) Modeling: Physiological
   Parameters for Health and Health-Impaired  Elderly. Journal of Toxicology and
   Environmental Health - Part B: Critical Reviews, 12(1):1-12, (2009).

19. Williams-Devane CR. Wolf MA, and Richard AM. DSSTox Chemical-Index Files for
   Exposure-Related Experiments in Arrayexpress and Gene Expression Omnibus:  Enabling
   Toxico-Chemogenomics Data Linkages. Bioinformatics, 25(5):692-694, (2009).

20. Williams-Devane CR. Wolf MA, and Richard AM. Toward A Public Toxicogenomics
   Capability For Supporting Predictive Toxicology: Survey Of Current Resources And
   Chemical Indexing Of Experiments In GEO And ArrayExpress. Toxicological Sciences,
   109(2):358-371, (2009).

21. Xu Y, Cohen-Hubal EA, Clausen PA, and Little JC. Predicting Residential Exposure To
   Phthalate Plasticizer Emitted From Vinyl Flooring - A Mechanistic Analysis. Environmental
   Science & Technology, 43(7):2374-80,  (2009).

22. Zhu H, Ye L, Richard AM, Golbraikh A, Wright FA,  Rusyn I, and Tropsha A. A Novel Two-
   Step Hierarchial Quantitative Structure Activity Relationship Modeling Workflow for
   Predicting Acute Toxicity of Chemicals  in Rodents. Environmental Health Perspectives,
   117:1257-1264, (2009).
                        Previous  I    TOC

-------
In Press

Cohen Hubal EA, Richard AM, Shah I, Gallagher J, Kavlock R, Blancato J, Edwards SW.
Exposure Science and the U.S. EPA National Center for Computational Toxicology. J Expo Sci
Environ Epidemiol. 2008 Nov 5. [Epub ahead of print].

Cohen Hubal EA. Biologically-Relevant Exposure Science for 21st Century Toxicity Testing
Toxicol. Sci., 2009 July 14. [Epub ahead of print] In Press Doi: Doi:10.1093/Toxsci/Kfp159.

Ema M, Iseb R, Katoc H, Onedad S, Hirosea A, Hirata-Koizumia M, Nishidac Y, Singh Av,
Knudsen Tb And lhara T (2009) Fetal Malformations And Early Embryonic Gene Expression
Response In Cynomolgus Monkeys Maternally Exposed To Thalidomide Repro. Tox In press.

Goetz, AK, Rockett JC, Ren H, Thillainadarajah I, and Dix DJ. (2009) Inhibition of Rat and
Human Steroidogenesis By Triazole Antifungals. Systems Biology in Reproductive Medicine, In
press.

Houck KA, Dix DJ, Judson RS, Kavlock RJ, Yang J, Berg EL. Profiling Bioactivity of The
Toxcast Chemical Library Using Biomap Primary Human Cell Systems. J Biomolec Screen.
(2009) In  press.

Knight AW,  Little S, Houck k, Dix D, Judson R, Richard A,  McCarroll N, Akerman G, Yang C,
Birrell L, Walmsley RM. Evaluation of High-Throughput Genotoxicity Assays Used  in Profiling
The US EPA Toxcast Chemicals,  Regulatory Pharmacology and Toxicology (2009) In press.

Rabinowitz JR; Little  SB; Laws SC, Goldsmith MR.  Molecular Modeling for Screening
Environmental Chemicals For Estrogenicity: Use of The Toxicant-Target Approach, Chemical
Research In Toxicology, 2009 Aug 31. [Epub ahead of print] In press.

Sanchez YA, Deener K, Hubal EC, Knowlton C, Reif D,  Segal D. Research needs for
community-based risk assessment: findings from a multi-disciplinary workshop. J Expo Sci
Environ Epidem. 2009 Feb 25 [Epub Ahead Of Print] In  press.

2008
1. Aylward, LL,Barton HA, and Hays SM. Biomonitoring Equivalents (Be) Dossier for Toluene
   (Cas No. 108-88-3). Regulatory Toxicology and Pharmacology, 51(3 Suppl):S27-36,  (2008).

2. Barthold JS, Mccahan SM, Singh AV, Knudsen TB, Si X, Campion L, and Akins RE. Altered
   Expression of Muscle And Cytoskeleton-Related Genes In A Rat Strain With Inherited
   Cryptorchidism. J Androl. 29(3):352-366. (2008).

3. Benigni  R, Bossa C, Richard AM,  and Yang C. A Novel Approach: Chemical Relational
   Databases, and the Role of the Isscan Database on Assessing Chemical Carcinogenity.
   Annals of The Institute Of Superiore Sanita 44(1):48-56, (2008).

4. Cohen-Hubal EA, Nishioka MG, Ivancic WA, Morara M, and Egeghy PP. Comparing  Surface
   Residue Transfer Efficiencies to Hands Using Polar  and Non-Polar Florescent Tracers.
   Environmental Science & Technology. American Chemical Society, Washington, DC,
   42(3):934-9, (2008).
                        Previous  I    TOC

-------
5.  Cohen Hubal EA, Moya J, Selevan SG. A Lifestage Approach to Assessing Children's
   Exposure. Developmental and Reproductive Toxicology. Birth Defects Res (Part B)
   83(6):522-529. (2008).

6.  Datta S, Turner D, Singh R, Ruest LB, Pierce WM Jr And Knudsen TB. Fetal Alcohol
   Syndrome (FAS) in C57BL/6 Mice Detected Through Proteomics Screening of the Amniotic
   Fluid Birth Defects Res (Part A) 82(4): 177-186. (2008).

7.  Deaciuc IV, Song Z, Peng X, Barve SS, Song M, He Q,  Knudsen TB, Singh AV, and Mcclain
   CJ. Genome-Wide Transcriptome Expression In The Liver of A Mouse Model of High
   Carbohydrate Diet-Induced Liver Steatosis And Its Significance For The Disease. Hepatol
   International, 2(1): 39-49 (2008).

8.  Hardison NE,  Fanelli TJ, Dudek SM, Reif DM,  Richie MD, Motsinger AA. A Balanced
   Accuracy Fitness Function Leads To Robust Analysis Using Grammatical Evolution Neural
   Networks In The Case Of Class Imbalance. Genetic and Evolutionary Computation
   Conference. (2008).

9.  Harris, LA, and Barton HA. Comparing Single and Repeated Dosimetry Data for
   Perfluorooctane Suflonate in Rats. Toxicology Letters, 181(3):148-156, (2008).

10. Houck KA and Kavlock RJ. Understanding Mechanisms of Toxicity: Insights from  Drug
   Discovery. Toxicol and Appl. Pharm. 227(2): 163-178. (2008).

11. Judson R, Richard A, Dix D, Houck K, Elloumi F, Martin M, Cathey T, Transue Tr, Spencer
   R, Wolf M. Actor-Aggregated Computational Toxicology Resource. Toxicol Appl Pharmacol.
   15;233(1):7-13. (2008).

12. Judson R. Pharmacogenetics in Drug Development and Research in Electrical Diseases of
   the Heart: Genetics, Mechanisms, Treatment,  Prevention Edited By Gussak, Antzelevitch,
   Wilde,  Friedman, Ackerman and Shen (Springer, 2008).

13. Judson R, Elloumi F, Setzer RW,  Li Z, and Shah I. A Comparison of Machine Learning
   Algorithms For Chemical Toxicity Classification Using A Simulated Multi-Scale Data Model.
   BMC Bioinformatics.  19(9):241, (2008).

14. Kavlock RJ, Ankley G, Blancato J, Breen M, Conolly R,  Dix D, Houck K, Hubal E, Judson R,
   Rabinowitz J,  Richard A, Setzer RW, Shah I, Villeneuve D, and Weber E. Computational
   Toxicology: A State of the Science Mini Review. Toxicological Sciences 103(1), 14-27.
   (2008).

15. Knaak, JB., Dary CC, Okino MS, Power FW, Zhang X, Thompson CB, Tornero-Velez R.,
   and Blancato.JN.  Parameters for Carbamate Pesticide QSAR and PBPK/PD Models for
   Human Risk Assessment. Environmental Contamination and Toxicology. Springer-Verlag,
   New York, NY, 193:53-210, (2008).

16. Knudsen TB and Kavlock RJ. Comparative Bioinformatics and  Computational Toxicology. In:
   Developmental Toxicology 3rd Edition. (B Abbott And D Hansen, Editors) New York: Taylor
   And Francis.Chapter 12, PP 311-360 (2008).
                       Previous  I    TOC

-------
17. Loizou, G., Spendiff M, Barton HA, Bessems J, Bois FY, d'Yvoire MB, Buist H, Clewell HJ
   3rd, Meek B, Gundert-Remy U, Goerlitz G, and Schmitt W. Development Of Good Modelling
   Practice For Phsiologically Based Pharmacokinetic Models For Use In Risk Assessment:
   The First Steps. Regulatory Toxicology and Pharmacology, 50(3):400-411, (2008).

18. Motsinger AA, Reif DM, Fanelli TJ, Ritchie MD. A Comparison of Analytical Methods for
   Genetic Association Studies. Genetic Epidemiology, 32(6). (2008).

19. Nong A, Tan YM, Krolski ME, Wang J, Lunchick C, Conolly RB, and Clewell HJ 3rd.
   Bayesian Calibration of A Physiologically Based Pharmacokinetic/Pharmacodynamic Model
   of Carbaryl Cholinesterase Inhibition. J. Toxicol. Environ, Health 71, 1363-1381. (2008).

20. Rabinowitz JR, Goldsmith MR, Little SB, and Pasquinelli MA. Computational Molecular
   Modeling For Evaluating The Toxicity Of Environmental Chemicals: Prioritizing Bioassay
   Requirements. Environmental Health Perspectives, 116(5), 573-577. (2008).

21. Reif DM, Mckinney BA, Motsinger AA, Chanock SJ, Rock  MT, Moore JH, Crowe JE Jr.
   Genetic Basis for Systemic Adverse Events Following Smallpox Vaccination. Journal of
   Infectious Diseases,  198(1). (2008).

22. Richard AM, Yang C, and Judson R. Toxicity Data Informatics: Supporting a  New Paradigm
   for Toxicity Prediction. Toxicology Mechanisms and Methods. 18(2  & 3):103-118, (2008).

23. Rodriguez CE, Sobol Z, Schiestl RH. 9,10-Phenanthraquinone Induces DMA Deletions and
   Forward Mutations Via Oxidative Mechanisms In The Yeast Saccharomyces Cerevisiae,
   Toxicology In Vitro 22(2):296-300 (2008).

24. Rogers JM and Kavlock RJ. Developmental Toxicity. In: Casarett & Doull's
   Toxicology:  The Basic Science of Poisons, 7th Ed. Cd  Klaassen, Editor. Mcgraw-Hill,
   Inc., New York, NY, 301-331. (2008).

25. Rouchka EC, Phatak AW, and Singh AV. Effect of Single Nucleotide Polymorphisms On
   Affymetrix® Match-Mismatch Probe Pairs Bioinformation 2(9):405-11. (2008)

26. Thompson,  CM., Sonawane B, Barton HA, Dewoskin RS,  Lipscomb JC, Schlosser P,  Chiu
   W., and Krishnan K. Approaches for Applications of Physiologically Based Pharmacokinetic
   Models in Risk Assessment. Journal of Toxicology and Environmental Health - Part B:
   Critical Reviews, 11(7):519-47, (2008).

27. Verzilli C, Shah T, Casas JP, Chapman J, Sandhu M, Debenham SL,  Boekholdt MS, Khaw
   KT, Wareham NJ, Judson R, Benjamin EJ, Kathiresan S, Larson MG, Rong J, Sofat R,
   Humphries SE, Smeeth L, Cavalleri G, Whittaker JC, Hingorani AD. Hingorani. "Bayesian
   Meta Analysis of Genetic Association Studies with Different Sets of Markers", Am.
   J.Hum.Gen. 82(4):859-872 (2008).

28. Wambaugh, JF, Barton HA, and Setzer RW. Comparing Models for Perfluorooctanoic Acid
   Pharmacokinetics Using Bayesian Analysis. Journal of Pharmacokinetics and
   Pharmacodynamics,  35(6):683-712, (2008).
                        Previous  I    TOC

-------
29. Yang C, Hasselgren CH, Boyer S, Arvidson K, Aveston S, Diekes P, Benigni R, Benz RD,
   Contrera J, Kruhlak NL, Matthews EJ, Han X, Jaworska J, Kemper RA, Rathman JF, and
   Richard AM. Understanding Genetic Toxicity through Data Mining: The Process of Building
   Knowledge by Integrating Multiple Genetic Toxicity Databases Toxicology Mechanisms and
   Methods. 18(2 & 3):277-295, (2008).

30. Yoon, M. and Barton HA. Predicting Maternal Rat And  Pup Exposures: How Different Are
   They? Toxicological Sciences. Society of Toxicology, 102(1):15-32, (2008).

31. Zhu H, Rusyn I, Richard AM,and Tropsha A. Use Of Cell Viability Assay Data Improves The
   Prediction  Accuracy of Conventional Quantitative Structure-Activity Relationship Models of
   Animal Carcinogenicity. Environmental Health Perspectives, 116(4):506-513, (2008).

2007
1.  Barton HA., Chiu WA, Setzer RW, Andersen ME, Bailer AJ, Bois FY, Dewoskin RS, Hays S,
   Johanson  G, Jones N, Loizou G, Macphail RC, Portier C, Spendiff M, and Tan YM.
   Characterizing Uncertainty And Variability In PBPK Models: State of The Science And
   Needs For Research And Implementation. Toxicological Sciences. Society of Toxicology,
   99(2):395-402, (2007).

2.  Benigni R, Netzeva Tl, Benfenati E, Bossa C, Franke R, Helma C, Hulzebos E, Marchant C,
   Richard A, Woo YT, Yang C. The Expanding Role of Predictive Toxicology: An Update on
   the (Q)SAR Models For Mutagens and Carcinogens. J. Environ. Sci. Health C, 25:53-97,
   (2007).

3.  Blancato, JN., Evans MV, Power FW, and Caldwell JC. Development and Use of PBPK
   Modeling and the Impact of Metabolism on Variability in Dose Metrics for the Risk
   Assessment of Methyl Tertiary Butyl Ether (MTBE). Journal of Environmental Science  and
   Health, 1:29-51, (2007).

4.  Breen MS, Villeneuve DL, Breen M, Ankley GT, and Conolly RB. Mechanistic Computational
   Model of Ovarian Steroidogenesis to  Predict Biochemical Responses to Endocrine Active
   Compounds. Annals Biomed. Engineering, 35(6), 970-981. (2007).

5.  Cherney DP, Ekman DR, Dix DJ, Collette TW. Raman  Spectroscopy-Based Metabolomics
   For Differentiating Exposures To Triazole Fungicides Using Rat Urine. Anal Chem
   79(19):7324-32. (2007).

6.  Chiu, W., Barton HA., Dewoskin RS, Schlosser P, Thompson CM, Sonawane B, Lipscomb
   JC, and Krishnan K. Evaluation of Physiologically Based Pharmacokinetic Models for Use in
   Risk Assessment.  Journal of Applied  Toxicology, 27(3):218-237, (2007).

7.  Conolly R, Blancato, JN. Development and Use of PBPK Modeling and the Impact of
   Metabolism on Variability In Dose Metrics for The Risk Assessment Of Methyl Tertiary Butyl
   Ether (MTBE), Journal of Environmental  Protection Science, 1: 29-51,  (2007).

8.  Conolly RB, and Thomas RS.  Biologically Motivated Approaches to Extrapolation From
   High To Low Doses And The Advent Of Systems Biology: The Road To Toxicological Safety
   Assessment. Human and Ecological Risk Assessment 13(1), 52-56.(2007).
                       Previous  I    TOC

-------
9.  Cummings AM, Stoker TE, and Kavlock RJ. Gender-Based Differences in Endocrine and
   Reproductive Toxicity. Environmental Research, 104(1):96-107, (2007).

10. Defur PI, Evans GW, Cohen Hubal EA, Kyle AD, Morello-Frosch RA, Williams D.
   Vulnerability as a Function of Individual and Group Resources In Cumulative Risk
   Assessment. Environ Health Perspect 115(5):817-824. (2007).

11.Dix DJ, Houck KA, Martin MT,  Richard AM, Setzer RWand Kavlock RJ. The Toxcast
   Program For Prioritizing Toxicity Testing of Environmental Chemicals. Toxicol. Sci., 95(1); 5-
   12. (2007).

12. Firestone M, Moya J, Cohen Hubal E,  Zartarian V, Xue J. Identifying Childhood Age Groups
   for Exposure Assessments Monitoring. Risk Analysis 27(3): 701-714. (2007).

13. Goetz AK, Ren H, Schmid JE, Blystone CR, Thillainadarajah I, Best DS, Nichols HP, Strader
   LF, Wolf DC, Narotsky MG, Rockett JC, Dix DJ. Disruption of Testosterone Homeostasis as
   a Mode of Action  for the Reproductive Toxicity of Triazole Fungicides in the Male Rat.
   Toxicol Sci 95(1):227-39. (2007).

14. Green ML, Singh  AV, Zhang Y, Nemeth KA, Sulik KK,  and Knudsen TB. Reprogramming Of
   Genetic Networks During Initiation of the Fetal Alcohol Syndrome. Dev Dyn. 236(2):613-31.
   (2007).

15. Hilborn ED, Carmichael WW, Scares CM, Yuan M, Servaites JC, Barton HA, and Azevedo
   SM. Serologic Evaluation of Human Microcystin Exposure. Environmental Toxicology,
   (22)5:459-463, (2007).

16. Kavlock RJ, Dix DJ, Houck KA, Judson RS, Martin MT, Richard AM. Toxcast: Developing
   Predictive Signatures for Chemical Toxicity. Alt. Animal Test Experiment. 14, Special Issue,
   623-627. (2007).

17. Kim SJ, Dix DJ, Thompson KE, Murrell RN, Schmid JE, Gallagher JE, Rockett JC.  Effects
   of Storage, RNA Extraction, Genechip Type, and Donor Sex On Gene Expression  Profiling
   of Human Whole  Blood. Clin Chem 53(6): 1038-45. (2007).

18. Liao KH, Tan YM, Conolly RB, Borghoff SJ, Gargas ML, Andersen ME, and Clewell HJ 3rd.
   Bayesian Estimation of Pharmacokinetic and  Pharmacodynamic Parameters in a Mode-of-
   Action-Based Cancer Risk Assessment for Chloroform. Risk Anal. 27(6), 1535-1551. (2007).

19. Martin MT, Brennan RJ, Hu W, Ayanoglu E, Lau C, Ren H, Wood CR, Gorton JC, Kavlock
   RJ, Dix DJ. Toxicogenomic Study of Triazole  Fungicides And Perfluoroalkyl Acids  In Rat
   Livers Predicts Toxicity and Categorizes Chemicals Based On Mechanisms of Toxicity.
   Toxicol Sci 97(2):595-613. (2007).

20. Mckinney BA, Reif DM White BC, Crowe JC,  Moore JH. Evaporative Cooling Feature
   Selection ForGenotypic Data  Involving Interactions. Bioinformatics, 23(16). (2007).

21. Motsinger AA, Reif DM.  Embracing Complexity: Gene-Gene and Gene-Environment
   Interactions.  In: Genes, Genomes, and Genomics, Vol. 3. (2007).
                       Previous  I    TOC

-------
22. Motsinger AA, Ritchie MD, Reif DM. Novel Methods for Detecting Epistasis in
   Pharmacogenomics Studies. Pharmacogenomics, 8(9). (2007).

23. Motsinger AA, Reif DM, Fanelli TJ, Davis AC, Ritchie MD. Linkage Disequilibrium In Genetic
   Association Studies Improves The Power Of Grammatical Evolution Neural Networks, leee
   Symposium on Computational Intelligence In Bioinformatics and Computational Biology.
   Linkage Disequilibrium in Genetic Association Studies Improves the Power of Grammatical
   Evolution Neural Networks (2007).

24. Platts AE, Dix DJ, Chemes HE, Thompson KE, Goodrich R, Rockett JC, Rawe VY, Quintana
   S, Diamond MP, Strader LF, Krawetz SA. Success And Failure In Human Spermatogenesis
   As Revealed By Teratozoospermic RNAs. Hum Mol Genet 16(7):763-73. (2007).

25. Power, F., Blancato, JN. Malathion Exposure During Lice Treatment: Use of Exposure
   Related Dose Estimating Model (ERDEM) and Factors Relating To The Evaluation Of Risk,
   U.S. Environmental Protection Agency, Washington, DC, EPA/600/R-07/023 (NTIS PB2007-
   106971), (2007).

26. Reif DM, Israel MA, Moore JH. Exploratory Visual Analysis of Statistical Results of
   Microarray Experiments Comparing High and Low Grade Glioma. Cancer Informatics, 2(1).
   (2007).

27. Rhomberg LR, Baetcke K, Blancato J, Bus J, Cohen S, Conolly R, Dixit, R, Doe J, Ekelman
   K, Fenner-Crisp P, Harvey P, Hattis D, Jacobs A, Jacobson-Kram D, Lewandowski T,
   Liteplo R, Pelkonen O, Rice J, Somers D, Turturro A, West W, and Olin S. Issues in the
   Design and Interpretation of Chronic Toxicity and Carcinogenicity Studies in Rodents:
   Approaches To Dose Selection. Crit. Rev. Toxicol. 37(9): 729-837. (2007).

28. Rodriguez CE, Mahle  DA, Gearhart JM, Mattie DR, Lipscomb JC, Cook RS, and Barton HA.
   Predicting Age-Appropriate Pharmacokinetics Of Six Volatile Organic Compounds In The
   Rat Utilizing Physiologically Based Pharmacokinetic Modeling. Toxicological Sciences.
   Society of Toxicology, 98(1):43-56, (2007).

29. Ryan PB, Burke TA, Cohen Hubal EA, Cura JJ, Mckone TE.  Using Biomarkers to Inform
   Cumulative Risk Assessment. Environ Health Perspect 115:833-84 (2007)

30. Singh AV, Knudsen KB and Knudsen TB. Integrative Analysis of the Mouse Embryonic
   Transcriptome. Bioinformation, 1(10), 406-413. (2007).

31. Singh AV, Rouchka E, Rempala G, Bastian C and Knudsen TB. Integrative Database
   Management for Mouse Development: Systems and Concepts  Review. Birth Defects
   Research (Part C) 81:1-19. (2007).

32. Vahter M, Gochfeld M, Casati B, Thiruchelvam M, Falk-Filippson A, Kavlock R, Marafante E,
   and Cory-Slechta D. Implications of Gender  Differences for Human Health Risk Assessment
   and Toxicology. Environmental Research, 104(1):70-84, (2007).

33. Wambaugh JF, Matthews JV, Gremaud PA,  and Behringer RP. "Response to Perturbations
   in Granular Flow. Physical Review E 76, 051303 (2007).
                       Previous  I    TOC

-------
34. Yoon M, Madden MC, and Barton HA. Extrahepatic Metabolism in Cyp2e1 in PBPK
   Modeling of Lipophilic Volatile Organic Chemicals: Impacts on Metabolic Parameter
   Estimation and Prediction of Dose Metrics. Journal of Toxicology and Environmental Health,
   70(18):1527-1541, (2007).

2006
1.  Barone S Jr, Re Brown, S Euling, E Cohen Hubal, Ca Kimmel, S Makris, J Moya, Sg
   Selevan, B Sonawane, T Thomas, C Thompson. Vision General De Al Evaluacion Del
   Riesgo En Salud Infantil Empleando Un Enfouque Por Etapas De Desarrollo [Overview Of A
   Life Stage Approach To Children's Health Risk Assessment] Acta Toxicologica Argentina,
   14(Suplemento) 7-10. (2006).

2.  Barton HA, Tang J, Sey Ym, Stanko JP, Murrell RN, Rockett JC, Dix DJ  Metabolism Of
   Myclobutanil And Triadimefon By Human And Rat Cytochrome P450 Enzymes And Liver
   Microsomes. Xenobiotica  36(9):793-806. (2006).

3.  Barton HA, Tang J, Sey YM, Stanko JP, Murrell RN, Rockett JC, Dix DJ. Metabolism of
   Myclobutanil and Triadimefon by Human and Rat Cytochrome P450 Enzymes and Liver
   Microsomes. Xenobiotica, 36(09):793-806, (2006).

4.  Barton HA, PastoorTP, Baetcke K, Chambers JE, Diliberto J,  Doerrer NG, Driver JH,
   Hastings CE, lyengar S, Krieger R, Stahl B, Timchalk C. The Acquisition and Application of
   Absorption, Distribution, Metabolism,  and Excretion (ADME) Data in Agricultural Chemical
   Safety Assessments. Critical  Reviews in Toxicology, 36(1):9-35, (2006).

5.  Birnbaum LS and Cohen-Hubal EA. Polybrominated Diphenyl  Ethers: A Case Study for
   Using Biomonitoring Data to Address Risk Assessment Questions. Environmental Health
   Perspectives, 114(11):1770-1775, (2006).

6.  Blancato, JN. Exposure Related Dose Estimating Model (ERDEM) A Physiologically-Based
   Pharmacokinetic And Pharmacodynamic (PBPK/PD) Model for Assessing Human Exposure
   and Risk, EPA Report- U.S. Environmental Protection Agency, Washington, DC,
   EPA/600/R-06/061 (NTIS  PB2006-114712), (2006).

7.  Blancato, JN. Computational  Environmental Toxicology, Mcgraw-Hill Yearbook of Science
   and Technology, Chapter  1, PP 72-75 (2006).

8.  Carmichael NG, Barton HA, Boobis AR, Cooper RL, Dellarco VL, Doerrer NG, Fenner-Crisp
   PA, Doe JE,  Lamb JC 4th, and PastoorTP. Agricultural Chemical Safety Assessment: A
   Multisector Approach to the Modernization of Human Safety Requirements.  Critical Reviews
   in Toxicology, 36(1):1-7, (2006).

9.  Cohen Hubal EA, Egeghy PP, Leovic KW, Akland GG. Measuring Potential  Dermal Transfer
   of a Pesticide to Children  in a Child Care Center. Environ Health Perspect 114(2)264-269.
   (2006).

10. Cohen Hubal EA. Uso De Los Datos De Biomonitoreo Para Informar Sobre  La Evaluacion
   De La Exposition Infantil [Using Biomonitoring Data To Inform Exposure Assessment In
   Children] Acta Toxicologica Argentina [Journal Of The Argentinan Society Of Toxicology].
   14(Suplemento) 17-19. (2006).
                       Previous  I    TOC

-------
11. Denslow ND, Colbourne JK, Dix DJ, Freedman JH, Helbing CC, Kennedy S, Williams PI
   Selection of Surrogate Animal Species For Comparative Toxicogenomics. In: Emerging
   Molecular and Computational Approaches for Cross-Species Extrapolations. Eds. W
   Benson and R Di Giulio. CRC Press, Florida. (2006).

12. Dix DJ, Gallagher K, Benson WH, Groskinsky BL, Mcclintock JT, Dearfield KL,  Farland WH.
   A Framework for the Use of Genomics Data at the EPA. Nat Biotechnol 24(9): 1108-11.
   (2006).

13. Goetz AK, Bao W, Ren H, Schmid JE, Tully DB, Wood C, Rockett JC, Narotsky MG, Sun G,
   Lambert GR, Thai SF, Wolf DC, Nesnow S, Dix DJ Gene Expression Profiling In The Liver of
   Cd-1 Mice To Characterize The Hepatotoxicity of Triazole Fungicides. Toxicol Appl
   Pharmacol 215(3):274-84. (2006).

14. Hunter SE, Rogers E, Blanton M, Richard AM, Chernoff N. Bromochloro-Haloacetic Acids:
   Effects on Mouse Embryos In Vitro And QSAR Considerations, Birth Defects Research Part
   A, 21(3):260-266, (2006).

15.Kavlock RJ, Barr D,  Boekelheide K, Breslin W,  Breysse  P, Chapin R, Gaido K, Hodgson E,
   Marcus M, Shea K,  and Williams P.  NTP Center For The Evaluation of Risks To Human
   Reproduction: Expert Panel Update on the Reproductive and Developmental Toxicity of
   Di(2-Ethylhexyl)  Phthalate .  Reproductive Toxicology, 22(3):291-399, (2006).

16. Kim SJ, Dix DJ, Thompson KE, Murrell RN, Schmid JE, Gallagher JE, Rockett JC. Gene
   Expression In Head Hair Follicles Plucked From Men And Women. Ann Clin Lab Sci
   36(2): 115-26. (2006).

17. Kim YK, Suarez J, Hu Y, McDonough PM, Boer C, Dix DJ, Dillmann WH. Deletion of the
   Inducible 70-Kda Heat Shock Protein Genes In Mice Impairs Cardiac Contractile Function
   and Calcium Handling Associated With Hypertrophy. Circulation 113(22):2589-97. (2006).

18. Luderer U, Collins TF, Daston GP, Fischer LJ,  Gray RH, Mirer FE, Olshan AF, Setzer RW,
   Treinen KA, Vermeulen R. NTP-CERHR Expert Panel Report on the Reproductive and
   Developmental Toxicity of Styrene, Birth Defects Research (Part B) 77(2):110-193 (2006).

19. Potter LK, Zager MG, and Barton HA. A Mathematical Model for the Androgenic Regulation
   of the Prostate in Intact and Castrate Adult Male Rats. American Journal of Physiology.
   American Physiological Society, 291(5):E952-E964, (2006).

20. Richard AM, Gold LS, Nicklaus MC. Chemical Structure Indexing Of Toxicity Data on the
   Internet: Moving Towards a Flat World. Current Opinion in Drug Discovery & Develop,
   9(3):314-325, (2006).

21. Richard AM The Future Of Toxicology-Predictive Toxicology: An Expanded  View Of
   Chemical Toxicity. 10 Chemical Research in Toxicology. American Chemical Society,
   Washington, DC, 9(September): 1257-1262, (2006).

22. Rockett JC, Narotsky MG, Thompson KE, Thillainadarajah I, Blystone CR, Goetz AK, Ren
   H, Best DS, Murrell RN, Nichols HP, Schmid JE, Wolf DC, and Dix DJ. Effect of Conazole
   Fungicides on Reproductive Development in the Female Rat. Reproductive Toxicology,
   22(4):647-658, (2006).
                                         10
                       Previous  I    TOC

-------
23. Shi L et al, MAQC Consortium. The Microarray Quality Control (MAQC) Project Shows Inter-
   and Intraplatform Reproducibility of Gene Expression Measurements. Nat Biotechnol
   24(9): 1151-61. (2006).

24. Sun G, Thai SF, Lambert GR, Wolf DC, Tully DB, Goetz AK, George MH, Grindstaff RD, Dix
   DJ, Nesnow S. Fluconazole-lnduced Hepatic Cytochrome P450 Gene Expression And
   Enzymatic Activities In Rats And Mice. Toxicology Letters, 164(1):44-53, (2006).

25. Tan YM, Liao KH, Conolly RB, Blount BC, Mason AM, and Clewell HJ. Use of A
   Physiologically Based Pharmacokinetic Model to Identify Exposures Consistent With Human
26. Biomonitoring Data for Chloroform. J. Toxicol. Environ. Health, Part A, 69(18),  1727-1756.
   (2006).

27. Tolson JK, Dix DJ, Voellmy RW, Roberts SM. Increased Hepatotoxicity of Acetaminophen in
   Hsp70i Knockout Mice, Toxicol Appl Pharmacol. 2006 N 1; 210(1-2):157-62 (2006).

28. Tully DB, Bao W, Goetz AK, Blystone CR, Ren H, Schmid JE, Strader LF, Wood CR, Best
   DS, Narotsky MG, Wolf DC, Rockett JC, Dix DJ. Gene Expression Profiling In Liver and
   Testis of Rats to Characterize the Toxicity of Triazole Fungicides. Sciencedirect Elsevier
   (Ed.), Toxicology and Applied Pharmacology, 215(3):260-273, (2006).

29. Wambaugh, JF, Graph Percolation As An Analog To Granular Force Networks. Cond-
   Mat/0603314, (2006).

30. Yang C, Richard AM, Cross KP. The Art of Data Mining the Minefields of Toxicity Databases
   to Link Chemistry to Biology. Curr Comput-Aided Drug Design, 2(2):135-150, (2006).

31. Yoon, M., Madden MC, and Barton HA. Developmental Expression of Aldehyde
   Dehydrogenase in Rat: A Comparison of Liver and Lung Development. Toxicological
   Sciences. Society of Toxicology, 89(2):386-398,  (2006).

32. Zhang Q, Andersen ME, and Conolly RB. Binary Gene Induction and Protein Expression in
   Individual Cells. Theor. Biol. Med. Modelling 5; 3:18, (2006)

2005
1.  Andersen ME, Dennison JE, Thomas RE, and Conolly RB. New Directions In Incidence-
   Dose Modeling. Trends Biotechnol. 23(3):122-127. (2005).

2.  Bao W, Schmid JE, Goetz AK, Ren H, Dix DJ A Database For Tracking Toxicogenomic
   Samples And Procedures. Reproductive Toxicology 19(3):411-419. (2005).

3.  Barton HA, Cogliano VJ, Flowers L, Valcovic L, Setzer RW, Woodruff TJ. Assessing
   Susceptibility from Early-Life Exposure to Carcinogens. Environmental Health Perspectives,
   113(9): 1125-1133, (2005).

4.  Barton HA, Computational Pharmacokinetics During Developmental Windows of
   Susceptibility. Journal of Toxicology and Environmental Health - Part A: Current Issues,
   68(11-12):889-900, (2005).
                                         11
                        Previous  I    TOC

-------
5.  Cohen Hubal EA, Suggs JC, Nishioka MG, Ivancic WA, Characterizing Residue Transfer
   Efficiencies Using a Fluorescent Imaging Technique, Journal of Exposure Analysis and
   Environmental Epidemiology. 15(3):261-270. (2005).

6.  Conolly RB, Gaylor DW, and Lutz WK. Population Variability in Biological Adaptive
   Responses to DMA Damage and the Shapes of Carcinogen Dose-Response Curves.
   Toxicol. Appl. Pharmacol. 207(2 suppl):570-75. (2005).

7.  Cummings A, and Kavlock R A Systems  Biology Approach to Developmental Toxicology.
   Repro. Toxicol, 19(3):281-290. (2005).

8.  Firestone M, Cohen Hubal EA, E.A.Guidance On Selecting Age Groups For Monitoring And
   Assessing Childhood Exposures To Environmental Contaminants, EPA Report - Risk
   Assessment Forum, U.S. Environmental Protection Agency, Washington, DC.  EPA/630/P-
   03/003F. (2005).

9.  Fostel J, Choi D, Zwickl C, Morrison N, Rashid A,  Hasan A, Bao W, Richard A, Tong W,
   Bushel PR, Brown R, Bruno M, Cunningham ML, Dix D, Eastin W, Frade C, Garcia A,
   Heinloth A, Irwin R, Madenspacher J, Merrick BA, Papoian T, Paules R, Rocca-Serra P,
   Sansone AS, Stevens J, Tomer K, Yang C, Waters M.  Chemical Effects In  Biological
   Systems-Data Dictionary (CEBS-DD): A Compendium of Terms For The Capture and
   Integration of Biological Study Design Description, Conventional Phenotypes, and 'Omics
   Data. Tox. Sci., 88(2):585-601. (2005).

10. Granville CA, Ross MK, Tornero-Velez R, Hanley  NM,  Grindstaff RD, Gold  A, Richard AM,
   Funasaka  K, Tennant AH, Kligerman AD, Evans MV, DeMarini D. Genotoxicity And
   Metabolism of the Source-Water Contaminant 1,1-Dichloropropene: Activation By Gstt1-1.
   Mutat. Res. 572(1-2):98-112, (2005).

11.Kavlock R, Ankley GT, Collette T, Francis E, Hammerstrom K, Fowle J, Tilson H, Toth G,
   Schmieder K, Veith GD, Weber E, Wolf E, Wolf DC, and Young D. Computational
   Toxicology: Framework, Partnerships, And Program Development. Reproductive Toxicology.
   19(3):265-280, (2005).

12. Kavlock R, and Cummings A. Mode of Action: Inhibition of Androgen Receptor Function-
   Vinclozolin-lnduced Malformations in Reproductive Development. Critical Reviews in
   Toxicology, 35(8-9):721-726, (2005).

13. Kavlock, R. J. And A. M. Cummings. Mode of Action: Reduction of Testosterone Availability-
   Molinate-lnduced Inhibition Of Spermatogenesis. Critical Reviews in Toxicology. Crc Press
   Lie, Boca Raton, Fl, 35(8-9):685-690, (2005).

14. Lutz WK, Gaylor DW, Conolly RB, and Lutz RW. Nonlinearity and Thresholds in Dose-
   Response Relationships for Carcinogenicity Due To Sampling Variation, Logarithmic Dose
   Scaling, or Small Differences in Individual Susceptibility. Toxicol. Appl. Pharmacol. 207(2
   suppl):565-69. (2005).

15. Ostermeier GC, Goodrich RJ, Diamond MP, Dix DJ, Krawetz SA. Toward Using Stable
   Spermatozoal Rnas For Prognostic Assessment Of  Male Factor Fertility. Fertility and
   Sterility, 83(6): 1687-94. (2005).
                                         12
                        Previous  I    TOC

-------
16.Seed, J, Carney EW, Corley RA, Crofton KM, Desesso JM, Foster PM, Kavlock RJ, Kimmel
   G, Klaunig J, Meek ME, Preston RJ, Slikker W Jr, Tabacova S, Williams GM, Wiltse J,
   Zoeller RT, Fenner-Crisp P, and Patton D. Overview: Using Mode of Action and Life Stage
   Information To Evaluate the Human Relevance of Animal Toxicity Data. Critical Reviews in
   Toxicology, 35(8-9):663-672, (2005).

17. Teeguarden JG, Waechter JM Jr., Clewell HJ 3rd, Covington TR, and Barton HA. Evaluation
   of Oral and Intravenous Route Pharmacokinetics, Plasma Protein Binding and Uterine
   Tissue Dose Metrics of BPA: A Physiologically Based Pharmacokinetic Approach.
   Toxicological Sciences, 85(2):823-838, (2005).

18. Teeguarden JG, .Deisinger PJ, PoetTS, English JC, FaberWD, Barton HA, Corley RA, and
   Clewell HJ 3rd Derivation Of A Human Equivalent Concentration For N-Butanol Using A
   Physiologically Based Pharmacokinetic Model For N-Butyl Acetate and Metabolites N-
   Butanol And  N-Butyric Acid. Toxicological Sciences. 85(1):429-446, (2005).

19. Tully DB, Luft JC, Rockett JC, Ren H,  Schmid JE, Wood CR, Dix DJ Reproductive And
   Genomic Effects In Testes From Mice Exposed To The Water Disinfectant Byproduct
   Bromochloroacetic Acid. Reproductive Toxicology 19(3):353-366.  (2005).
                                         13
                       Previous  I    TOC

-------
BMC  Bioinformatics                                            ^
D        ,     ..  ,                                                                               r:-,,,
Research article                                                                              jv^yjj v,y*

A comparison of machine learning algorithms for chemical  toxicity
classification using  a simulated multi-scale  data model
Richard Judson*1, Fathi Elloumi1, R Woodrow Setzer1, Zhen Li2 and
Imran Shah1

Address: National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Research
Triangle Park, North Carolina 27711, USA and 2Deptof Biostatistics University of North Carolina, Chapel Hill, 3126 McGavran-Greenberg Hall,
CB #7420, Chapel Hill, NC 27599-7420, USA
Email: Richard Judson* - judson.richard@epa.gov; Fathi Elloumi - elloumi.fathi@epa.gov; R Woodrow Setzer - setzer.woodrow@epa.gov;
Zhen Li - zli@bios.unc.edu; Imran Shah - shah.imran@epa.gov
* Corresponding author
Published: 19 May 2008                                       Received: 22 January 2008
BMC Bioinformatics 2008, 9:241  doi: 10.1 186/1471 -2105-9-241            Accepted: 19 May 2008

This article is available from: http://www.biomedcentral.eom/l47l-2IOS/9/24l

© 2008 Judson et al; licensee BioMed Central Ltd.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.Org/licenses/by/2.0).
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
            Abstract
            Background: Bioactivity profiling using high-throughput in vitro assays can reduce the cost and
            time required fortoxicological screening of environmental chemicals and can also reduce the need
            for animal testing. Several public efforts are aimed at discovering patterns or classifiers in high-
            dimensional bioactivity space that predict tissue, organ or whole animal toxicological endpoints.
            Supervised machine learning is a powerful approach to discover combinatorial relationships in
            complex  in vitro/in vivo  datasets. We present a novel model to simulate complex chemical-
            toxicology data sets and  use this model to evaluate the relative performance of different machine
            learning (ML) methods.
            Results:  The  classification  performance  of Artificial Neural  Networks (ANN), K-Nearest
            Neighbors (KNN), Linear Discriminant Analysis (LDA), Naive Bayes (NB), Recursive Partitioning
            and Regression Trees (RPART), and Support Vector Machines (SVM) in the presence and absence
            of filter-based feature selection was analyzed using K-way cross-validation testing and independent
            validation on simulated in vitro assay data sets with varying levels of model complexity, number of
            irrelevant features and measurement noise.  While the prediction accuracy of all ML methods
            decreased as non-causal (irrelevant) features were added, some ML methods performed better
            than others. In the limit of using a large number of features, ANN and SVM were always in the top
            performing set of methods while RPART and KNN (k = 5) were always in the poorest performing
            set.  The  addition of  measurement noise and irrelevant features decreased  the classification
            accuracy  of all ML methods, with LDA suffering  the greatest performance  degradation. LDA
            performance is especially sensitive to the use of feature selection. Filter-based feature selection
            generally improved performance, most strikingly for LDA.
            Conclusion: We have developed a novel simulation model to evaluate machine learning methods
            for the analysis of data sets in which in vitro bioassay data is being used to predict in vivo chemical
            toxicology. From our analysis, we can recommend that several ML methods, most notably SVM and
            ANN, are good candidates for use in real world applications in this area.
                                Previous  I     TOC
               Page 1 of 16
(page number not for citation purposes)

-------
BMC Bioinformatics 2008, 9:241
               http://vvww.biomedcentral.eom/1471-2105/9/241
Background
A daunting challenge faced by environmental regulators
in the U.S. and other countries is the requirement that
they evaluate the potential toxicity of a large number of
unique chemicals that are currently in common use (in
the range of 10,000-30,000) but for which little toxicol-
ogy information is available. The time and cost required
for traditional toxicity testing approaches, coupled with
the desire to reduce animal  use is driving the search for
new toxicity prediction methods [1-3]. Several efforts are
starting to address this information gap by using relatively
inexpensive, high throughput  screening approaches in
order to link chemical and biological space [1,4-21]. The
U.S. EPA is carrying out one  such large screening and pri-
oritization experiment, called ToxCast, whose goal is to
develop predictive signatures or classifiers that can accu-
rately predict whether a given chemical will  or will  not
cause particular toxicities [4]. This program is investigat-
ing a variety of chemically-induced  toxicity endpoints
including developmental and reproductive toxicity, neu-
rotoxicity and cancer. The initial training set being used
comes from a collection of~300 pesticide active ingredi-
ents for which  complete rodent toxicology profiles have
been compiled. This set of chemicals will be tested in sev-
eral hundred in vitro assays.

The goal of screening  and prioritization projects is to dis-
cover patterns or signatures in the set of high throughput
in  vitro assays (high throughput screening or HTS, high
content screening or HCS, and genomics) that are strongly
correlated with tissue, organ or whole animal toxicologi-
cal endpoints. One begins with chemicals for which toxi-
cology data is available (training chemicals) and develops
and validates predictive classification tools.  Supervised
machine  learning (ML)  approaches  can be  used  to
develop empirical models  that accurately classify the tox-
icological endpoints from large-scale  in vitro assay data
sets. This approach is similar to QSAR (quantitative struc-
ture activity relationship), which uses inexpensive calcu-
lated chemical descriptors to classify a variety of chemical
phenotypes, including toxicity. By analogy, one could use
the term QBAR (for quantitative bio-assay/activity rela-
tionship) to describe the use  of in vitro biological assays to
predict chemical activity. The QBAR strategy we describe
here is also related to biomarker discovery from large-
scale -omic data that  is used to predict on- or off-target
pharmacology in drug development, or to discover accu-
rate surrogates for disease state or disease progression.

The QBAR in vitro toxicology prioritization approach faces
a number of inter-related  biological and computational
challenges. First, there may be multiple molecular targets
and mechanisms by which a chemical can trigger a biolog-
ical response. Assuming that these alternative biological
mechanisms of action are  represented in the data, multi-
    ple techniques (including ML methods) may be required
    to  discover the underlying relationships between bio-
    assays and endpoint activity. Second, our present under-
    standing  of biological mechanisms of toxicity (often
    referred to as toxicity pathways) is relatively limited, so
    that one cannot a priori determine which of a set of assays
    will be relevant to a given toxicity phenotype. As a conse-
    quence, the relevant features may be missing from the
    data set and (potentially many) irrelevant features may be
    included. Here, by relevant features we mean data from
    assays that measure processes causally linked to the end-
    point of interest. By extension, irrelevant features include
    data from assays not causally linked to the endpoint. The
    presence of multiple irrelevant assays or features must be
    effectively managed by ML methods. Third, due to the
    high cost of performing the required in vivo studies, there
    are limited numbers of chemicals for which high quality
    toxicology data is available, and typically only a small
    fraction of these will clearly demonstrate the toxic effect
    being studied. The small numbers of examples and unbal-
    anced distribution of positive and negative instance for a
    toxicological endpoint can limit the ability of ML meth-
    ods to accurately generalize. In order to develop effective
    QBAR models of toxicity, these issues must be considered
    in the ML strategy.

    Four critical issues for evaluating the performance of ML
    methods  on complex  datasets are:  (1) the data set or
    model; (2) the  set of  algorithms evaluated; (3)  the
    method that is used to assess the accuracy of the classifica-
    tion algorithm; and (4) the method that is used for feature
    selection. In order to address the first issue, it was neces-
    sary to develop a model of chemical toxicity that captured
    the key points of the information flow in a biological sys-
    tem. The mathematical model we use is based on the fol-
    lowing ideas.

    1. There are multiple biological steps connecting the ini-
    tial interaction of a molecule with its principle target(s)
    and the emergence of a toxic phenotype. The molecular
    interaction can trigger molecular pathways, which when
    activated may lead to the differential activation of more
    complex  cellular processes.  Once  enough cells  are
    affected, a tissue or organ level phenotype can emerge.

    2. There will often be multiple mechanisms that give rise
    to the same phenotype, and this multiplicity of causal
    mechanisms likely exists at all levels of biological organi-
    zation. Multiple molecular interactions can lead to a sin-
    gle pathway being differentially regulated. Up-regulation
    of multiple pathways can lead to the expression of the
    same cellular phenotype. This process continues through
    the  levels of tissue, organ and whole animal. One can
    think of the chain of causation between molecular triggers
                                 Previous
TOC
                Page 2 of 16
(page number not for citation purposes)

-------
BMC Bioinformatics 2008, 9:241
               http://vvww.biomedcentral.eom/1471-2105/9/241
and endpoints as a many-branched tree, potentially with
feedback from higher to lower levels of organization.

3. The number of assays one needs to measure is large,
given our relative lack of knowledge of the underlying
mechanism linking direct chemical interactions with toxic
endpoints.

4. The number of example chemicals for which detailed
toxicology information is available is relatively limited
due to the high cost of generating the data. In most cases,
if a chemical is known to be significantly toxic, further
development and testing is halted, so it is unusual to have
complete, multi-endpoint toxicity data on molecules that
are  toxic  for any given mode.  A corollary is  that the
number of positive  examples for any given toxicity end-
point will be very limited, rarely making up more than
10% of all cases. This will limit the power to find true
associations between assays and endpoints. A related issue
is that most publicly available data sets that one can use
for toxicology modeling are heavily biased toward posi-
tive or toxic chemicals, because much less public effort is
put into performing extensive studies on chemicals that
are negative examples. The ToxCast data set is addressing
this selection bias by gathering complete data from a set
of chemicals without regard to their ultimate toxicity.

5. The available toxicity endpoint data tends to be categor-
ical rather than quantitative. This is due to the nature of
the in vivo experiments used to evaluate chemical toxicity.
Typically, too few animals are tested under any given con-
dition to pinpoint the lowest effective dose or the rate of
phenotypic toxicity at a particular dose. Instead, if a toxic
effect is seen  at a rate statistically above that seen with a
negative control, the chemical will be classified as causing
that toxicity.

We have developed a simple simulation model which
takes into account these ideas. Here we motivate the struc-
ture of the model, while the Methods and Results sections
provide details. We will illustrate the ideas behind our
model with the multiple known pathways that can lead to
rodent liver tumors. Several nuclear receptors, including
CAR (constitutive androstane receptor), PXR (pregnane-X
receptor)  and AHR  (aryl hydrocarbon receptor), when
activated by a xenobiotic, can upregulate a common set of
Phase I, Phase II and Phase III metabolizing enzyme path-
ways [22-24]. Each of these pathways can, when continu-
ally activated, lead to cellular phenotypes that include cell
proliferation, hypertrophy and cell death. A second, paral-
lel route is activated by compounds that bind to PPARa
(peroxisome proliferator-activated receptor a) and lead to
cellular hypertrophy and cellular proliferation [24,25]. In
a third mechanism, chemicals can directly interact with
DNA, causing the activation of DNA damage repair path-
    ways, which can in turn lead to cell death and cellular pro-
    liferation.  All  three  of these cellular  phenotypes  are
    potential precursors to liver tumors [26]. This collection
    of interconnected direct molecular targets, target-induced
    pathways, cellular or tissue phenotypes, and their connec-
    tions to the endpoint of liver tumors are illustrated in Fig-
    ure 1.

    Our model also assumes that a given chemical can interact
    with  multiple  molecular targets. It is well known that
    many drug compounds interact with multiple targets, as
    reflected in the phenomenon of off-target toxicity. Rele-
    vant  to the pathways shown in Figure  1, Moore et al
    showed that there  are compounds that simultaneously
    activate both CAR and  PXR pathways [27]. Preliminary
    data from the ToxCast program allows us to quantify the
    magnitude of this multi-target effect.  From a set of 183
    biochemical targets (primarily receptors, enzymes and ion
    channels), the  320 ToxCast chemicals[28] (mostly pesti-
    cides) were active against an average of 4.2  targets with a
    maximum of 35, a minimum of 0 and a standard devia-
    tion of 5.8.

    The connections shown in Figure 1 are not deterministic
    but instead depend on multiple factors including  the
    strength and duration of the initial chemical-target inter-
    action. Some pathways are more likely than  others to lead
    to the manifestation of particular cellular processes, and
    some cellular processes are more likely than others to lead
    to liver tumors. Based on this, one could assign a proba-
    bility or strength to each arrow in Figure 1. The probabil-
    ity that a given chemical will cause liver  tumors is then a
    complex integral over the individual step-to-step proba-
    bilities,  modulated by the target interaction strengths for
    the particular chemical.

    There is a vast literature on the evaluation of the perform-
    ance of different ML methods, but for the present applica-
    tion the literature concerning the analysis of microarray
    genomics data  sets and for QSAR applications are most
    relevant. Here we describe a pair of representative studies.
    Ancona et al. [29] used three algorithms (Weighted Voting
    Algorithm (WVM), Regularized Least Squares (RLS), Sup-
    port Vector Machine (SVM)) to classify  microarray sam-
    ples   as either  tumor or normal. They examined  the
    number of training examples that would be required to
    find a robust classifier.  In their example, SVM and RLS
    outperformed WVM.  Statnikov et al.  studied all  of the
    major classification issues in the context of multi-category
    classification using microarray data in cancer diagnosis
    [30].  They compared multi-category SVM (MC-SVM), k-
    nearest neighbors (KNN) and several artificial neural net-
    work (ANN) implementations and showed that MC-SVM
    was  far superior to the other algorithms they  tested in
    their application.
                                Previous
TOC
               Page 3 of 16
(page number not for citation purposes)

-------
BMC Bioinformatics 2008, 9:241
               http://vvww.biomedcentral.eom/1471-2105/9/241
     Nuclear
       DNA
                       Direct Molecular Targets
                                                                          Molecular Pathways / Processes
                                                                          Cellular /Tissue Processes
                                                                          Tissue / Organ Endpoint
Figure I
Connections between molecular targets, pathways, cellular processes and endpoints. This is illustrated for 5 molecular targets
(nuclear DNA, and the nuclear receptors CAR, PXR, AHR and PPARa), three molecular pathways, and three cellular pheno-
types, with liver tumors being the final endpoint. The connections have differing strengths or probabilities and are modulated
by the collection of interactions of a given chemical with the molecular targets.
The literature on machine learning methods in QSAR is
equally vast and extends back for 15 years or more. Much
of this work (like much of QSAR in general) is focused on
the (relatively easy) task of predicting activity  against
molecular targets.  A representative approach to target
interaction prediction is the paper by Burbridge et al. com-
paring SVM to several other algorithms for the prediction
of binding to dihydrofolate reductase [31]. Lepp et al per-
formed a similar study  that showed SVM performed well
in finding predictive QSAR models  for a  series of 21
molecular targets [32]. The recent state of the science for
predicting whole  animal  toxicity using ML and QSAR
methods were  reviewed by  Helma  and Kramer  [33],
Benigni and Giuliani [34] and by Toivonen et al.  [35].
They describe the outcome of an experiment (the Predic-
tive Toxicology Challenge) in which 17 groups submitted
111 models using a training set of 509 NTP compounds
for which mouse carcinogenicity data was available. The
goal was to predict the carcinogenicity of a set of 185 test
compounds. Only 5 of the 111 models performed better
than random guessing and the highest positive predictive
value for these was 55%, and this model had a false posi-
tive rate of 37%. These 5 models[36] include rule-based
methods using chemical fragments plus calculated physi-
cochemical properties,  a decision tree  model, and  one
using a voting scheme  across several standard ML meth-
    ods. It is difficult to draw many conclusions about the per-
    formance of ML methods from this exercise, which failed
    to produce significantly predictive methods. The authors
    of these reviews speculate that the cause is a combination
    of toxicity data being too noisy, the training  and test
    chemical spaces being too large,  and  structure  based
    approaches being inadequate to predict phenotypes as
    complex as whole animal toxicity.

    One of the key issues in systematically comparing the per-
    formance of ML methods is that of estimating accuracy in
    an unbiased way. For example, Ntzani and loannidis [37]
    report that many of the early studies using microarray data
    to classify tumor samples  did not perform appropriate
    cross validation, which has led to inflated predictions of
    classification accuracy. This observation prompted our
    use of independent validation sets.  Molinaro  et  al.
    showed that 10-fold cross validation performed well  for
    assessing accuracy of genomics classifiers [38]. Leave one
    out cross-validation (LOOCV) typically performed some-
    what better, but had a significantly higher computational
    cost. This was assessed by Molinaro et al.  in the context of
    using linear discriminant analysis (LDA), ANN, diagonal
    discriminant classifiers (DDA),  classification and regres-
    sion trees (CART) and ensemble classifiers. The Molinaro
    study data set (300  samples and 750 independent varia-
                                Previous
TOC
               Page 4 of 16
(page number not for citation purposes)

-------
BMC Bioinformatics 2008, 9:241
               http://vvww.biomedcentral.eom/1471-2105/9/241
bles), which used simulated genomics data, was similar in
size to the present work. Baldi, et al. [39] have systemati-
cally addressed the issue of ML performance metrics. They
describe a number of accuracy metrics including the bal-
anced accuracy or Q-score we use in this paper. The Q-
score is the average of the sensitivity and specificity. This
is most useful in the case where the classification variable
is dichotomous and where the number of positive and
negative cases in a training set is not well balanced. They
also  emphasize that the actual prediction accuracy is
related to the similarity of the training and test set.

Finally, Sima and Dougherty examined the issue of find-
ing an optimal subset of features with which  to train a
classification algorithm [40]. They compare sequential
floating forward search (SFFS) [41] and T-test feature selec-
tion. This latter can fail when variables are only predictive
when they act together. These authors' basic conclusion is
that  there are optimal subsets of features, but that poor
classification performance can be due to either a failure to
find an optimal subset or to the inability of any subset to
allow accurate classification. This study examined SVM,
KNN (n = 3)  and LDA as classification algorithms. These
authors suggest that automated feature selection methods
have inherent limitations and that one should use biolog-
ically-based selection when possible. Baker and Kramer
used the nearest centroids rule to select small subsets of
genes that could be used as robust classifiers from genom-
ics data sets [42]. Kohavi assessed the behavior of cross
validation methods to assess classifier accuracy for the
C4.5 and Naive Bayes algorithms [43]. This author con-
cludes  that k-fold cross validation with k = 10 provides a
good estimate of classification accuracy balanced against
modest computational requirements.

In summary, the goal of the analyses we present is to eval-
uate a machine learning approach to develop classifiers of
in vivo toxicity using in vitro assay data. In order to develop
an appropriate ML strategy, we generate simulated QBAR
data using a  mathematical model whose structure and
parameters are  motivated  by an idealized biological
response to chemical exposure based on  the  following
concepts: (a)  chemicals interact with multiple molecular
targets; (b) exposure to chemicals can stimulate multiple
pathways that lead to the same toxicological  endpoint;
and (c) there are multiple levels of biological organization
between the direct molecular interaction and the "apical"
endpoint. Additional parameters for generating simulated
data include model complexity, the level of noise in the
features, the number of chemicals to be screened and the
number of irrelevant features. We focus on the special case
where there is a large imbalance between the fraction of
positive and negative examples, which is found to  be the
case from our toxicological data [44]. The performance of
ML methods is analyzed as a function of these parameters.
    Results
    We evaluated the performance of different ML methods
    on simulated data sets generated by a biologically moti-
    vated analytic model. Data sets were simulated based on
    two levels of complexity; the number of irrelevant assays
    or input features in the data (data not causally connected
    with the endpoint being predicted); the number of chem-
    icals or instances; and the presence or absence of measure-
    ment noise in the data.  In all cases, all of the relevant
    features (causal for the endpoint being predicted) were
    included in the data set.

    The network depiction of the simulation models SI (less
    complex) and S2 (more complex) are illustrated in Figures
    2 and 3. These networks closely resemble the one shown in
    Figure 1, which models the connections leading from direct
    molecular interactions with DNA and a variety  of nuclear
    receptors and to liver tumors. Structurally, the simulation
    models are feed-forward networks that causally  link direct
    molecular interactions  (M-nodes) with  a final  organism-
    level toxicity endpoint, by way two levels of intervening
    biological processes.  Direct molecular interactions trigger
    pathway processes (P-nodes) which in turn trigger cellular
    processes (C nodes). Only if the cellular processes are acti-
    vated to a sufficient level is the final endpoint manifested.
    Of equal importance is the fact that many assays will be
    measured that are not causally  linked to the  endpoint.
    These irrelevant nodes are termed R-nodes for random. Our
    simulations typically include many more R than M nodes
    or  features.  Rules  for  linking molecular interaction
    strengths to the endpoint are described in the Methods sec-
    tion. The essential points for the present discussion are that
    a given chemical can interact with one or more input nodes
    and that the spectrum of input interactions uniquely deter-
    mines the value of the endpoint.

    The performance of LDA (Linear Discriminant  Analysis),
    KNN  (k-Nearest  Neighbors),  SVM   (Support  Vector
    Machines), ANN (Artificial Neural Networks), NB (Naive
    Bayes) and RPART (Recursive Partitioning and Regression
    Trees) was evaluated both with and without filter-based
    feature selection, using 10-way cross-validation testing, as
    well as validation  with independent data sets  which
    included 300 instances. For each set of conditions (ML
    method, model, number of features, number  of chemi-
    cals, inclusion of measurement noise, and the presence or
    absence of filter-based feature selection), training was car-
    ried out on 10 independent samples drawn from a simu-
    lated data  set of 10,000 chemicals. For all evaluations,
    10% of the chemicals were positive and 90% were nega-
    tive for the endpoint being predicted. As mentioned pre-
    viously, this imbalance between positive and negative
    examples reflects the situation with the data sets we  are
    modeling in which the adverse phenotypes being studied
    are rare. Predicted performance was evaluated using K-
                                Previous
TOC
               Page 5 of 16
(page number not for citation purposes)

-------
BMC Bioinformatics 2008, 9:241
                http://vvww.biomedcentral.eom/1471-2105/9/241
                                                         0.1
Figure 2
Model SI. The "M" nodes represent assays that measure direct molecular interactions with a chemical. These interactions can
activate pathways ("P" nodes) which can in turn activate cellular processes ("C" nodes). Finally, the activation of cellular proc-
esses can lead to the presence of an organ or organism-level endpoint. For Model SI, an additional 300 random or "R" nodes
were included in the input set of features, so that a total of 308 features are examined. Numerical values shown along edges
are values of wik used in Equation I.
fold cross-validation with K = 10. For each of the 10 sam-
ples, we recorded the number of true positives (TP), false
positives (FP), true negatives  (TN) and false negatives
(FN), sensitivity and specificity  and the balanced accuracy
or Q-score, which is the average of the sensitivity and spe-
cificity. To independently test the performance of the ML
    method, an independent validation set was drawn from
    the simulated data set and evaluated with the classifica-
    tion models for each  of the 10 training sets. The results
    (TP, FP, TN, FN, sensitivity, specificity,  Q-score) from
    these 10 data sets were also saved. The approach is out-
    lined in Figure 4.
Figure 3
Model S2. All symbols are as described in Figure I. There are a total of 24 "M" nodes plus 300 "R" nodes for a total of 324 fea-
tures to be examined.
                                 Previous
TOC
                Page 6 of 16
(page number not for citation purposes)

-------
BMC Bioinformatics 2008, 9:241
             http://vvww.biomedcentral.eom/1471-2105/9/241
                                            Biological
                                               Model
                                            Simulated
                                                Data
                                                                   10 Iterations per
                                                                   ML method and
                                                                   condition set
                      Cross
                   Validation
                     Sample
     k=10 spite
                         Independent
                           Validation
                            Sample
                   Mean  Q for
                   k= 10 splits

                                                             and SD
                                                Q(X-val) and Q(l-val)
Figure 4
Schematic view of the learning method employed. A large simulated data set is created from the model. From this large data
pool, multiple independent samples are drawn and either used for cross validation training and validation (X-val) (left hand
branch) or independent model validation (l-val) (right hand branch). For cross validation training, we use standard K-fold cross
validation with K = 10. The cross validation performance is the average of the 10 partitions. The classification model ("fit") used
in the right hand, independent validation branch is constructed using the entire data set for the left hand branch. For each clas-
sifier and each set of conditions, a total of 10 samples are drawn for the cross validation and 10 for the independent validation
processes. From this collection of results, we derive means and standard deviations for the balanced accuracy or Q-score.
                            Previous
TOC
Next
             Page 7 of 16
(page number not for citation purposes)

-------
BMC Bioinformatics 2008, 9:241
                                                                   http://vvww.biomedcentral.eom/1471-2105/9/241
The overall performance results of the different ML meth-
ods for the independent validation tests are shown in Fig-
ure 5. All results in this figure are calculated using Model
SI (Figure 2) for the case where the training and valida-
tion sets contained 300 chemicals or instances. Each panel
shows the Q-score trend for the ML methods as a function
of the number of features included. Horizontal lines are
drawn at Q = 0.9, which is a point that guarantees at least
80%  sensitivity  and specificity, and at Q =  0.5, which
occurs when sensitivity = 0 (all cases are predicted  to be
negative for the  endpoint). The far left point is the case
where only  the causal features are  used. Error bars (+ 1
SD) are given for the LDA results to provide an estimate of
the level of variation. The other methods showed similar
levels of variation. The figure shows the Q-Score curves as
a function of increasing number of irrelevant input fea-
tures in four blocks. In each block, each curve shows the
Q-score for one ML method beginning with just the causal
features (Nfeature = 8) and then increasing the number of
irrelevant features until Nfeature = 308. In the first block, the
curves generally show a decrease in performance going
from  Nfeature = 8 to  Nfeature =308, which means that the
accuracy of all learning methods generally decreased as
irrelevant features were added.
                                                        causal features, while with the maximum number of irrel-
                                                        evant features ANN, NB and SVM performed the best and
                                                        LDA the worst, at least in the absence of feature selection.
                                                        With the exception of LDA, the performance of different
                                                        ML methods stabilized after around 100 irrelevant fea-
                                                        tures. With the maximum number of irrelevant features
                                                        the classification accuracy of KNN and RPART were inter-
                                                        mediate between that of the highest group (ANN, SVM,
                                                        NB) and the lowest (LDA).

                                                        The second block  from the left shows the  classification
                                                        accuracy of the ML methods without feature selection but
                                                        with the addition of measurement noise. With no irrele-
                                                        vant features the classification accuracy of all ML methods
                                                        was significantly lower than in the absence of noise,  as
                                                        expected. LDA showed the same maximum negative per-
                                                        formance trend with the addition of irrelevant features.
                                                        The main difference from the previous case (no noise) was
                                                        that the performance of KNN (k = 3) was close to that of
                                                        ANN, NB and SVM as  the number of irrelevant features
                                                        increased. As before, RPART and KNN (k = 5) did not per-
                                                        form well. In  general,  the classification performance  of
                                                        LDA degraded the most with addition of noise while other
                                                        methods remained more stable.
The response of different ML methods to the addition of
noise varied: LDA and ANN performed the best with only
                                                        The third block from the left shows the classification accu-
                                                        racy of the ML methods with filter-based feature selection
     o
     CT>
     O
     cq
     o
                                                                                   Independent Validation
 
-------
BMC Bioinformatics 2008, 9:241
               http://vvww.biomedcentral.eom/1471-2105/9/241
(T-test) in the absence of noise. Comparing the perform-
ance of the ML methods with the first block (no noise, no
feature selection),  most ML methods performed better
with feature selection but their overall ranking was the
same. The exception was LDA, which showed the greatest
improvement in performance, tied with SVM and ANN
with the greatest Q-score. Feature selection also decreased
the  overall  variability in  classification  performance
between the different ML methods.

The fourth and final block represents the performance
results for the ML methods with noise and the use of T-test
feature selection. Compared with block 2, where feature
selection was  not  used,  the  performance of most ML
methods increases slightly.  LDA showed a significant
increase in performance. Compared with block 3, the per-
formance of all techniques was significantly lower when
irrelevant features were added. Overall, LDA,  NB, SVM,
ANN and KNN (N = 3) were quite stable i.e. their perform-
ance did not vary tremendously with the addition of noise
and irrelevant features.

An alternate way to examine the data is to fix the number
of features and look at trends as a function of number of
chemicals sampled. These curves (not shown) display the
expected trends that as the number of chemicals increases,
there is a corresponding improvement in performance.
The effects of the variant conditions are basically the same
as has already been shown.

Table 1 summarizes the results for both models SI and S2
for the limiting case where all 300 irrelevant features are
included. For all results, 300 chemicals were  used. The
table is organized into 4 blocks, the same as in Figure 5,
but the rows within each block are sorted by decreasing
values of Q-score. Values of sensitivity, specificity or Q-
score > 0.8 are bolded. Rows shaded in gray have Q-score
values less than the best Q-score in that block minus one
standard deviation for the best performing method. From
this table, one can see that specificity is typically high and
that sensitivity is typically low. With a small number of
positive cases, a safe classification  scheme is to assume
that most cases will be negative. The ML methods chiefly
differ by their ability to correctly predict the positive cases,
which is reflected in the sensitivity. In all cases, KNN (k =
5) and RPART perform poorly relative to the best ML
method. In the absence of feature selection, LDA also per-
forms poorly. SVM and ANN are always among the best
performers. NB and KNN (k = 3) are intermediate in per-
formance  robustness (i.e. relative lack of sensitivity to
added noise and number of irrelevant features). The
trends for model S2 are not significantly different from
those for the simpler model SI. The addition of measure-
ment noise significantly degraded the performance of all
ML methods, and this degradation  is mainly reflected in
    poorer sensitivity, i.e. the ability to correctly predict posi-
    tive cases.

    Discussion
    Developing predictive classifiers for complex biological
    data sets is a challenging problem because there are gener-
    ally more features than instances (curse of dimensional-
    ity);  the classification variable and  input features are
    noisy;  and there are many irrelevant features (i.e.  ones
    that are measured but which have no causal connection to
    the value of the classification variable). We have devel-
    oped a test bed for representing biologically motivated
    models and have used it to provide insight into the rela-
    tive classification performance of different ML methods.
    Though true in vitro biological systems are more complex
    and  dynamic  than our model,  our approach provides
    empirical insight into the relative performance of different
    learning methods as a function of the absence and pres-
    ence of experimental noise and the number of features. In
    particular, we have focused on the situation which is com-
    mon in toxicology data sets, namely where there is an
    imbalance between the number of positive  and negative
    examples.

    We find several main trends from our simulated data by
    systematically analyzing different ML  methods  on the
    same testing, training and validation data. First, most ML
    methods perform well in the presence of a small number
    of causal features, but most show significant degradation
    in performance as irrelevant features are added, which is
    well-known [45]. Second, all ML methods perform better
    with filter-based feature selection as irrelevant features are
    added. Third, the performance depends upon noise in the
    input features. While most ML methods perform well in
    the absence of noise, some are more stable than others.
    Fourth, in the presence  of noisy and irrelevant features,
    and  with feature selection, most ML methods perform
    similarly, with the exceptions of RPART and KNN (k = 5)
    which performed significantly worse. The models (Figures
    2 and 3) resemble generalized artificial neural networks,
    leading one to suspect that ANN methods should perform
    well. In general this is true, although (see Figure 5) other
    methods always performed at least as well.

    We found that the accuracy  predicted using k-fold cross
    validation was statistically indistinguishable  from that
    seen with an independent validation set except in the case
    of KNN (k = 3 or 5) with no feature selection. In this case,
    the k-fold cross validation predicted a higher accuracy
    than was  seen with independent validation. This is the
    only situation where we detected over-fitting using the
    training data.  This phenomenon disappeared when we
    tested  KNN against a more  balanced data  set in which
    there were equal numbers of positive and negative exam-
    ples. All other parameters were unchanged. Issues arising
                                Previous
TOC
               Page 9 of 16
(page number not for citation purposes)

-------
BMC Bioinformatics 2008, 9:241
                  http://vvww.biomedcentral.eom/1471-2105/9/241
Table I: Performance (mean and SD) of the ML methods.
Model
SI
SI
S2
S2
SI
SI
S2
SI
S2
S2
SI
SI
S2
S2
SI
SI
SI
S2
S2
S2
S2
SI
SI
S2
S2
SI
SI
S2
SI
SI
SI
S2
S2
S2
SI
SI
S2
S2
SI
S2
S2
SI
SI
SI
S2
SI
S2
S2
SI
S2
SI
S2
SI
S2
SI
S2
Learner
ANN
SVM
SVM
NB
NB
KNN(k=3)
ANN
CART
KNN(k=3)
LDA
KNN(k = S)
LDA
KNN(k = S)
CART
SVM
LDA
ANN
LDA
ANN
NB
SVM
NB
KNN(k=3)
KNN(k=3)
CART
CART
KNN(k = 5)
KNN(k = S)
KNN(k=3)
ANN
NB
KNN(k=3)
ANN
NB
SVM
CART
KNN(k = 5)
CART
KNN(k = 5)
SVM
LDA
LDA
LDA
SVM
NB
NB
LDA
SVM
ANN
KNN(k=3)
KNN(k=3)
ANN
KNN(k = S)
KNN(k = 5)
CART
CART
Noise
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
Feature Selection
None
None
None
None
None
None
None
None
None
None
None
None
None
None
T-test
T-test
T-test
T-test
T-test
T-test
T-test
T-test
T-test
T-test
T-test
T-test
T-test
T-test
None
None
None
None
None
None
None
None
None
None
None
None
None
None
T-test
T-test
T-test
T-test
T-test
T-test
T-test
T-test
T-test
T-test
T-test
T-test
T-test
T-test

0.7 1
0.68
0.66
0.63
0.62
0.58
0.56
0.56
0.47
0.7
0.45
0.66
0.34
NA
0.75
0.74
0.73
0.67
0.7
0.65
0.64
0.6 1
0.55
0.52
0.58
0.54
0.45
0.37
0.54
0.53
0.48
0.49
0.44
0.43
0.4 1
0.4
0.33
0.34
0.3
0.3
0.59
0.6
0.55
0.5
0.53
0.52
0.52
0.48
0.5 I
0.48
0.48
0.48
0.36
0.32
0.35
0.25
SD(Sens)
O.I2
0.076
0.086
O.I3
0.096
O.I3
O.I9
O.I3
O.I6
O.I I
O.I3
O.I3
O.I
NA
0.089
O.I I
O.I3
O.I I
O.I I
O.I3
O.I I
0.078
O.I6
O.I7
O.I2
O.I7
O.I2
O.I3
O.I I
O.I3
O.I3
O.I3
O.I I
0.062
0.09
O.I2
0.087
O.I7
O.I I
0.079
0.069
O.I2
O.I I
O.I I
0.084
0.083
0.087
O.I
O.I
O.I4
O.I
O.I
0.095
0.047
O.I4
0.093

0.96
0.99
0.99
0.98
0.99
0.98
0.98
0.96
0.98
0.76
0.98
0.69
0.99
NA
0.98
0.99
0.97
0.98
0.96
0.98
0.98
0.97
0.99
0.99
0.94
0.96
I
0.99
0.98
0.97
0.99
0.98
0.98
0.99
I
0.94
0.99
0.96
0.99
I
0.72
0.64
0.98
0.99
0.98
0.98
0.97
0.99
0.97
0.99
0.98
0.96
I
0.99
0.95
0.96
SD(Spec)
0.038
0.0063
0.0095
0.0072
0.0049
0.0 1 3
0.008
0.0 1 8
0.0 1 4
0.05 I
0.0 1 4
0.048
0.0 1 6
NA
0.0 1 I
0.0057
0.023
0.009
0.0 1 3
0.0 1 4
0.0074
0.0 1 2
0.008
0.0044
0.025
0.038
0.0035
0.0056
0.0 1 I
0.0 1 9
0.0054
0.009 1
0.0 1 6
0.0056
0.003 I
0.04 1
0.0066
0.03
0.0085
0.0039
0.05
0.037
0.0078
0.004
0.0 1
0.0088
0.0 1 I
0.0056
0.0 1 5
0.0083
0.0 1
0.0 1
0.0049
0.0036
0.023
0.027

0.84
0.83
0.82
0.8 1
0.8
0.78
0.77
0.76
0.73
0.73
0.72
0.67
0.66
NA
0.87
0.86
0.85
0.83
0.83
0.8 1
0.8 1
0.79
0.77
0.76
0.76
0.75
0.73
0.68
0.76
0.75
0.74
0.74
0.7 1
0.7 1
0.7
0.67
0.66
0.65
0.65
0.65
0.65
0.62
0.77
0.75
0.75
0.75
0.75
0.74
0.74
0.73
0.73
0.72
0.68
0.66
0.65
0.6
SD(Q)
0.068
0.039
0.04 1
0.063
0.047
0.067
0.09 1
0.065
0.077
0.05 I
0.064
0.079
0.049
NA
0.043
0.052
0.055
0.054
0.05 1
0.064
0.057
0.039
0.077
0.082
0.06 1
0.07
0.062
0.066
0.057
0.066
0.067
0.065
0.049
0.029
0.045
0.053
0.043
0.08
0.054
0.039
0.038
0.068
0.055
0.053
0.042
0.042
0.042
0.052
0.045
0.07
0.05 1
0.049
0.047
0.023
0.07 1
0.044
 This data is compiled for the special case where 300 chemicals were used, as a function of model, feature selection and level of measurement noise. The
 results are organized into 4 blocks, corresponding to the 4 blocks in Figure 5. Within a block, rows are ordered by decreasing values of Q-Score. The
 results give the average sensitivity, specificity and Q-score along with their corresponding standard deviations. All ML methods were trained using 300
 chemicals. The values come from 10 independent validation runs with unique samples of 300 chemicals. Values of sensitivity, specificity and Q-score >
 0.8 are bolded. Rows where the Q-score is less than  that of the best Q-score in the block minus one standard deviation for the best row are shaded.
                                       Previous
TOC
                 Page 10 of 16
(page number not for citation purposes)

-------
BMC Bioinformatics 2008, 9:241
                                         http://vvww.biomedcentral.eom/1471-2105/9/241
              S1   - Deterministic Assays
                                           S2 - Deterministic Assays
     0  .
 CN
 C
 CD
 Q.
 §  o
 o

              -10
-5
0
 \^
5
                      Component 1
Figure 6
Distribution of chemicals in feature/assay space for model
and S2. The data is projected into 2 dimensions using multi-
dimensional scaling. Chemicals that are negative for the end-
point are indicated by black circles, and chemicals positive for
the endpoint are represented by red crosses. For ease of vis-
ualization only a randomly selected set of 500 chemicals are
shown.
from unbalanced data sets have been previously analyzed.
Japkowicz et al.  found that classifier performance  of
imbalanced datasets depends on the degree of class imbal-
ance, the complexity of the data, the overall size of the
training set and the classifier involved [46]. Sun et al. also
observed that given a fixed degree of imbalance, the sam-
ple size plays a crucial role in determining the "goodness"
of a classification model [47]. The KNN method is sensi-
tive to imbalanced training data [48,49], and the class dis-
tribution of our simulation data is highly skewed with the
positive to negative rate of 1:9, thus the sample size very
likely explains the  different  performance between the
training and validation sets.

One of the important limitations of this work is that the
performance of classifiers is biased by our model of chem-
ical-induced bioactivity and toxicity. We assume a static
deterministic model of a biological system without feed-
back. An important aspect of the  chemical simulation
model  is the use of multiple chemical classes, each  of
which contains a collection of chemicals that behave sim-
ilarly (as measured  by their  molecular interaction spec-
trum).  As  described in the methods section, a chemical
                                                            m
                                                            o
                              CN
                              •£   LO
                               
-------
BMC Bioinformatics 2008, 9:241
               http://vvww.biomedcentral.eom/1471-2105/9/241
ing, all-positive clusters is very obvious when all features
are included.

We have focused on the performance of single classifiers,
but voting methods  which combine the predictions of
multiple individual methods have been used. Statnikov et
al. studied ensemble  classifiers or voting schemes, which
attempt to combine  multiple sub-optimal classifiers to
improve overall accuracy. That paper evaluated the utility
of selecting very small subsets of genes (as few as 25 out
of> 15,000) for classification.  This has the effect of greatly
reducing the danger of over-fitting from small numbers of
samples. Additionally, these authors demonstrated how
to evaluate the comparative performance of different algo-
rithms using permutation testing. Two conclusions  from
the Statnikov et al. work on cancer diagnosis using micro-
array  data are relevant to the present study.  First, they
observe that SVM methods outperformed KNN and ANN.
Our findings show that the relative rankings  of these 3
methods is a complex function of the number of irrele-
vant features, the level of noise and the use (or not) of fea-
ture  selection.  Second,  the authors observed that the
ensemble  classification methods tended to do worse than
single methods. Although we did not evaluate  the per-
formance  of ensemble based classification, our results
(Table 1 or Figure 5) do not suggest that voting would lead
to a decrease in performance, as long as the voting rule
was that the chemical was labeled positive if any method
predicted it to be positive.

The present work limited the number of chemicals to 300
and features to 300, which corresponds to the number of
chemicals and assays  we are using in the first phase of the
ToxCast program. Despite  the relatively small  size of the
data set, we were able to evaluate key issues in supervised
learning from  noisy and irrelevant data.  We  plan  to
expand the  number  of features and instance in future
work  as we gain  additional insights from experimental
data. Additionally, we intend to more fully explore the use
of dimensionality reduction (e.g. through correlation
analysis of closely related features), feature selection and
classifier ensembles in future work.

Conclusion
The prediction of chemical toxicity is a significant  chal-
lenge in both the environmental and drug development
arenas.  Gold standard in vivo toxicology experiments in
rodents and other species are very expensive and often do
not directly provide  mechanism of action information.
The alternative, which has been widely pursued in the
pharmaceutical industry, is to screen compounds using
use in vitro or cell based assays and to use the results of
these  assays to prioritize compounds for further efficacy
and safety testing. These in vitro screening techniques are
now being introduced in a significant way into the world
    of environmental chemical safety assessment. Here, there
    are unique challenges due to the modest amount of in vivo
    toxicology data that can be used to  develop  screening
    models, and due to the broad chemical space covered by
    environmental chemicals whose  toxicology  is  poorly
    characterized. The EPA is carrying out a significant screen-
    ing and prioritization program called ToxCast,  whose
    eventual aim is to screen a large fraction of the commonly
    used environmental chemicals and to prioritize a  subset
    of these for more detailed testing. The present analysis
    provides a novel simulation model of the linkage between
    direct chemical-target interactions and toxicity endpoints,
    and uses this model to develop guidelines for using ML
    algorithms to discover significant associations between in
    vitro screening data and in vivo toxicology.

    We find several main trends from our simulated data set
    by systematically analyzing different ML methods  on the
    same testing, training and validation data. First, most ML
    methods perform well in the presence of a small number
    of causal features, but most show significant degradation
    in performance as irrelevant features are added, which is
    well-known [45]. Second, all ML methods perform better
    with filter-based feature selection as irrelevant features are
    added. Third, while most ML methods perform well in the
    absence of measurement noise, some are more stable than
    others. Fourth, in the presence of noisy and irrelevant fea-
    tures, and with feature selection, most ML methods per-
    form  similarly well,  with the main exceptions  being
    RPART and KNN which underperformed the other meth-
    ods.

    Methods
    Simulation Models
    We use two models of the networks connecting direct
    molecular interactions with a test chemical and the pres-
    ence or absence of a toxic endpoint. Direct molecular
    interactions determine values of the M assays in the mod-
    els. These interactions can trigger pathway processes (P-
    nodes), which can in turn trigger cellular events (C-
    nodes), which can finally lead to the expression of a toxic
    endpoint. In addition to the M nodes, there are a large and
    variable number of random or R nodes with which a
    chemical can interact. Throughout the paper, we refer to
    the M  and R nodes as causal and irrelevant node  or fea-
    tures, respectively. A simulated chemical is uniquely char-
    acterized  by  its  spectrum  of activity  for  the  direct
    molecular interaction assays (M + R nodes). The value of
    the i-th M (or R) assay for chemical c is given by M;(c) and
    is randomly generated from a gamma distribution  (shape
    = 3/2,  rate = 0.5, ~95% of values are  between 0 and  8).
    This is the type of distribution one could see for -log(fe)
    where k is a binding or inhibition constant for a molecule
    interacting with protein target Figure 8  shows the distribu-
    tion of values for the M and R assays or features.
                                Previous
TOC
              Page 12 of 16
(page number not for citation purposes)

-------
BMC Bioinformatics 2008, 9:241
               http://vvww.biomedcentral.eom/1471-2105/9/241
                      10      15      20

                         Assay Value
                                                   30
Figure 8
Distribution from which the M and R assay values are drawn.
This is a gamma distribution with shape = 3/2 and rate = 0.5
The model guarantees that if two molecules have the same
spectrum of direct physical interactions, they will exhibit
the same downstream biology, including whether or not
they cause the endpoint to be activated. By altering the
interaction strength connecting nodes in the model, one
can simulate differing degrees of coupling between multi-
ple molecular targets and the downstream processes they
control.

These networks simulate the ability for an endpoint to be
triggered by multiple independent mechanisms. In model
SI, there are 2 major mechanisms, driven  by the inde-
pendent cellular processes Cl and C2 (see Figure 2). A col-
lection of chemicals may contain some substances that
trigger the endpoint through one mechanism and some
through the  other. Some chemicals may trigger both. This
interplay  of multiple  mechanisms  is characteristic of
many toxicological and disease processes and will allow
us to evaluate the ability of classification algorithms to
identify multiple paths from input to  output in a biologi-
cal system.

For all P and C nodes, values are calculated using weights
for the edges leading into a node plus the values of the
parents:
    where Xi (c) is the value for node Nt (c) in level L e [M,
    R, P, C, Endpoint] for chemical c, and wik and wijk are
    weights for the linear and quadratic interaction terms. The
    quadratic term in Equation 1 simulates the presence of
    cooperativity between  upstream processes that is neces-
    sary to trigger  downstream processes. In order to test
    binary classification algorithms, we assign chemicals to
    the positive (1) class if the value of X;(c) for the endpoint
    node is in the top 2% of the distribution, and to the neg-
    ative (0) class otherwise. The weights values wik and wijk are
    either 1.0 or 0.1 and are assigned sequentially through the
    network using the repeating series (1.0, 1.0, 0.1, 0.1, 0.1).
    For  the simulations, 2 different model networks were
    used, called SI and S2. The networks are shown in Figures
    2 and 3. Model SI has 2 parents for each node. Model S2
    has 4 C-level parents of the endpoint, 3 P-level parents for
    each C node and 2 M-level parents for each P node. Note
    that for S2, certain M-level molecular interactions can trig-
    ger more than one of the major mechanisms. Figure 2 dis-
    plays the values of the weights used for the linear portion
    of the model. Both networks contained a total of 400
    input layer nodes or molecular assays (SI: 8 M+392 R; S2:
    24 M+374 R), although the simulations only made use of
    up to 3 00 R nodes.

    Simulation  Data Sets
    For each model  (SI and S2), a set of 100,000 chemicals
    was  created with 2% being assigned to the positive end-
    point class. The chemicals are not generated completely
    randomly,  but were instead created from  500 chemical
    classes, each with 200  examples. To create a class, a first
    example was randomly generated (M and R assays drawn
    from the gamma distribution) and then the other exam-
    ples are created  from the exemplar by randomly adding
    normally distributed variation (SD = 1) to each M and R
    assay. The chemical class value (1... 5 00) was retained with
    each chemical.  From this  large set of chemicals, a sam-
    pling population was created by drawing 10,000 chemi-
    cals  from the larger set,  but enriching the fraction of
    positive cases to  10%.  This represents a very broad uni-
    verse of chemicals.

    From the set of 10,000 chemicals, multiple samples were
    drawn and used in the classification training and testing
    process. The only data given to  the classification algo-
    rithms are  the values for the M and R assays or features
    and the endpoint classification. A sample was character-
    ized by the following variables:

    1. Model (SI, S2)
                                                 (1)
                                Previous
TOC
              Page 13 of 16
(page number not for citation purposes)

-------
BMC Bioinformatics 2008, 9:241
               http://vvww.biomedcentral.eom/1471-2105/9/241
Table 2: Classification or ML methods used, along with reference to the R library used.
 ML Method   Description
                                          Library
 KNN        K-nearest neighbors (N = 3,5)                                                           MLInterfaces [50]
 NB          Naive Bayes                                                                        el071  [51]
 LDA         Linear Discriminant Analysis                                                             MLInterfaces [50]
 SVM         Support Vector Machine (kernel = radial, cost = 100)                                          e!07l[5l]
 AN N        Artificial Neural Networks (size = 10, range = 0.5, decay = 0.0001, maxit = 200, MaxN Wts = 10000)        e 1071  [51 ]
 RPART       Recursive Partitioning and Regression Trees (method = class, cp = 0, usesurrogate = 2                  el071  [51]
2. The number of chemicals (50,100,200,300)

3. The number of random or irrelevant features (R nodes)
(50,100,200,300)

4. Whether or not measurement noise was added to the
original M and R assay values. If so, normally distributed
noise (SD = 2) was added to each assay's value.

Classification Methodology
Each classification algorithm or ML method was evaluated
using the balanced accuracy or Q-score [39], which is the
average of the sensitivity and specificity for prediction.
This is a useful metric in the present situation because the
fraction of positive cases is small and the Q-score gives
equal weight to  the accuracy of predicting positive and
negative cases. In each sample, the fraction of chemicals
that is positive for the endpoint is small (10%), so a good
first approximation would be to predict that all chemicals
will be negative for the  endpoint. The  Q-score  for this
default prediction is 0.5, whereas a perfect prediction will
score  1.0.

Each ML method was evaluated against a set of 10 samples
or training sets, each using k-wise cross validation, with k
= 10 [43]. The model that was produced from each of the
training samples was evaluated against a separate valida-
tion sample.  The training and validation samples were
drawn from the same distribution. We calculated distribu-
tions of Q-score  for both the training samples (the results
of the k-fold cross validation) and the validation samples.
We call these  the "predicted" and "true" Q-scores.

For each sample set described above, we evaluated per-
formance for a series of ML methods with no feature selec-
tion and with T-test filter feature selection. In the latter
case, the best 20% of features were selected, with a mini-
mum number of 8. (Note that the features (M-nodes) are
not strictly normally distributed, but  are instead drawn
from  a gamma distribution overlaid with normally dis-
tributed variation.) To manage the large number of indi-
vidual runs, a simple MySQL database was created with 2
tables called queue and result. The queue table contains all
run parameters and the  result table holds all of the  rele-
    vant results. The relevant parameters in the queue table
    are [model (SI, S2), measurement noise (0/2), number of
    features, number of chemicals, ML method, feature selec-
    tion mode (none or T-test)]. In all cases, the fraction of
    positive cases in the sample was 10%. Figure 4 illustrates
    the overall approach.

    Classification AlgorithmslML methods
    Table 2 lists the ML methods that were evaluated, along
    with any non-default parameters. Parameters for each of
    the machine learning methods were tuned so that the per-
    formance (Q score) was acceptable (> 0.9) when tested
    against model  SI when the  ML method was  presented
    with all  of the true features, no irrelevant features, and
    when no noise was added to the features. Default param-
    eters were used for KNN, NB, LDA and RPART. For SVM,
    the cost function was varied over the range from 1 to 1000
    and a value of 100  was  selected.  ANN  was the only
    method  requiring significant tuning.  Approximately 20
    combinations of the  parameters  listed in Table  1 were
    tested prior to arriving at an acceptable set. All code was
    written in R (version 2.5.1) using the MLInterfaces imple-
    mentation of all ML methods. The code was parallelized
    using snow and Rmpi and run on a  Linux workstation
    cluster and an SGI Altix 4700.

    List of abbreviations
    The following abbreviations are used in the manuscript:
    ML: Machine Learning; KNN: k-Nearest Neighbors; NB:
    Naive  Bayes; LDA:  Linear Discriminant Analysis; SVM:
    Support Vector Machine; ANN: Artificial Neural Network;
    CART: Classification and Regression Trees; RPART: Recur-
    sive Partitioning  and  Regression Trees;  HTS:  High
    Throughput Screening; HCS: High Content Screening.

    Authors' contributions
    RJ, IS, WS, ZL, FE participated in the design of the experi-
    ment, in the design of the analysis strategy, in the formu-
    lation  of the conclusions and in implementation of the
    analysis  software. In addition, RJ developed the simula-
    tion model and its software implementation and per-
    formed the analysis runs.  IS developed the bulk of the
    final analysis software. RJ and IS drafted the manuscript.
    All authors read and approved the final manuscript
                                Previous
TOC
              Page 14 of 16
(page number not for citation purposes)

-------
BMC Bioinformatics 2008, 9:241
                  http://vvww.biomedcentral.eom/1471-2105/9/241
Acknowledgements
The authors wish to thank Edward Anderson and Govind Gawdi for help
with software configuration and code parallelization. Disclaimer: This man-
uscript has been reviewed by the U.S. EPA's National Center for Compu-
tational Toxicology and approved for publication. Approval does not signify
that the contents necessarily reflect the views and policies of the agency,
nor does mention of trade names or commercial products constitute
endorsement or recommendation for use.


References
I.    Bhogal N, Grindon C, Combes R, Balls M: Toxicity testing: creat-
     ing a revolution based on new technologies.  Trends Biotechnol
     2005, 23:299-307.
2.    Directive 2003/I5/EC of the European Parliament and of the
     Council of 27 February 2003  amending Council Directive 767
     768/EEC on the approximation of the  laws of Member States
     relating to cosmetic products   [http://ec.europa.eu/enterprise/
     cosmetics/html/consolidated dir.htm]
3.    REACH         [http://ec.europa.eu/environment/chemicals/reach/
     reach intro.htm]
4.    Dix DJ, Houck KA, Martin MT, Richard AM, Setzer RW, Kavlock RJ:
     The  ToxCast program  for  prioritizing toxicity testing of
     environmental chemicals.  Toxicol Set 2007, 95:5-12.
5.    Inglese J, Auld DS, Jadhav  A, Johnson RL, Simeonov A,  Yasgar A,
     Zheng W, Austin CP: Quantitative high-throughput screening:
     a titration-based approach that  efficiently identifies biologi-
     cal activities in large chemical libraries. Proc Not/ Acad Set USA
     2006, 103:1 1473-1 1478.
6.    Lamb J, Crawford ED, Peck  D, Modell JW, Blat 1C, Wrobel MJ, Lerner
     J, BrunetJP, Subramanian A, Ross KM, Reich M, Hieronymus H, Wei
     G, Armstrong SA, Haggarty SJ, demons PA, Wei R, Carr SA, Lander
     ES, Golub TR: The Connectivity Map:  using gene-expression
     signatures to connect small molecules, genes, and disease.
     Science 2006, 313:1929-1935.
7.    Strausberg RL, Schreiber SL: From knowing to controlling: a
     path from genomics to  drugs using small molecule probes.
     Science 2003, 300:294-295.
8.    Fliri AF, Loging WT, Thadeio PF, Volkmann RA: Biological spectra
     analysis: Linking biological  activity  profiles  to molecular
     structure.  Proc Natl Acad Sd USA 2005, 102:261-266.
9.    Austin CP, Brady LS, Insel TR, Collins FS: NIH Molecular Libraries
     Initiative. Science 2004, 306:1138-1139.
10.   Bredel M, Jacoby E: Chemogenomics: an emerging strategy for
     rapid target and drug discovery.  Nat Rev Genet 2004, 5:262-275.
I I.   Klekota J, Brauner E,  Roth  FP, Schreiber SL: Using high-through-
     put screening data to discriminate compounds with single-
     target  effects from those with side effects. J Chem /nf Model
     2006,46:1549-1562.
12.   Kikkawa R, Fujikawa M, Yamamoto T, Hamada Y, Yamada H, Horii I:
     In vivo  hepatotoxicity study of  rats in comparison with in
     vitro hepatotoxicity screening  system.  J Toxicol  Sd 2006,
     31:23-34.
13.   Fliri AF, Loging WT, Thadeio PF, Volkmann RA: Analysis of drug-
     induced effect patterns  to link structure and side effects of
     medicines. Nat Chem Biol 2005, 1:389-397.
14.   MelnickJS, Janes J, Kim S, Chang JY, Sipes DG, Gunderson D, Jarnes
     L, Matzen JT, Garcia ME, Hood TL, Beigi R, Xia G, Harig RA, Asatryan
     H, Yan SF, Zhou Y, Gu XJ,  Saadat A, Zhou V, King FJ, Shaw CM, Su
     Al, Downs R, Gray NS, Schultz PG, Warmuth M, Caldwell JS: An effi-
     cient rapid system for profiling the cellular activities of
     molecular libraries. Proc Natl Acad Sd USA 2006, 103:3153-3158.
15.   O'Brien PJ, Irwin W, Diaz D, Howard-Cofield E, Krejsa CM, Slaughter
     MR, Gao B, Kaludercic N, Angeline A, Bernardi P, Brain P, Hougham
     C: High concordance of drug-induced human hepatotoxicity
     with in vitro cytotoxicity measured in a novel cell-based
     model   using  high  content  screening.   Arch   Toxicol 2006,
     80:580-604.
16.   Scherf U, Ross DT, Waltham  M, Smith LH, Lee JK, Tanabe L, Kohn
     KW,  Reinhold WC, Myers TG, Andrews DT, Scudiero DA, Eisen MB,
     Sausville EA, Pommier Y, Botstein D, Brown PO, Weinstein JN: A
     gene expression database for the molecular pharmacology
     of cancer. Not Genet 2000, 24:236-244.
      17.  Smith SC, DelaneyJS, Robinson MP, Rice MJ: Targeting chemical
          inputs and optimising HTS for agrochemical discovery.  Comb
          Chem High Throughput Screen 2005, 8:577-587.
      18.  Tietjen K,  Drewes M, Stenzel K: High throughput screening in
          agrochemical research.  Comb Chem High Throughput Screen 2005,
          8:589-594.
      19.  Walum E, HedanderJ, Garberg P: Research perspectives for pre-
          screening alternatives to animal  experimentation On the
          relevance of cytotoxicity measurements,  barrier passage
          determinations and high throughput screening in vitro to
          select  potentially hazardous compounds in large sets of
          chemicals. Toxicol Appl Pharmacol 2005, 207:393-397.
     20.  Paolini GV.Shapland RH.van Hoorn WP, Mason JS, Hopkins AL: Glo-
          bal mapping of pharmacological space.  Nat Biotechnol  2006,
          24:805-815.
     21.  Krewski D, DAcostaJ.Anderson M.Anderson H,JB III, Boekelheide
          K, Brent R, Charnley G, Cheung V, Green S, Kelsey K, Kervliet N, Li
          A, McCray L, Meyer O, Patterson DR, Pennie W, Scala  R, Solomon
          G, Stephens M, J Yager J, Zeize L: Toxicity Testing in the Twenty-first Cen-
          tury: A Vision and a Strategy Washington  D.C.: National Academies
          Press; 2007.
     22.  Wang H, LeCluyse EL: Role of orphan nuclear receptors in the
          regulation of drug-metabolising enzymes.  Clin  Pharmacokinet
          2003,42:1331-1357.
     23.  Okey AB: An  aryl hydrocarbon receptor  odyssey to the
          shores of toxicology: the Deichmann Lecture, International
          Congress of Toxicology-XI.  Toxicol Sci 2007, 98:5-38.
     24.  Heuvel JP Vanden, Thompson JT, Frame SR, Gillies PJ: Differential
          activation of nuclear receptors by perfluorinated fatty acid
          analogs and  natural  fatty acids: a comparison of human,
          mouse, and rat peroxisome proliferator-activated receptor-
          alpha, -beta,  and -gamma, liver X receptor-beta,  and retin-
          oid X receptor-alpha. Toxicol Sci 2006, 92:476-489.
     25.  McMillian M, Nie AY, Parker JB, Leone A, Kemmerer M, Bryant S,
          HerlichJ, Yieh  L, BittnerA, Liu X, Wan J, Johnson MD: Inverse gene
          expression patterns for macrophage activating hepatotoxi-
          cants and peroxisome proliferators in rat liver.  Biochem Phar-
          macol 2004, 67:2141-2165.
     26.  Williams GM,  latropoulos  MJ: Alteration of liver cell function
          and  proliferation: differentiation  between  adaptation and
          toxicity. Toxicol Pathol 2002, 30:41-53.
     27.  Moore LB, Parks DJ, Jones SA, Bledsoe RK, Consler TG, Stimmel JB,
          Goodwin B, Liddle C, Blanchard SG, Willson TM,  Collins JL, Kliewer
          SA:  Orphan   nuclear  receptors  constitutive  androstane
          receptor  and  pregnane X receptor share xenobiotic and
          steroid ligands. \ Biol Chem 2000, 275:15 122-15 127.
     28.  ToxCast  [http://www.epa.gov/ncct/toxcast]
     29.  Ancona N, Maglietta R,  Piepoli A, D'Addabbo A, Cotugno R, Savino
          M, Liuni  S, Carella M, Pesole G, Perri F: On the statistical assess-
          ment of classifiers using DNA microarray data.  BMC Bioinfor-
          matics 2006, 7:387-401.
     30.  Statnikov A, Aliferis CF, Tsamardinos I, Hardin D, Levy S: A compre-
          hensive evaluation of multicategory classification methods
          for microarray gene expression cancer diagnosis.  Bioinformat-
          ics 2005, 21:63 1-643.
     31.  Burbridge  R,  Trotter M, Buxton B, Holden S:  Drug  design by
          machine learning: support vector  machines for  pharmaceu-
          tical data analysis. Computers & Chemistry 2001, 26:5-14.
     32.  Lepp Z,  Kinoshita T, Chuman H: Screening for  new antidepres-
          sant leads of multiple activities by support vector machines.
          J Chem Inf Model 2006, 46:158-167.
     33.  Helma C,  Kramer S: A survey of the Predictive Toxicology
          Challenge 2000-2001. Bioinformatics 2003, 19:1179-1182.
     34.  Benigni R, Giuliani A: Putting the Predictive Toxicology Chal-
          lenge into perspective: reflections on the results.  Bioinformat-
          ics 2003, 19:1 194-1200.
     35.  Toivonen H, Srinivasan A, King RD, Kramer S, Helma C: Statistical
          evaluation of  the Predictive Toxicology Challenge 2000-
          2001. Bioinformatics 2003,  19:1 183-1 193.
     36.  The Predictive Toxicology Challenge (PTC) for 2000-2001
          [http://www.predictive-toxicology.org/ptc/ffROC]
     37.  Ntzani EE, loannidis JP:  Predictive ability of DNA microarrays
          for cancer outcomes and correlates: an empirical assess-
          ment.  Lancet 2003, 362:1439-1444.
                                       Previous
TOC
                 Page 15 of 16
(page number not for citation purposes)

-------
BMC Bioinformatics 2008, 9:241
                   http://vvww.biomedcentral.eom/1471-2105/9/241
38.   Molinaro AM, Simon R, Pfeiffer RM: Prediction error estimation:
     a comparison  of resampling methods.   Bioinformatics  2005,
     21:3301-3307.
39.   Baldi P, Brunak S, Chauvin Y, Andersen CA, Nielsen  H: Assessing
     the accuracy of prediction algorithms for classification: an
     overview. Bioinformatics 2000,  16:412-424.
40.   Sima C, Dougherty ER: What should be expected from feature
     selection  in  small-sample  settings.    Bioinformatics  2006,
     22:2430-2436.
41.   Pudil P: Floating Search  Methods in Feature Selection.  Pattern
     Recognition Letters 1994,  15:1 I 19-1 125.
42.   Baker SG, Kramer BS: Identifying genes that contribute most to
     good classification in microarrays.   BMC Bioinformatics  2006,
     7:407-414.
43.   Kohavi  R: A Study of Cross Validation  and  Bootstrap  for
     Accuracy  Estimation and Model Selection.  International Joint
     Conference on Artificial Intelligence; Montreal.  IJCAI  1995.  Unpaged
44.   Martin MT, Houck KA, McLaurin K, Richard AM, Dix DJ: Linking
     Regulatory  Toxicological  Information  on  Environmental
     Chemicals with  High-Throughput  Screening  (HTS)  and
     Genomic Data. The lexicologist CD - An official Journal of the Society
     of Toxicology 2007, 96:219-220.
45.   Almuallim H, Dietterich TG: Learning With Many  Irrelevant
     Features.  Proceedings  of the Ninth National Conference on Artificial
     Intelligence 1991:547-552.
46.   Japkowicz N, Stephen S: The class imbalance problem: A sys-
     tematic study.  Intelligent Data Analysis 2002, 6:429-450.
47.   Sun Y, Kamel MS, Wong AKC, Wang Y: Cost-sensitive boosting
     for classification  of imbalanced data.  Pattern Recognition 2007,
     40:3358-3378.
48.   Zhang J, Mani I: kNN Approach to Unbalanced Data Distribu-
     tions: A Case Study involving Information Extraction.  ICML
     2003.
49.   Li LH, T TM, Huang D:  Extracting Location  Names from Chi-
     nese Texts Based on SVM and  KNN. 2005 IEEE International
     Conference on Natural Langrage Processing And Knowledge Engineering
     2005, 10:371-375.
50.   MLInterfaces: towards  uniform  behavior of machine learn-
     ing tools in R [http://bioconductor.org/packages/1.8/bioc/vignettes/
     MLInterfaces/inst/doc/M Llnterfaces.pdf]
51.   The e 1071  package     [http://cran.r-project.org/web/packages/
     el07l/el07l.pdf]
                                                                            Publish with BioMcd Central  and every
                                                                           scientist can read your work free of charge

                                                                         "BioMed Central will be the most significant development for
                                                                        disseminating the results of biomedical research in our lifetime."
                                                                            Sir Paul Nurse, Cancer Research UK

                                                                          Your research papers will be:
                                                                          • available free of charge to the entire biomedical community
                                                                          • peer reviewed and published immediately upon acceptance
                                                                          • cited in PubMed and archived on PubMed Central
                                                                          • yours — you keep the copyright

                                                                     Submit your manuscript here:
                                                                     http://www.biomedcentral.eom/i nfo/publishing_adv.asp
                                             £   J BioMedcentral
                                       Previous
TOC
                 Page 16 of 16
(page number not for citation purposes)

-------
48
ANN IST SUPER SANHA 2008 | VOL. 44, No. 1: 48-56
         A novel approach:  chemical relational databases,

         and the role  of the  ISSCAN database

         on assessing chemical carcinogenicity

         Romualdo Benigni(a), Cecilia Bossa(a), Ann M. Richard(b) and Chihae Yang(c)
         (a>Dipartimento di Ambiente e Connessa Prevenzione Primaria, Istituto Superiors di Sanitd, Rome, Italy
          National Center for Computational Toxicology, US Environmental Protection Agency, Research
         Triangle Park, North Carolina, USA
         (c)LeadScope Inc., Columbus, Ohio, USA
         Summary. Mutagenicity and carcinogenicity databases are crucial resources for toxicologists and
         regulators involved in chemicals risk assessment. Until recently, existing public toxicity databases
         have been constructed primarily as "look-up-tables" of existing data, and most often did not contain
         chemical structures. Concepts and technologies originated from the structure-activity relationships
         science have provided powerful tools to create new types of databases, where the effective linkage of
         chemical toxicity with chemical structure can facilitate and greatly enhance data gathering and hy-
         pothesis generation, by permitting: a) exploration across both chemical and biological domains; and
         b) structure-searchability through the data. This paper reviews the main public databases, together
         with the progress in the field of chemical relational databases, and presents the ISSCAN database
         on experimental chemical carcinogens.
         Key words: database, mutagenicity, carcinogenicity, chemical structure.

         Riassunto ( Un approccio innovative: i database chimico relazionali e il ruolo del database ISSCAN per
         la valutazione della cancerogenesi chimica). Basi di dati di cancerogenesi e mutagenesi sono essenziali
         per la stima del rischio chimico. Finora queste  si presentavano essenzialmente come tavole statiche,
         ma i progress! nel campo delle relazioni struttura-attivita hanno permesso di creare nuove tipologie
         dove 1'unione del dato tossicologico con la struttura chimica permette di legare ricerche in ambiti
         chimico e biologico, e di esplorare i dati dal punto di vista strutturale. Questo articolo presenta le
         principal! basi di dati pubbliche assieme agli sviluppi delle nuove banche dati chimico relazionali, e
         illustra la banca dati ISSCAN sui cancerogeni chimici.
         Parole chiave: basi di dati, mutagenesi, cancerogenesi, struttura chimica.
           INTRODUCTION
           Currently, the public has access to a variety of
         databases containing mutagenicity  and carcino-
         genicity data. These resources are crucial for the
         toxicologists and regulators involved in the risk as-
         sessment of chemicals, which necessitate access to
         all the relevant literature, and capability  to search
         across toxicity databases using both biological and
         chemical criteria. In this field, rapid progress has
         taken place both in terms of initiatives and tech-
         nological innovation. In particular, public Internet
         resources  to support biological and toxicological
         activity evaluation of chemicals  have expanded
         greatly and  are ushering  in a new era of public in-
         formation access and data mining in support of
         toxicity assessment.
           In the context of the recent dramatic changes in
         regulations  and regulatory  needs worldwide, the
                                                  progress in toxicological databases, and in database
                                                  technology is particularly timely  and provides an
                                                  absolutely sine qua non tool for the regulatory imple-
                                                  mentations. As a matter of fact, increasing demands
                                                  and expectations are being placed on predictive tox-
                                                  icology in support of the new European REACH
                                                  legislation and other pieces of legislation worldwide
                                                  [1], and the need  emerges for more structured or-
                                                  ganization and harnessing of legacy toxicity data,
                                                  and maximal utilization of these data [2]. Until now,
                                                  the assessment of chemical risk in the  European
                                                  Union (EU) has been largely based on traditional
                                                  toxicology. However legislative, societal and practi-
                                                  cal realities (too many chemicals, too few resources)
                                                  have created new inducements and opportunities to
                                                  encourage use and acceptance of  "alternative" ap-
                                                  proaches,  which can  reduce substantially the need
                                                  for experimental toxicological testing.
           Address for correspondence: Romualdo Benigni, Dipartimento di Ambiente e Connessa Prevenzione Primaria, Istituto
           Superiore di Sanita, Viale Regina Elena 299, Rome, Italy. E-mail: romualdo.benigni@iss.it.
                                      Previous

-------
                                                                                   ISSCAN DATABASE
                                                       49
  In 2003, the European Commission (EC) adopted a
legislative proposal for a new chemical management
system  called REACH (Registration, Evaluation
and Authorisation of Chemicals).  Article  13(1) of
the legal text of the draft REACH  regulation states
that [3]: "Information  on intrinsic properties of
substances  may be generated by means other than
tests, in particular through the use  of qualitative or
quantitative structure-activity relationship models
or from information from structurally  related sub-
stances (grouping or read-across), provided that the
conditions set out in Annex XI are  met".
  REACH is expected to introduce a dramatic change in
the present EU regulatory schemes [4]. It will provide a
basis for the use of structure-activity relationships mod-
els, together with other "non-testing" approaches, for
predicting the environmental and lexicological proper-
ties of chemicals, in the interests of time-effectiveness,
cost-effectiveness and  animal  welfare. According to
an assessment carried out by the European Chemicals
Bureau (ECB), the in vivo mutagenicity studies, shortly
followed by carcinogenicity, are posing high demand
for test-related recourses [5,6].
  In particular, the science of the relationships between
chemical structure and the biological activity of mol-
ecules is expected to play a new role and support three
distinct  activities: category  formation, "read-across",
and (Quantitative) Structure-Activity Relationships
((Q)SAR). A chemical category is a group of chemi-
cals whose physicochemical and human health and/or
environmental toxicological properties are likely to be
similar or follow a regular pattern as a result of struc-
tural similarity. If this similarity is recognized with suf-
ficient evidence, all the chemicals in the category can be
considered (and regulated) in the same way.  Another
approach to fill data gaps is read-across. In the read-
across approach, endpoint information (e.g., carcino-
genicity) for one chemical is used to predict the same
endpoint for another chemical, which is considered to
be "similar" in some way (usually on the basis of struc-
tural similarity). Regarding the third approach, the sci-
entific foundation  of (Q)SAR  models lies in physical
organic  chemistry, where features of a chemical and
its properties are used to estimate chemical behaviour
and activity solely from the knowledge  of chemical
structure. (Q)SAR modeling has been widely used  in
pharmacology, toxicology and physical chemistry [7],
and its capabilities and limitations are relatively well
understood [8-10]. Regarding  the use of (Q)SAR, a
recent project supported by the European Chemicals
Bureau (ECB) surveyed the models  for mutagenicity
and carcinogenicity in the public domain: the results
are summarized in [4] and [11].
  The extensive use of estimation techniques such as
(Q)SARs, read-across and grouping of chemicals,
where appropriate and in a suitably  constrained
context, has the potential to effect  huge reductions
in use of animals for modeled toxicity endpoints. At
the same time, all these approaches need to be fed
by adequate amounts of good quality data and da-
tabases.
 DATABASES OF CHEMICAL
 MUTAGENS AND CARCINOGENS
 IN THE PUBLIC DOMAIN
 Among the sources of freely available data pertain-
ing to toxicity on chemical substances, one of the
principal resources is the TOXNET database of the
National Library of Medicine (NLM) (http://toxnet.
nlm.nih.gov/). TOXNET is a cluster of different da-
tabases, collecting information on toxicology, haz-
ardous chemicals, environmental health,  and  toxic
releases. From the website, it is  possible  to search
across and within the databases by several identifi-
ers, such as chemical name, CAS (Chemical Abstract
Service) number, molecular formula, classification
code, locator code, and structure or substructure
(with the  CHEMID PLUS protocol). Among the
TOXNET  databases, the Chemical Carcinogenesis
Research  Information  System  (CCRIS) and the
GENE-TOX databases deal specifically with muta-
genicity and carcinogenicity data.
 CCRIS contains over 8000 chemical records with
animal carcinogenicity, mutagenicity,  tumor promo-
tion, and tumor inhibition test results provided by
the National Cancer  Institute (NCI). Test results
have been  reviewed by experts and all the  records
are written in a standardized textual format.
 GENE-TOX was  developed by the US Environ-
mental Protection Agency (USEPA) and contains
genetic toxicology (mutagenicity) test data, result-
ing from expert peer review of the open scientific
literature, on over 3000 chemicals. The GENE-TOX
program was established within EPA  to select assay
systems for evaluation, review data in the scientific
literature,  and recommend proper testing protocols
and evaluation procedures for these systems.
 Another repository of experimental carcinogenicity
data available on the web is the Carcinogenic Potency
Database (CPDB) (http://potency.berkeley.edu/cpdb.
html). This database collects the  results from over
6000 chronic, long-term  animal cancer bioassays on
over 1500 chemicals published in the general literature
through 1997 and by the National Cancer Institute/
National Toxicology Program through 1998. CPDB
is  organized alphabetically by chemical name. All
experiments of a chemical are listed under the name
of the test agent; for each experiment, information
is included on test animals, features of experimental
protocol, and carcinogenicity results in detail, includ-
ing literature citation. CPDB is downloadable in pdf,
xls or txt formats, and searchable by chemical name,
CAS number, or author. Most recently, chemical-spe-
cific summary data pages have been provided on the
CPDB  website to make these data more  accessible
through chemical or structure searching (see, e.g., the
result of a search on  acetaldehyde: http://potency.ber-
keley.edu/chempages/ACETALDEHYDE.html).
 The US National Toxicology Program (NTP) makes
available on the web (http://ntp.niehs.nih.gov/) data
from more than 500 long-term toxicology and carcino-
genesis bioassays collected by the NTP and its pred-
ecessor, the National Cancer Institute's Carcinogenesis
                                Previous

-------
50
Romualdo Benigni, Cecilia Bossa, Ann M. Richard, et al.
         Testing Program, and organized in a database at the
         National Institute of Environmental Health Sciences
         (NIEHS). These data can be accessed as technical
         reports; the user can browse them  directly or make
         text searches (by chemical name or CAS number, for
         example), or download the reports in pdf  format. In
         addition, detailed experimental study data, to the level
         of individual animal observations, are  housed in  an
         Oracle NTP on-line database, with limited  searchable
         access to detailed data on thousands of experiments
         provided to the public on the NTP website.
           To enhance their structure-searchability and use in
         modeling  applications, both the CPDB and  the on-
         line NTP database have been "chemically-indexed"
         by the USEPA's National Center for Computational
         Toxicology DSSTox (Distributed Structure-Searchable
         Toxicity) database project (www.epa.gov/ncct/dsstox/),
         which emphasizes quality procedures for accurate and
         consistent chemical structure  annotation  of  toxico-
         logical experiments. Chemical structures  and summary
         mutagenicity and carcinogenicity data have been  pub-
         lished for the entire CPDB inventory (www.epa.gov/
         ncct/dsstox/sdf_cpdbas.html; recently updated), along
         with the URL address locating the  specific chemical
         data webpage on the CPDB website provided for each
         indexed chemical substance. Chemical structures and
         indicators of data availability (1 = yes, 0 = no) have also
         been provided for the entire chemical inventory of the
         online NTP database, for each of the 4 main NTP study
         areas  (Developmental, Immunological,  Genetox, and
         Chronic Cancer Bioassays)  (see below for more infor-
         mation on the DSSTox project).
           From the International Agency for Research on Can-
         cer (IARC) website it is possible to  access  the IARC
         Monographs on the Evaluation of Carcinogenic Risks
         to Humans (www-cie.iarc.fr/).  In these documents,
         independent assessments by international  experts  of
         the carcinogenic risks to humans posed by a  variety
         of agents, mixtures and exposures, are published. The
         Monographs are searchable by key word, CAS number,
         synonym or chemical name.
           Recently, a very useful tool that is expanding access
         to a wide range of toxicological databases, as well as
         other public biological activity databases available
         on the web has been created by the National  Center
         for Biotechnology Information (NCBI) through
         the PubChem  project (http://pubchem.ncbi.nlm.
         nih.gov).  PubChem is a public information  system
         (tightly integrated into the  cluster   of  biological
         and literature databases hosted at NCBI, such  as
         PubMed http://www.ncbi.nih.gov/entrez/query.fcgi)
         that links chemical identifiers (such as chemical
         name, CAS number and chemical structures) to bio-
         logical activity knowledge of substances. It  should
         be remarked that PubChem is not an independently
         curated database, but rather a user-depositor system
         that aggregates standardized data from many sourc-
         es, providing a tool to interrogate  databases in the
         public domain in the US (including both toxicologi-
         cal and biomedical ones).  The PubChem interfaces
         provide extensive query capabilities on textual and
                                                   numeric information, as well as a comprehensive set
                                                   of structure-based query methodologies. PubChem
                                                   was originally created to house all the bioassay data
                                                   of the NIH Molecular Libraries Initiative Screening
                                                   program, whose goal is to process hundreds of thou-
                                                   sands of chemicals through up to several thousands
                                                   of high-throughput bioassay  screens, using chem-
                                                   istry to probe  biology at the  fundamental cellular
                                                   and protein receptor level (http://nihroadmap.nih.
                                                   gov/molecularlibraries/). PubChem has expanded,
                                                   however, as a user-depositor public data repository,
                                                   housing large amounts of public bioassay data, in-
                                                   cluding the NLM TOXNET and USEPA DSSTox
                                                   inventories. PubChem has also significantly expand-
                                                   ed its tools and capabilities for analyzing chemicals
                                                   across bioactivity space, through summary activity
                                                   assignments (active or inactive, or a binned range
                                                   of activities).
                                                     Recent reviews [12-14] surveyed the current status
                                                   of public toxicity databases in terms of their diverse
                                                   content and structure, and provide a useful comple-
                                                   ment to the information summarized above.
                                                    NEW NEEDS AND NEW TOOLS:
                                                    CHEMICAL RELATIONAL DATABASES
                                                    Until recently, many existing public toxicity data-
                                                   bases have been constructed primarily as "look-up-
                                                   tables" of existing data, and most often did not con-
                                                   tain chemical structures. These databases typically
                                                   utilize chemical names (usually common or com-
                                                   mercial names) and CAS numbers which are non-
                                                   unique and commercially registered and, therefore,
                                                   unsuitable for a unique, public identifier. In addi-
                                                   tion, often the organization of the data follows that
                                                   of the literature on paper,  and does not lend easily
                                                   itself to informatics implementation.
                                                    Recently, concepts and computer techniques that
                                                   originated from the structure-activity relationships
                                                   science have provided powerful tools to create new
                                                   types of databases, where the ability to retrieve data
                                                   is strongly improved both  in qualitative and quan-
                                                   titative terms. In fact, whereas the indexing (iden-
                                                   tifier)  elements  in  traditional databases, such  as
                                                   names and CAS numbers, are non-unique, prone to
                                                   errors and devoid of intrinsic information, chemical
                                                   structure as a chemical identifier has universally un-
                                                   derstood meaning and scientific relevance. Chemical
                                                   structure and chemical concepts (e.g., reactive func-
                                                   tional groups, acidity, hydrophobicity, electrophilic
                                                   reactivity, free radical formation) provide a common
                                                   language and framework for exploring the similar-
                                                   ity among chemicals and  the underlying chemical
                                                   reactivity bases for diverse toxicological outcomes.
                                                   Hence, chemical structure  should be considered  an
                                                   essential identifier  and scientifically useful metric
                                                   for chemical toxicity databases. Effective linkage of
                                                   chemical toxicity data with chemical structure infor-
                                                   mation can facilitate and greatly enhance data gath-
                                                   ering and hypothesis generation in conjunction with
                                                   (Q)SAR modeling efforts [15].
                                      Previous

-------
                                                                                    ISSCAN DATABASE
                                                        51
  Thus, a crucial point is that of collecting and stand-
ardizing portions of the existent knowledge in a way
that allows: a) exploration across both chemical and
biological domains;  and  b)  structure-searchability
through the data. These characteristics may be gained
when chemical structures and toxicity data are incor-
porated into what is termed a Chemical Relational
Database (CRD). CRD is a special type of relational
database whose main informational unit is a chemi-
cal structure and whose fields are attributes or data
associated with that chemical structure.
  In order to be accessed  with a CRD application,
the information has  to be stored in specialized file
formats. Among them, Structure Data File (SDF)
format has become as the most widely used public
standard for exchange of structure/data informa-
tion on chemicals. SDF  files are simple text files
that adhere to a strict format for representing mul-
tiple chemical structure  records  and  associated
data fields. Each  record in the file  is composed of
a "structure" section where the 2D  or 3D structure
of the  molecule is represented as MOLfile format,
and a second section composed of numerical or text
data fields (Figure 1). Hence, SDF files are very ver-
satile: they can accommodate many types of data,
are easily edited and manipulated by programming
scripts, and could be easily ported to other types of
standard formats, such as the mark-up languages,
XML and CML (for further information on issues
related to chemical annotation [4, 15]).
  SEARCHING CAPABILITY
  OF CRD DATABASES
  Even though simple, useful searches can be performed
with widely available informatics tools, such as XLS
(Excel-readable) files of chemicals with annotated tox-
icity and/or properties, where it is possible, eg., to re-
trieve substances within a predefined range of toxicity
values. However, coding the chemical structure in the
SDF file allows one to perform remarkably more com-
plex searches by using specialized CRD software pro-
grams: most commercially available CRD applications
provide substructure and functional group search fea-
tures, different algorithms  for searching compounds
chemically  similar to query ones (similarity search),
and text and data field search functions (for informa-
tion on commercial and public software applications,
see the  DSSTox website:  www.epa.gov/ncct/dsstox/
SDFViewerBrowserCRDs.html).
  When the SDF file is imported into a CRD ap-
plication, it is possible to do structure/text/data rela-
tional searching across records in the database. All
these operations are collectively termed "data min-
ing" [14]. In Figure 2, as an example the substructure
searching results using aniline as query structure are
depicted. The result of the search consists  of all
chemicals in a database (i.e., an SDF file) contain-
ing aniline as basic  substructure. In this way, it is
possible to identify subsets of chemicals according
to any structural query (i.e.,  functional group,  or
molecular substructure).
10 9 0
0.3600
-0.8880
0.71 6,0'''
-o.teSo
/'1800
/-1 .7920
/ -1 .0560
/ 1 .7960
0.3040
-1 .2040
4 10 1 0
4910
\
3. 8 1 0
3\ 420
2 N7 1 0
\
2 6\1 0
1 5 Ks 0
1 3 1 t>
1220
M END,---""
> ^Substance
^V
o ,.
0^980'
,-''{.3840
-0.4000
-1 .4600
1 .6400
0.7560
2.4760
-0.6200
-2.4800
-1.3160
000
000

000
000
000
000
000
»%0 0 0
"(N-Q.. 0
ID> (41) *


0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000





-~
c
c
c
c
H
H
H
H
H
H





T"V20
0
0
0
0
0
0
0
0
0
0






0
0
0
0
0
0
0
0
0
0





	
0
0
0
0
0
0
0
0
0
0






o "o"
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0






""9.
0
0
0
0
0
0
0
0
0






. 0

0
>-sO
0
0
0
0
0
0
0
0





Ov
0
0
0
0
0
0
0






0 0
0 0
*0 0
b\o
0 0>
0 0
0 0
0 0
0 0
0 0






0 0
0 0
0 0
0 0
, 0 0
\0 0
\ 0
0\ 0
0 0
0 0

,
1
1
/
2D Structure




••„
\

































,--





'*'




,''



/
s










  !>     (41)
        (41)
        (4.1.).'''
                    *'
                                      • Text and data fields
                    Fig. 1 | A sample SDF file, contain-
                    ing both structural (top) and data
                    (bottom) information.
                                Previous

-------
Romualdo Benigni, Cecilia Bossa, Ann M. Richard, et al.
13 LeadScope Enterprise - Enterprise: gonzo:9 - secure

 File Edit View  Tools  Help
 New Project  ^ New Query ^ Refine Query  HP Structures .X Kernove  Display Studies Display Known Drug Information

           jSfsfgJIggjfljMBSaSjSiBliliig^^^^^^^^^^B

            Structures j Spreadsheet | Graphs |
 Fig. 2 Example of substructure searching in a database of diverse chemicals. All the chemicals including aniline as a substructure are
 highlighted. The search was performed with the program LeadScope (LeadScope Inc., Ohio),
  Another very useful feature with the addition of
visual analytic tools is the possibility of character-
izing a database by its component functional groups
or chemical classes. An example of  this capability
is presented  in Figure 3, by applying a CRD ap-
plication to the SDF file. The figure shows that the
chemicals in the database are divided into chemical
classes, and the frequency in each class is given. In
addition,  it is possible  to  add colors to each class
bar, pointing visually to the abundance in each class
of the chemicals active and inactive for some select-
ed property (e.g., carcinogenicity). Visualization of
the retrieved data makes easier and more immediate
the understanding of the results of the query.
  The above data mining capabilities can be expand-
ed to perform more complex searches, by formulat-
ing queries where  specific  combinations  of struc-
tures, data and text (i.e.,  "chemical profiles") are
searched for in the database at the same time.
  Another crucial  operation that can be performed
on structural databases is that of calculating chemi-
cal similarity between pairs of chemicals [16]. Based
on the structural motifs in common to two chemi-
cals, the degree of similarity can be  quantified on,
e.g.,  a 0 to 1 scale, and the resulting similarity value
can be used as supporting evidence in the process of
identifying categories of similar chemicals.
  A more sophisticated use of data mining approaches
allowed by modern CRD applications is the identi-
fication of one or more common structural patterns
among groups  of chemicals with  similar character-
istics or profiles (e.g., toxicity). Such patterns, when
identified, can be used as predictive models to estimate
the toxicity of other chemicals, with similar structural
patterns [14].


  THE DSSTOX DATABASE PROJECT
  In  view of the  powerful opportunities  provided
by the CRD technology, a major problem is that of
transforming the  available  databases  according to
the new standards. A considerable progress is repre-
sented by PubChem that allows the user to browse
through the US  public  databases  individually  and
collectively according to structural criteria. However,
even though this  design permits a user to explore
and download all  or portions of the available infor-
mation, there is no quality review  of the structural
inventory of PubChem in relation to bioassay data,
which come from a large number of user-depositors
                              Previous

-------
                                                                                   ISSCAN DATABASE
                                                       53
or sources with various levels of quality review ap-
plied to their data; hence, it is largely a "user-beware"
public resource. New initiatives are now being devel-
oped to address this concern in the world of toxicity
data. An example of project designed to provide the
user with self-contained data files that can be read-
ily incorporated into CRD and used freely is the
Distributed Structure-Searchable Toxicity (DSSTox)
Database Network, which is a project of the USEPA
(www.epa.gov/comptox/).
  A primary objective of the DSSTox website (www.
epa.gov/ncct/dsstox) is to serve as a central com-
munity   forum  for  publishing standard-format,
structure-annotated chemical toxicity data files for
open-access, public use, and for use in  CRD appli-
cations. DSSTox efforts include the careful quality
annotation of chemical structures, standardization
and documentation of toxicity data in collaboration
with toxicity data experts, and open public access to
toxicity databases.
  In the  initial phase, data files were not structure-
searchable on the DSSTox web site itself, but the
data files could be downloaded in their entirety and
freely used. Since September 2007,  a DSSTox struc-
ture-browser offered on the DSSTox website allows
structure/substructure/similarity-searching through
all DSSTox data file content, and can be additional-
ly accessed from off-site collaborators (e.g., CPDB,
EPA IRIS, NTP) for website searching through ei-
ther local content (e.g., just the content of the origi-
nator's website)  or broader  searching through the
DSSTox inventory and, soon to be added, providing
external links to PubChem.
  At present, the  DSSTox data file cluster includes
six separate databases: CPDBAS - Carcinogenic Po-
tency Project Summary Tables (Source, LS Gold,
CarcinogenicPotencyProject,UCBerkeley);DBPCAN
- EPA Disinfection By-products Carcinogenicity Es-
timates Database  (Source, YT Woo, USEPA, Office
of Pollution Prevention & Toxics); EPAFHM - EPA
Fathead Minnow Acute Toxicity Database (Source,
C. Russom, USEPA, Mid-Continental Ecology Di-
vision-Duluth); NCTRER -  FDA NCTR Estrogen
Receptor Binding Database (Source,  Weida Tong
and Hong  Fang,  National Center for Toxicological
Research, Jefferson, Arkansas); FDAMDD  - FDA
Maximum Recommended Daily Dose (Source, Edwin
Matthews and R. Daniel Benz, US FDA, Rockville,
MD), and  the  newest data file, IRISTR (Source,
USEPA's Integrated Risk Information System Toxicity
Reviews), which includes 34  toxicity-related  content
fields. Additionally, the DSSTox file inventory includes
2 structure-locator  files, HPVCSI  (USEPA's High
Production Volume Challenge Program) and NTPBSI
63 Projtet Browstr
 F*J E* ¥i»¥  To* Wir»dw» Heip

                                    Feature Comttnattoe Oustere Sesrttotes R-6ron» ft*J MaM
HMogriwi | scatterpM i Fealures
y*jfe
jpfoieet Hetarchv v
- FundjoraM groups •">
* (K"d «Kiy»«Jf ' |
serf halrdf j
* tieohd
* SM-iyd*
ffi *ene
66 **yn* Features
amMme
tfi mufms
iS ands :
Sis bwonyoupt 1
» carbanwte ,
'« cartjonyl
* catboxarwte i
Js carboxyWe
* cwboxyte aatl i
* ether ]
'*> jusnrfne 1
* hafcte
,.
A, hv^azrie
< >
IF*** [ 	 \ Q

tes 0
Fitwed
72?
•^SX 2
1
150
13
129
3
1
1 ! " 238
2
1
14
258
.. ,, . 60
54
35
109
3
1S7

) 10 100 IK
X- logartNwc v

$5
Ida)
727;
21
1
150!
13
»29[
3
li
238
I
1
14
258
eo!
54
35
109]
3
187 1
SS


Flare: is CoweeT"!












Frequency

/"
^





Cetartor. £a«e v
^m ,«. ,:BH
tt .3,0 -25 ,2-0 29 2S 3,0 >»
   Ready
                                         Sets57
                                                  To»al774
                                                          Mean 2 55
                                                                     Dev_079   Ftere*774
                                                                                       Selected 0
 Fig. 3 Example of classification of the chemicals in a database by chemical classes. The analysis was performed with the program
 LeadScope (LeadScope Inc., Ohio).
                                Previous

-------
Romualdo Benigni, Cecilia Bossa, Ann M. Richard, et al.
(National Toxicology Program Bioassay)  contain-
ing URL addresses to chemical-specific data pages,
and 2 structure-index files containing only a chemi-
cal structure listing, NTPHTS (National Toxicology
Program High-Throughput Screening) and TOXCST
(EPA's National Center for Computational Toxicology
ToxCast testing program).
  Each DSSTox database is published as a  separate
and distinct module that adheres to standard con-
ventions in SDF data file format, file names, chemi-
cal structure fields, and minimum documentation re-
quirements. Together with the SDF file, the  DSSTox
provides an MS Excel-readable file (.xls) (reporting
the non-structural data), and an Acrobat-readable
file (.pdf) which displays the traditional graphical
representation of  the chemicals. In addition, the
DSSTox website provides a detailed guide on the use
of files, and a rich documentation on the entire sub-
ject of databases and related concepts [12,  17].  The
collected DSSTox published inventory contains over
six thousand unique chemical substances relevant to
toxicology and can be merged for structure-search-
ing, or ported into CRD applications.
  THE ISSCAN DATABASE
  ON CHEMICAL CARCINOGENS
  As pointed out  above, currently the public has
access to a variety of toxicity databases; however,
these publicly available  data may not be immedi-
ately suitable for use. One  general issue is that of
data quality, both from a chemical and biological
perspective. Beyond its most obvious meaning (data
"must" be of good quality,  otherwise any inference
based on them is simply devoid of any value), there
are more subtle problems linked to this issue. For ex-
ample, for each chemical the CCRIS (as well as the
CPDB) reports all the available experimental results.
There are cases  where more than one experiment,
with contradictory results, exist for a given chemi-
cal. There are also cases where the experimental pro-
tocols differ to a large extent. In all these cases, the
database user has to employ her/his expert judge-
ment to make an activity assignment.  Together with
the data issue and linked to it, is that of the data
standardization, which can  become extremely criti-
cal for some more formalized applications, such as
QSAR analyses  [9]. These approaches need highly
summarized representations of the activity of the
chemicals (i. e., a unique number for the potency of
the active compounds; a dichotomous classification
into actives/inactives). But the large public databas-
es often do not meet these  modeling requirements.
One example is  the NTP on-line database that in-
cludes high-level detail on animal bioassays and ge-
netic toxicity experiments for several thousands of
chemicals, respectively, but  which does not provide
ready access  to  data for the entire chemical study
inventory, relational access to particular slices of the
data,  or aggregate summarizations of the data ac-
cording to the requirements of QSAR modeling.
  To alleviate the above problems, at the Istituto Superi-
ore di Sanita (ISS) a new database on chemical carcino-
gens called ISSCAN: "Chemical carcinogens: structures
and experimental data" has been built. The data can be
freely downloaded from the ISS website:  www.iss.it/
ampp/dati/cont.php?id=233&lang=l&tipo=7  or  from
the DSSTox site: wwwepa.gov/ncct/dsstox/sdf_isscan_
external.html.
  The ISSCAN database contains information  on
chemical compounds tested with the long-term car-
cinogenicity bioassay on rodents (rat, mouse). The
specific characteristics of the ISSCAN database in re-
spect to other databases should be emphasized. First,
the ISSCAN initiative is aimed at providing the scien-
tific and regulatory community with carcinogenicity
calls that have been re-checked, in order to ensure the
quality of the data. The data were cross-checked on
different sources of information available; contradic-
tions were solved going back to the original papers,
and results  based on insufficient protocols were not
included. Second, the biological data (carcinogenicity
and Salmonella mutagenicity) were coded in numeri-
cal terms that can be used directly for QSAR analy-
ses. This aspect of being QSAR-ready eliminates the
intermediate passage of data transformation that of-
ten is problematic for the QSAR practitioner without
specific toxicological expertise.
  The general structure of the database is  inspired
by that of  the DSSTox. The ISSCAN database is
composed  of standard chemical data fields,  such
as 2D structure,  chemical name and synonyms,
CAS registry number, molecular weight, chemical
formula and  SMILES notation, together with bio-
logical data fields: carcinogenic potency  in rat and
mouse,  mutagenicity in  Salmonella typhimurium
(Ames test), carcinogenicity results in the  four ex-
perimental  groups most commonly used for the can-
cer bioassay,  carcinogenicity results from the NTP
experimentation (when available), overall  carcino-
genicity, together with the source of carcinogenicity
data. Figure 4 displays the information reported by
ISSCAN for a representative chemical.
  From the website  it is possible to download four
different files:
  1) an SDF file containing chemical structures to-
    gether with chemical and biological data;
  2) a PDF  file with a detailed explanation and guid-
    ance of use;
  3) a PDF file with 2D chemical structures of the
    substances;
  4) an XLS file of the data.
  At present, the second updated version of ISSCAN
is available, including 890 chemicals tested for ro-
dent carcinogenicity (the  main primary sources of
data are the  NTP, CPDB, CCRIS, and IARC  re-
positories). It is our plan to accomplish  the evalu-
ation of the remaining chemicals by the year 2008.
Since the SDF file cannot be read by users without
specialized  software applications, it is also our plan
to make available on our website a tool suitable for
simple analyses.
                             Previous

-------
                                                                                  ISSCAN DATABASE
                                                       55
  It should be emphasized that this type of project
(ISSCAN) is not in opposition to other databases
(e.g., CCRIS, CPDB) that follow the philosophy of
reporting vast amounts of data at different hierar-
chical levels, also including contradictory evidence
when existing. In contrast  and complementary to
these efforts, the ISSCAN initiative is aimed at pro-
viding the end-user with information that is revised
and  re-organized for a specific  aim,  whereas the
above databases have the important role of keeping
track of all the available information. Even when
the knowledge contribution  of  portions  of such
databases  looks very minor (e.g., data from experi-
ments with few animals and old protocols), this - in
a different context - may turn out to be very useful
for, e.g., planning further studies.


  CONCLUSIONS
  The key to a rapid progress in the field of chemical
toxicity databases exploitation is that of combining
information technology with the chemical structure
as identifier of the molecules.  This permits an enor-
mous range of operations (e.g., retrieving chemicals
or chemical classes, describing the content of data-
bases, finding  similar chemicals, crossing biological
and  chemical  interrogations,  etc.) that other  more
classical databases cannot allow.  In the foreseeable
future, this trend will become even more pervasive:
a clear demonstration of this trend is the creation by
NCBI of the chemically-interrogable PubChem da-
tabase fully integrated  with the traditional, textual
PubMed  (http://www.ncbi.nlm.nih.gov/sites/entrez)
repository of biomedical information.  At the same
time, there is a proliferation of new tools aimed at
Formula
FW
Substance ID
Mouse Female Cane
SAL
Rat Male Cane
TD50_Rat
TD50_Mouse
Rat_Female_Canc
Cane
MolWeight
Mouse_Male_Canc
Mouse_Male_NTP
ChemName
Rat_Male_NTP
Reference
SMILE
Rat Female NTP
CAS
Mouse_Female_NTP
Synonyms
C15H13NO
223.2699
2
ND
3 HN'
ND 	 \
NP /^\ >
ND 4. ff— 
-------
56
Romualdo Benigni, Cecilia Bossa, Ann M. Richard, et al.
           animal toxicity on a chosen set of compounds: the
           standardization of data and CRD-accessibility will
           be a necessary requirement in order to fully exploit
           the value  of these data (for more information, see:
           www.epa.gov/ncct/toxcast/).


           A cknowledgements
           This work was partially  granted by the EU FP6 Contract n.
           037017 OSIRIS "Optimized strategies for risk assessment of in-
                                                              dustrial chemicals through Integration of non-test and test in-
                                                              formation"

                                                              Disclaimer
                                                              This manuscript does not necessarily reflect the views and policies
                                                              of the USEPA, nor does mention of trade names or commercial
                                                              products constitute endorsement or recommendation for use.

                                                              Submitted on invitation.
                                                              Accepted on 16 December 2007.
           References
             1.  Organisation for Economic Co-operation and Development.
                Report on the Regulatory Uses and Applications in OECD
                Member Countries of (Quantitative) Structure-Activity Re-
                lationship ((Q)SAR) Models in the Assessment of New and
                Existing  Chemicals.  58. 2006. OECD Series on Testing and
                Assessment. Paris: OECD; 2006. (ENV Monograph No. 58).
             2.  Richard AM. Future of predictive toxicology. An expanded
                view of "chemical toxicity", future of toxicology perspec-
                tive. Chem Res Toxicol2006;19-.1257-62.
             3.  Commission of the European Communities. Proposal concern-
                ing the registration, evaluation, authorisation and restriction of
                chemicals (REACH). (COM(2003)644Final). Bruxelles: EU;
                2003.
             4.  Benigni R, Netzeva TI, Benfenati E, Bossa  C, Franke R,
                Helma C, Hulzebos E, Marchant C, Richard A, Woo Y-T,
                Yang C. The expanding role of predictive toxicology: an up-
                date on the (Q)S AR models for mutagens and carcinogens. /
                Environ Sci Health C 2007;25:53-97.
             5.  Pedersen F, de Brujin J, Munn SJ, and Van Leeuwen, K.
                Assessment of additional testing needs under REACH. Effects
                of (Q)SARs, risk based testing and voluntary industry initia-
                tives. Ispra: Joint Research Centre; 2003. (JRC report EUR
                20863 EN).
             6.  Van der Jagt K, Munn SJ, Torslov J, de Brujin J. Alternative
                approaches can reduce the use of test animals under REACH.
                Addendum  to  the Report "Assessment of addtional testing
                needs under REACH.  Effects of (Q)SARs, risk based test-
                ing and voluntary industry initiatives". Ispra: Joint Research
                Centre; 2004. (JRC Report EUR 21405 EN).
             7.  Hansch C, Leo A. Exploring QSAR 1. Fundamentals and ap-
                plications in chemistry andbiology. Washington DC: American
                Chemical Society; 1995.
             8.  Hansch C,  Hoekman D, Leo A, Weininger D, Selassie CD.
                Chem-bioinformatics: comparative QSAR at the interface be-
                tween chemistry and biology. Chem Rev 2002;102:783-812.
             9.  Franke R,  Gruska  A. General introduction to QSAR. In:
                                                              10.
                                                              11
    Benigni R (Ed.). Quantitative structure-activity relationhsip
    (QSAR) models of mutagens and carcinogens.  Boca Raton:
    CRC Press; 2003. p. 1-40.
    Benigni R. Structure-activity relationship studies of chemi-
    cal mutagens and carcinogens: mechanistic investigations
    and prediction approaches. Chem Rev 2005;105:1767-800.
    Benigni R, Bossa C, Netzeva TI, Worth  AP. Collection and
    evaluation of ( Q)SAR models for mutagenicity and carcino-
    genicity.  Office for the Official Publications of the European
    Communities. EUR - Scientific and Technical Research Series.
    Luxenbourg; 2007. (EUR 22772 EN). Available from: http://
    ecb.jrc.it/documents/QSAR/EUR_22772_EN.pdf; last  vis-
    ited 21/11/2007.
                                                              12
    Richard AM, Williams CR. Public sources of mutagenicity
    and carcinogenicity data: use in structure-activity relation-
    ship models. In:  Benigni R (Ed.). Quantitative Structure-
    Activity Relationship (QSAR) models of mutagens and car-
    cinogens. Boca Raton: CRC Press; 2003. p. 145-74.
    Yang C, Benz RD, Cheeseman MA. Landscape of current
    toxicity databases and database standards. Curr Opinion Drug
    Discov Develop 2006;9:124-33.
    Yang C, Richard AM, Cross KP. The art of data mining the
    minefields of toxicity databases to link chemistry to biology.
    Curr Comput Aid Drug Des2006;2:l35-50.
    Richard AM, Gold LS, Nicklaus MC. Chemical structure in-
    dexing of toxicity data on the Internet: moving toward a flat
    world. Curr Opinion Drug Discov Develop 2006;9:314-25.
    Gallegos Saliner A. Mini-review on chemical similarity and pre-
    diction of toxicity.  Curr Comput Aid Drug Des2006;2:l05-22.
17.  Richard AM. DSSTox web site launch: improving public ac-
    cess to databases for  building structure-toxicity prediction
    models. Preclinica 2004;2:103-8.
18.  Dix DJ, Houck KA, Martin MT, Richard AM,  Setzer MW,
    Kavlock RJ. The ToxCast program for  prioritizing toxicity
    testing of environmental chemicals. Toxicol Sci2007;95:5-l2.
                                                              13
                                                              14
                                                              15
                                                              16
                                              Previous

-------
                                                                                                                    Research


A Novel  Two-Step Hierarchical Quantitative Structure-Activity Relationship
Modeling Work Flow for Predicting Acute Toxicity of Chemicals  in  Rodents
Hao Zhu,1 Lin Ye,1 Ann Richard,2 Alexander Golbraikh,1 Fred A.  Wright,3 Ivan Rusyn,4'" and Alexander Tropsha1'"
laboratory for Molecular Modeling, Division of Medicinal Chemistry and Natural Products, School of Pharmacy, University of North
Carolina at Chapel Hill, Chapel Hill, North Carolina, USA; 2National Center for Computational Toxicology, Office of Research and
Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA; 3Department of Biostatistics,
and 4Department of Environmental Sciences and Engineering, School of Public Health, University of North Carolina at Chapel Hill,
Chapel Hill, North Carolina, USA
 BACKGROUND: Accurate prediction of in vivo toxicity from in vitro testing is a challenging problem.
 Large public—private consortia have been formed with the goal of improving chemical safety assess-
 ment by the means of high-throughput screening.
 OBJECTIVE: A wealth of available biological data requires new computational approaches to link
 chemical structure, in vitro data, and potential adverse health effects.
 METHODS AND RESULTS: A database containing experimental cytotoxicity values for in vitro half-
 maximal inhibitory concentration (IC5o) and in vivo rodent median lethal dose (LD5o) for more
 than 300 chemicals was compiled by Zentralstelle zur Erfassung und Bewertung von Ersatz- und
 Ergaenzungsmethoden zum Tierversuch (ZEBET; National Center for Documentation and
 Evaluation of Alternative Methods to Animal Experiments). The  application of conventional
 quantitative structure—activity relationship (QSAR) modeling approaches to predict mouse or rat
 acute LD5o values from chemical descriptors of ZEBET compounds yielded no statistically signifi-
 cant models. The analysis of these data showed no significant correlation between IC^Q and LD^Q.
 However, a linear IC5Q versus LD50 correlation could be established  for a fraction of compounds.
 To capitalize on this observation, we developed a novel two-step modeling approach as follows.
 First, all chemicals are partitioned into two groups based on the relationship between IC5o and
 LD5o values: One group comprises compounds with linear IC5o versus LD5o relationships, and
 another group comprises the remaining compounds. Second, we built conventional binary clas-
 sification QSAR models to predict the group affiliation based on chemical descriptors only. Third,
 we developed ^-nearest neighbor continuous QSAR models for each subclass to predict LD50 values
 from chemical descriptors. All models were extensively validated using  special protocols.
 CONCLUSIONS: The novelty of this modeling approach is that it uses the relationships between in vivo
 and in vitro data only to inform the initial construction of the hierarchical two-step QSAR models.
 Models resulting from this approach employ chemical descriptors only for external prediction of
 acute rodent toxicity.
 KEY WORDS: acute toxicity,  computational toxicology, IC50, LD50, LOAEL, NOAEL, QSAR.
 Environ Health Perspect 117:1257-1264  (2009).  doi:10.1289/ehp.0800471 available via tittp://
 dx.etoi.org/ [Online 3 April 2009]
Development  of accurate and predictive
in vitro toxicity testing methods that could
be used as alternatives for lengthy and costly
in vivo experiments has long been an elusive
goal for both industry and regulatory agencies
(National Research Council 2007). New, bold
research programs were recently established at
the National Toxicology Program (Xia et al.
2008) and the U.S. Environmental Protection
Agency (U.S. EPA) (Dix et al. 2007) and
coordinated at the interagency level by the
U.S. government (Collins et al. 2008) to
address this important challenge in a system-
atic way. The overall goal of these initiatives
is to explore a diverse array of in vitro toxicity
assays, such as cell-based and cell-free high-
throughput  screening (HTS) techniques,  as
well as toxicogenomic technologies, to evaluate
the toxic potential of chemicals and prioritize
candidates for animal testing.  However, the
utility of in vitro data as indicators of in vivo
effects will be fully realized only if rigorous
correlation between the toxicity of chemi-
cals  in vitro and in vivo can be established
(National Research Council 2007; Rabinowitz
et al. 2008).
   Many previous studies have indicated that
the correlation between the in vitro toxicity
results and animal toxicity test data (e.g., acute,
subacute, subchronic, and chronic rodent tox-
icity test results) is generally poor. Most nota-
bly,  in 2001, the Interagency Coordinating
Committee on the Validation of Alternative
Methods (ICCVAM) hosted a workshop to
assess the relationship between cytotoxicity
and  rodent acute toxicity for > 300 diverse
compounds; the data were compiled by the
Zentralstelle zur Erfassung und Bewertung
von  Ersatz-und Ergaenzungsmethoden zum
Tierversuch (ZEBET; the National Center for
Documentation and Evaluation of Alternative
Methods to Animal Experiments) [ICCVAM
and National Toxicology Program Interagency
Center for the Evaluation  of Alternative
Toxicological Methods (NICEATM) 2001].
It was concluded that there is no clear cor-
relation between cytotoxicity [half-maximal
inhibitory  concentration  (ICjo)]  and acute
toxicity [median lethal dose (LD50)] data in
rodents. Similarly, poor correlation was found
between in vitro cytotoxicity and in vivo
rodent carcinogenicity, even when a diverse
set of in vitro end points from HTS was used
(Xia et al. 2008; Zhu et al. 2008).
    Cheminformatics approaches such as quan-
titative structure-activity relationship (QSAR)
modeling have been widely used in toxicology
(Dearden 2003; Johnson et al. 2004). Several
software packages, such as Toxicity Prediction
by Komputer Assisted Technology (TOPKAT)
(Venkatapathy et al. 2004) and Multiple
Computer-Automated Structure Evaluation
(MultiCASE) (Matthews et al. 2006), have
been  developed and actively used by both
industry and regulatory agencies. However,
existing modeling tools generally do not achieve
good external accuracy of prediction  for com-
pounds not used in model development, and
few QSAR models have been successful in pre-
dicting in  vivo toxicity end points for diverse
sets of environmental compounds (Benigni
et al. 2007; Stouch et al. 2003).

Address correspondence to A. Tropsha, 327 Beard
Hall, University of North Carolina, Chapel  Hill,
NC 27599-7360 USA. Telephone: (919) 966-2955.
Fax: (919) 966-0204. E-mail: alex_tropsha@unc.edu
  *These authors contributed equally to this work.
  Supplemental Material is available online
(doi: 10.1289/ehp.0800471.SI viahttp://dx.doi.org/)
  We  thank  T. Martin [U.S.  Environmental
Protection Agency (EPA)], and J.  Strickland and
M. Jackson (ILS, Inc., Durham, NC) for providing
some of the data used in this study. We also thank
W. Setzer (U.S. EPA) for his  interest in this study
and valuable comments on the moving M-regression
method.
  This work was supported, in part, by grants from
the National Institutes of Health (GM076059 and
ES005948) and the U.S. EPA (RD83272001  and
RD83382501).
  The research described in this article has not been
subjected to each funding agency's peer review
and policy review and therefore does not neces-
sarily reflect their views, and no official endorse-
ment should be inferred. The  manuscript has been
reviewed by the U.S. EPA National Center for
Computational Toxicology and approved for pub-
lication. Approval does not signify that the con-
tents necessarily reflect the views and policies of the
agency, nor does mention of trade  names or com-
mercial products constitute endorsement or recom-
mendation for use.
  The authors declare they  have  no competing
financial interests.
  Received 26 November 2008; accepted 3 April 2009.
Environmental Health Perspectives •  VOLUME 117 I NUMBER 81 August 2009
                                        Previous
                                                                                1257

-------
Zhu etal.
   There are several possible reasons that previ-
ous attempts to establish relationships between
in vitro and in vivo toxicity data were largely
ineffective. These include, among other fac-
tors,  inadequate attention paid to the chemical
diversity of the compounds used for screening
and modeling and, consequently, unjustified
confidence in the ability of models to extrapo-
late significantly outside the chemistry space
of the training set. Furthermore, the conven-
tional QSAR modeling efforts have been dis-
connected from the growing efforts to employ
in vitro screening (i.e., HTS data) to predict
in vivo outcomes. Recently, we have proposed
the use of hybrid chemical—biological descrip-
tors, that  is, a combination of conventional
chemical descriptors with HTS profile data
regarded as biological descriptors. We have
demonstrated that these hybrid descriptors
afford QSAR models with significantly higher
accuracy  of prediction of rodent carcino-
genicity versus models using chemical descrip-
tors alone, and much higher  accuracy versus
models that used biological in vitro data alone
(Zhuetal. 2008).
   These  recent studies suggest that the
explicit consideration  of chemical structure (in
the form of chemical  descriptors) along with
in vitro assay data could potentially account
for discrepancies between in vitro and  in vivo
results  and produce more accurate predictive
models of in  vivo toxicity. To validate this
hypothesis further, in this study we used the
ZEBET data  set (ICCVAM and NICEATM
2001) for which previous attempts to establish
the direct in  vitrolin vivo correlation proved
largely unsuccessful (Freidig et al. 2007). We
have observed that chemicals can be partitioned
into two classes based  on comparison between
cytotoxicity and acute toxicity data: a) those for
which the linear in vitro (lC$o)/in vivo  (LDjo)
correlation could be demonstrated and K) those
that correlate poorly. Furthermore, and  of cen-
tral importance for applying our models to the
external set of chemicals for which no in vitro
data exist, we have built binary QSAR models
that could discriminate between compounds in
these two classes with reasonable accuracy based
on their chemical features alone.  Finally, we
have established rigorous and externally  predic-
tive class-specific QSAR models of rodent acute
toxicity measured by  LD50 values. We show
that a two-step hierarchical QSAR modeling
work flow where compounds are first assigned
to a class using binary QSAR models  and
then their LDj0 values is predicted using class-
specific continuous QSAR models affords accu-
rate prediction of LD50 values for compounds
not included in the training set. In addition, we
show that this two-step model's statistical pre-
diction accuracy compares favorably with cur-
rently available commercial toxicity predictors.
Our studies suggest that the two-step  QSAR
modeling work flow can improve performance
of predictive acute toxicity models for diverse
organic compounds and aid in prioritizing
compounds for rodent toxicity testing.

Materials and Methods
Data sets. The ZEBET database consists of
data for 361 chemicals compiled from litera-
ture studies and published in a consolidated
ICCVAM report (ICCVAM and NICEATM
2001). Every compound in this data set  has
at least one cytotoxicity result (IC50) and at
least one type of rodent acute toxicity value
(rat  or mouse  LD50). We defined ZEBET
criteria to select cytotoxicity data for this data
set as follows: a) at least two different ICjQ
values were available, either  from different
cell types or from different cytotoxicity end
points;  V) cytotoxicity data were obtained
with mammalian cells; c)  cytotoxicity data
obtained with hepatocytes were not accept-
able; and  d)  chemical exposure  time  in
the  cytotoxicity tests was at  least 16  hr.
Furthermore, only the results obtained from
the following cytotoxicity tests were accepted:
a) cell proliferation  measured by cell num-
ber, protein, DNA  content, DNA synthe-
sis, or colony formation; b) cell viability and
metabolic  indicators, including metabolic
inhibition test (MIT-24),  3-(4,5-dimethyl-
thiazol-2-yl)-2,5-diphenyltetrazolium
bromide (MTT) assay, 3-(4,5-dimethyl-
thiazol-2-yl)-5-(3-carboxymethoxyphenyl)-2-
(4-sulfophenyl)-2H-tetrazolium (MTS) assay,
and  sodium 3,3-(l-[(phenylamino)carbonyl]-
3,4-tetrazolium)-bis(4-emthoxy-6-nitro)ben-
zene sulfonic acid hydrate (XTTC);  r) cell
viability and membrane indicators, including
neutral red uptake, trypan blue exclusion,  cell
attachment, and cell  detachment; and d) dif-
ferentiation indicators.
    For the  purpose  of this work,  we
curated the data set to select the subset of
organic  compounds  and excluded inorganic
and organometallic  compounds, as well as
compound mixtures, because conventional
chemical descriptors used  in QSAR studies
could not be computed in  these cases. There
were 254 and 235 compounds that had rat
or mouse LDjQ (millimole/kilogram-body
weight/day) values,  respectively. Only  LD50
values published in the Registry of Toxic
Effects  of  Chemical Substances (RTECS)
(Norager et al. 1978; Ruden  and Hansson
2003) were used. The distributions of log(l/
LDjo) values of ZEBET compounds, with
the exception  of a single outlier, were from
-2.61 to 2.30  for the rat and from -2.50 to
2.19 for the mouse. We considered one com-
pound,  2,3,7,8-tetrachlorodibenzo-^>-dioxin
(CAS 1746-01-6), an activity outlier because
its log(l/LD50) value was -4.21 for the  rat,
which deviated significantly from the activ-
ity range of the data set. After excluding this
single outlier, the data sets used for modeling
consisted of 253 compounds for the rat and
235 compounds for the mouse. An additional
set of 115 compounds with complete data
(both LD50 and 1050) for the rat was recently
released by ICCVAM, which we used for vali-
dation (referred to as the ICCVAM data set).
[For raw data, see  Supplemental Material,
Table 1  (doi:10.1289/ehp.0800471.Sl).]
   The  data  on rat chronic lowest observed
adverse effect levels (LOAELs) and rat chronic
no observed adverse effect levels (NOAELs)
were  compiled from an internal low-dose
toxicity data set established in our labora-
tory [see  Supplemental Material,  Table 2
(doi:10.1289/ehp.0800471.SI)].  These data
include a combination of multiple toxicity
phenotypes, such as liver toxicity and kidney
toxicity. Compared with the ZEBET data set,
42 unique compounds have both rat LOAEL
and in vitro IC50 values, and 41 compounds
have both NOAEL and in vitro ICjQ values.
Because of limited availability of LOAEL and
NOAEL data,  we used these two data sets
only to  illustrate the data partitioning algo-
rithm and did  not build any QSAR models
for them.
   QSAR modeling approaches. We used
the /^-nearest  neighbor (£NN) QSAR model-
ing approach that has been developed in our
group (Zheng and Tropsha 2000).  In brief,
the method is based on the £NN principle and
the variable selection procedure.  It  employs
the leave-one-out cross-validation procedure
(LOO-CV) and a simulated-annealing algo-
rithm for the variable selection. The  proce-
dure starts with the random selection of a
predefined number of descriptors  from all
descriptors. If £NN > 1, the estimated activi-
ties yt of compounds excluded by the LOO
procedure are calculated using the following
formula:
                                       [1]
                         k- 1
where y: is the activity of the ^'th compound.
We define weights w~ as
          U>jj = 1 — '
                                       [2]
where d~ is the Euclidean distance between
compound i and its ^'th nearest neighbor.
Further details of the algorithms and work
flow are provided elsewhere  (Medina-Franco
et al. 2005; Ng et al. 2004; Shen et al. 2002;
Zheng and Tropsha 2000).
   We  developed  rat and mouse LDj0
QSAR models for ZEBET compounds using
DRAGON chemical descriptors (DRAGON
1258
                          VOLUME 1171 NUMBER 8 I August 2009 •  Environmental Health Perspectives
                                        Previous

-------
                                                                              Two-step QSAR for acute rodent toxicity prediction
for Windows, version 5.4; Teleste s.r.l.,
Milan,  Italy). Before model construction,
23 compounds with rat LD50 results and 24
compounds with mouse LD50 results were
selected at random to serve as external vali-
dation sets. The remaining 230 rat and 211
mouse compounds were used as modeling
sets, and each was divided multiple times
into training/test sets using the sphere exclu-
sion approach (Golbraikh  et al. 2003). We
characterized the statistical significance of the
models  with  the standard LOO-CV R2 (q1)
for the training sets and the conventional R2
for the  test sets when modeling  real values
(i.e., continuous QSAR). For classification
modeling, we used correct classification rates
expressed as a fractional value between 0 and
1. The model acceptability cutoff values of
the LOO-CV accuracy of the training sets
and the prediction accuracy for test sets were
both set to 0.65  for classification models. For
continuous models, the acceptability thresh-
olds for LOO-CV regression q  for the train-
ing sets and  R2  values for the test set were
both set at 0.5. Models that did not meet
both training and test set cutoff criteria were
discarded.
    Moving M-regression for data partitioning,
We used a novel approach related to a class
of M-regression methods (Andersen 2007),
which we termed "moving  M-regression," to
select compounds for which there is a strong
correlation between  IC50  and LD50 values
(class 1). The approach is a variant of the least
squares regression  that takes into account
only those data points contained within a
band around the regression line jyregr =  ax
+ b. For each y, only the points  within the
interval  \y — dt, y + d,] are candidates for class
1, whereas points outside of this band  are
excluded from class 1. If the line y = ax + b
is moved,  some new points will enter the
band, whereas some other points will leave
it, which may result in a higher regression R2;
this also explains why we use the term "mov-
ing M-regression." For each point, we define
{xj, yt}, «'=!,...,», the moving M-regression
inclusion function, as
1, if y± e [ax; + b-
0, otherwise.
                            + b +
                                       [3]
    Thus, the moving M-regression line can be
found by minimizing the following expression:

            n
 F(a,b) =  X T\(xi>yi)(yi - oxi - b)2•  M
           i= 1

    Function _Fis not differentiable at all points
(XP yj) such that yt = axj +  b — d\ and jy, = /ay  +
b + d2. For practical purposes, we approximate
T|(A;, jy,) by sums of two sigmoid functions:
                                          where PI and PI are large (- 100) positive
                                          parameters. Indeed, as PI  and P2 approach
                                          infinity, the expression on the right side
                                          of Equation 5  approaches the right side of
                                          Equation 3. Small approximation errors in
                                          the vicinity of points {axj + b — d-[, y,} and {/ay
                                          + b + d2, yi\ approach zero as both PI and PI
                                          approach infinity. It is as if the data points are
                                          gradually included within, or excluded from,
                                          the band when the regression line is mov-
                                          ing. Finally, replacing T|(:xy, yj) by Equation 3,
                                          we  obtain

                                          F(a,b) =
                                                                                 [6]
                                              To optimize F(il, b), the following system
                                          of equations is to be solved:
                                                            dF
                                                                                 [7]
                                              Equations 7 are nonlinear, so depending
                                          on the data set and parameters PI, P2, d\, and
                                          d2, they can have multiple solutions (a, b).
                                              In these studies, Ay and jy, were the in vitro
                                          log(l/IC5o) and in vivo log(l/LD5o) values,
                                          respectively, for a data set  of compounds
                                          under  study. Instead of using Equation 6,
                                                 Whole
                                              in vitro/in vivo
                                                 dataset
r
                                                                                       we determined the compounds that belong
                                                                                       to class 1 by maximizing the number of data
                                                                                       points within the band. With this correction,
                                                                                       our target function takes the form

                                                                                       F*(a,b) =
                         	
                          l+exp[/>2(.y,.
                                                          [8]
                      To obtain the baseline toxicity regression
                   (see Results), we opted to minimize the num-
                   ber of outliers below the regression line. For
                   this purpose, we added additional terms for
                   the lower border of the band weighted by an
                   arbitrary parameter a. Thus, we minimized
                   the following target function:

                   F**(a,b) =
                     "  <              1  _i_ ™
                    £ |	1  + a	
                                                          [9]
                      The initial point (a, b) for minimization
                   of F** was selected manually. PI and PI were
                   equal to 100, d\ and d2 were equal to 0.4,
                   and a was equal to 1. To optimize Equation
                   9, the system of Equations 7 in which F is
                   replaced by F** should be solved.  Figure 1
                   summarizes the data analytical work flow that
                   we employed in this study for rodent acute
                   toxicity modeling.
                      Model validation. We validated training
                   set models by evaluating their external predic-
                   tive power on the test sets as described above.
Compounds
above the
line


Generate regression
line between IC5(I and
LD5(I using moving
regression method

— h.

Compounds
on the line

Compounds
                                          Figure I.The work flow of the two-step /rNN QSAR LD60 modeling.
Environmental Health Perspectives • VOLUME 117 I NUMBER 81 August 2009
                                        Previous
                                                                                                                           1259

-------
Zhu etal.
Furthermore, a 5-fold external CV analysis
was performed for the original ZEBET data
set: the data  set was randomly split into five
equal-size subsets  of compounds and  five
independent sets of calculations were con-
ducted each time using 80% of the whole data
set as a modeling set and the remaining 20%
compounds as a test set. In addition, robust-
ness of QSAR models was verified using a
Y-randomization (randomization of response)
approach as  follows. We randomly divided
the modeling set compounds into class 1  and
class 2  subsets and  developed £NN QSAR
LDjo models for each subset using the same
protocol and the same cutoff criteria (if and
R  > 0.5) as for compounds  in classes 1  and
2 that we generated  by means of the moving
regression. The purpose of this was to see if
statistically significant QSAR models could
be obtained for any random division  of the
original data  into two classes. Independently,
we applied the test  to compounds in unique
classes  1 and class 2 by randomizing their
LDjo values and redeveloping training set
models. Both Y-randomization tests were
repeated 10 times.
  1  o
Results
Failure of conventional QSAR modeling of
rodent acute toxicity. The modeling set includ-
ing 230 compounds with known rat LD50
data was partitioned into 32 training and test
sets and the conventional £NN QSAR model-
ing approach was applied to all training sets
as detailed in "Materials and Methods." We
characterized each training set model by its
q2 value; for the five best training set models,
these values ranged between 0.5 and 0.57.
These five models were used for predicting
LDjo values for the respective external valida-
tion set (23  compounds).  However,  for each
of these models the R  value for this exter-
nal set was < 0.5. When we used other types
of in-house or commercial QSAR methods
(e.g., support vector machine or partial least
square)  and other types of descriptors (e.g.,
MolConnZ descriptors or molecular operating
environment descriptors), we obtained no sta-
tistically significant predictive QSAR models
(data not shown). Likewise, modeling of the
mouse data set  (211  modeling compounds
and 24  external validation compounds) was
unsuccessful (data not shown). This  negative
                   Log (1/IC5.
                  Log (1/IC5.
Figure 2. The identification of the baseline correlation between cytotoxicity (IC60) and various types of in vivo
toxicity testing results. (A) Rat LD50. (B) Mouse LD50. (C) Rat LOAEL. (D) Rat NOAEL. C1, class 1; C2, class 2.

Table 1. The results of data partitioning for the compounds with rat LD60, mouse LD60, rat chronic LOAEL,
and rat chronic NOAEL data in ZEBET data  set using cytotoxicity IC50 values.
Model
Rat LD50 (original set)
Mouse LD50
Rat LOAEL
Rat NOAEL
Rat LD50 (full data set)
No. of C1 compounds
137
119
21
19
258
C1 ratio (%)
60
56
49
46
61
No. of C2 compounds
93
92
21
22
167
C2 ratio (%)
40
44
51
54
39
result corroborates the well-known inability of
conventional QSAR modeling approaches to
arrive at statistically significant and externally
predictive models of in vivo toxicity.
    Data partitioning using the  moving
M-regression approach.  It is well known that
in vitro cytotoxicity correlates poorly with in vivo
toxicity end points when any relatively large set
of compounds is considered.  The ZEBET data
set is no exception; cytotoxicity (IC50) correlates
with acute toxicity (LD50) for only a fraction
of the compounds in either the rat (Figure 2A)
or mouse (Figure 2B) data sets. Most of the
compounds are more toxic in vivo than in vitro.
Similar patterns could be found  between cyto-
toxicity and other in vivo toxicity end points,  for
example, rat chronic LOAEL and rat  chronic
NOAEL (Figures 2C,D).
    To devise a mathematical means for iden-
tifying compounds with strong in vitrolin vivo
correlation,  we extended concepts that have
been previously employed in calculating the
"baseline regression" that correlated the aquatic
toxicity of (some) chemicals with the loga-
rithm of the «-octanol/water partition coef-
ficient (log P) (Klopman et al. 1999, 2000;
Mayer and Reichenberg 2006). Here, we have
developed a novel approach, termed "moving
M-regression," to identify a subset  of com-
pounds with strong ICjQ versus  LDj0  correla-
tion. Using this  method, we have partitioned
compounds in  the  modeling  set into two
classes: class  1, compounds with acute toxicity
that linearly correlate with cytotoxicity; and
class 2, compounds with acute toxicity that do
not correlate well with cytotoxicity, with these
points positioned above the regression line.
    This analysis  for the rat ZEBET data  set
resulted in 122 compounds assigned  to class
1, that is, within the linear regression cor-
relation band between LD50 and IC50 val-
ues. The points corresponding to 93 out of
108 remaining compounds are  located above
the regression line band and are classified as
class 2, whereas 15 compounds  fall below the
regression line (Figure  2A). Although these
compounds are likely to be activity outliers, in
the absence  of an objective rationale for their
outlier status, we merged them into class 1 to
obtain the highest coverage of the  resulting
models and to provide  a more  realistic mea-
sure of external predictivity. Figure 2A and
Equation 10 show the correlation between
the LDjQ and ICjQ values of the resulting 137
class 1 compounds:
                                                   Log(l/LD50)=-l.l  +0.4
                                                                  xlog(l/IC5()),
                                                                                                                                [10]
Abbreviations: C1, Class 1; C2, Class 2.
                                             with R2 = 0.74, SE = 0.36, and n = 137. We
                                             also applied this approach to analyze the rela-
                                             tionship between the in vitro ICjQ and other
                                             in vivo toxicity data, including mouse LDjQ,
                                             rat chronic LOAEL, and rat chronic NOAEL
1260
                          VOLUME 1171 NUMBER 8 I August 2009 •  Environmental Health Perspectives
                                        Previous

-------
                                                                               Two-step QSAR for acute rodent toxicity prediction
(Table 1). The same trend was found for all
data sets, that is,  in all cases the data  were
partitioned into two  classes:  a)  points  on
the baseline and V) points off the baseline
(Table 1, Figure 2). We found the ratio of
class 1 to class 2 compounds to be similar  for
each of the four in vitrolin vivo toxicity data
sets. This result further  supports the generality
of the "moving M-regression" approach.
    Hierarchical QSAR modeling of the parti-
tioned rodent toxicity data. Using class assign-
ments from the data partitioning described
above, we  employed a two-step  QSAR
approach (Figure 1)  for d) classification
modeling (i.e., establishing that compounds
assigned to classes 1 and 2  based on  their
biological activity data could be  subdivided
into the  same classes based on their  chemi-
cal  structure), and b) predictive continuous
modeling for all compounds in  each  class
(i.e., estimation of the LDj0 based on chemi-
cal  structure, not ICjQ  data).  For ZEBET  rat
data, we generated three modeling sets: set 1,
230 compounds (137  class 1  vs.  93 class 2)
for  classification modeling; set 2,  137 class 1
compounds; and set 3,  93 class 2 compounds
for developing two continuous rat LD50
models. The analysis of these three data sets
resulted in 252 classification  models,  as well
as 1,207 continuous LDj0 models for class 1
compounds and 40 continuous LDj0  models
for  class  2 compounds that satisfied the sta-
tistical significance threshold criteria. Table 2
lists the statistical figures of merit for the best
      models obtained  for these three  model-
ing sets.
    To demonstrate that these QSAR mod-
els have significant external prediction accu-
racy, we have employed several concurrent
approaches for model validation. First, fol-
lowing our general model validation work
flow (Tropsha and Golbraikh 2007), we
used 23 compounds excluded randomly
from the entire data set as an external valida-
tion set.  The following two-step prediction
protocol for external compounds was used:
a) £NN  classification models were used to
assign compounds to class 1 or class 2; and
b) depending on the outcome, the respective
class-specific continuous QSAR models was
employed to predict the LD50 values for each
compound. The results demonstrate that the
overall accuracy of prediction for this exter-
nal set is reasonably good.  In the first step,
the classification model had 65% prediction
accuracy (the fraction  of correctly identified
class 1 and class 2 compounds). In the second
step, we  obtained R2 = 0.70,  mean absolute
error(MAE) =  0.39, and prediction cover-
age (i.e.,  the fraction of the external set com-
pounds within the applicability domains of
the models) of 74% for the external test set
when combining the predictions for class 1
and class 2 compounds.
    Second, we performed a 5-fold external
CV analysis to test the robustness of the mod-
eling outcome using 253 rat ZEBET com-
pounds. The dataset was randomly split into
five equal-size subsets of compounds and the
modeling procedure was repeated  five times,
using each subset as a test set and the remain-
ing four subsets as training set, as detailed
in "Materials and Methods." The statistical
results of this exercise were as follows: sloperegr
= 0.45 ± 0.01, #2regr = 0.71 ± 0.04,  #2ext =
0.55 ± 0.05,  MAE = 0.44 ± 0.04,  coverage =
73 ± 3%.
    Third, we performed Y-randomization tests
to establish whether our models are statisti-
cally robust. Random partitioning of the com-
pounds into two classes (10 times) produced
only three (for class 1) and 28 (for class 2)
models that satisfied the criteria ofq2/R2 > 0.5,
compared with 1,207 and  40, respectively,
models for "moving M-regression"—assisted
partitioning.  Randomizing LDj0 data gener-
ated no model with q IR  > 0.5 for  class 1 and
class 2 compounds.
    Fourth,  we  performed additional
Y-randomization analyses by randomly mov-
ing or rotating the correlation line  (including
negative correlation)  and redefining com-
pounds into  classes  1 and 2.  The randomly
assigned class 1 and class 2 sets  were used
to develop QSAR LDj0 models individually
and the procedure was  repeated 10  times. We
found that at most, a very small number (< 7)
of acceptable (Q2 > 0.5, R2  > 0.5) models
could be developed.
    Similar modeling results  were obtained
using the ZEBET mouse LD50 data. After par-
titioning 211 modeling set  compounds into
119  class  1 compounds and 92 class 2 com-
pounds, we developed 843 classification mod-
els for class 1 versus class  2, 236 continuous
                                LD50 models for class 1 compounds, and 356
                                models for class 2 compounds. A two-step
                                prediction protocol for evaluation of the 24
                                external compounds resulted in similarly good
                                external prediction accuracy: R2 = 0.69, MAE
                                = 0.42, and prediction coverage of 54%.
                                    As a true external validation challenge, we
                                have used our model to make predictions for
                                the 115 compounds with rat LD50 data in the
                                new ICCVAM data set.  We compiled this
                                data set after we finished the development of
                                the above-described QSAR LD50 models, so
                                it could be viewed as a true "blind" validation
                                test. The statistical parameters of the predic-
                                tion results for these compounds were R2 =
                                0.57, MAE = 0.48, and prediction coverage of
                                70%. Although somewhat less accurate than
                                the results of the previous  external prediction,
                                this validation reinforces the statistical signifi-
                                cance and utility of the model.
                                    Y-randomization tests were also performed
                                for the mouse  LDj0 data set. Similar to  the
                                rat data, after 10 random assignments of com-
                                pounds into the two classes, we developed,
                                at most, 4 (for class 1) and  38  (for class 2)
                                models (q2IR2 > 0.5), compared with 843
                                and 236 models, respectively, when we used a
                                classification model. Randomization of LD50
                                values produced no significant models.
                                    Stability of the in vitro andin vivo moving
                                M-regression parameters.  Because the regres-
                                sion correlation between  in vitro (ICjo) and
                                in vivo (LD5o)  data is required to classify the
                                modeling set compounds and, subsequently,
                                to create the £NN  classification models, this
                                linear correlation  is an  essential  factor to
                                determine the  robustness of our final mod-
                                els. Hence, the slope of the  correlation and
                                associated correlation coefficient (R2) should
                                remain stable when  new compounds  are
                                added into the modeling set.  To validate this
Table 2. Statistical information for the five most statistically significant kNN QSAR models based on three
modeling sets.
Model
N-training
Pred-training
N-test
Pred-test
The best kNN classification model for 137 class 1 versus 93 class 2 compounds
1
2
3
4
5
173 0.84
147 0.86
193 0.83
165 0.86
173 0.81
55
74
37
59
55
0.73
0.70
0.73
0.70
0.75
1
1
1
1
1
The best kNN continuous model for 137 class 1 compounds
1
2
3
4
5
103 0.66
103 0.73
111 0.71
115 0.65
77 0.73
34
34
26
22
60
0.81
0.71
0.74
0.79
0.71
3
2
3
5
2
The best kNN continuous model for 93 class 2 compounds
1
2
3
4
5
80 0.61
77 0.67
80 0.69
80 0.65
79 0.63
13
16
13
13
14
0.84
0.77
0.74
0.76
0.78
2
1
1
2
2
Abbreviations: NNN; number of the nearest neighbors used for prediction; N-test, number of compounds in the test set;
N-training, number of compounds in the training set; Pred-test, the overall predictivity of the test set (correct classifica-
tion rate for classification models, IP for continuous models); Pred-training, the overall predictivity of the training set (cor-
rect classification rate for classification models, cf for continuous models).
Environmental Health Perspectives  • VOLUME 117 I NUMBER 81 August 2009
                                                                                  1261
                                        Previous

-------
Zhu etal.
supposition, we compiled all available ZEBET
and ICCVAM compounds with rat LD50
data to create a new modeling set, including
the original modeling set (230 compounds),
the external validation set (23 compounds),
and additional data (115 compounds). We
also included the compounds previously not
used for modeling (inorganic, organometallic,
and mixtures)  because we used  no  chemi-
cal descriptors  in this validation. Using the
moving M-regression approach for  all 425
compounds with IC50  and LD50  values, the
resulting in vitrolin vivo correlation  param-
eters are similar to  those obtained from our
original modeling set in EquationlO:
     Log(l/LD50)=-l.l +0.36
                   xlog(l/IC5()),
[11]
with R2 = 0.71, SE = 0.37, and n = 258. The
proportions of class 1 and class 2 compounds
and outliers  among these 425 compounds
were also comparable to those of the original
modeling set of 230 compounds  (Table 1).
We conclude that adding new compounds
into the modeling set, which should be impor-
tant to improve the final model by enriching
its chemical and biological diversity, does not
affect the in vitrolin vivo regression statistics.
    Comparison between the two-step hierar-
chical LD50 QSAR model and TOPKAT. We
compared the performance of our modeling
approach with that of TOPKAT software, ver-
sion 6.1 (Accelrys 2009; Enslein 1988). Two
types of comparison were considered. First,
we have analyzed 27 of the 115  ICCVAM
compounds that have been used neither for
building our model nor in the TOPKAT
   2.0
.   o.o
-o
£ -0.5
o
=5 -1.0
o>
£ -1.5
   -2.0
     -2
~S
o
     -2-101       2
            Experimental Log (1/LD50)

Figure 3. The correlation between experimental
and predicted LD50 values for 27 external com-
pounds within the applicability domain (A) using
TOPKAT and (6) using the two-step model devel-
oped in this study.
LD50 training set. Figure 3 shows the correla-
tion between the experimental and predicted
LD50 values obtained from our model versus
TOPKAT. The R2 and MAE of TOPKAT
were 0.16 and 0.78,  respectively, for all
27 compounds, which is considerably less than
the same statistical parameters for prediction
of the same data set using our model, R and
MAE of 0.64 and 0.38,  respectively. For seven
compounds that were outside of the applicabil-
ity domain for our model,  the R2 and  MAE
using TOPKAT were 0.60  and 0.50, respec-
tively, whereas our model produced values of
0.86 and 0.29,  respectively (Table 3).
    Second, we have used our models to pre-
dict acute toxicity compounds in the RTECS
(Norager et al.  1978) data set (data were kindly
provided by Todd Martin from the U.S.
EPA), which contains  approximately 7,000
compounds with rat LD50 data. We removed
compounds that we found within the ZEBET
data set, as well as  inorganic compounds and
mixtures. This procedure produced a library of
4,003 compounds spanning a diverse chemical
space of organic molecules  for which experi-
mental rat LD50 data are available.
    Because the size of the  RTECS library is
much larger than that of our original model-
ing set, we drew from our experience in  using
QSAR models for virtual  screening (Oloff
et al.  2005) and narrowed the model applica-
bility domain. Consequently, predictions were
made only for compounds that had greater
than 70% confidence level  in  assigning  them
to either class  1 or class 2 in step 1 of our
work flow (i.e., we required  that > 70% of
all QSAR models  meeting our acceptability
domain criteria would predict a compound
in the same class).  We determined that there
were  1,562 compounds (out  of 4,003) that
were not included  in the training set of
TOPKAT rat LD50 model and for which pre-
dictions could be made based on the afore-
mentioned criteria. The TOPKAT model
predicted LD50 values  for these compounds
with an R2 = 0.16 and MAE  = 0.78 (Figure
4, Table 3). The same parameters for the two-
step QSAR model were  0.26 and 0.65, respec-
tively. After implementing the applicability
domain filter, we made predictions for 965
RTECS  chemicals. TOPKAT model had
parameters of R2 = 0.22 and MAE = 0.65; the
same parameters for the two-step model were
0.33 and 0.54, respectively (Table 3), which is
better than or comparable to prediction accu-
racy of various commercial QSAR modeling
packages (Moore et al. 2003), albeit there is
room for improvement.
    It should be  noted  that the  predic-
tion accuracy of the two-step model can be
improved by applying stricter criteria in the
classification step. For instance, a 90% cutoff
for correct class prediction results in prediction
model statistics of R2 = 0.62 and MAE = 0.42,
but the coverage of the model diminishes con-
siderably to include 101 compounds (Table 3).
The performance of TOPKAT for the same
101 compounds is poor: R2 = 0.26 and MAE
=  0.66.  Considering that the TOPKAT
LDjo training set contains many more com-
pounds (- 6,000) than the training set used
to develop the two-step model (- 200), it is
noteworthy  that higher prediction accuracy
can be achieved using  our modeling approach
for a much larger data set. Furthermore, our
approach outperforms TOPKAT consistently
over a range of error thresholds either for
965 RTECS  compounds or  for 101  RTECS
compounds  (Figure 4). In addition,  we used
the Wilcoxon test to calculate the ^-values for
the differences in MAEs obtained using two
methods. Both for the whole set (965 com-
pounds) and for the reduced set (101  com-
pounds), the improvement achieved by our
method, compared with TOPKAT, is statisti-
cally significant (p < 0.005).
    One obvious reason that the prediction
accuracy of our models for RTECS com-
pounds is lower than that obtained from the
external validation set of ICCVAM data is the
difference in  "activity" ranges of compounds
in these two  data sets.  For example, the activ-
ity (log 1/activity, in millimolar units) of
ZEBET compounds  ranges from -2.61 to
2.30, whereas the  activity range of RTECS
compounds is considerably larger, from —3.34
to 4.21. It should  be  stressed that the £NN
method used in  our study cannot extrapolate
                                            TableS. Comparison between TOPKAT and the two-step model prediction of the external compounds.
                                                                 Two-step model
                                                                      TOPKAT
       Measure  No applicability domain  With applicability domain  No applicability domain  With applicability domain
       Prediction of 27 new ZEBET compounds
        ff              0.64                0.86                 0.16                 0.60
        MAE            0.38                0.29                 0.78                 0.50
        Coverage (%)      100                  67                  100                 67
       Prediction of 1,562 RTECS compounds with 70% confidence level
        ff              0.26                0.33                 0.19                 0.22
        MAE            0.65                0.54                 0.76                 0.65
        Coverage (%)      100                  62                  100                 62
       Prediction of 1,562 RTECS compounds with 90% confidence level
        ff              0.42                0.62                 0.19                 0.26
        MAE            0.60                0.42                 0.84                 0.66
        Coverage (%)      12                  6                   12                  6
1262
                                       Previous
                                VOLUME 1171 NUMBER 8 I August 2009 • Environmental Health Perspectives
                        TOC

-------
                                                                                   Two-step QSAR for acute rodent toxicity prediction
in the activity space because external com-
pound activity is predicted by averaging the
activities of nearest-neighbor compounds in
the training set as described in "Materials and
Methods." The  MAE for the prediction of
RTECS compounds that have experimen-
tal activity above 2  or below —2 is 1.14 log
units. On the other hand, the MAE  for the
prediction for RTECS compounds that have
experimental activity between -2 and 2  is
considerably lower,  0.52 log units. The likely
explanation for the better performance of our
models in the latter range is that more than
90% of our modeling set compounds have
rat LDjo activity in the same range, between
—2 and 2. Increasing the diversity and activ-
ity range of compounds in the modeling set
should  significantly improve the prediction
accuracy of our models.

Discussion
The  conventional wisdom in mechanistic and
regulatory toxicology is that predictions of the
in vivo toxicity end points from in vitro meas-
ures, even within the same species, are difficult.
However, an approximate linear correlation
between in vitro IC50 and rodent LD50, two of
the widely acceptable benchmark parameters
used for regulatory purposes, can be established
for  a significant  fraction of the compounds.
Indeed, we confirmed this notion by quantita-
tive analysis of the ICjQ/LDjQ relationships and
devised an objective, computational means to
partition compounds into two groups: those
having good linear fit within a defined band, or
those falling outside the band and exhibiting, for
the most part, higher in vivo than in vitro toxic-
ity. Our hypothesis to explain this observation is
that, whereas cytotoxicity assays can reflect some
of the toxicity mechanisms resulting in  adverse
health effects at  the whole-animal level, the
in vitro tests cannot fully reproduce the complex
mechanisms of the in vivo toxicity. For example,
it is well known that many compounds  are not
toxicants themselves but have metabolites that
are toxic. We argue that the two-step  predic-
tion model based on chemical descriptors only
 „ 1.0
 ffl 0.6
 o
                                Two-step model
                                (965 RTECS)
                                Two-step model
                                (101 RTECS)
                                TOPKAT
                                (965 RTECS)
                                TOPKAT
                                (101 RTECS)
                            - --Randomsampling
                 1            2
                Prediction errors
Figure 4. Fraction of compounds versus prediction
errors obtained by the two-step rat LD60 model,
TOPKAT, and random sampling for 965 and 101
RTECS compounds.
that we developed in our studies also assists
in identification of the compound subset that
may act directly (i.e., without being biotrans-
formed) and through mechanisms likely to
be predictive of the potential in vivo effects. A
similar argument was presented previously in
ecotoxicity research where log P was found to
be a mechanistically relevant predictor (Verhaar
et al. 2000).
    To further substantiate this argument,
we considered the top 10 chemical fragment
descriptors that were used most frequently in
statistically significant QSAR models, that is,
descriptors with the highest discriminatory
power [see Supplemental Material, Table 3
(doi:10.1289/ehp.0800471.SI)]. It is note-
worthy that  the  aromatic primary amine,
"hydrazine," and "sulfonamide" moieties,
found within compounds that are known to
be  toxic both in vitro and  in vivo (Alaejos
et al. 2008; Carr et al. 1993; Toth 1988),
were found predominantly in compounds
of class 1.  On the other hand, "pyrrolidine"
and "aromatic tertiary amine"  moieties,
which require biotransformation  (Domagala
1994), were predictors for  class 2. We have
also demonstrated that this objective  divi-
sion of the data set into two major groups
affords robust hierarchical QSAR models, an
assertion further supported by successive chal-
lenges to the models with external data sets,
CV, and randomization of data.
    The approach advocated in this study for
biologically informed partitioning of structure-
activity relationship  data differs from conven-
tional  cheminformatics clustering approaches.
Traditional methods partition compounds into
multiple subgroups based on their chemical
structure properties only (i.e., chemical descrip-
tors). The underlying reasoning for chemically
based  clustering is that similar structures are
expected to have similar biological properties
and mechanisms of activity.  However,  it is a
well-known limitation of structure-activity
relationships that the absence or presence of a
functional group or other minor change of the
chemical structure may result in a large change
of biological activity (Maggiora 2006). In our
studies, the conventional chemical structure-
based clustering method did not yield any sta-
tistically meaningful models, either global or
local. The distribution of pairwise chemical
similarities for all compounds within the mod-
eling sets (class 1 vs. class 2) of rat LDj0 values
using  DRAGON descriptors is very similar
(data not shown).  This observation  reconfirms
that chemical clustering would not have par-
titioned  compounds in a way similar to the
biological data-based partitioning.

Conclusions
Although the cytotoxicity data generally  show
weak correlation with rodent acute toxicity,
we have  demonstrated that these data can be
used to inform and improve QSAR model-
ing of in  vivo acute toxicity. We have devel-
oped a novel two-step £NN QSAR modeling
approach that affords a successful prediction
of acute toxicity (LDj0) values from chemical
structure for both rats and mice.  Furthermore,
we predicted LDj0 values for external com-
pounds with accuracy, exceeding that of previ-
ously published  QSAR models developed with
the commercial (TOPKAT) software. It should
be stressed that although in vitro cytotoxic-
ity data have been used to establish the rules
for partitioning most compounds into  two
classes, the ultimate models, both classification
and continuous, employ chemical descriptors
only. This vital feature of our approach makes
it possible to achieve accurate predictions of
rodent acute toxicity directly from chemical
structure  alone, even bypassing the need for
in vitro studies of new compounds. We believe
that this  biological-data—based partitioning
approach using in vitro toxicity data for the
modeling set only, coupled with subsequent
chemical-structure—based classification  and
continuous QSAR modeling techniques, holds
promise for modeling other complex in  vivo
toxicity end points. This approach charts a
future course for combining in vitro screening
methods  and QSAR modeling to prioritize
chemicals for in vivo animal toxicity testing.

                 REFERENCES

Accelrys. 2009. Predictive Toxicology - DS TOPKAT. Available:
    http://accelrys.com/products/discovery-studio/toxicology/
    [accessed 30 June 2009].
Alaejos MS, Pino V, Afonso AM. 2008. Metabolism and toxicol-
    ogy of heterocyclic aromatic amines when consumed
    in diet: influence of the genetic susceptibility to develop
    human cancer. A review. Food Res Int 41:327-340.
Andersen M. 2007. Modern Methods for Robust Regression.
   Thousand Oaks, CA:Corwin Press.
Benigni R, Netzeva Tl, Benfenati E, Bossa C, Franke R, Helma C,
    et al. 2007. The expanding role of predictive toxicology: an
    update on the (Q)SAR models for mutagens and carcino-
    gens. J Environ Sci Health C Environ Carcinog Ecotoxicol
    Rev 25:53-97.
Carr A, Tindall B, Penny R, Cooper DA. 1993. In vitro cytotoxicity
    as a marker of hypersensitivity to sulphamethoxazole in
    patients with HIV. Clin Exp Immunol 94:21-25.
Collins FS, Gray GM, Bucher JR. 2008. Toxicology. Transforming
    environmental health protection. Science 319:906-907.
Dearden JC. 2003. In silico prediction of drug toxicity. J Comput
   Aided MolDes 17:119-127.
Dix DJ, Houck KA,  Martin MT, Richard  AM, Setzer RW,
    Kavlock RJ. 2007. The ToxCast program for prioritizing toxic-
    ity testing of environmental chemicals. Toxicol Sci 95:5-12.
Domagala JM. 1994. Structure-activity and structure-side-effect
    relationships for the quinolone antibacterials. J Antimicrob
    Chemother 33:685-706.
Enslein K. 1988. An overview of structure-activity relationships
    as an alternative to testing in animals for carcinogenicity,
    mutagenicity, dermal and eye irritation, and acute oral
   toxicity. Toxicol Ind Health 4:479-498.
Freidig AP, Dekkers S, Verwei M, Zvinavashe E, Bessems JG,
   van de Sandt JJ. 2007. Development of a QSAR for worst
    case estimates of acute toxicity of chemically reactive
    compounds. Toxicol Lett 170:214-222.
Golbraikh A, Shen M, Xiao Z, Xiao YD, Lee KH, Tropsha A. 2003.
    Rational  selection of training and test sets for the develop-
    ment of validated QSAR models. J Comput Aided Mol Des
    17:241-253.
ICCVAM and NICEATM. 2001. Report  of the International
   Workshop on In Vitro Methods for Assessing Acute
Environmental Health Perspectives • VOLUME 117 I NUMBER 81 August 2009
                                          Previous
                                                                                      1263

-------
Zhu etal.
    Systemic Toxicity. Interagency Coordinating Committee
    on the Validation of Alternative Methods and National
    Toxicology Program Interagency Center for the Evaluation
    of Alternative Toxicological Methods Report 01-4499
    Bethesda, MD:National Institutes of Health.
Johnson DE,  Smith DA, Park BK. 2004.  Linking toxicity and
    chemistry: think globally, but act locally? Curr Opin Drug
    DiscovDevel 7:33-35.
Klopman G, Saiakhov R, Rosenkranz HS. 2000. Multiple computer-
    automated structure evaluation  study of aquatic toxicity II.
    Fathead minnow. Environ Toxicol Chem 19:441-447.
Klopman G, Saiakhov R, Rosenkranz HS,  Hermens JLM. 1999.
    Multiple Computer-Automated  Structure Evaluation pro-
    gram study of aquatic toxicity 1: guppy. Environ Toxicol
    Chem 18:2497-2505.
Maggiora GM. 2006. On outliers and activity cliffs—why QSAR
    often disappoints. J Chem Inf Model  46:1535; doi: 10.1021/
    ci700332k [Online 28 December 2007].
Matthews EJ, Kruhlak NL, Cimino MC, Benz RD, Contrera JF.
    2006. An analysis of genetic toxicity, reproductive
    and developmental toxicity, and carcinogenicity data:
    II. Identification of genotoxicants, reprotoxicants, and car-
    cinogens using in silico methods. Regul Toxicol Pharmacol
    44:97-110.
Mayer P, Reichenberg F. 2006. Can highly hydrophobic organic
    substances cause aquatic baseline toxicity and can they
    contribute to mixture toxicity? Environ  Toxicol Chem
    25(101:2639-2644.
Medina-Franco JL, Golbraikh A, Oloff S, Castillo R, Tropsha A.
    2005. Quantitative structure-activity relationship analysis
    of pyridinone HIV-1 reverse transcriptase inhibitors using
    the k nearest neighbor method and QSAR-based database
    mining. J Comput Aided Mol Des 19:229-242.
Moore DR. Breton RL, MacDonald DB. 2003. A comparison of
    model performance for six quantitative structure-activity
    relationship packages that predict acute toxicity to fish.
    Environ Toxicol Chem 22:1799-1809.
National Research  Council. 2007. Toxicity Testing in  the 21st
    Century: A Vision and a Strategy. Washington, DC:National
    Academies Press.
Ng C, Xiao Y, Putnam W, Lum B, Tropsha A. 2004. Quantitative
    structure-pharmacokinetic parameters relationships
    (QSPKR) analysis of antimicrobial agents in humans using
    simulated annealing k-nearest-neighbor and partial least-
    square analysis methods. J Pharm Sci 93:2535-2544.
Norager 0, Town WG,  Petrie JH. 1978. Analysis of the  regis-
    try of toxic effects of chemical substances (RTECS] files
    and conversion of the data in these files for input to the
    environmental  chemicals data and information  network
    (ECDIN]. J Chem Inf Comput Sci 18:134-140.
Oloff S, Mailman RB, Tropsha A. 2005. Application of validated
    QSAR models of D1 dopaminergic antagonists for data-
    base mining. J Med Chem 48:7322-7332.
Rabinowitz JR. Goldsmith MR, Little SB, Pasquinelli MA. 2008.
    Computational molecular modeling for evaluating the tox-
    icity of environmental chemicals: prioritizing bioassay
    requirements. Environ Health Perspect 116:573-577.
Ruden C, Hansson SO.  2003. How accurate are the European
    Union's classifications of chemical substances. Toxicol
    Lett 144:159-172.
Shen M, LeTiran A, Xiao Y, Golbraikh  A, Kohn H, Tropsha A.
    2002. Quantitative structure-activity relationship  analysis
    of functionalized amino acid anticonvulsant agents using
    k nearest neighbor and simulated annealing PLS methods.
    J Med Chem 45:2811-2823.
Stouch TR, Kenyon JR. Johnson SR, Chen XQ, Doweyko A, Li Y.
    2003. In silico ADME/Tox: why models fail. J Comput Aided
    Mol Des 17:83-92.
Toth B. 1988. Toxicities of hydrazines: a review. In Vivo 2:209-242.
Tropsha A, Golbraikh A. 2007. Predictive QSAR modeling work-
    flow, model applicability domains, and virtual  screening.
    Curr Pharm Des 13:3494-3504.
Venkatapathy R, Moudgal CJ, Bruce RM. 2004. Assessment of
    the oral  rat chronic lowest observed adverse effect level
    model in TOPKAT, a QSAR software package for toxicity
    prediction. J  Chem Inf Comput Sci 44:1623-1629.
Verhaar HJ, Solbe J, Speksnijder J, van Leeuwen  CJ,
    Hermens JL. 2000. Classifying environmental pollutants:
    part 3. External validation of the classification system.
    Chemosphere 40:875-883.
Xia M, Huang R, Witt KL, Southall N, Fostel J, Cho MH, et al.
    2008.  Compound cytotoxicity profiling using quantita-
    tive high-throughput screening. Environ Health Perspect
    116:284-291.
Zheng W, Tropsha A. 2000. Novel variable selection quantita-
    tive structure-property relationship approach based on
    the k-nearest-neighbor principle. J Chem Inf Comput Sci
    40:185-194.
Zhu H, Rusyn I, Richard A, Tropsha A. 2008. Use of cell viability
    assay data improvesthe prediction accuracy of conventional
    quantitative structure-activity relationship models of animal
    carcinogenicity. Environ Health Perspect 116:506-513.
1264
                                  VOLUME 1171 NUMBER 8 I August 2009 •  Environmental Health Perspectives
                                                    Previous

-------
                                            Toxicology and Applied Pharmacology 233 (2008) 7-13
                                               Contents lists available at ScienceDirect
                                  Toxicology and Applied  Pharmacology
                                    journal homepage:  www.elsevier.com/locate/ytaap
ACToR —  Aggregated Computational Toxicology Resource

Richard Judson a'*, Ann Richard a, David Dix a, Keith Houck a, Fathi Elloumia,  Matthew Martin a,
Tommy Catheyb, Thomas  R. Transueb, Richard Spencer b, Maritja Wolfb
a National Center for Computational Toxicology, US. Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711, USA
b Lockheed Martin, A Contractor to the US. Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711, USA
ARTICLE   INFO

Article history:
Available online 11 July 2008

Keywords:
ToxCast
ACToR
Database
HTS
Screening
Priori tization
                                         ABSTRACT
ACToR (Aggregated Computational Toxicology Resource) is a database and set of software applications that
bring into one central location many types and sources of data on environmental chemicals. Currently, the
ACToR chemical  database contains information  on  chemical structure, in vitro bioassays and in vivo
toxicology assays derived from more than 150 sources including the U.S. Environmental Protection Agency
(EPA), Centers for Disease Control (CDC), U.S.  Food and Drug Administration (FDA), National Institutes of
Health (NIH), state agencies, corresponding government agencies in Canada, Europe and Japan, universities,
the World Health Organization (WHO) and non-governmental organizations (NGOs). At the EPA National
Center for Computational Toxicology, ACToR helps manage large data sets being used in a high-throughput
environmental chemical screening and prioritization program called ToxCast™.
                                                                   Published by Elsevier Inc.
Introduction

   Computational Toxicology is an emerging field that aims to use
modern computational and molecular biology techniques to under-
stand and  predict chemical toxicity. A  particular area where this
approach is being applied is in chemical screening and prioritization.
In the U.S., there are an estimated 30,000 unique chemicals in wide
commercial use (>1 t/year) (Muir and Howard, 2006), and only a
relatively small subset of these has been sufficiently well character-
ized for their potential to cause human  or ecological  toxicity to
support regulatory action. This "data gap" is well documented (EPA,
1998; Allanou et al., 1999; Birnbaum et al., 2003; Guth et al., 2005;
Applegate  and Baer,  2006;  Krewski  et al., 2007).  The  standard
approach to determine a chemical's toxicity profile involves perform-
ing in vivo studies on rodents and  other species,  and can take
2-3 years and cost millions of dollars  per  chemical. Clearly, this
strategy is neither practical nor viable for evaluating tens of thousands
of chemicals; hence, the  large inventories  of existing chemicals for
which little or no test data are available. An alternative approach is to
attempt to cover much larger regions of chemical space by employing
more efficient in  vitro methods.  One  strategy applies  relatively
inexpensive and rapid high-throughput screening (HTS) assays to a
large set of chemicals, followed by the use of these results to prioritize
a much smaller subset of chemicals for more detailed analysis. The
 * This work was reviewed by EPA and approved for publication but does not
necessarily reflect official agency policy.
 * Corresponding author. Fax: +1 919 541 3085.
   E-mail address: judson.richard@epa.gov (R. Judson).

0041-008X/S - see front matter. Published by Elsevier Inc.
doi:10.1016/j.taap.2007.12.037
                          "prioritization  score" for a chemical would be based  on derived
                          signatures,  or  patterns  extracted from  the HTS  data, which are
                          predictive of  particular effects or modes of chemical toxicity.
                          Chemicals of known toxicity comprise the reference or  training set
                          that is used to develop and validate predictive signatures. HTS assays
                          that yield data for the predictive signatures would then be run on
                          chemicals of unknown toxicity (the test chemicals), and a prioritiza-
                          tion score for those chemicals would be produced. The U.S. EPA has
                          made a significant investment in this approach through the recent
                          launch of the ToxCast™ research program (Dix et al., 2007). ToxCast is
                          screening hundreds,  and  eventually thousands  of environmental
                          chemicals using hundreds of HTS assays towards the two goals of
                          developing predictive toxicity signatures,  and using these signatures
                          to prioritize chemicals for further testing. In this EPA context, the term
                          "environmental chemicals" refers primarily  to  industrial chemicals
                          and pesticides  used or produced in large enough quantities to pose
                          significant potential for human or ecological exposure.
                             There  are multiple computational aspects to this approach. First,
                          some of the screening assays themselves may be  computational (in
                          silica). Second,  a robust database and data analysis infrastructure are
                          required to manage the large data volumes produced by a large-scale
                          HTS program. Third, one needs high quality in vivo toxicology data on
                          as large and diverse group of chemicals as possible in order to develop
                          and validate the predictive signatures. Currently, such toxicity data are
                          available  from a  number of sources, but  these data  are widely
                          dispersed and often not sufficiently annotated or fully accessible for
                          computational  use.
                             To support the EPA's ToxCast screening and prioritization effort, as
                          well as other EPA programs, we are developing a system called ACToR,
                                        Previous

-------
                                     R.Judson et al. / Toxicology and Applied Pharmacology 233 (2008) 7-13
for Aggregated Computational Toxicology Resource. ACToR is a set of
linked databases and software applications that bring together many
types and sources of data on environmental chemicals into one central
location. Currently, the ACToR chemical and assay databases contain
information on chemical structure,  in vitro bioassays and in vivo
toxicology assays derived from more than 150 sources including the
EPA, CDC, FDA, N1H, state  agencies, corresponding  government
agencies in Canada, Europe and Japan, universities, the World Health
Organization and NGOs. An important set of data collections comes
from the DSSTox project (Distributed Structure-Searchable Toxicity)
(Richard and Williams,  2002) at the EPA which produces curated
collections of chemical structures with corresponding assay data. The
design of ACToR has followed that of the NIH PubChem Project in
many respects, but has been generalized to allow for the broader types
of data  that  are of interest to toxicologists and  environmental
regulators. The current ACToR  web  interface is also  designed to
meet the needs of scientists focused on the study of chemical toxicity.
   This paper briefly outlines the design of the ACToR database and
the types of data it contains, and will illustrate its utility in the context
        of developing training and validation data sets for chemical screening
        and prioritization projects.

        Materials and methods

        Organization of the database.   The current version of ACToR is focused
        mainly on capturing information on chemicals and assays of chemical-
        biological effects. Plans are underway to extend this to capture relevant
        genomic and biological pathway information. The organizing
        principles for the design of the  chemical/assay system are largely
        derived from the PubChem  project, which  is capturing chemical
        structure and HTS information on millions of chemicals in its role as
        the main data repository for  the NIH Molecular Libraries Roadmap
        (Austin et al., 2004). The main organizing principle of PubChem centers
        on  the three main types of  data  that are  catalogued: substances
        indexed  by substance identifier (SID), compounds (i.e., chemical
        structures,  indexed  by  compound identifier  (C1D), and  bioassays
        indexed by assay identifier (AID). A PubChem substance is a single
        chemical entity submitted by one data source and often corresponds to
                             Generic Chemical
                                 (from ACTOR)
                         ACTOFLGCID :  INTEGER
                         NAME : TEXT
                         CASRN:TEXT
0..1
                                1..*
                                       0..1
                                 Substance
                                 (from ACTOR)
                          ACTOR_SID : INTEGER
                          NAME : TEXT
                          CASRN:TEXT
                                0..*
                              Assay Result
                                 (from ACTOR)
                        ACTOR_RSID :  INTEGER
                        VALUE : REAL
                                    1
                                       0..*
                            Assay Component
                                 (from ACTOR)
                         ACTOR_ACID : INTEGER
                         NAME:TEXT
         0..1
                        Compound
                         (from ACTOR)
ACTOR_CID : INTEGER
NAME:TEXT
SMILES : TEXT
MOLFILE :  TEXT
                                                                 0..1
                         0..*
                      Data Collection
                         (from ACTOR)
                 ACTOR_DCID : INTEGER
                 NAME : TEXT
                                                                                       0..*
                           Assay
                         (from ACTOR)
                  ACTOR_A!D :  INTEGER
                  NAME:TEXT
                                                                                0..*
                         Phenotype
                         (from ACTOR)
                  ACTOR_PID :  INTEGER
                  NAME:TEXT
Fig. 1. Schematic, entity-relationship (ER) diagram of the ACToR database schema showing key relationships between substances, compounds, generic chemicals, data collections and
assays. The annotations on the connecting lines are the range of number of entities for each relationship. For instance, a substance can have 0 or 1 generic chemicals (indicated by 0..1),
while a generic chemical can have 1 to many substances (indicated by 1.. ). The zero/one-to-many relationships are implemented through separate tables (not shown). The actual
schema contains 44 tables.
                                        Previous

-------
                                        R. Judson et al. / Toxicology and Applied Pharmacology 233 (2008) 7-13
the physical substance on which some experiment was performed. A
compound is a generic chemical entity that corresponds to a unique
chemical structure. Since a substance is defined as being both data
source and experiment-specific, many substances (SIDs) may map to a
single  compound  (CID). A bioassay, indexed by AID, represents a
specific type of test data associated with one or more substances.
   In ACToR, these ideas are generalized  somewhat, although the
model is close enough such that all data from PubChem can be easily
loaded into ACToR. In ACToR, a substance is  similarly defined as a
unique chemical from a single "data collection" (see below) and is
minimally  characterized by a data collection-specific  SID and a
chemical name. Most often, the substance  will also have synonyms,
a CAS (Chemical Abstracts  Service) registry  number (CASRN)  and
multiple other parameters.  A compound always has an associated
chemical structure and  a data collection-specific CID, in addition to
optional parameters derived directly from chemical structure, such as
SMILES and InChI representations and a molecular weight. Note that
since ACToR is  in essence a  "super-aggregator",  pulling  in large
external data collections such as PubChem, it also stores the source-
labeled CIDs from each independent collection (e.g., PubChem CID,
DSSTox CID). The  data  collection-specific SIDs and  CIDs  are  called
SOURCE_NAME_S1D and SOURCE_NAME_CID and are alphanumeric
strings of the form PUBCHEM_1234 or S1DS.2345. Additionally, ACToR
internally uses sequentially generated unique numeric SIDs and CIDs.
   Data on chemicals across data collections are aggregated using the
concept of a generic chemical, which for this purpose takes the place
of the compound in PubChem. The vast  majority of chemicals in
PubChem  have defined structures, while  many  environmental
chemicals  are complex, and often undefined, mixtures. However,
most environmental chemicals, along with  their related toxicity data
are indexed by a more discriminating CAS  registry number (CASRN)
rather than by chemical structure. Because of this, ACToR aggregates
information based  on CASRN.  A generic chemical is defined by a
CASRN, a preferred name, an ACToR CID and a unique generic chemical
identifier or  GCID. All data on all substances sharing a particular
CASRN are attached to the corresponding  generic chemical.  An
advantage of using CASRN is that different numbers will be assigned
to a pure substance versus a mixture of isomeric forms or a mixture of
unrelated compounds.  All of these cases, however,  may  share a
common compound  PubChem CID and  representative  structure.
Disadvantages of using  CASRN include the  fact that they are not
always available or unique for a given substance (e.g. CASRN can be
retired and replaced), they do not typically distinguish to the level of
compound purity grade (e.g., analytical vs. technical grade), and they
                              are tied to a non-public registry system (Chemical Abstracts Service
                              (CAS) SciFinder). Nonetheless, CASRN are sufficiently general to serve
                              as the basis for aggregation. Because only a small fraction of PubChem
                              substance records contain a CASRN, we perform a second level of
                              aggregation and  pull  in  all PubChem substances that  share  the
                              structure or  PubChem  CID associated  with a  particular GCID.
                              Currently, the  two main sources of chemical structure data in ACToR
                              are EPA DSSTox and PubChem. Because DSSTox structures are quality
                              reviewed, hand curated,  and reconciled  with chemical name  and
                              CASRN, they  are  always preferred over  structures automatically
                              generated  and provided  by disparate sources in  PubChem. Fig. 1
                              illustrates the basic relationships between substance,  compounds,
                              generic chemicals and assays, which are described next.
                                 In ACToR, an assay is a collection of data values associated with a
                              set of substances and can  be represented in a rectangular matrix. An
                              assay is associated with an AID, a name, a category,  and one or more
                              phenotypes. Examples of assay categories  are listed in Table 1  and
                              reflect our focus  on chemical toxicity and its origin in detailed
                              molecular biological interactions. As one can see, the concept of an
                              assay as implemented in ACToR is purposely broad so as to capture any
                              information potentially  relevant  to  understanding  toxicity  and
                              evaluating risk for environmental chemicals. An assay also can have
                              one or more components, which correspond to the columns of the
                              rectangular data  matrix.  Each  component is defined by  an assay
                              component identifier  (ACID), the corresponding  AID, a  name,  a
                              description, units (when applicable) and a data type (FLOAT, INTEGER,
                              CATEGORICAL, TEXT, BOOLEAN, URL). The actual data values are called
                              assay results and are linked to the assay, the assay component and the
                              original data-collection-specific substance.
                                 Because ACToR is intended to support  hazard identification  and
                              risk assessment, assays can be labeled by a series of "phenotypes" for
                              which they contain information. The set of phenotypes implemented
                              in ACToR  spans  both traditional toxicology study areas: general
                              chemical hazard, acute toxicity, subchronic toxicity, chronic toxicity,
                              carcinogenicity, developmental toxicity, reproductive toxicity, neuro-
                              toxicity, developmental neurotoxicity, immunotoxicity, dermal toxi-
                              city, respiratory toxicity, genotoxicity and ecotoxicity.

                              Data sources.  ACToR is importing data from a large  number of public
                              sources (currently >150), which are referred to as data collections. A
                              data collection will usually include a set of substances and may have
                              corresponding compounds (chemical structures) and  one or more
                              assays. The largest source of data currently in ACToR in  terms of
                              substances and assay data  points is  PubChem, which is itself a
Table 1
Categories of assays in ACToR
Assay category
Description
Examples
Physicochemical
Biochemical
Genomics
Cellular
Tissue
In vivo toxicology (tabular primary)

In vivo toxicology (study listing primary)

In vivo toxicology (tabular secondary)

In vivo toxicology (summary calls)

In vivo toxicology (summary report via URL)

Regulatory

Chemical Category
Physical and chemical properties (in vitro and/or in silica)
Biochemical (non-cell-based) (in vitro and/or in si/ico)
Gene expression values or signatures
Cell-based assay
Tissue slice assays
Tabulated results from primary animal-based studies
of chemical effect
Primary studies are available but have not been tabulated

Tabulated data from secondary sources of in vivo
toxicology studies
Derived summary determinations of risk

Links to text reports on the web for which specific data
values are not directly accessible in tabular form
Listings of chemicals that fall under specific environmental
laws or government mandates
Listing of structural or use categories, often intended
for prioritization efforts
Molecular weight, logR boiling point
Enzyme inhibition or receptor binding constants
Result of in vitro or in vivo microarray analysis
Cell culture cytotoxicity
Tissue slice cytotoxicity
Clinical chemistry, histopathology, developmental
and reproductive assays
Clinical chemistry, histopathology, developmental
and reproductive assays
Clinical chemistry, histopathology, developmental
and reproductive assays
Chemicals determined to pose a defined risk
of human cancer
Reports from EPA Integrated Risk Information System (IRIS),
National Toxicology Program (NTP)
U.S. Toxic Substances Control Act (TSCA)

Phthalate
                                         Previous
                          TOC

-------
10
                                        R. Judson eta/./ Toxicology and Applied Pharmacology 233 (2008) 7-13
Table 2
Summary statistics for the ACToR database
Data collections
                                                          232
Source-specific substances
Compounds (chemical structures)
Generic chemicals
Generic chemicals with structure
Assays
Assay components
Assay results
 964,083
 404,196
 504,871
 390,379
1592
  10,733
6,118,231
Assay results are individual data points for a single substance and a single assay
component. The numbers only include substances having CAS registry numbers. A
much large number of substances, compounds and assay results are included from
PubChem, but are not currently indexed as generic chemicals.

compilation of multiple data sources (57 of which have data included
in ACToR). Most assay data in PubChem comes from HTS assays run by
the Molecular Libraries Screening Centers Network (MLSCN) (Austin
et al.,  2004)  on  compounds from the  Molecular Libraries Small
Molecule  Repository (MLSMR).  However, the vast majority  of
chemicals in PubChem have no assay data and come from collections
of molecular structures  from chemical manufacturer catalogs (e.g.,
SIGMA) or virtual screening libraries (e.g., ZINC).
   The balance of the data collections within ACToR pertains  more
specifically to environmental  chemicals. These collections are from
the US EPA, CDC, FDA, NIH, equivalent agencies in Europe, Japan and
Canada, the World Health Organization, universities and several states
and NGOs. Some of these specific sources are described below.
   To be included in ACToR, a data collection must meet several
criteria. First, it has to be publicly available with no restrictions on
redistribution. An  important goal of the ACToR project is to create a
widely usable, freely distributable, open-source system. Any conclu-
sions drawn from these data  should be  subject to  independent
confirmation, which is made possible by this open-source data model.
Second, the collection should contain information on environmen-
tally-relevant chemicals. We have not to this point included a number
of data  sets focused exclusively on pharmaceutical compounds,
although toxicological information on these compounds is potentially
informative. Third, if a source consists of a web-accessible database,
we require an index of the chemicals in the  database in order to link
that web resource  back into ACToR. Several web databases that allow
local searching by name or CASRN do not provide full access or a list of
the chemicals needed for indexing; hence, these are not currently
included in the database. However, there are a number of important
data collections without publicly available indexes, so the ACToR user
interface  provides URL  links to  allow the  user  to search these
databases on a chemical-by-chemical basis  based on CASRN and/or
name.  TOXNET  and  its  component  databases are the main  data
collections in this  category. In addition to compiling data from other
databases, selected tabular information from the primary toxicology
literature is also being captured  in ACToR.

Software  aspects.     The  ACToR database  is  implemented using
MySQL. Software to preprocess  and  load data is written in Perl and
the web  interfaces are written in Java. The use of 100% open-source
software will allow the entire system to be easily distributed to other
interested groups.

Search and browsing.    The current version of the database allows
one to browse by data collection or assay, and to search chemicals by
name and structure using a chemical drawing applet and standard
chemical similarity algorithms.

Results

   Table  2 gives summary statistics on the current composition of the
database. As already mentioned,  the vast majority of substances come
from PubChem, although the overlap of that set with chemicals of
environmental interest is relatively small. ACToR contains all sub-
stances, compounds and assay results from PubChem, but the table
only gives counts for chemicals that can be indexed by CAS registry
number, which yields just over 500,000 unique or generic chemicals.
   To illustrate the utility of ACToR, we show how the aggregated data
can be used to evaluate sets of chemicals for use in developing and
validating toxicology signatures  for a screening and prioritization
approach. This approach  is more fully explored elsewhere (Judson
et al., in preparation). Our focus is on environmental chemicals having
sufficiently high production  and use volumes such that  there is
potential for  human and ecological exposure. The sets of chemicals
used are summarized in Table 3.  Because of overlaps between these
lists, the current total number of generic  chemicals considered is
11,139. This  exact  number  will fluctuate  over time  because  the
chemicals included  in the lists periodically change  due to altered
use-patterns, introduction of new chemicals, and discontinuation of
use of others.  On average,  each of these chemical substances  has
information derived from 2-3 sources, although some of chemicals are
found in a dozen or more data collections. Of these  chemicals, 7512
have  an  associated  chemical  structure  and C1D assigned. Of  the
chemicals without  a  structure, many  are mixtures or complex
substances (e.g., mica,  milk, mink oil and molasses, all of which are
pesticide inert ingredients).
   The primary in vivo toxicology assays (either tabulated or not) are
those derived from National Toxicology Program (NTP) studies and from
ToxRefDB. The majority of data currently  in the ToxRefDB database, a
component of the larger ACToR system, contains summary results of
primary toxicology studies submitted to  the EPA on pesticide active
ingredients (Martin  et al.,  2007).  Typically these data have been
extracted from EPA Office of Pesticide Programs (OPP) evaluations of
studies  based  on EPA Office  of Prevention, Pesticides and Toxic
Substances (OPPTS) harmonized test guidelines (http://www.epa.gov/
opptsfrs/home/guidelin.htm). ToxRefDB captures details of study design
and dose series data from the areas of histopathology, clinical chemistry,
hematology, gross anatomy, pathology (neoplastic and non-neoplastic),
urinalysis and  mortality. Data is aggregated at the  level of animal
treatment group (dose and  time). Summary data from  ToxRefDB is
entered into the ACToR assay tables. NTP primary tabular data is also
being entered  into ToxRefDB for chemicals not covered by  OPPTS
sources. The DSSTox program has indexed all of the studies in the NTP
database by chemical and study type, and this index is in ACToR.
   The secondary in vivo toxicology study data is derived from Risk-
Based  Concentrations (RBC); WHO Classifications  of Pesticide
           Table 3
           Sources of lists of environmental chemicals
           Data Collection
                                                                    Chemicals
           HPV — High Production Volume chemicals produced or               2810
            imported in quantities > 1 M Ib/year
            http://www.epa.gov/hpv/pubs/update/hpvchmlt.htm
           IUR — Inventory Update Rule, chemicals produced or                  5375
            imported in quantities > 10,000 Ib/year (2002 list)
            (also referred to as MPVs or medium Production volume chemicals)
            http://www.epa.gov/oppt/iur/tools/data/2002-comp-chem-records.htm
           Pesticide active ingredients including anti-microbials and food-use       3476
            pesticides http: //www.epa.gov/pesticides/factsheets/registration.htm
           Pesticide inert ingredients (or "other ingredients")                   3850
            http://www.epa.gov/opprd001/inerts/lists.html
           TRI — Toxic Release Inventory provides reports of toxic chemical          577
            releases and other waste management activities
            http: //www.epa.gov/tri/
           Drinking water chemical contaminants, disinfection by-products,          120
            and chemical contaminant candidates
            http://www.epa.gov/safewater/ccl/index.html
           EDC —  Draft list of chemicals considered for endocrine disruption          73
            screening http://www.epa.gov/endo/pubs/edspoverview/index.htm
                                            Previous
         TOC

-------
                                           R. Judson et al. / Toxicology and Applied Pharmacology 233 (2008) 7-13
Table 4
URLs for sources of data described in this article
                                                               URL
Data Source
CDC Agency for Toxic Substances and Disease Registry (ATSDR)
California EPA Proposition 65
Cancer Potency Database (DSSTox)
Chemical Abstracts Service (CAS) SciFinder
Center for the Evaluation of Risks to Human Reproduction (CERHR)
CERCLA Priority List of Hazardous Substances
eChemPortal
DrugBank
EPA Disinfection By-products Database (DSSTox)
EPA Fathead Minnow Database (DSSTox)
EPA HPV Challenge Program
EPA HPV Information System
EPA Integrated Risk Assessment System (IRIS)
EPA Pesticide Fact Sheets (Conventional Chemicals)
EPA Office of Pesticides (OPP) Inert (other) Pesticide

Ingredients
EPA Risk-Based Concentrations (RBC)
EPA ToxCast Program
European substances Information System (ESIS)
EXTOXNET Pesticide Information Profiles
FDA Everything Added to Food in the United States
FDA Maximum Daily Dose Database
Health Canada Priority Substance Lists
INCHEM Concise International Chemical Assessment Documents
INCHEM Environmental Health Criteria Monographs
INCHEM International Agency for Research on Cancer (IARC)
ITER TERA Risk Assessments
Ministry of Health Labor and Welfare (Japan) Risk Assessments
Molecular Libraries Small Molecule Repository (MLSMR)
National Toxicology Program (NTP)
NIH Molecular Libraries  Roadmap
NTP llth Report on Carcinogens (RoC)
OECD Screening Information Data Sets (SIDS) for High Volume Chemicals
PubChem
TOXNET
WHO Classifications of Pesticide Hazard
http://www.atsdr.cdc.gov/toxfaq.html
http://www.oehha.ca.gov/prop65/prop65Jist/Newlist.html
http://potency.berkeley.edu, http://www.epa.gov/ncct/dsstox/sdLcpdbas.html
http: //www.cas.org/
http://cerhr.niehs.nih.gov/chemicals/index.html
http://www.atsdr.cdc.gov/cercla/051ist.html
http://webnet3.oecd.org/echemportal/ParticipatingDb.aspx
http://redpoll.pharmacy.ualberta.ca/drugbank
http://www.epa.gov/ncct/dsstox/sdLdbpcan.html
http://www.epa.gov/ncct/dsstox/sdLepafhm.html
http: //www.epa.gov/hpv/
http://www.epa.gov/ncct/dsstox/sdLhpvcsi.html
http://www.epa.gov/iris, http://www.epa.gov/ncct/dsstox/sdLiristr.html
http://www.epa.gov/opprd001/factsheets
http://www.epa.gov/opprd001/inerts/lists.html
http://www.epa.gov/reg3hwmd/risk/human/index.htm
http: //www.epa.gov/comptox/toxcast/
http://ecb.jrc.it/esis
http: //extoxnetors tedu
http://vm.cfsan.fda.gov/-dms/eafus.html
http://www.epa.gov/ncct/dsstox/sdLfdamdd.html
http://www.hc-sc.gc.ca/ewh-semt/contaminants/exis tsub/categor/_result_substance/index_e.html
http://www.inchem.org/pages/cicads.html
http://www.inchem.org/pages/ehc.html
http://www.inchem.org/pages/iarc.html
http: //www. tera.org
http://wwwdb.mhlw.go.jp/ginc/html/dbl.html
http://mlsmr.glpg.com/MLSMR_HomePage/project.html
http://ntp.niehs.nih.gov/, http://www.epa.gov/ncct/dsstox/sdLntpbsi.html
(http://nihroadmap.nih.gov/molecularlibraries/)
http://ntp.niehs.nih.gov/ntpweb/index.cfm7objectid-035E5806-F735-FE81-FF769DFE5509AFOA
http://www.chem.unep.ch/irptc/sids/OECDSIDS/indexcasnumb.htm
http: //pubchem.ncbi.nlm.nih.gov
http://toxnet.nlm.nih.gov
http://www.inchem.org/documents/pds/pdsother/class.pdf
Hazards; the Cancer Potency Database (CPDB) (Richard et al., 2006);
the EPA Fat  Head Minnow Database (Russom et al., 2007); the FDA
Maximum Daily Dose Database (Matthews et al., 2004); and IRIS
(Integrated Risk Information System) (Richard et al., 2007). With the
exceptions of RBC and the WHO pesticide data, the data for these sets
are taken from the DSSTox database. Web site URLs for all of the data
sources used for this analysis are given in Table 4.
   The category  of in vivo toxicology (summary calls)  describes
sources where  experts have reviewed the toxicology  literature and
have  made a definitive statement  about a particular  chemical and
endpoint,  for  instances labeling a chemical as a proven  human
carcinogen.  Data  sources  for  this  category  are  California EPA
Determination of Cancer and Developmental  Risks (Proposition 65);
CERCLA  Priority List of Hazardous  Substances; the FDA Everything
Added to  Food in  the United States  List (EAFUS); Health  Canada
Priority Substance Lists;  EPA OPP Inert (other) Pesticide Ingredients
categories; NTP llth Report on Carcinogens (RoC); and the Disinfec-
tion By-products Database (Woo et  al., 2007).
   Data sources included under Toxicity Summary Reports on  the
Web are Cancer Potency Database; National Toxicology Program (NTP)
reports (Burch et al., 2007); IRIS; EPA HPV Information System; CDC
Agency for Toxic Substances and Disease Registry (ATSDR); Center for
the Evaluation of Risks to Human Reproduction (CERHR); DrugBank;
EPA Pesticide Fact  Sheets; EXTOXNET Pesticide Information Profiles;
INCHEM  Concise  International Chemical  Assessment  Documents
(CICAD); INCHEM Environmental Health Criteria Monographs (EHC);
INCHEM International Agency for  Research on Cancer (IARC); ITER
TERA Risk Assessments; Ministry of Health Labor and Welfare (Japan)
Risk Assessments;  NTP llth Report on Carcinogens (RoC); and  OECD
Screening Information Data Sets (SIDS) for High Volume Chemicals;
         ESIS (European  chemical  Substances  Information  System)  and  its
         subsets  ESIS HPV, ESIS  LPV  (low  production  volume);  ESIS PBT
         (Persistent  Bioaccumulating Toxins), and  ESIS ORATS (Online Eur-
         opean Risk Assessment Tracking System). Note that some sources of
         data are included in multiple assay categories.
            Table 5 summarizes the amount of toxicology information that we
         have currently captured in ACToR, selected by assay categories for the
         set of 11,139 environmental chemicals being analyzed. About half of
         these chemicals  have some publicly available toxicology data within
         the sets of  information we have currently compiled. Primary in vivo
         toxicology data is available for 1447 chemicals (13%), and secondary in
         vivo toxicology data is available for a total of 1405 chemicals (13%). A
         total  of 5205 chemicals (47%)  have one or more summary in vivo
         toxicity calls or  determinations, which are derived by experts who
         have curated data from the primary scientific literature. Finally 5244
         chemicals (47%)  have  one or more  summary text reports on chemical
         toxicity available on the web. However, many of these (especially from
         the ESIS LPV list) simply state that no hazard  or toxicology information
         is available  for that chemical. We emphasize once again that these are
         conservative numbers as there are still large collections of data  yet to
         Table 5
         (Number of chemicals)x(number of assays) in ACToR for the 11,139 environmental
         chemicals being analyzed

         Assays                              0     1     2    3    4    5    >5
         In vivo toxicology (study listing primary)   9692  686  341  85   42  51  242
         In vivo toxicology (tabular secondary)      9734  861  240 163   91  43    7
         In vivo toxicology (summary calls)        5934  2776  950 419  258  184  618
         In vivo toxicology (summary report via URL)  5895  2706 1121 573  344  203  297
                                            Previous
    TOC

-------
12
                                       R.Judson et al. / Toxicology and Applied Pharmacology 233 (2008) 7-13
Table 6
Number of the ToxCast Phase 1308 chemicals for which data is captured in ACToR from
primary guideline studies or from IRIS assessments for key areas of toxicology
Phenotype
Acute toxicology
Subchronic toxicology
Chronic toxicology
Carcinogenicity
Developmental toxicology
Reproductive toxicology
Immuno-toxicology
Genotoxicity
Neuro-toxicology
ToxRefDB
0
235
274
263
274
251
0
0
0
IRIS
126
0
126
126
126
126
126
126
126
NTP
0
0
38
38
4
0
1
60
0
Total
126
235
291
285
290
Til
126
147
126
% Coverage
41
77
95
93
95
91
41
78
41
be compiled and loaded into the database. The bottom line, though, is
that there  is little detailed in vivo toxicology information  for the
majority of these environmental chemicals.
   The EPA ToxCast program is a major driver of the development of
the ACToR system. The goal of ToxCast is to develop and test methods
for chemical screening and prioritization by linking the results of in
vitro assays to in vivo toxicity data (Dix et al., 2007). The most rigorous
chemical toxicity testing data are derived from whole animal human
health guideline studies, many of  which are  being captured in
ToxRefDB and ACToR, initially for a set of 308 unique chemicals that
are being used  in Phase 1 of ToxCast. OPP-required guideline studies
and NTP studies are the primary source of the in vivo data that will be
used in ToxCast. A secondary source is the data from IRIS assessments.
Table 6 shows the number of chemicals in the ToxCast Phase 1 set of
308 that have data from OPP guideline studies, from NTP or from IRIS
for a set of key areas  including acute  toxicity, subchronic toxicity,
chronic toxicity, carcinogenicity, developmental toxicity, reproductive
toxicity, immunotoxicity, neurotoxicity  and genotoxicity. As one can
see, there is currently good coverage for many of these areas, but
several will require searches through other ACToR-catalogued data
collections in order to build the complete analysis data set.

Discussion

   This  paper  briefly describes  ACToR (Aggregated Computational
Toxicology Resource), which is a set of linked  databases and analysis
tools  that aggregate a large  number of data sets  of relevance to
environmental chemicals and toxicology. The utility of the system was
illustrated with  an example showing the  amount of data available from
multiple sources that can be used for developing training and validation
sets for high-throughput chemical screening and prioritization efforts.
   ACToR is not alone in its goal of aggregating large sets of chemical
structure and assay  data. PubChem is  the largest  effort currently
available, with  information on more than 10 M unique chemical
compounds. PubChem currently focuses on aggregating data  from in
vitro  HTS assays as  the primary  data repository  for the MLSCN.
PubChem allows more generalized types of assay data to be submitted
and displayed, but their query engine is not tailored to the types of
custom toxicology-based queries needed for our purposes. However,
their underlying data model maps easily into the ACToR application
and serves as a  useful model for our internal data organization. This
has allowed us to import all of the PubChem data and easily integrate
it with other data sources. Another important comparison  is with
TOXNET which  is a collection of multiple data  sources covering many
aspects of chemical toxicity. TOXNET has a common search engine that
allows the user to easily find data from multiple sources. However, it is
a closed system which does not allow a  user to pull together datasets
that are useful for computational purposes. One unique aspect of the
ACToR system is that it is pulling together the data from PubChem
(focused on chemical  structure  and HTS in  vitro assay data) and
TOXNET (focused on in vivo toxicology data) and combining it in a way
that it can be used for computational analysis. We are in the process of
  extracting selected tabular data from TOXNET to include directly into
  the ACToR database.
     eChemPortal is an Organization for Economic Co-operation and
  Development (OECD) effort very similar to ACToR. It is aggregating
  information on HPVs and pesticides among others. eChemPortal
  currently contains links to 7 large database systems, some of which
  contain what  in ACToR are  multiple  individual  databases (e.g.,
  1NCHEM contains 11  individual databases).  Unlike  eChemPortal,
  which provides links to web  pages for the component databases,
  ACToR extracts  tabular data from the individual sources and makes it
  searchable  in an aggregated fashion. A system called Vitic is being
  developed  as  a  collaboration  between 1UCLID and  a  number of
  pharmaceutical companies with the goal  of being an international
  toxicology  information center (Judson et al., 2005). Finally, the
  European substances Information System or ESIS provides links to a
  number of databases  including  EPA  HPV,  1UCL1D and E1NECS
  (European  INventory  of Existing Commercial chemical Substances).
  The CEBS (Chemical Effects in Biological Systems) project at the N1EHS
  is  constructing a multi-domain information repository to hold the
  detailed results and summaries of in vivo and in  vitro  toxicology
  experiments (Waters et al.,  2003).
     We  have made use of several  reviews  of the  toxicology data
  landscape to select data collections to be included in ACToR. Yang et al.
  have recently published two such reviews (Yang et  al., 2006a; Yang
  et al., 2006b).  In 2001 and 2002,  a pair of review collections  was
  published surveying the landscape of toxicity data available on the
  internet (Brinkhuis, 2001; Poore et al., 2001; Felsot, 2002; Junghans
  et al.,  2002;  Patterson et  al.,  2002;  Polifka and Faustman,  2002;
  Russom, 2002;  Winter, 2002; Wolfgang and Johnson, 2002; Young,
  2002; Richard and Williams, 2003).
     We foresee  several key  uses for ACToR. One is the derivation of
  training and validation data sets  for ToxCast  and other chemical
  screening and prioritization efforts. The second is to serve as a unique
  resource for researchers developing fully computational models linking
  chemical structure with in  vitro and in vivo assays.  Third, this large
  structure-searchable database can be a valuable resource for reviewers
  within the EPA and other regulatory agencies who are examining new
  chemicals submitted for marketing approval. Reviewers can use the
  system to search for structural analogs of the novel compounds and, if
  available, easily locate potentially informative in vitro bioassay and in
  vivo toxicology  data on related compounds. This can, in turn, inform
  their decisions on the novel chemicals under review.
     ACToR is a  rapidly evolving system.  Future developments  will
  involve bringing in additional  sources of information; extraction of
  tabular data  from on-line text documents linked to  chemicals;
  addition of more curated chemical structures; and the construction
  of a more flexible query  and data export interface. Additionally,
  this system will allow the construction of workflow processes for
  prioritization   of data  capture,  quality control and  chemical
  prioritization scoring.  The system  is currently being used  at the
  EPA to  support the  ToxCast chemical screening and prioritization
  program. We are working  towards  public release of the system in
  2008.

  Conflict of interest disclosure  statement
  The authors declare that they have no conflicts of interest.

  References

  Allanou, R., Hansen, B., van det Bill, Y., 1999. Public availability of data on EU high
     production volume chemicals. http://ecb.jrc.it/documents/Existing-Chemicals/
     PUBLIC_AVAILABILITY_OF_DATA/.
  Applegate, J., Baer, K., 2006. Strategies for closing the data  gap. http://ecb.jrc.it/
     documents/Existing-Chemicals/PUBLIC_AVAILABILITY_OF_DATA/.
  Austin, C.P., Brady, L.S., Insel, T.R., Collins, F.S., 2004. NIH molecular libraries initiative.
     Science 306,1138-1139.
  Birnbaum, L.S., Staskal, D.F., Diliberto, J.J., 2003.  Health effects of polybrominated
     dibenzo-p-dioxins (PBDDs) and dibenzofurans (PBDFs). Environ. Int 29, 855-860.
                                           Previous
TOC

-------
                                                 R. Judsori et al. / Toxicology and Applied Pharmacology 233 (2008) 7-13
                                                                                                                                                                13
Brinkhuis, R.P., 2001. Toxicology information from US government agencies. Toxicology
    157, 25-49.
Burch, J., Eastin, W.C., Bowden, B., Wolf, MA, Richard, A. M., 2007. DSSTox national
    toxicology program bioassay on-line database structure-index locator file: SDF file
    and documentation, www.epa.gov/ncct/dsstox/.
Dix, D.J., Houck, K.A., Martin, M.T., Richard, A.M., Setzer, R.W., Kavlock, R.J., 2007. The
    ToxCast program for prioritizing toxicity testing  of environmental  chemicals.
    Toxicol. Sci. 95, 5-12.
EPA, 1998. Chemical hazard data availability study. Office of pollution prevention and
    toxics, http://www.epa.gov/hpv/pubs/general/hazchem.pdf.
Felsot, A.S., 2002. WEB resources for pesticide toxicology, environmental chemistry, and
    policy: a utilitarian perspective. Toxicology 173,153-166.
Guth, J., Denison, R., Saas, J., 2005. Background  paper  for  reform no. 5 of  the
    Louisville charter for safer chemicals:  require comprehensive safety data for all
    chemicals. http://www.louisvillecharter.org/downloads/CharterBkgrdPaper5.pdf.
Judson, P.N., Cooke, P.A., Doerrer, N.G., Greene, N., Hanzlik, R.P., Hardy, C, Hartmann, A.,
    Hinchliffe, D.,  Holder, J., Muller, L, Steger-Hartmann, T, Rothfuss, A., Smith,  M.,
    Thomas, K., Vessey, J.D., Zeiger, E., 2005. Towards the creation of an international
    toxicology information centre. Toxicology 213,117-128.
Judson, R.,  Dix, D.J., Houck, K., Richard, A.M., Martin, M.T., Kavlock, R.J., Dellarce, V.,
    Holderman, T, Tan,  S., Carpenter, T, Smith, E., In preparation. The toxicity data
    landscape for environmental chemicals.
Junghans, T.B., Sevin, I.E., lonin, B., Seifried, H., 2002. Cancer information resources:
    digital and online sources.  Toxicology 173,13-34.
Krewski, D.D., Acosta, J., Anderson, M., Anderson, H., Ill, J.B., Boekelheide, K., Brent, R.,
    Charnley, G., Cheung, V., Green, S., Kelsey, K., Kervliet, N., Li, A., McCray, L, Meyer, 0.,
    Patterson, D.R., Pennie, W, Scala, R., Solomon, G., Stephens, M., J Yager, J., Zeize, L,
    2007. Toxicity testing in the twenty-first century: a vision and a strategy. National
    Academies Press, Washington D.C.
Martin, M.T., Houck, K.A., McLaurin, K., Richard, A.M., Dix, D.J., 2007. Linking regulatory
    toxicological information  on  environmental chemicals with high-throughput
    screening (HTS) and genomic  data. The Toxicologist CD—An official Journal of the
    Society of Toxicology 96, 219-220.
Matthews, E.J., Kruhlak, N.L, Weaver, J.L, Benz, R.D., Contrera, J.F., 2004. Assessment of
    the health effects of chemicals in humans: II. Construction of an adverse effects
    database  for QSAR modeling. Curr. Drug. Discov. Technol. 1, 243-254.
Muir, D.C., Howard, PH., 2006.  Are  there other  persistent organic pollutants? A
    challenge for environmental chemists. Environ. Sci. Technol. 40, 7157-7166.
     Patterson, J., Hakkinen, P.J., Wullenweber, A.E., 2002. Human health risk assessment:
         selected Internet and world wide web resources. Toxicology 173,123-143.
     Polifka, J.E., Faustman, E.M., 2002. Developmental toxicity: web resources for evaluating
         risk in humans. Toxicology 173, 35-65.
     Poore, L.M., King, G., Stefanik,  K.,  2001.  Toxicology information resources  at  the
         Environmental Protection Agency. Toxicology 157,11-23.
     Richard, A.M.,  Williams, C.R., 2002. Distributed structure-searchable toxicity (DSSTox)
         public database network: a proposal. Mutat. Res. 499, 27-52.
     Richard, A., Williams, C., 2003. Public Sources of mutagenicity and carcinogenicity data:
         use in structure-activity  relationship  models.  In:  Benigni, R. (Ed.),  QSARS of
         mutagens  and carcinogens. CRC Press, New York, pp. 145-173.
     Richard, A.M.,  Gold, L.S., Nicklaus, M.C., 2006. Chemical structure indexing of toxicity
         data on the internet: moving toward a flat world. Curr. Opin. Drug. Discov. Devel. 9,
         314-325.
     Richard, A.M., M.A., W, J., B., 2007. DSSTox EPA Integrated Risk Information System
         (IRIS) toxicity review data: SDF file and  documentation.
     Russom,  C.L.,  2002.  Mining environmental toxicology information: web resources.
         Toxicology 173, 75-88.
     Russom, C.L, Williams, C.R., Stewart, T.W, Swank, A.E., Richard, A.M., 2007. DSSTox EPA
         Fathead Minnow Acute Toxicity Database (EPAFHM): SDF files and documentation.
         www.epa.gov/ncct/dsstox/.
     Waters, M., Boorman, G., Bushel, P., Cunningham, M., Irwin, R.,  Merrick, A., Olden, K.,
         Paules, R.,  Selkirk, J., Stasiewicz, S., Weis, B., Van Houten, B., Walker, N., Tennant, R.,
         2003. Systems toxicology and the Chemical Effects in Biological Systems  (CEBS)
         knowledge base. EHP. Toxicogenomics. Ill, 15-28.
     Winter, C.K., 2002. Electronic information resources for food toxicology. Toxicology 173,
         89-96.
     Wolfgang, G.H., Johnson, D.E., 2002. Web resources for drug toxicity. Toxicology 173,
         67-74.
     Woo, Y.T., Williams, C.R., Fields, N., Richard, A.M., 2007. DSSTox EPA Water Disinfection
         By-Products with Carcinogenicity Estimates Database  (DBPCAN):  SDF files and
         documentation.
     Yang, C., Benz, R.D., Cheeseman,  M.A., 2006a. Landscape of current toxicity databases
         and database standards. Curr. Opin. Drug. Discov. Devel. 9,124-133.
     Yang, C., Richard, A.M.,  Cross, K.P., 2006b.  The art of data mining the minefields of
         toxicity databases to link chemistry to biology. Curr. CompuL-Aided. Drug. Dis. 2,
         135-150.
     Young, R.R., 2002. Genetic toxicology: web resources. Toxicology 173,103-121.
                                                   Previous
TOC

-------
Journal of Andrology, Vol. 29, No. 3, May/June 2008
Copyright © American Society of Andrology
                                  of                and                                                   in
a

JULIA S. BARTHOLD,* SUZANNE M. MCCAHAN,* AMAR V. SINGH,t THOMAS B. KNUDSERf
XIAOLI SI,* LIAM CAMPION,* AND ROBERT E. AKINS*
From the * Nemours Biomedical Research and Division of Urology, A.I. duPont Hospital for Children,  Wilmington,
Delaware; and the ^National Center for Computational Toxicology, US Environmental Protection Agency, Research
Triangle Park, .North  Carolina,
ABSTRACT:   Development of the fetal gubernaculum is a prereq-
uisite for testicular descent and dependent on  insulin-like 3 and
androgen, but knowledge of downstream effectors is limited. We
analyzed transcript profiles in gubernaculum and testis to address
changes occurring during normal and abnormal testicular descent in
Long Evans wild-type (wt) and cryptorchid (orl) fetuses. Total RNA
from male wt and orl gubernacula (gestational days [GD] 18-20), wt
female  gubernacula (GD18), and  testis  (GD17 and  19)  was
hybridized to Affymetrix GeneChips, Statistical analysis of temporal,
gender, and  strain-specific differences in  gene expression  was
performed with the use of linear models  analysis with empirical
Bayes statistics and analysis of variance {gubernaculum) and linear
analysis (testis). Overrepresented common gene ontology functional
categories and pathways were identified in  groups of differentially
expressed genes with the  Database for Annotation,  Visualization,
and Integrated Discovery.  Transcript profiles were dynamic in wt
males between GD18-19  and  GD20,  comparatively static in orl
GD18-2G gubernaculum, and similar in wt and orl testis. Functional
analysis of differentially expressed genes in wt and orl gubernaculum
identified categories related to metabolism, cellular biogenesis, small
GTPase-mediated signal transduction, cytoskeleton, muscle devel-
opment, and insulin signaling. Genes involved in androgen receptor
signaling, regulated by androgens, or both were overrepresented in
differentially expressed  gubernaculum  and testis  gene groups.
Quantitative reverse transcription  polymerase chain reaction (RT-
PCR) confirmed differential expression of genes related to muscle
development, including Myog,  Tnnt2,  Fst,  Igf1, IgfbpS, Id2, and
Msx1. These data suggest that the orl mutation results in a primary
gubernacular defect that affects muscle development and cytoskel-
etal function and might alter androgen-regulated pathways.
  Key words:  Gubernaculum,  undescended testis, gene expres-
sion profiling, fetus.
  J Androl 2008;29:352-366
     Cryptorchidism, or undescended testis, is one of the
     most common  congenital anomalies  in  humans,
occurring  in  2%-3%   of  all  boys  (Barthold  and
Gonzalez, 2003). The cause of nonsyndromic crypt or-
chidism,  in  most  cases,  is  unknown;  however,  the
prevalence  of  sporadic  and  familial  nonsyndromic
cryptorchidism  supports  multifactorial  susceptibility
on  the basis of contributions from specific genetic loci
interacting  with  environmental factors, which could
include endocrine-disrupting  chemicals having antian-
drogenic, estrogenic, or  both  effects (Mahood et  al,
2006). Genetic contributions to cryptorchidism are not
well understood, but the anomaly might be inherited in
as many as 25%  of cases, with autosomal dominant
inheritance being  the most common pattern  and  the

  Supported by NTH  grant  P20  RR-020173-01. The authors have
nothing to disclose.
  Correspondence to: Dr Julia Barthold, Division of Urology, A.I.
duPont Hospital for Children. 1600 Rockland Rd, Wilmington, DE
19803 (e-mail: jbarthol@nernours.org).
  Received for publication August 30. 2007; accepted  for publication
December 17, 2007.
  DOI: 10.2164/jandrol. 107.003970
mean hcritability in first-degree male relatives calculated
to be .67 (Czeizel et al, 1981; Elert et al, 2003).
  Completion  of testicular  descent  in mammals is
dependent on the gubernaculum, which is  an append-
age of the anterior  abdominal wall comprising a core
of  mesenchymal  cells  with  associated  extracellular
matrix  and  localized  striated muscle (Radhakrishnan
et al,  1979;  Costa et al,  2002).  In the rat fetus, the
gubernaculum is  visible at gestational day  14  (GDI4)
in both  sexes (Radhakrishnan et al, 1979). The female
gubernaculum contains both mesenchymal and poorly
organized muscle cells, and  further  growth  fails to
occur after GDI6. In males, the gubernaculum enlarges
after  GD16, increases  dramatically in  size  between
GDIS and 20, then becomes exteriorized  by everting
into an  extra-abdominal location around  the  time of
birth   (GD22).  The  mesenchymal  portion  of  the
gubernaculum  disappears,  leaving  an  outer  layer of
muscle, which persists as a sac of cremaster muscle and
surrounds the scrota!  testis. Eversion of the gubernac-
ulum-cremaster complex occurs rapidly, but the mech-
anisms  controlling  its development and  motility are
poorly understood.
                                                       352




-------
Barthold et al   •   Gene Expression in Fetal Gubernaculum and Testis of Cryptorchid Rats
                                                  353
  In  vitro studies of gubernacular development and
phenotypic  analysis  of  cryptorchid  genetic  mouse
models suggest that the testis is  required  for  proper
development  of  the  ipsilateral  gubernaculum  and
implicate secretion of the Leydig  cell  hormones insu-
lin-like  factor  3  (InsO)  and, to a lesser  degree,
testosterone  (Emmen  et al,  2000)  in  gubernacular
development. Targeted deletion of either InslB or Rxfp2
is  associated  with  high  intra-abdominal  testes  in
homozygous male mice and delayed testicular descent in
heterozygotes (Zimmermann et al, 1999; Overbeek et al,
2001). Development of the fetal gubernaculum is femi-
nized in homozygous InsBIRxfp2 mutants (Tomiyama et
al, 2003). By contrast, mice and rats with spontaneous
androgen receptor defects or that have  been exposed to
the antiandrogen flutamide (Spencer et  al, 1991) show a
milder phenotype. Although a model of testicular descent
separates INSL3- and androgen-dependent phases into
distinct  events  (Hutson  and  Hasthorpe, 2005), both
hormones stimulate proliferation of fetal gubernacular
cells.  Moreover,  generalized expression  of both  the
INSL3 receptor RXFP2 (relaxin/insulin-like  family re-
ceptor peptide 2, also known as LGR8 or GREAT) and
the androgen receptor is present in the fetal gubernaculum
(Emmen et al,  2000; Scott et al, 2005).  Canonical InsBI
Rxfp2 signaling  involves the cAMP/protein kinase A
(PKA) pathway  via  activation  of  the  cAMP response
element  (CRE; Halls  et al,  2005),  but  information
regarding downstream effectors is limited.
  The Long Evans orl rat strain is an inbred colony at
high risk for spontaneous cryptorchidism (Mouhadjer et
al, 1989). Approximately two-thirds  of offspring  are
affected, and up to 75% of cases occur unilaterally, with
the left  side  more  frequently affected (unpublished
observations);  overall,   approximately 35%^tt%   of
testes fail to descend (Barthold ct al, 2006). The orl
gubernaculum is reduced in size between GDIS and 20,
but  the   testis  descends  normally  during  this time
(Barthold et al, 2006). By the first day  of life, however,
normal  eversion fails to occur in about  half of orl
gubernacula, and subsequent aberrant lateral migration
occurs with final localization of the ipsilateral testis in
the superficial  inguinal  pouch, anterior  to the rectus
muscle. This is  a unique animal model of cryptorchidism
in that  the phenotype  is  similar  to  that  seen most
commonly in the human population.
  Because of the complexity of genetic pathways and
their interactions that are known in the various models of
cryptorchidism, a  comprehensive  screen of transcript
profiles is useful to address the changes associated with
the gubernaculum and testis in orl rats.  In this study, we
use microarray analysis to study gene expression in the
developing fetal gubernaculum and  testis.  Our data
indicate  that expression of genes  involved  in  energy
Table 1 . Samples

Gubernaculum
wt
W18a-tf
W18a-d
W19a-d
WL20a-d
WR20a-c
orl
O18a-d
O19a-d
OL20a-d
OR20a-c
Testis
wt
W17a-e
W19a-e
orl
O17a-e
O19a-e
used for microarray
Gestational Age, d


18
18
19
20
20

18
19
20
20


17
19

17
19
analysis
Side


Right + left
Left
Left
Left
Right

Left
Left
Left
Right








Litter No,


1
2
3
4
4

5
6
7
7


8
3

9
6
Abbreviations; wt, Long Evans wild-type strain; orl, Long Evans
cryptorchid strain.

a Female (all others male),

pathways and in the functionally related categories of
muscle  development,  cytoskeleton  organization  and
biogenesis, and small GTPase-mediated signal transduc-
tion is altered in normal and orl fetuses  during prenatal
growth of the gubernaculum. By contrast, we found fewer
strain-specific differences in fetal testicular gene expres-
sion, suggesting that genetic variants with gubernaculum-
specific effects predispose orl rats to  cryptorchidism.
            ancf

Animals
Breeding colonies of orl and wt rats were maintained in a reverse
light cycle room following protocols approved by the institu-
tional Animal Care and Use Committee. Female estrus cycles
were assessed with vaginal smears, and animals were mated in
the afternoon  to generate timed pregnancies,  which  were
identified by  visualization of sperm in vaginal  smears  the
following day,  defined  as  GDI.  Pregnant females  were
euthanatized via CC>2 inhalation during the  late morning of
GDI7-20. The caudal half  of each fetus was  immediately
collected  in RNAlater  (Applied  Biosystems,  Foster  City,
California) and stored at 4°C for at least 24 hours to facilitate
microdjssection. Fetal testes and gubernacula were separated by
microdissection, and samples were collected as noted in Table 1.
For GDIS 19, left-sided samples were used because of a higher
incidence of left  cryptorchidism  observed in the  orl strain
(unpublished observations) and the possibility of intrinsic left-
right  asymmetry  in  males. At  GD20,  both left  and  right
gubernacula were removed from 3 fetuses. In females, left and
right gubernacula from the same fetus were pooled.




-------
354
                Journal of Andrology   •   May/June 2008
RNA Extraction
Total RNA was purified from single gubernacula or testes with
the RNeasy Mini Kit (Qiagen, Valencia,  California) and the
RNase-free DNase Set (Qiagen).  RNA was quantified on the
basis of A26o with the  use of an ND-1000 Ultraviolet-visible
spectrophotometer  (NanoDrop  Technologies,  Wilmington,
Delaware). Overall integrity of the total RNA was verified with
a  2100  Bioanalyzer  (Agilent  Technologies,  Santa  Clara,
California) before processing for microarrays to assure consis-
tency across samples.

Microarray Sample Processing
RNA samples (Table 1) were  assessed with  Affymetrix Rat
Expression Array 230A (Affymetrix, Santa Clara, California).
This  microarray  contains   15 866  probe sets representing
approximately 10 500 genes  and  2700 ESTs.  Some genes are
represented by more than 1 probe set. Whereas NCBI Entrez
Gene lists approximately 38 000 genes for Rattus norvegicus, the
230A GcneChips  interogates nearly one-third  of known rat
genes. For testes, 1  p,g of total RNA from single organs was
labeled with the One-Cycle cDNA Synthesis Kit (Affymetrix).
This involved cDNA synthesis followed by in vitro transcription
with T7-RNA polymcrase and biotinylatcd nucleotidc. Because
of the smaller yield  of RNA  from gubernacula, 30 ng of total
RNA from single organs was amplified and labeled with the
GeneChip Two-Cycle cDNA Synthesis Kit (Affymetrix). After
cDNA synthesis  and  in vitro transcription  with T7-RNA
polymerase,  the resulting cRNA was used as a template for a
second round of cDNA synthesis, which  was followed  by in
vitro  transcription in the presence of biotinylated nucleotide.
Biotinylated cRNA was hybridized to 230A GeneChips. Arrays
were washed, stained with strepavidin phycoerythrin conjugate,
and scanned at DuPont Haskell Laboratory for  Health and
Environmental Sciences in a Hybridization Oven 640 (Affyme-
trix), GeneChip Fluidics Station (Affymetrix), and GeneArray
Scanner (Affymetrix) with Affymetrix protocols and reagents.
Standard Affymetrix quality  control measures were consistent
across all hybridizations of the same tissue type.

Analysis of Microarray Data
As a measure of the quality of hybridization, the raw and
normalized probe intensity distributions for each GeneChip
were determined with histogram  plots within the AffylmGUI
interface  for the limma package of Bioconductor  (Wettenhall
et al, 2006). Representations of 5' and 3' regions of transcripts
in the labeled cRNA were examined with the Affy package of
Bioconductor to  verify  consistency within  tissue  types.
Expression values were calculated with the MAS 5 algorithm
from raw probe intensities using GCOS (Affymetrix). Expres-
sion values were also calculated with the GC robust multiarray
average (GC-RMA) algorithm (Wu et  al, 2004) from raw
probe intensities within AffylmGUI. All further analyses were
performed  with  the   GC-RMA expression   values  unless
otherwise noted. Global gene expression  patterns and overall
variability  between  samples were  examined by  principal
component analysis (PCA), which  was performed  in  MeV
version 3.1 (Saeed et al,  2003).  Two methods were used to
identify differentially expressed  genes: 1)  the LIMMA linear
models approach with the empirical Bayes  statistic (B & 3) and
the multiple testing adjustment method of Holm, used within
AffylmGUI (referred to as linear analysis), and 2) calculation of
MASS  expression ratios  using GDIS  wt  as  a reference
denominator followed by the scaling of Iog2-transformed values
to a median of 0.0 and standard deviation of 0.50 with statistical
analysis in GeneSpring version 7.2 using analysis of variance
with a Benjamini and Hochberg false discovery rate of .001
(referred to as reference denominator method). Differentially
expressed genes were filtered for a GC-RMA average expression
value of greater than or equal to 50 in at least 1 sample group in
the analysis and then separated into  groups according to the
Iog2 ratio of average  expression values for groups of samples.
Some groups of genes  were further separated with K-means
clustering in MeV. Plots of expression profiles were created with
the statistical package  R  (http://www.r-projcct.org/).  Gene
groups with mean  expression  levels as well as the  entire data
set are  available  via  Accession  number GSE7755  at  the
Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo).
Groups of differentially expressed probe  sets were examined for
statistical overrepresentation of Gene Ontology (GO) Biological
Function categories and biological pathways as defined in the
Kyoto  Encyclopedia  of Genes and Genomes (KEGG; http://
www.genome.jp/kegg/)  with the Database for Annotation,
Visualization, and Integrated Discovery (DAVID) 2007 (Dennis
et al, 2003; http://david.abcc.ncifcrf.gov/).

Real-Time Reverse Transcription Polymerase
Chain Reaction
Real-time reverse transcription polymerase chain reaction (RT-
PCR) was used to validate trends in selected array-derived data
from gubernaculum. cDNA was synthesized from 150 ng of total
RNA (n > 6 samples per group) with the High-Capacity cDNA
Archive Kit (Applied Biosystcms). Amplifications were  per-
formed in triplicate using TaqMan Gene Expression Assays (see
Table 2 for details) and TaqMan Universal PCR Master Mix in
an ABI Prism 7900HT. Levels of target mRNA expression were
determined by the 2~~AACT method (Livak and Schmittgen, 2001)
with tripeptidyl peptidase  2 (Tpp2) as  control and  total rat
embryonic RNA (Agilent Technologies) as calibrator. Nonpara-
mctric  statistical analyses of differences between strains were
performed in SPSS (version 14.0; SPSS Tnc) as indicated.
Global Gene Expression
PCA analysis using all probe sets on the RAE 230A array
was  performed  to  examine  global  trends  in  gene
expression  for  gubernaculum and testis samples  (Fig-
ure 1). Gubernaculum samples (Table 1) clustered into 4
main groups: 1) GDIS females, 2) GDIS	19 wt males, 3)
GD20 wt males, and 4) 13 of the  15 orl samples. The 2
remaining orl samples, OlSa and O19d, clustered  with
GD20  and GDI8-19  wt  males,  respectively. These
samples were considered outliers and were excluded  from




-------
Barthold et al   •   Gene Expression in Fetal Gubernaculum and Testis of Cryptorchid Rats
                                                        355
Table 2. Real-time
ABI Assay
Rn00432087_m1
Rn00673944_m1
Rn00518185_m1
Rn01 494289 ml
Rn01495280_m1
Rn00710306_m1
Rn00563116_m1
Rn00667535 ml
Rn00567418_m1
Rn01399583_m1
Rn01438455_m1
Rn01483694_m1
Rn01 43741 0_m1
Rn00584577_m1
reverse transcription polymerase
Gene Symbol
Bmp4
Des
Dusp6
Fst
Id2
Igf1
IgfbpS
Msx1
Myog
Nfkbl
Olfml
Tnnt2
Tpp2
Wnt4
chain reaction assays
Gene Name
Bone morphogenetic protein 4
Desmin
Dual-specificity phosphatase 6
Follistatin
Inhibitor of DMA binding 2
Insulin-like growth factor 1
Insulin-like growth factor-binding protein 5
Homeobox, msh-like 1
Myogenin
Nuclear factor of kappa light chain gene enhancer in B-cells 1 , p105
Olfactomedin 1
Troponin T2, cardiac
Tripeptidyl peptidase II
Wingless-related MMTV integration site 4
Abbreviation: MMTV, mouse mammary tumor virus.

further  analysis. Linear  analysis of left vs right GD20
samples failed to reveal significant differences; however,
further analyses were limited to left-sided samples. Testis
samples  clustered  into  GDI?  and  19  groups  with
intermixing of the  2 strains at each time  point. After
initial global data analysis, our overall strategy was to 1)
identify expression profiles of developmentally regulated
genes in a defined window of gestation in wt males, 2)
identify genes  differentially expressed  between strains,
and 3) analyze these groups of genes functionally.

Expression Profiles in Normal and Cryptorchid Strains
Gubernaculum—We studied changes in gene expression
across normal development of the  gubernaculum by
comparison of the  3 male wt groups (12 samples). The
                        W18&19.-,   . W20
                                 '     '"
       gene  expression  profile  in  GDI8-19  samples  was
       remarkably similar, with only 11 differentially expressed
       genes identified. By contrast, comparison of GDIS and
       20 wt samples  returned 1023  probe sets  that were
       differentially regulated, suggesting a  major  switch  in
       gene expression between GD18-19 and GD20. With the
       use of K-means clustering, we  further classified these
       genes into 2 lists with declining expression (n = 371)  or
       increasing expression (n = 652)  after GDIS  (Figure 2).
       The  expression  profiles show marked  sexual dimor-
       phism of these genes at  GDIS, suggesting  that they
       participate in male-specific  gubernacular development.
       Surprisingly, by GD20, the expression  profile in normal
       males  approximates  that  of GDIS  females,  as  we
       observed in  PCA analysis (Figure 1).
                                                             B
                                                                     GD17
                                   GD19
Figure 1. Principal component analysis (PCA) of all gubernacular (A) and testicular (B) samples for all probe sets on the Affymetrix Rat 230A
GeneChip. Black and gray dots represent wild-type (wt) and Cryptorchid (orl) samples, respectively. Labels refer to samples within adjacent
dotted circles except as noted. (A) For gubernaculum, all gestational day (GD)18 and 19 wt (W) samples and all but 2 orl (O) samples were
noted to cluster together. The 2 outlier orl samples cluster between GD20 wt and GD18 female (F) samples (gray dots), which are closely
aligned. (B) For testis, GD17 and 19 samples cluster together with no clear separation between strains.
                               Previous
TOC

-------
356
                                                         Journal of Andrology  •  May/June 2008
         A.
in
O
w *>..
(/) 0
. ?"i

0
CM
Figure 2. Expression profiles of genes differentially expressed
between  GD18 and 20 in wt gubernaculum (n = 1023). Genes are
classified on the basis of K-means  clustering: (A) Increasing
expression between GD18 and 20 (n = 652) and (B) decreasing
expression between GD18 and 20 (n = 371). Error bars show mean
Z-scores ±  SD for probe sets in  each subgroup.  Vertical  lines
separate the samples by gender (F indicates wt female; W, wt male)
and gestational age in days (18-20).
  Because of minimal differences between GDIS and
20, we analyzed strain-specific differences at GDIS and
20 only. Linear analysis  identified 2401 probe sets that
were differentially expressed between wt and orl males at
these time points.  The reference denominator method,
which takes into account all time-, gender-, and strain-
specific  differences  in  expression  relative to  GDIS,
returned  3707  probe  sets.  After  filtering  for  low
                                          expression, these lists were combined to generate a final
                                          list of 3589  probe  sets associated with  wt versus  orl
                                          differences.
                                            Log2 ratio of expression values was used to divide  the
                                          final list into groups  of genes with  either higher (orl-
                                          high, n = 1681) or lower (orl-low, n = 1908) expression
                                          in orl relative to wt samples. With the use of K-means
                                          clustering, we identified 3 major expression profiles in
                                          each of these groups  (Figure 3). The observed pattern
                                          suggests  little  change in  expression of differentially
                                          regulated genes in orl fetuses between GDIS and  20.
                                          Interestingly, a tendency toward reciprocal patterns of
                                          female, wt, and orl expression is seen in the correspond-
                                          ing clusters from each group. Two clusters (cluster 1 of
                                          each  group) show  the   greatest  differences between
                                          normal males and females at GDIS. It  is noteworthy
                                          that the  corresponding orl expression profiles in these
                                          clusters are feminized.
                                            Testis—Comparable linear analysis of GDI7 and 19
                                          wt  samples  identified fewer (n =  818) differentially
                                          regulated genes at these 2 time points in testis compared
                                          with gubernaculum. More significantly, few genes were
                                          differentially expressed in the testicular  samples  of wt
                                          compared with orl  fetuses.  Linear  analysis  of  strain
                                          differences at GDI7 or 19 yielded only 349 differentially
                                          expressed probe sets  when combining the 2 lists.  Of
                                          these, expression was higher in wt males in 248 sets and
                                          lower in  101.

                                          Functional Annotation  of Differentially Expressed Genes
                                          We performed functional analysis of groups of genes
                                          using DAVID and  analyzed common  GO  Biological
                                          Process annotations.  Two groups of probe  sets  were
                                          analyzed  separately: genes differentially  expressed  be-
         B.
            Q.

            fi^
                        Cluster 1
                                      Cluster 2
            c ^
            o
            '«
            w o-
            X
            UJ
     ?tt.
              ?iTS
+>*
Ti^:
                         r*?'%
                                                              ••••^•«

ft*f?t
00  00 O)   O   CO  O)
T-  i- i-   CJ   T-i-
LJ-  o o   o   55
                                      O
                                      OJ
                          CO  00 O)  O  CO  O)   O
                          i—  i— i—  CNJ  i—  i—   OJ
                          u-  o o  o  55   5
                                                               Cluster 3

if.


t"*+
-----
ii*




t?;
.;;




:"iT
ft




tffftif
' ii !
/t-



S^
jl_
""*"



•jj

p


^

                            CO CO O)   O   00  O)    O
                            i— i— i—   OJ   i—  i—    C«l
                            u- oo   o   55    5
Figure 3. Expression profiles of differentially expressed gene groups in orl and wt gubernaculum are shown: (A) Genes with higher expression
in orl (orl-high) and (B) genes with lower expression in orl (orl-low). Profiles were generated by K-means clustering of all samples in the 2 gene
groups, and the 3 major resultant clusters for each gene group are shown. Error bars show mean Z-scores ± SD for probe sets in orl-high
clusters 1 (n = 577), 2 (n = 502), and 3 (n = 448) and for orl-low clusters 1 (n = 604), 2 (n = 851), and 3 (n = 451). Vertical lines separate the
samples by gender and strain (F indicates wt female; O, orl male; W, wt male) and gestational age in days (18-20).
                                   Previous
                                                        Next

-------
Barthold et al   •   Gene Expression in Fetal Gubernaculum and Testis of Cryptorchid Rats
                                                  357
Table 3. Selected functional gene ontology (GO) biological process annotations represented by differentially expressed genes3-
                                                      Gubernaculum
                                                                                            Testis
GO Biological Process
Metabolism
Cell organization and biogenesis
Biosynthesis
Cytoskeleton organization and biogenesis
Cell cycle
Localization
Transport
Muscle development
Generation of precursor metabolites and energy
Actin filament-based process
Small GTPase-mediated signal transduction
Cell division
Muscle contraction
Apoptosis
Phosphorylation
Transcription from RNA polymerase II promoter
Growth
Cellular morphogenesis
Development
No, of Genes
1205
392
298
104
141
502
433
44
126
48
63
34
41
121
122
127
53
91
379
P Value No, of Genes P Value
3.8 x 10~54
1.2 X 10~27
4,7 X 10~22
3.4 X 1G~10
2,3 X 10~9
4.4 X 10~8
1.8 x 10~7
2.0 X 1Q~6 10 4.9 X 1Q~4
5.7 x 10 6
1.6 x 10~5
2,1 X 10'~5
2.6 X 1Q'~5
6.7 X 10 5
1.3 x 10~4
1.6 x 10~4
2,8 x 10~4
4.0 X 10~4 H 2.7 X 10~3
5.9 X 10 4 14 2.1 X 10 2
6.3 x 10~3 60 3.3 x 1Q~5
' Modified Fisher exact P values are shown for selected overrepresented GO biological process terms identified by analysis of differentially
 expressed genes in fetal gubernaculum (n = 3428 DAVID IDs) and testis (n = 347 DAVID IDs) with the use of Database for Annotation,
 Visualization, and Integrated Discovery (DAVID; http://niaid.abcc.ncifcrf.gov/home.jsp).
tween wt and orl in gubernaculum (n = 3589; converted
to 3428 unique DAVID IDs) and  the group differen-
tially expressed between wt and orl in testis (n = 349,
converted to 347 DAVID IDs). The analyses from each
group were compared for recurring functional themes.
  Selected  nonredundant  categories  are   shown  in
Table 3, and expression data  for selected genes in these
groups are shown in Table 4. Multiple categories related
to general physiologic processes such as metabolism and
biosynthesis were identified. When analyzing all genes
differentially expressed between wt and orl, small GTPase
signal transduction was the most significantly represented
signaling pathway in GO. We also identified categories
related  to small  GTPase  signaling,  including  muscle
development and cytoskeletal organization and biogen-
esis. These data are consistent with the known morpho-
logical changes occurring in  the  gubernaculum during
this  time  frame,  including  significant  growth  and
maturation of muscle (Radhakrishnan et al, 1979; Cain
et al, 1995).  Comparatively few GO  annotations were
Table 4. Selected genes differentially expressed in wt and orl fetal gubernaculunf
Probe Set ID
Muscle development
1368725_at
1368302_at
1387232_at
1374904__at
1367652__at
1375518__at
1388185_at
1386993_at
1387181_at
1388335_at
1398248_s_at
1 36831 0_at
1367600_at
1369928_at
Gene Title"

Jagged 1
Homeobox, rnsh-Iike 1
Bone morphogenetic protein 4
Sine oculis homeobox homolog 1 (drosophila)
Insulin-like growth factor-binding protein 3
Titin
Retinoblastoma 1
Myosin, heavy polypeptide 7, cardiac muscle, beta
Myogenic factor 6
Transgelin 2
Myosin, heavy polypeptide 6, cardiac muscle, alpha
Myogenin
Desmin
Actin, alpha 1, skeletal muscle0
Gene Symbol

Jagl
Msx1
Bmp4
Six1
Igfpb3
Ttn
Rb1
Myh7
Myf6
Tagln2
Myh6
Myog
Des
Actal
Log2(orl/wt)

3.96
2.44
1.9
1.88
1.94
1.36
-0.81
	 1.01
	 1.08
-1.14
-1.2
	 1.4
	 1,63
-1.7




-------
358
Journal of Andrology  •  May/June 2008
Table 4, Continued
Probe Set ID
1372569_at
1369375_a_at
1 367S70__at
1388298__at
1387348_at
1367628_at
Muscle contraction
1 36761 7_at
1367592_at
1370857_at
1367572_at
1370198_at
1368838__at
1368724__a_at
1371239_s_at
1387787_at
Cytoskeleton organization and
1367654__at
1387080_at
1372692_at
1370875_at
1387227_at
1383822__at
1368893__at
1371885_at
1 375881 _at
1367605_at
1399105_at
1374523_at
138846Q__at
1370184__at
13989QO__at
Small GTPase-mediated signal
1 37321 5_at
1389292__at
1367475_at
1374239_at
1388892_at
138873Q_at
1 36821 7_at
1 38971 0_at
1371255_at
1368096_at
1 372521 _at
1 37251 3_at
1398838_at

1 37706 1_at
1 37388 1_at

1370168_at
1 36804 1_at
1370130_at
1388729__at
Development
1375532_at
1 368641 _at
1377064_at
Gene Titleb
Four and a half LIM domains 3 (predicted)
Calpain 3
Transgelin
Myosin, light polypeptide 9, regulatory (predicted)
Insulin-like growth factor-binding protein 5
Lectin, galactose-binding, soluble 1

Aldolase A
Troponin T2, cardiac
Smooth muscle alpha-actin
Myosin, light polypeptide 3
Triadin
Tropomyosin 4
Tropomyosin 1 , alpha
Tropomyosin 3, gamma
Myosin, light polypeptide 2
biogenesis
Fat tumor suppressor homolog (Drosophi/a)
Chondroitin sulfate proteoglycan 6
Tyrosine kinase, non-receptor, 2
Villin 2
Wiskott-Aldrich syndrome protein-interacting protein
Bicaudal D homolog 2 (Drosophi/a)
CAP, adenylate cyclase-associated protein, 2 (yeast)
Cytoskeleton-associated protein 1 (predicted)
Destrin
Profilin 1
Bridging integrator 3
6-phosphogluconolactonase (predicted)
Capping protein (actin filament), gelsolin-like
Cofilin 1 , non-muscle
Dynactin 3 (predicted)
transduction
Active BCR-related gene (predicted)
RAB18, member RAS oncogene family
Cell division cycle 42
FERM, RhoGEF and pleckstrin domain protein 2 (predicted)
RAB2B, member RAS oncogene family
CDC42 effector protein (Rho GTPase-binding) 4 (predicted)
RalA-binding protein 1
Son of sevenless homolog 1 (Dmsophilaf
Harvey rat sarcoma viral (v-Ha-ras) oncogene homolog
RAB7, member RAS oncogene family-like 1
Rho family GTPase 2
Ras-related C3 botulinum toxin substrate 1
RAB7, member RAS oncogene family RhoGAP involved in
beta-catenin-A/-cadherin and NMDA receptor signaling
(predicted)
Rho, GDP dissociation inhibitor (GDI) beta Tyrosine 3-
monooxygenase/tryptophan S-monooxygenase activation
protein, theta polypeptide
Synaptojanin 2 binding protein
Ras homolog gene family, member A
Harvey rat sarcoma oncogene, subgroup R (predicted)

Inhibitor of DNA binding 2°
Wingless-related MMTV integration site 4
Dual-specificity phosphatase 6
Gene Symbol
Fhl3_predicted
Capn3
Tag/n
My!9_predicted
IgfbpS
Lgals 1

Aldoa
Tnnt2
Acta2
My/3
Trdn
Tpm4
Tpm1
Tpm3
My/2

Path
Cspg6
Tnk2
Vil2
Waspip
Bicd2
Cap2
Ckap 1__predicted
Dstn
Pfn1
Bin3
Pgls__predicted
Capg
cm
Dctn3_predicted

Abr predicted
Rab18
Cdc42
Farp2__predicted
Rab2b
Cdc42ep4_predicted
Ralbpl
Sos1
Hras
Rab7l1
Rnd2
Pad
Rab7

RICS_predicted
Arhgdib

Ywhaq
Synj2bp
RhoA
Rras_j)redicted

Id2
Wnt4
Dusp6
Log2(orl/wt)
-1.99
	 1.29
	 2,4
-2,52
-3,21
4,62

	 1,15
	 1,79
-2,32
-2.33
	 2,35
	 2,54
-2,57
-2,85
3,25

4,31
2,2
2,09
1.98
1.51
1,32
-1,62
-1,98
2,04
2.1
-2.22
-2,26
	 2,26
2,31
-2.75

2,78
2,38
2.12
1.71
1.5
1.28
1.27
0.96
-1.04
-1.58
-1,84
	 1,88
-2

	 2.07
	 2,15

-2,16
2,23
	 3,05
	 3,08

5.15
3.2
2.75




-------
Barthold et al  •   Gene Expression in Fetal Gubernaculum and Testis of Cryptorchid Rats

Table 4.  Continued
359
Probe Set ID
1369008__a__at
1388856_at
1387843_at
1 370221 _at
1390119_at
1376755__at
1370747_at
1372447_at
137Q968_at

1373829__at
1368395__at
1370224_at
1375043_at
1367712__at
1388154__at
1389403__at
1386940_at
Insulin signaling pathway
1376779__at
1386950__at
1398799_at
1367573_at
1368116_a_at
1386888__at
Focal adhesion
1369955_at
1370267_at
1372905_at
1370333__a__at
1367760__at
137Q427__at
1370155_at
1389723_at

1383Q75__at
1388138__at
1371664_at
1387777_at
1386863__at
1368385__a__at
1398836_s_at
1387346_at
Other
1390638__at
1372964__at
139Q355__at
1368509_at
1389670_at
Gene Title"
Olfactotnedin 1
Kit ligand
Follistatin
WNT1 inducible signaling pathway protein 1
Secreted frizzled-related protein 2
Retinoic acid receptor, beta
Fibroblast growth factor 9
Fibroblast growth factor receptor 1d
Nuclear factor of kappa light chain gene enhancer
in b-cells 1, p105
Fibroblast growth factor receptor 2d
Glypican 3d
Signal transducer and activator of transcription 3
FBJ murine osteosarcoma viral oncogene homolog
Tissue inhibitor of metalloproteinase 1
E2F transcription factor 5
Bone morphogenetic protein 7
Tissue inhibitor of metalloproteinase 2

Forkhead box O1 A
Protein phosphatase 1, catalytic subunit, beta isoform
Eukaryotic translation initiation factor 4E
Ribosomal protein S6
Ribosomal protein S6 kinase, polypeptide 1
Eukaryotic translation initiation factor 4E binding protein 1

Procollagen, type V, alpha 1
Glycogen synthase kinase 3 beta
Vinculin (predicted)
Insulin-like growth factor 1
Mitogen-activated protein kinase 1
Platelet-derived growth factor, alpha
Procollagen, type I, alpha 2
Phosphoinositide-3-kinase, regulatory subunit 4, p150
(predicted)
Cyclin D1
Thrombospondin 4
Paxillin
Integrin-linked kinase
Protein phosphatase 1, catalytic subunit, alpha isoform
Growth factor receptor-bound protein 2
Actin, beta
Integrin beta 1 (fibronectin receptor beta)

Similar to Eph receptor A4 (predicted)"
AT-rich interactive domain 5B (MRF1-like) (predicted)06
Ryanodine receptord
Bardet-Biedl syndrome 2 homolog (human)d
Similar to homeobox protein A10 (predicted)8
Gene Symbol
Olfml
Kit!
Fst
Wispl
Sfrp2
Rarb
Fgf9
Fgfrl
Nfkbl

Fgfr2
Gpc3
Stat3
Fos
Timp 1
E2K
Bmp7
Timp2

Foxo 1a
Ppplcb
Eif4e
Rps6
Rps6kb1
Eif4ebp1

ColSal
Gsk3b
Vcl_predicted
Igf1
Map2k1
Pdgfa
Coi1a2
Pik3r4_predicted

Ccndl
Thbs4
Pxn
Ilk
Ppplca
Grb2
Actb
Itgbl

Rgd 1560587_predicted
AridSb pred
Ryr1
Bbs2
Rgd1566402_jiredicted
Log2(orl/wt)
2.57
2.31
2.09
2.04
1.96
1.93
1.57
1.55
1.53

1.26
1.11
1.06
0.9
	 1.07
-1.29
-1.81
	 2.18

1
0.72
-0.77
-1.52
	 2.33
	 2.61

2.34
1.88
1.68
1.38
1.15
1.12
1.03
1.01

0.88
0.73
0.62
	 0.93
	 1.5
-1.68
-2.76
	 2.97

2.46
2.26
1.85
	 1.18
2.1
Abbreviations: GAP, GTPase-activating protein; GDP, guanosine diphosphate; MMTV, mouse mammary tumor virus;  NMDA, /V-methyl-o-
as pa rate.
a Greatest difference in Affymetrix 230A probe set mean expression levels between orl and wt samples at gestational day 18 or 20 expressed
  as log,(orl/wt).
b Bold gene titles indicate that real-time reverse transcription polymerase chain reaction also was performed.
c Probe set with E annotation.
d Gene associated with human cryptorchidism according to Online Mendelian Inheritance in Man (OMIM; http://www.ncbi.nlm.nih.gov/omim/).
e Gene associated with cryptorchidism  in mice.




-------
360
                                                                       Journal of Andrology  •   May/June 2008
Tables. Selected Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways represented by differentially expressed genes*
                                               Gubernaculum
                                                                                       Testis
KEGG Pathway
Ribosome
Oxidative phosphorylation
Proteasome
ATP synthesis
Valine, leucine, and isoleucine degradation
Glycolysis/gluconeogenesis
Fatty acid metabolism
Insulin signaling pathway
Focal adhesion
No, of Genes
41
48
19
19
17
19
16
35
49
P Value No, of Genes P Value
1
4,
1
7,
4,
,8
,1
,9
.5
.2
4.0
4,
1
5,
,2
,7
.0
X
X
X
X
X
X
X
X
X
10
10
10
10
10
10
10
10
10
-17
-9
-9
-6
~4 5 9.6 x 10~3
-3
-3
-2
2 12 6.2 X 10 3
a Modified Fisher exact P values are shown for selected overrepresented KEGG pathway terms identified by analysis of differentially expressed
 genes in fetal gubernaculum (n = 3428 DAVID IDs) and testis {n = 347 DAVID IDs) with the use of DAVID {http://niaid.abcc.ncifcrf.gov/
 home.jsp).

identified in the differentially expressed testis gene group.
However, these include muscle development, which might
indicate selective effects of the mutation in myoid cells.
  WTith the use of DAVID,  we identified overrepresented
KEGG pathways for the 2  groups of genes differentially
expressed in wt and or! gubernaculum and testis. Selected
pathways  are shown  in Table 5, with  the  most genes
associated with focal adhesion in both gubernaculum and
testis. We separately analyzed overrepresentation of 755
known androgen-regulated  and androgen-signaling path-
way genes (rittp://www.netpath.org/pathways?path_id=
NctPath_2;  Bolton ct al, 2007)  not included in DAVID
with a Fisher's exact test patterned after EASE (Expres-
sion Analysis Systematic Explorer) methodology (Hos-
ack et al, 2003). Of 622 present on the microarray, 199 (P
= .039) and 30 (P  = .004) androgen-associated genes are
differentially expressed  in gubernaculum  and  testis,
respectively.  Together,  the functional  category  and
pathway analyses  suggest  altered  regulation of related
processes and pathways linked to energy and metabolism,
muscle  and cytoskeleton   organization,   and  altered
expression of androgen-regulated genes.

Expression of Genes Linked to Cryptorchidism or
Testicular Descent
We analyzed the expression patterns of candidate genes
annotated  on  the 230A  chip. HoxalO,  Epha4,  and
AridSB, genes associated with cryptorchidism in mice
with spontaneous or  targeted  mutations (http://www.
informatics.jax.org/) are present in the list  of differen-
tially expressed  genes (Table 4).  Of the previously
reported  tyrosine kinases  expressed  in  the  fetal
GD16.5  mouse  gubernaculum  (Verma-Kurvari  and
Parada,  2004),  several  failed  to  show  significant
gubernacular expression  during  the  interval  studied
(Rous sarcoma virus  [c-Src],  spleen tyrosine kinase
                                                               v-Erb  erythroblastic  leukemia  viral  oncogene
                                                        homolog 4 [Erbb4], and Eph receptor B4 [Ephb4]) using
                                                        the microarray methodology. Others  were  expressed
                                                        during this time frame at comparable levels in females
                                                        and  both male strains  (platelet-derived growth  factor
                                                        receptor alpha  [Pdgfra],  insulin-like growth factor 1
                                                        receptor [Igflr], c-src tyrosine kinase [Csk], kinase insert
                                                        domain protein receptor [Kdr, also known as Vegfr2 or
                                                        Flkl], and v-abl Abclson murine  leukemia viral onco-
                                                        gene homolog 1 [Abll; data not shown]). Expression of
                                                        protein tyrosine kinase 2 (Ptk2, encoding FAK or focal
                                                        adhesion  kinase) is sexually dimorphic at GDIS  and
                                                        increases significantly in wt males between GDIS and 20
                                                        but is not differentially expressed between strains. None
                                                        of the functional annotation analyses of testicular gene
                                                        expression identified  patterns  suggestive  of  altered
                                                        hormone  synthesis  in orl  fetal testis and representative
                                                        Leydig  cell-specific genes, including  probe sets for
                                                        isoforms  of  Cypllal, C'ypl7al, HsdBb,  and Hsdl7b,
                                                        which were highly, and not differentially,  expressed in
                                                        both  strains  (data not shown). Ins I3  expression  was
                                                        higher in or! fetal GD17 and 19 testis, but the differences
                                                        were not significant on the basis of our linear analysis.

                                                        Comparison of Differentially Expressed  Gubernaculum
                                                        and Testis Genes
                                                        Of the 349 testis genes differentially expressed between
                                                        the 2 strains,  117 were also differentially expressed in wt
                                                        compared with orl gubernaculum. Few of these genes
                                                        were down- or up-regulated in  both testis and guber-
                                                        naculum of orl fetuses. These include insulin-like growth
                                                        factor-binding protein 5, interferon-induced transmem-
                                                        brane  proteins  1   and  3,  osteoglycin, and calcium/
                                                        calmodulin-dependent protein kinase II;  delta  (lower
                                                        expression in orl); and Epha4 (higher expression in orl).
                                                        Many genes encoding ECM proteins, including several




-------
Barthold et al  •  Gene Expression in Fetal Gubernaculum and Testis of Cryptorchid Rats
                                                361
         A.
Dression
).5 0.5 1.5
X '
LUin


$1.
00
0

^
O>
O

^P
0
CM
0
?%

00
5
t£;

O>
5
tti;?H

0
CM
         B.
         .
         t/> 0
         g-9
*
1 1
h
T
C
>
1
)
1


"-


-_
'1
o
i-
C
-
1
>
>
..,
i
i

-r— -,-
M^
5
-,- -r
1*1*
0)
5
Figure 4. Expression profiles of ribosomal genes in gubernaculum
and testis. Mean Z-scores ± SD for 62 probe sets in  (A) male
gubernaculum and (B) testis are shown. Vertical lines separate the
samples by strain (O indicates orl; W, wt) and gestational age in
days (18-20).
procollagens,  laminins, basigin,  matrix Gla protein,
chondroitin sulfate proteoglycan 2, and spondin 1, show
reduced expression in orl testis.
  We directly compared mean normalized expression of
62 ribosomal and  mitochondria! ribosomal  genes  in
gubernaculum and testis (Figure  4).  In contrast to the
highly differential expression in gubernaculum, expres-
sion  levels of these  genes are  comparable  in  testis
samples from the 2 strains. Absolute expression  levels
between  testis  and  gubernaculum  were not  directly
comparable because of the differences in the protocols
used (ie,  single compared  with double amplification of
RNA  samples). However,  these data together  with
global expression data suggest that  altered expression
profiles are much more prominent in orl gubernaculum
than in testis. These data  suggest that altered  signaling
in the orl strain has  a more  profound effect on  gene
expression in fetal gubernaculum  than testis.

Validation of Array-Derived Expression Profiles
To determine whether the expression profiles  obtained
from  the microarrays were consistent with the relative
amounts of mRNA present in parallel samples,  real-time
RT-PCR validation was carried out. Expression levels of
selected  genes  in  specific  pathways and  functional
groups (Table 4, bold) were analyzed. We identified 2
candidate control genes, tripeptidyl peptidase  2 (Tpp2)
and lumican, with  mean GC-RMA expression  levels
showing minimal variation across all groups; real-time
RT-PCR results showed differences in raw Ct values of
less than 0.5 for both genes with more consistency seen
in Tpp2 expression across samples (data not shown). We
analyzed target genes related to  Tgfp/Wnt/Hedgehog
(Bmp4, U2, Msxl, Wnt4, Fst), MAPK (Dusp6,  Nfkbl),
and insulin-like growth factor signaling  (Igfl,  IgfbpS);
neurogenesis  (Olfinl) and myogenesis  (Myog,  Tnnt2,
Des) in gubernacula, testes (6-12 samples/group  from 2-
3 litters), or both relative to Tpp2. Compared with the
microarray data, we identified similar expression pat-
terns for most wt and orl  samples at the 2 time points,
and  differences in  mRNA levels by RT-PCR were
statistically different for many of these genes at GD20
but not GDIS  (Figure 5). The most  significant differ-
ences between strains were noted for gubernacular genes
associated  with neuromuscular development  (Olfml,
Msxl, Myog, Tnnt2, and Id2),


Discussion

We characterized transcript profiles of fetal gubernac-
ulum and testis  in an  animal  model  of inherited
cryptorchidism  and a wild-type strain to identify genetic
pathways that are activated during rapid growth of fetal
gubernaculum.   Global  analysis  of  samples  suggests
delayed, feminized, or both patterns in  the mutant orl
gubernaculum compared with the wt strain. We  observe
2 major trajectories in  the normal rat  gubernaculum
between GDIS  and  20 and  sexually dimorphic expres-
sion of these genes at GDIS, with little  change in gene
expression between GDIS and 19  in males. Functional
analysis of the  normal pattern of gene  expression is
consistent  with  growth,   cellular proliferation,  and
muscle development  that are known to occur in the rat
gubernaculum  during this time  frame.  The dynamic
changes resolve at GD20 to a level of expression similar
to that  of the GDIS female, suggesting  completion of
male-specific development. Similarly, by GD20, expres-
sion of Rxfp2  (Insl3 receptor)  mRNA is  markedly
diminished (Barthold et  al, 2006) and  antiandrogen
exposure does not prevent testicular descent (Spencer et
al, 1991), suggesting  that the critical phase of InslS and
androgen stimulation of the gubernaculum occurs prior
to this time.
  By contrast, GDI8-20 gene expression in orl fetuses is
substantially less varied, with significant  overlap in the
function of genes that are more highly expressed  in orl
males and in females,  suggesting that  loss of  male-
specific  signaling  is  already  present at GDIS. The
observation that strain-specific expression profiling of
fetal testis shows relatively few differences also supports
a model of cryptorchidism  in the orl strain in which
                              Previous
          Next

-------
362
                    Journal of Andrology   •   May/June 2008
Wnt4
Fst Bmp4
Arrav RT-PCR
IS
S
5
.1 §
U =.
=



J

20 IS 20








I
*

_,
5
*
0



T I
-L

***


1



J
E.
X
Id

\v o \v o \v o w o
Olfml
Anav RT-PCR Arrav RT-PCR
18 20 18 20 18

e
0
!
0



i




f
i

o

•f;
0
d
cs
_
1
1
-L

*





0
11
&s
X
wl
0


'
_L <

20 18 20








1
X

Ifj
«M
0
S
2
T


-•

1 )
--

**


I

W O W O W O W O W 0 W 0 W 0 W 0
Msx1 Igf1
Anav RT-PCR
18

5
c.
«1

S



T


20 IS 20







1





0
o
*f
0


**
,1

-L


***
I

1



ression
&
Cd


\V 0 \V O W 0 W O
Nfkbl
Arrav RT-PCR Anav RT-PCR
18 20 18 20 18
1
;

i

o



i






I





q
o
-t

q




1 j_
L


***
••

i


s
oC
§1
I3
c._
x S
a <=
^

o

i




20 18 20







|

i



S
O
M
tf.
"— '

--

--




*


z


\V O W 0 W O W 0 \\ 0 W 0 \V 0 W 0
Dusp6 IgfbpS
Arrav RT-PCR
IS



e
0
|l
o




_
I
20 IS 20








T
1
T


if

w
s
*f.




«
-'•
t


--
11




1
E.

w o w o w o w o
Des
Myog
Array RT-PCR Array
20 20 20
o
*r.
*~~
ession
SOI)
c.
£g

O


I



30
"™"
^
T;




__
--
_^
•


e
s

ession
Cc C
pc J

3
:
J -|-

,1
S
S
1
T
* — 1—
Arrav RT-PCR Arrav RT-PCR
18 20 18 20 18

D


O
S5
0







ME



j
1
-


h-

0
o
,




, <•
--
*#*



I



0
e m
0
iA O

-------
Barthold et al  •  Gene Expression in Fetal Gubernaculum and Testis of Cryptorchid Rats
                                                363
delayed or incomplete development of the gubernacu-
lum  is  a major  contributing  factor.  Our functional
analysis suggests that many general pathways related to
metabolism, energy and  growth are altered in the orl
gubernaculum, consistent with the decreased size of the
fetal orl gubernaculum (Barthold et al, 2006). We also
identified several specific, related pathways and biolog-
ical  processes represented  by  differentially  expressed
gubernacular  genes,  including  small  GTPase  signal
transduction, focal adhesion, actin-filament	based pro-
cess,  cytoskeleton  organization and biogenesis,  and
muscle development. Although no database  of andro-
gen-regulaled genes in the fetal gubernaculum exists, we
observed that genes regulated by androgens, involved in
androgen receptor signaling in other cell types, or both
were  overrepresented  in  our  lists  of  differentially
expressed genes from  both testis  and gubernaculum.
These  data  suggest  that androgen receptor signaling
may be  altered  in  the  orl  fetus, although additional
studies are needed.
  The respective  roles of Tnsl3 and androgen in  cell-
specific development of the  gubernaculum remains
undefined. In  vitro,  Rxfp2 activation increases cAMP
production via the stimulatory G-protein Gas, activates
the CRE reporter, and, in cooperation with testosterone,
stimulates proliferation of fetal gubernacular cells (Em-
men et al, 2000; Halls et al, 2007). Rxfp2 is one of many
G-protein	coupled receptors that activates cAMP/PKA,
a response that regulates multiple developmental  pro-
cesses, including neurogenesis and myogenesis (Lonze
and  Ginty, 2002; Chen et al, 2005). However, little is
known  of the  downstream  effects  of  cAMP/PKA
signaling in fetal gubernaculum beyond cellular prolif-
eration. In vivo,  hyperplasia  and  extracellular matrix
production in the fetal gubernaculum is followed by
maturation of muscle precursors that become peripher-
ally  oriented  to  form  the  striated  cremaster muscle
(Radhakrishnan et al,  1979; Wensing,  1986). Cultured
GDI7 rat gubernacula enlarge in response to synthetic
androgen R1881  without  Insl3 but  contain poorly
organized myosin-positive cells within the mesenchymal
core. However, when cultured with testis, they contain a
defined outer layer  of muscle (Emmen et  al,  2000).
Marked atrophy of the fetal gubernaculum with loss of
the inner mesenchymal core is characteristic of both
fnslS and Rxfp2 null mice (Kubota et al,  2001). These
data suggest that In si 3 may play a role in the regulation
of both  matrix  remodeling  and muscle  development
within the gubernaculum.
  Other rodent data support a role for Insl3IRxfp2 and
additional candidate genes in regulation of myogenesis
during development of the gubernaculum. Rxfp2 mRNA
is present throughout the gubernaculum at GD16 in rat
(Scott et al, 2005), but by GDI9, binding sites for Insl3
arc localized to the outer muscle layer (McKinnell et al,
2005),  whereas the  androgen receptor continues to be
expressed in both mesenchyme and muscle (Staub et al,
2005). A cell-specific developmental role for androgens in
the gubernaculum is not clear;  however, after prenatal
exposure to the antiandrogen flutamide prior to GD17,
both mesenchymal and muscular  compartments of the
GD20  rat gubernaculum are reduced in size (Cain et al,
1995) and  embryonic muscle isoforms persist in adult
cremaster muscle  (Tobe et al, 2002). HoxctlO transcripts
are also expressed throughout the GDI5.5 gubernacu-
lum, and histological studies of the postnatal cremaster
muscle in HoxalO (—/—) males show disordered myo-
genesis (Satokata  et al, 1995).
  Expression patterns of specific genes that are involved
in muscle development, contraction, or both (Table 4;
Figure 5) support our global functional analysis results
and suggest that  terminal  differentiation of muscle is
delayed or  disrupted in  orl gubernaculum. Expression
levels of Myog and Myf6, myogenic regulatory factors
that control  later  stages  of  muscle differentiation
(Sartorelli  and Caretti,  2005), are  reduced, whereas
several genes that arc down-regulated during or inhibit
terminal  differentiation  of  muscle  (Melnikova et al,
1999; Ohkawa et al, 2006),  including Igfl, M2, Msxl,
and representatives  of  the  fibroblast growth  factor
family,  show increased expression in  the orl fetal
gubernaculum. We also identified altered expression of
several genes that promote skeletal muscle development,
including IgfbpS, Ilk, Bmp7, and Fst (Huang et al, 2000;
Amthor et  al, 2002). Expression of Rps6kbl, Eif4e, and
Eif4ebpl are reduced in orl gubernaculum. These genes
are effectors  of insulin and a mammalian target of
rapamycin  (mTOR)  signaling  that  regulate  protein
synthesis and cell size  (Ruvinsky  and Meyuhas, 2006);
mTOR signaling  is  also critical  for myoblast  fusion
(Park and  Chen, 2005). Reduced expression of these
genes is consistent with the global reduction in protein
synthesis, as well  as the reduced expression of muscle-
specific genes that we observed in  orl gubernaculum,
with  previous  microarray  data showing  increased
expression  of energy  and  metabolism genes and de-
creased expression of genes involved in DNA replication
and transcription during skeletal myotube maturation
(Park and Chen, 2005).
  In addition to muscle-specific genes, we  identified
altered expression of  genes  related  to small GTPase
signal  transduction,   cytoskeleton   organization  and
biogenesis,  and focal  adhesion. The  Rho GTPases
encode  proteins  that   are  responsive  to  G-protein-
coupled receptor and receptor tyrosine kinase signaling
and are  critical  for  cytoskeletal  reorganization, cell
motility,  axon guidance, and myogenesis (Kj oiler and
Hall, 1999; Bishop and Hall, 2000;  Bryan et al, 2005).




-------
364
              Journal of Andrology  •   May/June 2008
Several, including RhoA,  Racl, Cdc42, and RhoC, arc
differentially expressed between strains (Table 4). Focal
adhesions are sites of cell attachment to the extracellular
matrix comprising integrins, cytoskeletal  proteins,  and
signaling molecules (Sastry and Burridge, 2000). Possi-
ble roles for focal adhesion signaling in the developing
gubernaculum include regulation of myoblast matura-
tion (Clemente  et  al, 2005), migration  (Mitra et  al,
2005),  or  both; formation of  costameres  (Z-bands
anchoring myoflbrils to the sarcolemma) (Quach  and
Rando,  2006);  and  axon  pathfinding  (Robles  and
Gomez,  2006). Expression  of  the  mRNA for several
genes  that  participate  in  focal  adhesion  signaling,
including Ptk2, Kdr (Flkl), Src (y-src), and Csk (Sastry
and Burridge, 2000; Mitra et al, 2005) is present in the
GD16.5  mouse  gubernaculum  (Verma-Kurvari  and
Parada, 2004). Csk encodes a tyrosine kinase linked to
focal  adhesion turnover  and  regulation  of the actin
cytoskeleton (McGarrigle et al, 2006), and  the corre-
sponding protein is localized to both mesenchymal  and
muscle layers in  GDI6.5 mouse gubernaculum. Ilk, a
key component of integrin-mediated signaling that plays
a  role in the switch from myogcnic proliferation to
differentiation (Huang et al, 2000), is expressed at lower
levels in orl gubernaculum.
   Although the genetic basis for human cryptorchidism
remains  largely  unknown,  review  of  gene  defects
associated with  cryptorchidism as compiled in Online
Mendelian Inheritance in Man (OMIM)  supports  our
present data. Several syndromes that include  cryptor-
chidism are  linked to genes that participate in small
GTPase  signaling (SOS1, KRAS, FGD1), actin cyto-
skeleton  regulation  (FLNA, FLNE), muscle develop-
ment (ACTA I), or muscle contraction (RYRJ), Expres-
sion  of  some   of these  genes  is  altered  in  orl
gubernaculum (Table 4). In humans, the gubernaculum
is  comprised primarily  of mesenchyme,  but striated
muscle is present within its distal portion in addition to
the surrounding cremaster muscle (Tayakkanonta, 1963;
Barteczko and Jacob, 2000;  Costa et al,  2002), whose
role, if any,  in testicular descent is unclear. It is notable,
however,  that  cryptorchidism  is  present in  multiple
forms  of congenital  myopathy (OMIM) and  is also
present in males with Prune-Belly syndrome (Jennings,
2000) and at a higher frequency  in males with abdominal
wall defects (Kaplan et  al,  1986).  Moreover,  altered
structure and function of the cremaster muscle has been
reported in cryptorchid boys (Tanyel et al, 2000). These
observations taken  in combination with our  present
study suggest that muscle patterning might play a more
important  role  in  development  and  function of the
gubernaculum than previously recognized.
   Limitations of this study include  our analysis of
tissue-specific compared with cell-specific  gene  expres-
sion and the requirement for amplification of gubernac-
ulum  but not  testis.  Although  the  amplification,
analysis, or both could be  biased toward a particular
cell type, we  have been  unable to  identify any  clear
differences in cellular composition of wt compared with
orl gubernaculum  using  cell-specific  immunostaining
(data  not shown).  Because  of differences in  RNA
processing, we  avoided  direct comparisons of  gene
expression in  testis  and  gubernaculum.  Also,  the
possibility exists that  early transcriptosomal changes
associated with male specific gubernacular development
were missed because gene expression is already sexually
dimorphic in wt at GDIS. Therefore, although we can
identify global expression profiles that reflect develop-
ment of wt and orl gubernacula, we cannot determine
whether differences in  gene  expression are the cause or
result  of abnormal development. Moreover,  because
testicular descent does not occur until the postnatal
period in the rat, we are unable to determine at the fetal
stage  which   gubernacula  (less than   half) will  be
associated with  cryptorchid  testes.  Despite this, the
expression profiles of individual orl samples are highly
similar and markedly different from wt in gubernaculum
but not testis and therefore  likely phenotype-specific as
opposed  to  strain-specific.  The  basis  for reduced
penetrance of the phenotype remains unknown at this
time but might be related to a dosage effect determined
by environmental factors, modifying loci, or both. To
date,  we have no evidence that maternal- or paternal-
specific factors  determine phenotype because the fre-
quency of cryptorchidism in offspring does not  appear
to be related to paternal phenotype or identity of the
dam (unpublished observations).
  Analysis of gene expression in fetal tissues of wild-
type and cryptorchid  orl mutant rats suggests that a
primary gubernacular  defect that directly or indirectly
affects muscle function might  predispose to cryptorchi-
dism  in the  affected  strain.  Further studies  will  be
necessary to elucidate  the mechanism of gubernacular
dysfunction in the orl rat.
fifeferences
Amthor H. Christ B, Rashid-Doubell F, Kemp CH, Lang E. Patel K.
   Follistatin  regulates  bone  morphogenetic protein-7 (BMP-7)
   activity to stimulate embryonic muscle growth. Dev Biol. 2002;243:
   115-127.
Barteczko KJ, Jacob MI. The testicular descent in  human.  Origin,
   development and fate of the gubernaculum Hunteri, processus
   vaginalis peritonei, and gonadal ligaments. Adv Anal Embryo! Cell
   Biol. 2000;156:III-X, 1-98.
Barthold JS, Gonzalez R. The epidemiology of congenital cryptorchi-
   dism, testicular ascent and  orchiopexy. / Urol.  2003;170:2396-
   2401.




-------
Barthold et al  •   Gene Expression in Fetal Gubernaculum and Testis of Cryptorchid Rats
                                                           365
Barthold JS, Si X, Stabley D, Sol-Church K. Campion L, McCahan SM.
   Failure of shortening and inversion of the perinatal gubernaculum in
   the cryptorchid Long-Evans orl rat. / Vrol. 2006:176:1612 1617.
Bishop  AL,  Hall A. Rho  GTPases  and their  effector  proteins.
   Biochem J. 2000;348(2):241-255.
Bolton EC, So AY, Chaivorapol C, Haqq CM, Li H, Yamamoto KR.
   Cell- and gene-specific regulation of primary target genes by the
   androgcn receptor. Genes Dev.  2007;21:2005-2017.
Bryan BA,  Li D,  Wu X,  Liu M. The Rho family of small GTPases:
   crucial  regulators  of skeletal  myogenesis.  Cell Mol Life Sci.
   2005;62:1547-1555.
Cain MP, Kramer SA, Tindall DJ, Husmann DA. Flutamide-induced
   cryptorchidism in the rat is associated with altered gubernacular
   morphology. Urology. 1995:46:553 558.
Chen AE, Ginty DD, Fan CM. Protein kinasc A signalling via CREB
   controls myogenesis induced by Wnt proteins. Nature. 2005:433:
   317-322.
Clemente CF, Corat MA, Saad ST, Franchini KG. Differentiation of
   C2C12  myoblasts is  critically  regulated  by  FAK  signaling.
   Am J Physiol.  2005;289:862R-870R.
Costa WS,  Sampaio  FJ,  Favorite LA,  Cardoso  LE.  Testicular
   migration: remodeling of connective tissue and muscle cells in
   human gubernaculum testis. J Vrol. 2002;167:2171-2176.
Czeizel A, Erodi  E, Toth J. Genetics of undescended testis. J UroL
   1981;126:528-529.
Dennis G Jr,  Sherman BT, Hosack DA,  Yang J.  Gao W, Lane HC,
   Lctnpicki RA. DAVID: Database for Annotation, Visualization,
   and Integrated Discovery. Genome Biol. 2003;4:P3.
Elert  A, Jahn K, Heidenreich  A, Hofmann  R.  Der  familiare
   Leistenhoden  [The  familial undescended  testis].  Klin  Padiatr.
   2003;215:40-45.
Emm en  JM, McLuskey  A, Adham  IM, Engel W. Grootegoed JA,
   Brinkmann AO. Hormonal control of gubernaculum development
   during testis descent:  gubernaculum outgrowth in vitro  requires
   both insulin-like factor and androgen. Endocrinology. 2000:141:
   4720-4727.
Halls M. Bathgate R, Roche P. Summers R. Signaling pathways of the
   LGR7 and LGR8 receptors determined by reporter genes. Ann N Y
   Acad Sci. 2005;1041:292-295.
Halls ML,  Bathgate RA.  Summers RJ.  Comparison of  signaling
   pathways activated by the relaxin family peptide receptors, RXFP1
   and  RXFP2,  using   reporter  genes.  / Pharmacol  Exp  Thcr.
   2007;320:281-290.
Hosack  DA,  Dennis G  Jr, Sherman BT, Lane HC, Lempicki RA.
   Identifying biological themes within  lists of  genes with EASE.
   Genome Biol. 2003;4:70R.
Huang Y, Li J, Zhang Y, Wu C. The roles of integrin-linked kina.se in
   the regulation  of myogenic differentiation.  J Cell Biol. 2000:150:
   861	872.
Hutson  JM, Hasthorpe  S. Abnormalities of testicular descent. Cell
   Tissue Res. 2005:322:155-158.
Jennings RW. Prune belly syndrome.  Semin  Pedialr Surg. 2000;9:
   115-120.
Kaplan LM, Koyle MA, Kaplan GW, Farrer JH, Rajfer J. Association
   between  abdominal  wall defects and cryptorchidism.  ./  UroL
   1986:136:645-647.
Kjoller  L,  Hall  A.  Signaling  to Rho  GTPases. Exp  Cell  Res.
   1999;253:166-179.
Kubota  Y,  Nef S, Farmer PJ, Temelcos C, Parada LF, Hutson JM.
   Leydig  insulin-like hormone,  gubernacular  development  and
   testicular descent. ./ Vrol. 2001:165:1673-1675.
Livak KJ, Schmittgcn TD. Analysis of relative gene expression data
   using real-time quantitative  PCR and  the 2(-Delta Delta  C(T))
   Method. Methods (San Diego). 2001:25:402-408.
Lonze  BE, Ginty  DD.  Function and  regulation of CREB family
   transcription factors in the nervous system. Neuron. 2002:35:605
   623.
Mahood IK, McKinnell C, Walker M, Hallmark N, Scott H, Fisher
   JS,  Rivas A, Hartung S, Ivell R. Mason JI, Sharpe RM.  Cellular
   origins of testicular dysgenesis in rats exposed in utero  to di(n-
   butyl)  phthalate.  Inl  J Androl. 2006;29:148-154,  (discussion)
   181-185.
McGarrigle D, Shan D, Yang S, Huang XY.  Role of tyrosine kina.se
   Csk in G protein-coupled receptor- and receptor tyrosine kinase—
   induced fibroblast  cell migration. J Biol  Chem. 2006;281:10583-
   10588.
McKinnell C. Sharpe RM, Mahood K. Hallmark N. Scott H, Ivell R,
   Staub C, Jegou B. Haag F, Koch-Nolte F, Hartung S. Expression
   of insulin-like  factor 3 protein in the rat testis during fetal and
   postnatal development and in relation to  cryptorchidism  induced
   by  in utero exposure  to di (n-butyl) phthalate.  Endocrinology.
   2005:146:4536-4544.
Melnikova  IN, Bounpheng M, Schatteman GC, Gilliam D, Christy
   BA. Differential biological activities of mammalian Id proteins in
   muscle cells. Exp Cell Res. 1999:247:94-104.
Mitra SK,  Hanson DA.  Schlaepfer DD. Focal  adhesion kinase: in
   command and  control of cell  motility. Nat  Rev  Mol Cell Biol.
   2005;6:56-68.
Mouhadjer N, Pointis G,  Malassine A. Bedin M. Testicular steroid
   sulfatase in a cryptorchid rat strain. J Steroid Biochem.  1989:34:
   555-558.
Ohkawa Y, Marfella CG, Imbalzano AN. Skeletal muscle specification
   by myogenin and Mef2D via the SWI/SNF ATPase Brgl. EMBO
   (Eur Mol Biol Organ) J. 2006;25:490-501.
Overbeek PA, Gorlov IP, Sutherland  RW, Houston JB, Harrison WR,
   Bocttgcr-Tong  HL,  Bishop  CE,  Agoulnik  Al. A transgcnic
   insertion causing cryptorchidism in mice. Genesis. 2001;30:26—35.
Park IH, Chen J. Mammalian target of rapamycin (rnTOR) signaling
   is required for  a late-stage fusion process during skeletal myotube
   maturation. ./ Biol  Chem. 2005:280:32009-32017.
Quach   NL, Rando TA. Focal  adhesion  kinase is essential  for
   costamerogenesis in  cultured  skeletal muscle  cells. Dev Biol.
   2006:293:38-52.
Radhakrishnan  J.  Morikawa Y,  Donahoe PK,  Hendren  WH.
   Observations on the gubernaculum  during descent of the testis.
   Investig Vrol.  1979:16:365 368.
Robles  E, Gomez  TM.  Focal adhesion  kinase signaling at  sites of
   integrin-mediated adhesion controls axon pathfinding. Nal Neu-
   rosci. 2006:9:1274-1283.
Ruvinsky I, Meyuhas O. Ribosomal protein S6 phosphorylation: from
   protein synthesis to cell size. Trends Biochem Sci. 2006;31:342—348.
Saeed AI. Sharov V. White J, Li J, Liang W, Bhagabati N, Braisted J,
   Klapa  M,  Currier T,  Thiagarajan  M,  Sturn A,  Snuffin  M,
   Rezantsev A, Popov D, Ryltsov A, Kostukovich E, Borisovsky I.
   Liu Z, Vinsavich A, Trush V, Quackcnbush J. TM4: a free, open-
   source  system   for microarray data  management and analysis.
   BioTechniques. 2003:34:374-378.
Sartorelli V, Caretti G.  Mechanisms underlying the  transcriptional
   regulation of skeletal myogenesis. Citrr Opin  Genet Dev. 2005;!5:
   528-535.
Sastry  SK, Burridge K.  Focal adhesions: a  nexus for  intracellular
   signaling and cytoskeletal dynamics. Exp Cell Res. 2000;261:25—36.
Satokata I,  Benson  G,  Maas  R.  Sexually  dimorphic  sterility
   phenotypes in HoxalO-deficient mice. Nature. 1995:374:460-463.
Scott  DJ,  Fu  P.  Shen PJ, Gundlach A. Layfield S, Riesewijk A.
   Tomiyama H,  Hutson JM, Trcgcar  GW,  Bathgate RA.  Charac-
   terization of the rat INSL3 receptor. Ann N Y Acad Sci. 2005;1041:
   13-16.




-------
366
                 Journal of Andrology   •   May/June 2008
Spencer JR. Torrado  T, Sanchez RS, Vaughan  ED Jr, Imperato-
   McGinley J. Effects of flutamide and finasteride on rat testicular
   descent. Endocrinology. 1991:129:741 748.
Staub C, Rauch M, Ferriere F, Trepos M, Dorval-Coiffec I. Saunders
   PT, Cobellis G, Flouriot G, Saligaut C, Jcgou B. Expression of
   estrogen receptor ESR1  and its 46-kDa variant in the gubernac-
   ulum testis. Siol Reprod.  2005;73:703-712.
Tanyel FC, Erdem  S,  Altunay H,  Ergun  L. Ozcan  Z. Alabay B.
   Buyukpamukcu N, Tan E. Distribution and morphometry of fiber
   types in cremaster muscles  of  boys with inguinal  hernia or
   undescended testis. Pathol Res Pract. 2000;196:613	617.
Tayakkanonta K. The gubernaculum testis  and its  nerve supply.
   Ami N Z J Surg. 1963;33:61-67.
Tobe T. Toyota N, Matsuno Y, Komiyama M, Adachi T. Ito H. Mori
   C. Embryonic myosin heavy chain and troponin T isoforms remain
   in the cremaster muscle of adult rat cryptorchidism induced with
   flutamide. Arch Histol Cvtol 2002:65:279-290.
Tomiyama H, Hutson JM, Truong A, Agoulnik AI. Transabdominal
   testicular descent is disrupted in mice with deletion of insulinlike
   factor 3 receptor. J Pcdiatr Surg. 2003:38:1793 1798.
Verma-Kurvari  S. Parada LF.  Identification of  tyrosine  kinases
   expressed in the male mouse gubernaculum during development.
   Dev Dyn. 2004;230:660-665.
Wensing CJ. Testicular descent in the rat and a comparison of this
   process in the rat with that in the pig. Anat Rec. 1986;214:154  160.
Wettenhall JM,  Simpson KM, Satterley K, Smyth GK. affylmGUT: a
   graphical user  interface for linear  modeling of single channel
   microarray data. Bioinformatics (Oxf). 2006;22:897-899.
Wu Z. Irizarry RA, Gentleman R, Martinez-Miirillo F. Spencer F. A
   model-based background adjustment for  oligonucleotide expres-
   sion arrays. J Am Stat Assoc. 2004;99:909-917.
Zimmermann S. Steding G, Emmen JM, Brinkmann AO, Nayernia K,
   Holstein AF, Engel W, Adham IM. Targeted disruption of the InsB
   gene causes bilateral cryptorchidism. Mol Endocrinol. \999; 13:681-691.




-------
This article was downloaded by: [US EPA Environmental Protection Agency]
On: 2 September2009
Access details: /Access Details: [subscription number 789514190]
Publisher Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number:  1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK
     JOURNAL OF
   TOXICOLOGY AND
  ENVIRONMENTAL
        HEALTH
Journal of Toxicology and Environmental Health, Part B
Publication details, including instructions for authors and subscription information:
http://www. info rmaworld. co m/smpp/title~content=t713667286


Approaches for Applications of Physiologically Based Pharmacokinetic Models
in Risk Assessment
Chad M. Thompson a; Babasaheb Sonawane a; Hugh A. Barton b; Robert S. DeWoskin c; John C. Lipscomb d;
Paul Schlossera; Weihsueh A. Chiu a; Kannan Krishnan e
" National Center for Environmental Assessment, Office of Research and Development, U.S. Environmental
Protection Agency, Washington, DC, USA b National Center for Computational Toxicology, Office of Research
and Development, U.S.  Environmental Protection Agency, Research Triangle Park, North Carolina, USA c
National Center for Environmental Assessment, Office of Research and Development, U.S. Environmental
Protection Agency, Research Triangle Park, North Carolina, USA d National Center for Environmental
Assessment, Office  of Research and Development, U.S. Environmental Protection Agency, Cincinnati, Ohio,
USA e Groupe de recherche interdisciplinaire en sante et Departement de sante environnementale et sante
au travail, Universite de  Montreal, Montreal, Canada

Online Publication Date: 01 August 2008
To cite this Article Thompson, Chad M., Sonawane, Babasaheb, Barton, Hugh A., DeWoskin, Roberts., Lipscomb, John C.,
Schlosser, Paul, Chiu, Weihsueh A. and Krishnan, Kannan(2008)'Approaches for Applications of Physiologically Based
Pharmacokinetic Models in Risk Assessment', Journal of Toxicology and Environmental Health, Part B,11:7,519 — 547
To link to this Article: DOI: 10.1080/10937400701724337
URL: http://dx.doi.org/10.1080/10937400701724337
                              PLEASE SCROLL DOWN  FOR ARTICLE
Full terms and conditions  of use:  http://www.informaworld.com/terms-and-conditions-of-access.pdf

This article may be used for research,  teaching and private study purposes.  Any substantial or
systematic reproduction,  re-distribution,  re-selling, loan or  sub-licensing,  systematic supply or
distribution in any form to anyone is expressly forbidden.

The publisher does not  give any warranty express or implied or make  any representation that the contents
will be complete or accurate or up to date. The accuracy of any  instructions,  formulae and drug doses
should be independently verified with primary sources. The publisher shall  not be liable for any loss,
actions, claims, proceedings,  demand or costs or damages whatsoever  or  howsoever caused arising directly
or indirectly in connection with or arising out of the use of  this material.
                                  Previous

-------
Journal of Toxicology and Environmental Health, Part B, 11:519-547, 2008
Copyright © Taylor & Francis Group, LLC
ISSN: 1093-7404 print /1521-6950 online
DOI: 10.1080/10937400701724337
                                         j Taylor £t Francis
           APPROACHES FOR APPLICATIONS OF PHYSIOLOGICALLY BASED
           PHARMACOKINETIC MODELS IN RISK ASSESSMENT

           Chad M. Thompson1, Babasaheb Sonawane1, Hugh A. Barton2, Robert S. DeWoskin3,
           John C.  Lipscomb4, Paul Schlosser1, Weihsueh A. Chiu1, and Kannan Krishnan5
           1 National Center for Environmental Assessment, Office of Research and Development,
           U.S. Environmental Protection Agency, Washington, DC, USA
           2National Center for Computational Toxicology, Office of Research and Development,
           U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA
           3National Center for Environmental Assessment, Office of Research and Development, U.S.
           Environmental Protection Agency, Research Triangle Park,  North Carolina, USA
           4National Center for Environmental Assessment, Office of Research and Development,
           U.S. Environmental Protection Agency, Cincinnati, Ohio, USA
           5Groupe de recherche interdisciplinaire en sante et Departement de sante environnementale
           et sante au travail, Universite de Montreal, Montreal, Canada
           Physiologically based pharmacokinetic (PBPK) models are particularly useful for simulating exposures to environmental
           toxicants for which, unlike pharmaceuticals, there is often little or no human data available to estimate the internal dose
           of a putative toxic moiety in a target tissue or an appropriate surrogate. This article reviews the current state of knowl-
           edge and approaches for application of PBPK models in the process of deriving reference dose, reference concentration,
           and cancer risk estimates. Examples drawn from previous U.S. Environmental  Protection Agency (EPA) risk assessments
           and human health risk assessments in peer-reviewed literature illustrate the ways and means of using PBPK models to
           quantify the pharmacokinetic component of the interspecies and intraspecies uncertainty factors as well as to conduct
           route to route, high dose to low dose and duration extrapolations. The choice of the appropriate dose metric is key to the
           use of the PBPK models for the various applications in risk assessment. Issues  related to whether uncertainty factors are
           most appropriately applied before or after  derivation of human equivalent dose (or concentration)  continue to be
           explored. Scientific progress in the understanding of life stage and genetic differences in dosimetry and their impacts on
           variability in susceptibility, as well as ongoing development of analytical methods to characterize uncertainty in PBPK
           models, will make their use in risk assessment increasingly  likely. As such,  it is anticipated that when PBPK models are
           used to express adverse tissue responses in terms  of the internal target tissue dose of the toxic moiety rather than the
           external concentration, the scientific basis of, and confidence in, risk assessments will be enhanced.
    Improving the scientific basis for human health  risk and safety assessments is an ongoing concern
for regulatory agencies. Often, the requisite data  in humans for directly assessing health risks are not
available or are limited; therefore, developing risk and safety estimates requires extrapolation(s)  across
species, exposure  routes, durations, and exposure levels. By  utilizing physiological, biochemical, and
physicochemical data, physiologically based pharmacokinetic (PBPK) models can perform these extrap-
olations using scientifically grounded principles, as well as characterize the variability  and uncertainty
therein. Although data-intensive, PBPK models have the added benefit of affording predictions of inter-
nal dosimetry  in  humans that otherwise cannot be  measured directly without  potentially harming
patients or study participants. This article reviews the current state of knowledge  and approaches for
application of PBPK models in cancer and noncancer risk assessment. Use of these  models for perform-
ing the extrapolations often needed in risk assessment is discussed generally,  followed by descriptions

    This article is an edited excerpt from the final report "Approaches for the Application of Physiologically Based Pharmacokinetic Models
and Supporting Data in Risk Assessment" (U.S. EPA,  2006a), and was originally funded by U.S. Environmental Protection Agency under con-
tract RFQ-DC-03-00328. For more information about this report please see http://cfpub.epa.gov/ncea/cfm/recordisplay.dm?deid = 157668.
    The National Center for Environmental Assessment has reviewed and approved this article for publication. Such approval does not
signify that the contents reflect the views or policy  of the U.S. Environmental Protection Agency, nor does mention of trade names con-
stitute endorsement or recommendation for use.
    Address correspondence to Kannan Krishnan, DSEST, 2375 Chemin de la Cote Ste Catherine Room 4105, Universite de Montreal,
Montreal, PQ, Canada H3T1A8. E-mail: kannan.krishnan@umontreal.ca

                                                   519
                             Previous
TOC

-------
520                                                                     C. M. THOMPSON ET AL
of the specific applications of PBPK models in reference concentration (RfC) derivation, reference dose
(RfD) derivation, and cancer risk assessment, along with illustrative examples from chemical-specific
human health risk assessments from  regulatory agencies and peer-reviewed literature. Other applica-
tions such as (1) use of biomonitoring data to infer exposure, (2) evaluation of chemical mixtures, and
(3) linking of PBPK and pharmacodynamic modeling are also discussed. This article primarily focuses on
practices at the U.S. Environmental Protection Agency (EPA); however, PBPK models are used in regu-
latory assessments in Europe and North America  (International Workshop on the Development  of
Good Modeling Practice for PBPK Models, Chania, Greece, April 2007), and thus  the approaches
herein may be applicable to the broader risk assessment community.


    RATIONALE FOR USING DOSIMETRY MODELS IN RISK ASSESSMENT

    Frequently, questions have been raised concerning the interpretations of toxicological studies,
including health risk assessments. These questions may revolve around observations of toxicities at
high doses that are not apparent at  lower doses, or findings in one species that are not observed  in
another species. In other cases, toxicity is observed in several  species but at different exposure con-
centrations or exposure doses. Historically, questions about how to best apply information from
experimental  animal toxicity studies or high occupational exposures to protecting public health has
led to a growing recognition that pharmacokinetic analyses, particularly pharmacokinetic models
that estimate dosimetry, are valuable tools to help provide some answers.

    Advantages of Dosimetry Models
    Pharmacokinetics  involves the study of the time course of the concentrations or amounts of a
parent chemical or metabolite(s) in  biological fluids, tissues and excreta, as well as the construction
of mathematical models to interpret such data (Wagner, 1981; Benet et al.,  1996; Renwick, 2001).
The time course of the concentration of a chemical or its metabolite in  biota is determined by the
rate and extent of absorption, distribution, metabolism, and excretion  (ADME). The pharmacoki-
netics, or ADME, of a substance determines the delivered dose or the amount of chemical  available
for interaction in the tissue(s) of interest. Relating adverse response(s) observed in biota to an appro-
priate measure  of delivered  dose (e.g., concentration  of the toxic chemical in the target tissue)
rather than administered dose or exposure concentration is likely to  improve the characterization  of
many dose-response relationships (Clewell et al., 2002a; Slikker et al., 2004).
    Various modeling approaches are  used to characterize exposures  and  the  resulting delivered
doses. These approaches reflect differences in chemical and  physical characteristics (e.g.,  stable  or
reactive gases, particulate matter,  lipophilic  organics,  water-soluble compounds), differences  in
pharmacokinetic properties, and the ability of compounds to  produce contact site or systemic toxic
effects (U.S. EPA, 1994; Andersen & Jarabek, 2001; Overton, 2001; U.S. EPA, 2004).
    Exposure  to many drugs and toxicants occurs via the oral route  resulting in systemic effects,
which were analyzed using relatively simple (e.g., one- and two-compartment) pharmacokinetic
models  (O'Flaherty, 1981; Renwick, 2001). Generally, these models consist of a central compart-
ment that  represents the whole body when  distribution  occurs nearly  instantaneously  (one-
compartment model)  or  plasma  when an additional  compartment (two-compartment model) is
necessary to  describe a slower distribution phase such  as  sequestration  into fat. Compartment
models  help  characterize the kinetic behavior of a  chemical and are useful for  deriving values
describing distribution in the body and clearance from the plasma (e.g., half-life).
    The values derived from a compartment model analysis may only apply to the conditions of the
study from which the experimental  data were obtained. However, these models have expanded  to
include  PBPK models that contain multiple compartments and explicit mathematical descriptions of
physiological processes and tissues most likely to affect chemical disposition (e.g., absorption from
the gut or lung, cardiac output, metabolism in the liver, renal  clearance). Such models more closely
represent the biological determinants of a chemical's disposition in  the  body and predict the
internal  dose that would  result from different exposure regimens—including exposure conditions
for which no data are available or ethically obtainable.
                         Previous   I     TOC

-------
APPLICATIONS OF PBPK MODELS IN RISK ASSESSMENT
                                                                                          521
    Clearly, the need to predict the behavior of chemicals in exposed organisms is a driving force
behind  PBPK model development. Early PBPK models were developed to predict the behavior of
volatile  anesthetics, including compounds now used exclusively as industrial chemicals (Krishnan &
Andersen, 2001). The general principles developed in  these early PBPK modeling efforts for system-
ically distributed compounds are also applicable to other chemical classes. For example,  the respi-
ratory tract is a frequent site of both  exposure and toxicity, and it has been a particular focus for a
range of modeling approaches, including those developed to simulate the kinetics of gases of vari-
ous reactivities and solubilities, as well as particulate matter (PM) (U.S. EPA, 1994). More recently,
the kinetics of reactive  gases  and PM within  the  respiratory  tract have  been simulated  using
advanced approaches such as two-dimensional and three-dimensional computational fluid dynam-
ics (CFD) modeling (Kimbell etal.,  1993; Overton, 2001; Martonen etal., 2001; U.S. EPA, 2004).
    The  role of metabolism is  another significant factor  in the  development of PBPK models.
Saturable metabolism often results in  nonlinear relationships between the level of administered dose
and the levels of the internal dose for a parent compound or metabolite. In combination with  other
physiological and chemical events, the resulting administered dose-response relationship can quickly
become difficult to resolve with simple analytical tools. PBPK models provide a reliable means to
account for interactions and nonlinearities among multiple processes and to provide insight into
whether the parent chemical or the  metabolite is the toxic moiety leading to adverse effects. As
depicted in Figure 1, dose-response  relationships that appear unclear or confusing at the adminis-
tered dose level may become more understandable when expressed on the basis of internal dose of
the chemical. The major advantage of constructing dose-response relationships on the basis of inter-
nal  or delivered dose is that it provides a stronger biological basis for conducting extrapolations and
for comparing responses across studies, species, routes, and dose levels (Clewell & Andersen, 1985;
Andersen et al., 1987; Aylward et al.,  1996; Benignus  et al., 1998; Melnick & Kohn, 2000).
                               30
                                                            linear
                                                          Weibull
                                 0     1000   2000    3000    400C
                                    Exposure concentration (ppm)

                                 B
                               30

                               25
                             8-15
                             CD

                             ^1°1
                                5
                                0
                                       10    20    30    40    50
                                      Rate metabolized (mg/hr)
FIGURE 1. Relationship between the exposure concentration and adverse response for a hypothetical chemical. (A) Hypothetical exam-
ple of a chemical for which the correlation between dose and response is weak or complex, along with equally plausible curve fits (linear,
Hill, and Weibull). This dose-response relationship is improved when it is based on an appropriate measure of internal dose (B).
                          Previous
TOC

-------
522
                                                                       C. M. THOMPSON ET AL
   Advantages of using PBPK Models in Risk Assessment
   Regulatory agencies such as the U.S. EPA derive dose-response values based on the current
understanding of a dose-response relationship. Reference values for  noncancer effects corre-
spond to an estimate of a daily exposure to the human population (including sensitive subgroups)
that is likely to be without an appreciable risk of deleterious effects during  a lifetime. The refer-
ence values developed at the U.S. EPA include RfC for chronic inhalation exposures and RfD for
chronic oral exposures. For chronic oral and inhalation cancer risk assessments with an unknown
or a linear mode of action (MOA) (e.g.,  mutagenic carcinogens), U.S. EPA typically develops unit
risk estimates, a probability of developing cancer over a lifetime per unit of exposure, including
the cancer slope factor (CSF) for oral exposures and the inhalation unit risk (IUR). The underlying
assumption in these derivations  is that the typical human exposure concentrations or applied
doses of a parent  chemical result in internal exposures to the putative toxic  form of the chemical
in a target organ  that will be less than or equal to a level that is not associated  with significant
adverse responses during a lifetime (noncarcinogens) or that yields a likely risk at  or below the
estimated lifetime risk (carcinogens).
   Although a key factor in the induction of adverse effects is the presence of the toxic form of a
chemical in the target organ, it is rare that data are available on the time course of the toxic moiety
in the target tissue(s)  in humans. Even in animal studies, it is more practical to obtain measures of
blood, plasma, and urinary concentrations of toxic chemicals and their metabolites than the actual
toxic moiety level  in the relevant tissue. Pharmacokinetic models can therefore be used to estimate
the tissue concentration of toxic substances.
   Among the compartmental pharmacokinetic models, PBPK models are generally the most
useful for conducting  extrapolations needed to derive  reference values because they model
the underlying physiological and chemical processes that determine chemical disposition, and
may be used  to  predict  target organ  concentrations  for  relevant  exposure scenarios
(Himmelstein  &  Lutz, 1979;  Rowland,  1985; Leung,  1991;  Andersen, 1995;  Krishnan  &
Andersen,  2001, 2007;  Reddy et al.  2005).  Figure 2  depicts the  process of  PBPK model
development. By simulating kinetics and estimating dose metrics  of chemicals, PBPK models
make explicit  and  perhaps reduce the  uncertainty  related  to  interspecies,  intraspecies,
route-to-route,  duration, and high-dose to  low-dose extrapolations needed  to derive  RfC,
RfD, and cancer  unit risk estimates. The following sections discuss how the PBPK models are
used in  health risk assessment.
                               | Reflne Model
           Simulation  |
^



>
not OK

	
<
^
Comparison
to Data
/
/
C\K




Application in
Risk Assessment
FIGURE 2. Basic flow chart of PBPK model development. Note that physiological and biochemical constants used in deterministic mod-
eling could be replaced with statistical values when using probabilistic modeling approaches.
                        Previous
TOC

-------
APPLICATIONS OF PBPK MODELS IN RISK ASSESSMENT                                               523
   APPLICATION OF PBPK MODELS IN RISK ASSESSMENT

   This article focuses on applications of PBPK models in risk assessment and provides literature
references to guide the reader interested in further information. An extensive  listing of more than
1000 references relevant to PBPK modeling for environmental chemicals is provided as an online
Appendix to "Approaches for the Application of Physiologically Based Pharmacokinetic Models and
Supporting Data in Risk Assessment" (U.S. EPA, 2006a).

   Pharmacokinetic Model and Data Needs
   The quantitative dose-response assessment portion  of the risk assessment process can be used
to determine a point of departure (POD) for one or more of the most sensitive critical effects. The
POD is the dose-response point that marks the beginning of a low-dose extrapolation, and it can be
the no-observed-adverse-effect level (NOAEL) or the lowest-observed-adverse-effect level (LOAEL)
for an observed  incidence or the  lower bound on dose for an estimated incidence or change in
response level from a benchmark dose (BMD) analysis. Frequently, the POD is an external exposure
concentration (or administered dose) that relates to the observed  responses in laboratory or field
studies.  Infrequently, the POD is  an internal  dose metric (e.g. blood concentration) in a toxicity
study which  is  also designed to collect appropriate pharmacokinetic measurements. Generally,
however, if an assessment intends to use an internal dose metric  such as blood concentration, a
PBPK model will be needed to estimate the internal dose(s) from  the external exposure or doses
(including the POD) that were used in a toxicity study.
   PBPK models are often intended to estimate target tissue dose in species and under exposure
conditions for which little or no data exist. Thus, if a complete pharmacokinetic data set were avail-
able, then there may be no need to develop a PBPK model. Such an optimal  pharmacokinetic data
set for risk assessment would consist of the time-course data on the most appropriate dose metric
associated with exposure scenarios and doses used in the critical studies chosen for the assessment
(e.g., animal bioassays or human clinical and epidemiological studies) and relevant human expo-
sure  conditions. An example of such a dose metric is the concentration of a toxic metabolite in tar-
get tissue over a 24-h period in the test species and humans. This  information  would be obtained
for the window of exposure and route and scenario of exposure associated with  the critical study, as
well  as for the window of susceptibility, appropriate route, and exposure scenarios in humans.
   In almost all cases, however, the optimal data set is not available, and often  the available animal
pharmacokinetic data may be limited. In the absence of experimental kinetic data on the biologically
active form of a chemical  in target tissues, data on blood concentration of the parent chemical, urinary
metabolite  levels, or fraction  absorbed may be used as  a surrogate for the tissue levels. These and
other subsets of pharmacokinetic data are used to develop a PBPK model to estimate the level of the
toxic moiety of interest, and the uncertainty in those estimates can be formally characterized.

   Choosing PBPK Models Appropriate for Risk Assessment
   PBPK models are most often  used in risk assessments to estimate tissue and  blood concentra-
tions of  a toxic  moiety (parent chemical or metabolite) resulting from the dosing regimens in the
animal toxicity or human studies that are the basis for deriving dose-response values (e.g., RfC, RfD,
CSFs). Figure  3 depicts four basic criteria for selecting appropriate PBPK models. The first criterion,
though appearing self-evident, is quite fundamental because not all models available in the scien-
tific literature are parameterized for the specific species and  lifestage that were used in the critical
toxicological study under consideration in dose-response analysis. For example,  a PBPK model for a
particular compound may be available for rats, yet the critical study in a dose-response analysis is in
mice. Similarly,  the second criterion relates to the routes of exposure that can  be simulated by an
existing  PBPK model and those of interest for a health risk assessment. The third criterion is that a
model chosen for risk assessment applications would be able to provide simulations of the tissue
dose of  the toxic moiety or an appropriate dose metric related to the MOA,  exposure scenarios,
and routes associated with the critical study and likely human exposures. When  a PBPK model does
not adequately  meet these three criteria, additional data collection and model development  may
                         Previous  I     TOC

-------
524
                                                                            C. M. THOMPSON ET AL
                                  Is the PBPK
                                 model available
                                   for the test
                                  species and
                                   humans?
                                    Are the
                                  parameters for
                                   simulating
                                  relevant routes
                                   available?
         Experimental
            Data
          Collection
                                 Does the mode
                               simulate dose metrics
                                of relevance to risk
                                  assessment?
                                Has the model been
                                evaluated and peer-
                                   reviewed?*
FIGURE 3.  Flow chart for selecting PBPK models appropriate for risk assessment. Four basic criteria for model use in risk assessment are
shown in diamond shape boxes. A detailed discussion of model evaluation is described in U.S. EPA (2006a) and Chiu et al. (2007).


be necessary. Finally, the fourth criterion is that PBPK models  intended for use in risk assessment
need  to be peer-reviewed; otherwise, efforts may be directed toward this end. Approaches for eval-
uating PBPK models are discussed in greater detail elsewhere (Clark et al., 2004; U.S.  EPA, 2006a;
Chiu etal., 2007).

    Evaluation of Dose Metrics for PBPK Model-Based Assessments
    Relating  blood and tissue concentrations with response in exposed organisms has long been
recognized in pharmacology (Wagner, 1981). As mentioned previously, in pharmacokinetics, the
target tissue dose that most closely relates to an adverse response is often referred to as the internal
"dose  metric" (Andersen & Dennison, 2001). Dose metrics used in risk assessment  applications,
                          Previous
TOC

-------
APPLICATIONS OF PBPK MODELS IN RISK ASSESSMENT                                                 525
ideally,  reflect the biologically  active form  of the chemical  (parent chemical, metabolites,  or
adducts), its level (concentration or amount),  duration of internal exposure (instantaneous, daily,
lifetime, or a specific developmental period),  and intensity (peak, average, or integral), as well as
the appropriate biological matrix (e.g., blood, target tissue, surrogate tissues). When using PBPK
models in risk assessment, the basic data needed are: (1) a POD for a critical effect from a toxicity
study, (2) a peer-reviewed PBPK model for the relevant animal  test species and humans, and (3) a
dose metric that is reliably predicted by the PBPK model, is appropriate for the risk assessment, and
is supported  by the  MOA (if known). The methods and challenges  associated with  the first data
need (identification of critical  effects and PODs for an assessment) remain the same regardless of
whether one  uses PBPK models or not; approaches for identifying PODs are found elsewhere (U.S.
EPA, 1994, 2005a). The criteria and issues associated with the evaluation of PBPK models for use in
risk assessment are discussed in  detail elsewhere (U.S.  EPA, 2006a; Chiu  et al., 2007).  It is  worth
noting, however, that although an animal model is frequently needed, if sufficient pharmacokinetic
data are available for the toxic moiety of interest in an animal species,  then only a  human PBPK
model may be necessary to predict internal dose in humans. The third data need, identification of
the appropriate dose metric, is the subject or the remainder of this subsection.
    The dose metric, or the appropriate form of potential toxic moiety most closely associated with
the response,  varies from chemical to chemical, depending on  the MOA and critical effect. It has
two basic properties: the moiety and the measure thereof (Table 1). The dose metric for PBPK-
based risk assessment is chosen following the identification of the potential toxic moiety and eval-
uation  of the relationship  with the endpoint  of  concern.  Useful frameworks for evaluating
hypothesized  MOAs are included in Guidelines for Carcinogen  Risk Assessment (U.S. EPA, 2005a)
and A  Framework for Assessing Health Risks  of Environmental Exposures to Children (U.S. EPA,
2006b). Although the framework specifically deals with  carcinogens, the concepts are  broadly
applicable to  noncancer MOAs.  The framework provides useful discussion  related  to  evaluating
multiple MOAs (particularly over dose ranges) and for assessing relevance to humans.  Furthermore,
available data on closely related chemicals may be used to infer the likely toxic moiety. Similarly,
the toxicity data for various exposure routes and modes of administration may be compared to infer
the potential toxic moiety (IPCS, 2005).
    After the identification of the potential toxic moiety, the appropriate measure of tissue exposure
to the toxic moiety is  selected (Figure 4). For example, peak concentration was  related to some
neurotoxic effects of  solvents  (Bushnell,  1997; Benignus  et  al.,  1998;  Pierce  et al.,   1998;
MacDonald et al., 2002), such as the correlation of concentration of trichloroethylene  at the time of
                              TABLE 1. Examples of dose metrics for exploring
                              dose-response relationships
                              Potential moieties
                              Parent compound
                               Peak concentration
                               Average concentration
                               Amount or quantity
                               AUC (integral)
                              Metabol ite
                               Peak concentration
                               Average concentration
                               Amount or quantity
                               Rate of production
                               Cumulative rate of amount formed/time/L tissue
                               AUC (integral)
                              Miscellaneous
                               Receptor occupancy (extent or duration)
                               Macromolecular adduct formation
                               Depletion of cofactors
                         Previous   I     TOC

-------
526
                                                                        C. M. THOMPSON ET AL

Maximum
1
Integrated
i
Average
1

Integrated
1
^
r

Cmax
(mg/L tissue)
AUC
(mg/L tissue x time)
mg formed
per unit time
(e.g. mg/hr/L tissue)
mg formed
overtime specified
(e.g. mg/hr/L tissue)
DOSE METRICS
AUC
rng/L tissue x time
(parent or metabolite)
'evaluate correlations
to toxic responses
                      FIGURE 4. Examples of dose metrics for risk assessment application.


toxicity testing with observed effects on behavioral and visual functions (Boyes et al., 2000). For tet-
rachlorodibenzodioxin, tissue concentrations of the chemical measured during a critical period of
gestation were reported to predict the  intensity of developmental responses (Hurst et al., 2000).
The gender-specific genotoxic effects of benzene in  mice are related to differences in  the rate of
oxidative metabolism (Kenyon etal., 1996).
    For chronic effects of chemicals, the integrated concentration of the toxic form of chemical in
target tissue over time (i.e., the area under the concentration curve or AUC), typically determined
as the daily average, is often considered a reasonable dose metric (Andersen et al., 1987;  Collins,
1987; Voisin et al., 1990; Clewell et al., 2002a). For carcinogens producing reactive intermediates,
the amount of metabolite produced per unit time and the amount of metabolite in target tissue over
a period of time (e.g., mg metabolite/L tissue during 24 h) were used as dose metrics (Andersen et al.,
1987; Andersen & Dennison,  2001). For developmental effects, the dose surrogate is defined in the
context of the window of exposure for a particular gestational event (Welsch et al., 1995). Although
the AUC and rate of metabolite formation figure among the most commonly investigated dose met-
rics,  other surrogates of tissue exposure may  also be appropriate for risk  assessment purposes,
depending on  the chemical and its  MOA (e.g.,  maximal concentration [Cmax] of the toxic  moiety,
duration and extent of receptor occupancy, macromolecular adduct formation,  or depletion of glu-
tathione) (Clewell et al.,  2002a). Table 2 lists dose metrics that have been  used in a  number of
PBPK models published in the peer-review literature. Although these models and dose metrics were
used for cancer and noncancer risk  assessments performed in the scientific literature, they have not
necessarily been used in risk assessments carried out by regulatory agencies.
   When the appropriate dose metric cannot  be  readily identified, evaluation of the relationship
with the endpoint of concern  may be undertaken with each of the dose metrics to identify the one
that exhibits the best association (Andersen et al., 1987; Kirman et al., 2000). This becomes partic-
ularly important when there are limited or confusing data on the plausible  MOA of the chemical. At
a minimum, the appropriate dose metric can be identified as the one that demonstrates a consis-
tent  relationship with positive and negative  responses observed at various dose levels, routes,  and
scenarios in a given  species. In other words, the level  of the dose  metric would be lower for expo-
sure conditions that elicit no effect and higher for conditions that elicit toxic responses, regardless of




-------
APPLICATIONS OF PBPK MODELS IN RISK ASSESSMENT
                                                                                                 527
TABLE 2. Dose metrics used in published PBPK models for assessing risk
Chemical
Acrylonitrile
Bromotrifluoromethane

Butoxyethanol (2-)

Chloroform




Chloropentafluorobenzene
1,4-Dioxane


Ethyl acrylate
Ethylene glycol ethers

Formaldehyde

Heptafluoropropane

Isopropanol



Methoxyacetic acid


Methyl chloroform

Methylmercury
Methyl methacrylate

Methylene chloride

Styrene

Tetrachlorodibenzodioxin




Toluene
Trichloroethylene







Vinyl acetate


Vinyl chloride

Endpoint
Brain tumors
Cardiac sensitization

Forestomach lesions
and tumors
Liver cancer


Hepatic effects and
kidney tumor
Liver toxicity
Liver tumors


Forestomach tumors
Developmental
toxicity
Cancer

Cardiac sensitization

Neurobehavioral
effects
Developmental/
reproductive effects
Developmental effects


Hepatic effects

Neurological effects
Nasal lesions

Cancer

Lung tumors (mouse)

Biochemical responses
Cancer risk



Behavioral effects
Renal toxicity
Neurotoxicity

Cancer (liver lung and
kidney)



Olfactory
degeneration and
tumor development
Angiosarcoma

Dose metric
Peak metabolite concentration in target tissue
Concentration of parent chemical at the end
of exposure
Concentration of butoxyethanol/ butoxy
acetic acid in forestomach
Amount of metabolites covalently bound
to biological macromolecules L liver
per day; % cell kill/day
Maximal rate of metabolism per unit kidney
cortex volume
AUC of parent chemical in liver
Time-weighted average concentration
in liver over lifetime
Liver AUC
Tissue-specific glutathione depletion
Peak concentration and average daily AUC of
the alkoxyacetic acid (metabolite) in blood
DNA-protein cross-links

Concentration of parent chemical at the end
of exposure
Peak blood concentration

AUC of isopropanol and its metabolite
(acetone)
AUC of parent chemical (gestational day 1 1 )
Maximal concentration of parent chemical
(gestational day 8)
Area under the liver concentration vs. time
curve
Fetal brain concentrations
Amount metabolized/time/volume
nasal tissue
Rate of glutathione transferase metabolites
produced/L liver/time
Steady-state concentration of ring oxidation
metabolite mediated by CYP2F
Body burden
Time-weighted receptor occupancy
Up/down regulation of receptor occupancy
Fraction of cells induced

Brain concentrations at the time of testing
Metabolite production/L kidney/day
Blood concentration of metabolite
(trichloroethanol)
Amount metabolized/kg/day; AUC for
trichloroacetic acid or dichloroacetic
acid/L plasma; production of
thioacetylating intermediate from
dichlorovinylcysteine in kidney
Intracellular pH change associated with
the production of acetic acid; proton
concentration in olfactory tissue
mg metabolized/L liver; mg metabolite
produced/L liver/day
Reference
Kirman etal. (2000)
Vinegar and Jepson (1 996)

Poet etal. (2003)

Reitzetal. (1 990a)


Meek et al. (2002)

Clewell and Jarnot (1 994)
Leung and Paustenbach
(1 990)
Reitzetal. (1 990b)
Frederick etal. (1992)
Sweeney etal. (2001)

Schlosseretal. (2003);
Casanova etal. (1996)
Vinegar and Jepson (1 996)

Gentry etal. (2002)

Gentry etal. (2002)

Clarke etal. (1993, 1992)
Welschetal. (1995)

Reitzetal. (1 988a)

Gearhartetal. (1995)
Andersen et al. (2002,
1999)
Andersen etal. (1987)

Cruzan et al. (2002)

Kim et al. (2002)
Andersen etal. (1993)
Portieretal. (1993)
Conolly and Andersen
(1997)
Van Asperen etal. (2003)
Barton and Clewell (2000)
Barton and Clewell (2000)

Clewell et al. (2000);
Fisher and Allen (1993)



Bogdanffy et al. (2001,
1999)

Clewell etal. (2001); Reitz
etal. (1996b)
                            Previous
TOC

-------
528                                                                     C. M. THOMPSON ET AL
           TABLE 3. Relationship between tumor prevalence and dichloromethane metabolites3


           Exposure (ppm)        Microsomal pathway6         GSTC pathway6        Tumor number

           0                          006
           2000                     3575                 851                33
           4000                     3701                 1811                83

            aAdapted from Andersen et al. (1 987); methylene chloride dose response in female mice.
            6mg metabolized/L tissue/d via each pathway.
            cGlutathione S-transferase.
the route and exposure scenario (Clewell etal., 2002a). For example, Andersen etal. (1987) used a
PBPK model for dichloromethane (DCM) to examine which putative metabolites are responsible for
inducing liver and lung tumors in mice exposed to 2000 or 4000 ppm, 6 h/d, 5 d/wk for lifetime. In
brief, DCM undergoes oxidation  by microsomal cytochrome  P-450 enzymes or conjugation to
glutathione by glutathione S-transferase (GST). These authors designed a mouse PBPK model to cal-
culate the tissue dose of metabolites arising from exposure scenarios comparable to those used in
the relevant cancer bioassay study. The predicted amounts of metabolites formed by each pathway
were compared  to the observed tumor incidence.  Because the parent chemical was nonreactive,
Andersen  et al.  (1987) considered  it an unlikely  candidate responsible for the tumorigenicity.
Hence,  only the relationship  between the tissue exposure to its metabolites and tumor incidence
was examined (Table 3). Although the dose  metric based on cytochrome P-450 oxidative pathway
varied little between 2000 and 4000 ppm, the flux through the GST pathway rose with  increasing
dose of  DCM and corresponded well with the degree of DCM-induced liver tumors at these expo-
sure concentrations. Similar results were obtained for lung tumors (data not shown).
    Where there is an inadequate basis for prioritizing one dose metric over another, some suggest
using the most conservative one (i.e., the dose metric estimating the highest risk or lowest accept-
able exposure  level) to be health protective (Clewell et al., 2002a). For more on dose metric selec-
tion, see (Clewell etal., 2002a; U.S.  EPA, 2006a).

    Overview  of Extrapolations Possible With PBPK Models
    The scientific literature (or database) for a specific chemical  is often incomplete, thus  extrapola-
tions  are frequently required  in risk assessment. Generally, there are five types of extrapolations
possible with PBPK models. Interspecies extrapolations are most common, as toxicity data is most
often derived in  experimental animals. Because animal exposures are often performed in the con-
text of work days and work weeks, the exposure patterns frequently utilized in animal studies need
to be adjusted for duration, particularly when deriving dose-response values for continuous expo-
sure.  Route extrapolations are often  advantageous in risk assessment, as high-quality studies for all
routes of interest to human exposure are rarely available in the scientific literature. Similarly, high-
dose to  low-dose extrapolations are necessary to account for the fact that experimental assays often
employ  exposure doses high enough to elicit a response over more than one dose or concentration.
Finally, intraspecies extrapolations may be carried out to  extrapolate findings in  one group (usually
healthy  adult animals or humans) to other susceptible populations.
    Interspecies Extrapolation  The application of PBPK models for interspecies extrapolation of
tissue dosimetry (Rowland,  1985) is  performed in several steps. First,  a model for the species in a
critical toxicity study is developed. The  next step involves using species-specific or allometrically
scaled physiological parameters in  the model and replacement of the chemical-specific parameters
(e.g.,  metabolic rates, protein  binding constants) with appropriate estimates for the species of inter-
est (e.g., humans) for the risk assessment. In this approach, the qualitative determinants of pharma-
cokinetics are  considered to be invariant among the various mammalian species unless qualitative
differences between species have been identified. In this case, differences would be factored into
the existing structure of PBPK models (e.g., if different metabolic pathways existed among species)
but, obviously, data describing these species differences are required.
                         Previous  I     TOC

-------
APPLICATIONS OF PBPK MODELS IN RISK ASSESSMENT                                               529
    For conducting interspecies extrapolation of pharmacokinetic behavior of a chemical, quanti-
tative estimates of model parameter values (e.g.,  physiological parameters, partition coefficients,
and metabolic rate constants) in the second species are required. Physiological parameters are typ-
ically available from primary or secondary literature (Brown et al.,  1997). The tissue:air partition
coefficients of chemicals appear to be relatively constant across species, whereas blood:air parti-
tion  coefficients show some species-dependent variability. Therefore,  the tissue:blood partition
coefficients for human PBPK models can be calculated by dividing the rodent tissue:air partition
coefficients by the human blood:air partition values (Krishnan & Andersen, 2001). The tissue:air
and  blood:air partition coefficients for volatile organic chemicals may also be predicted using
appropriate data on the content of lipids and water in human tissues and blood (Poulin & Krishnan,
1996a, 1996b).
    Kinetic constants for metabolizing enzymes do not necessarily follow any readily predictable
pattern, making  the interspecies  extrapolation  of xenobiotic metabolism difficult.  Therefore, the
metabolic rate constants for xenobiotics are best obtained  in the species of interest. In vivo
approaches for determining metabolic  rate  constants are not always feasible  for application in
humans.  The alternative is to obtain such data under in vitro conditions (Lipscomb et al.,  1998,
2003). A parallelogram approach  may be used to predict values for the human PBPK model on the
basis of metabolic rate constants  obtained in vivo in  rodents as well as in  vitro using rodent and
human tissue fractions  (Reitz et al., 1988b;  Lipscomb et al., 1998). Alternatively, for chemicals
metabolized by enzymes conserved across species, Vm3X has been scaled to the 0.75  power of body
weight and Km assumed  invariant. This approach  may be useful as a crude approximation,  in the
absence of direct measurements  of metabolic parameters. Once the human PBPK model  is con-
structed and  parameterized for a particular  chemical, its dosimetric predictions should be com-
pared with human  data (if available)  to assess  whether  the  model  adequately simulates the
pharmacokinetics of the chemical in humans.
   An example of rat-to-human extrapolation of the kinetics of toluene using a PBPK model is pre-
sented in Figure 5A. Here the structure of the PBPK model developed in rats was kept unchanged,
but the species-specific parameters were determined  either by  scaling  or  experimentally,  as
described earlier. This reparameterized  model was then able to predict accurately the kinetics of
toluene in humans (seeTardif et al., 1997). Whenever the human data for a  particular chemical are
not available for evaluation purposes, a corollary approach using human data on similar chemicals
may be attempted (Jarabek et al.,  1994).
   There are some instances where PBPK models may be used for interspecies extrapolation of
toxicity studies without the need of an animal PBPK model.  For example, an RfC for methanol was
proposed (Starr & Festa, 2003) using a mouse developmental toxicity study  (Rogers  et al., 1993) in
which blood  methanol levels were also reported. By using  the blood methanol level at the POD
from the mouse study, a previously published human methanol  PBPK model (Bouchard et al.,
2001) was used  to predict the inhalation concentration associated with the same internal  blood
methanol level in humans. This example highlights the advantage afforded  by toxicity studies that
also  include pharmacokinetic measurements. It should be noted, however, that the utility of this
specific example is dependent on whether methanol, as opposed to a  methanol metabolite, is a
suitable dose  metric for methanol-induced toxicity.
    Route-to-Route Extrapolation  The extrapolation  of the kinetic behavior of a  chemical from
one exposure route to another is  performed  by including appropriate equations to  represent each
exposure pathway. For simulating the intravenous administration of a chemical, a single input rep-
resenting the dose administered to the animal is included in the equation for mixed venous concen-
tration. Oral gavage of a chemical dissolved in a carrier solvent may be modeled using approaches
ranging from simple zero- or first-order absorption to multicompartment, multiabsorption process
models (Roth et al., 1993), and dermal absorption has been modeled by including  a diffusion-lim-
ited  compartment to represent skin as a portal of entry (Krishnan  & Andersen, 2001). After the
equations describing the route-specific entry of  chemicals into systemic circulation are included in
the PBPK model, it is possible to conduct extrapolations of pharmacokinetics and dose metrics. This
approach is illustrated in Figure 5B for oral-to-inhalation extrapolation of the kinetics of chloroform
                         Previous  I     TOC

-------
530
                                                                                              C. M. THOMPSON ET AL
   A: Interspecies

   c     100 -,  Rat
   o
   'ra      ID-
                                                                  B: Route
   'S
             0.1
      3     0.01 -
      w
      o    0.001 -
      >   0.0001
                              8      12     16
                                Time (hours)
                                                  20
                                                         24
.1
1
"c
§
•a
o
o
_a
w
o
C
0)


100~i

10-
^p 1 <
D)
S 0.1 -

0.01-


0 001 ™
Oral

rx

"^•••^
"^^-v
"™^-^
""~^-v.
'*"""»„


01234567
Time (hours)
               10 -,

                1 -

              0.1 -

             0.01 -

            0.001 -

           0.0001
                Human
C: Duration

   I      10°-i
   •E       10-
   8
   I?     1-
   ||   0.1-

   |     0-01 -
   g    0.001 -
          0.0001
              10-1
               1 -
             0.1 -
            0.01 -
           0.001 -
          0.0001
                                16     24     32
                                  Time (hours)
                                                    40
                                                           48
                             8     12     16
                               Time (hours)
                                               20
                                                     24
                            16     24     32
                                Time (hours)
                                               40
                                                      48
  100..



>    1-

  0.1

 0.01
                                                                          Inhalation
                                                                                        345
                                                                                        Time (hours)
                                                                   D: Low-Dose
                                                                      300
                                                                 0) 'T~-
                                                                 £ £ 200

                                                                      100
                                                                                 500        1000
                                                                            Exposure Concentration (ppm)
                                     1500
                                                                       15 n
                                                                       10 ^
                                                                  II  5^
                                                                                 500         1000        1500
                                                                            Exposure Concentration (ppm)
FIGURE 5.  Examples of extrapolations afforded by PBPK models. (A) Interspecies extrapolation: Exposure of humans to 1 7.7 ppm tolu-
ene for 24 h results in the same AUC (3.8 mg/L/h) as exposure of rats to 50 ppm toluene for 6 h. It should be noted that if the pharmaco-
kinetics  in humans were  identical to rats, then an equivalent 24-h exposure would have yielded 12.5  ppm  (i.e., 24  hx12.5 ppm  is
identical to  6 hxSO ppm). Based on Tardif etal. (1997). (B) Route: Oral dose (1 mg/kg) and inhalation dose (83 ppm, 4 h) of chloroform
result in same AUC. Based on Corley et al. (1990). (C) Duration: Inhalation exposure of rats to 9.7 ppm toluene for  24 h results in an
AUC (2.4 mg/L/hr) equivalent to 50 ppm toluene for 4 h. Based on Tardif et al. (1997). (D) Low dose: Top panel, 1 2-h AUC of toluene,
indicates apparently linear kinetics. Bottom panel, 12-h AUC of a metabolite of toluene under the same exposure conditions, indicates
that the kinetics of toluene are nonlinear. Based on Tardif etal. (1997).
                                 Previous
                                                    TOC

-------
APPLICATIONS OF PBPK MODELS IN RISK ASSESSMENT                                               531
in  rats. Accordingly, 4-h inhalation exposure to 83.4 ppm chloroform is equal to an oral dose of 1
mg/kg, as determined with PBPK models on the basis of equivalent dose metric (i.e., parent chemi-
cal AUC in blood). Note that the peak concentrations differ approximately by 10-fold; thus, if peak
concentration was thought to be the appropriate dose metric, higher inhalation exposures would be
required to produce the same peak as a 1 mg/kg oral dose.
    Duration Adjustment   On the basis of equivalent dose metric, duration-adjusted exposure
values are obtained with PBPK models (Andersen et al., 1987; Bruckner et al.,  2004; Simmons et
al., 2005). For example, if the appropriate dose metric of a chemical were the  AUC, it would ini-
tially be determined for the  exposure duration of the critical study using the PBPK model and then
the atmospheric concentration for a continuous exposure (e.g. a day, a window of exposure, or a
lifetime) yielding the same AUC is determined by iterative simulation. Figure 5C depicts an example
of 4-h to  24-h extrapolation of the pharmacokinetics of toluene in rats, based on equivalent 24-h
AUC (2.4 mg/L/h). According to the modeling results, rats exposed to 50 ppm toluene for 4 h or 9.7
ppm for 24 h receive equivalent doses (i.e., AUC). It should be noted that extrapolations across long
durations may not be warranted, as lifestage changes and pharmacodynamic adaptations (e.g., sen-
sitization and desensitization) may be operational (Clewell et al., 2002a).
    High-Dose to Low-Dose Extrapolation  PBPK models facilitate high-dose to low-dose extrap-
olation of tissue dosimetry by accounting for the dose dependency of relevant processes (e.g., satu-
rable  metabolism,  enzyme  induction, enzyme inactivation, protein  binding, and depletion of
glutathione reserves) (Clewell & Andersen, 1987). The description of metabolism in PBPK models
has frequently included a capacity-limited metabolic process that becomes saturated at high doses.
Nonlinearities arising from mechanisms other than saturable  metabolism, such as enzyme  induc-
tion, enzyme inactivation, depletion of glutathione reserves, and binding to macromolecules, can
also be described with PBPK models (Clewell & Andersen, 1987; Krishnan et al., 1992). A PBPK
model intended for use in  high-dose  to low-dose extrapolation  needs equations and parameters
describing dose-dependent  phenomena if they occur in the  range of interest for the  assessment.
Because the  determinants of nonlinear behavior may not be identical across species and age groups
(e.g.,  maximal velocity for metabolism, glutathione concentrations), these parameters are required
for each species/age group. During the conduct of high-dose to low-dose extrapolation, no change
in  the parameters of the PBPK model should be required except for the administered dose or expo-
sure concentration.
    An example of high-dose to low-dose extrapolation is presented in  Figure 5D. In this  figure,
both the  blood AUC and the  amount metabolized over a period of time (12 h) are plotted  as a
function of exposure concentrations of toluene. For conducting high-dose to low-dose simulation in
this particular example, only the numerical value of the exposure concentration (which is an input
parameter for the PBPK model) was changed during every model run. All other model parameters
remained the same. The model simulations shown in  Figure  5D indicate the nonlinear nature of
blood AUC as well as the amount of toluene metabolized per unit time in the exposure concentra-
tion range simulated. In  such  cases, the high-dose to  low-dose extrapolation of tissue dosimetry
should not be conducted assuming linearity but, rather, should be performed using tools such as the
PBPK models.

    Estimating Intraspecies Variability
    The magnitude of interindividual variability in internal dose may be assessed using PBPK mod-
els for RfC and RfD derivations. For this purpose,  population distributions of parameters, particu-
larly those relating to physiology and metabolizing enzymes (i.e., genetic polymorphisms), are
specified in a Monte Carlo approach, such that the  PBPK model output corresponds to distributions
of dose metric in a population.  Using the Monte Carlo approach, repeated computations based on
inputs selected at random from statistical distributions for each input parameter (e.g., physiological
parameters,  enzyme content/activity with or without the consideration of polymorphism) are con-
ducted to provide a statistical distribution of the output, i.e., tissue dose. Using the information on
the dose metric corresponding to a high percentile (e.g., 95th) and the 50th percentile, the magni-
tude of interindividual variability (pharmacokinetic component) is computed for risk assessment
                        Previous  I     TOC

-------
532
                                                                         C. M. THOMPSON ET AL
                     re
                     2
                     Q.
                                     50th percentile
                          ....ill
                                                            95th percentile
                                           • interindividual variability .
                                      Concentration

FIGURE 6. Estimation of interindividual variability. In this example, interindividual variability describes the variation between the 50th
(median) and 95th percentile values of a dose metric simulated with a probabilistic PBPK model.


purposes (Figure 6). An important challenge for implementing  this approach is  to adequately
describe codependencies of parameters (e.g., tissue volume and  blood flow); assuming indepen-
dence of parameters will overestimate the population variability. Another challenge is to adequately
characterize population distributions reflecting the range of factors important to pharmacokinetics,
including genetics, age, lifestyle, health,  and nutritional status.
    Although past efforts largely have characterized the impact of the distributions of parameters in
the adult population, variability analyses also need to address different lifestages (e.g., pregnancy,
children, geriatric). Age-specific changes in physiology, tissue composition, and metabolic activity
were incorporated into the same model structure used for adults (O'Flaherty, 1994; Corley et al.,
2003; Nong et al., 2006). Published examples of modeling different ages describe predictions for a
range of chemicals with different properties (Clewell et al., 2002b, 2004; Sarangapani et al., 2003;
Ginsberg et al., 2004).  However, some lifestages, notably pregnancy and lactation, require different
model structures (i.e., describing the mother and the offspring) (Gentry et al., 2003, 2004; Corley et
al., 2003). Characterization of population variability across ages and lifestages as well as adult vari-
ability is an ongoing area of development (U.S. EPA, 2006c).

    Applications of PBPK Models in RfC and RfD Derivation
    The applications of PBPK models in  RfD and RfC derivations are very similar (U.S. EPA, 2006a)
and therefore the  present article describes only the approaches for applying PBPK  models in  RfC
derivation.  An RfC value corresponds to  an estimate (with uncertainty spanning perhaps an order of
magnitude) of continuous inhalation exposure (mg/m3) for a human population, including sensitive
subgroups,  that is likely to be without an appreciable risk of deleterious noncancer effects during a
lifetime (U.S. EPA, 1994). Notationally, RfC is defined as:
                                        = POD[HEc]/UF

    where  POD[HEC]  is the POD  (NOAEL, LOAEL,  or BMC) dosimetrically  adjusted to a human
equivalent concentration (HEC), and UF represents uncertainty factors to account for the extrapola-
tions associated with the POD (i.e., interspecies differences  in sensitivity, human intraspecies vari-
ability, subchronic-to-chronic extrapolation, LOAEL-to-NOAEL extrapolation, and incompleteness
of database)
    The starting point for an RfC derivation is the identification of the POD for the critical effect in a
key study. Subsequent steps involve (a) adjustment for the difference in duration between experi-
mental exposure (e.g.,  6 h) and expected  human  exposure (24 h), (b) calculation of the HEC, and
(c) application of uncertainty factors (UFs). Briefly  described  in this section are the default methods
often employed by the U.S. EPA in performing the aforementioned steps in RfC derivation, and the
potential benefits afforded by PBPK models in performing the same steps.
                         Previous
TOC

-------
APPLICATIONS OF PBPK MODELS IN RISK ASSESSMENT                                               533
    RfC Derivation: Determining a Point of Departure  Points  of departure  are derived  from
human epidemiological studies or animal toxicological studies. A POD for RfC derivation cannot be
identified or established with only pharmacokinetic data or PBPK models in the absence of dose-
response data.  Integrated pharmacokinetic-pharmacodynamic  models (Gearhart et al.,  1990,
1994; Timchalk et al., 2002) may be capable of predicting response and thus estimating a POD  in
the future, and are an area of ongoing research.
    Typically, the POD used in the RfC process would be inhalation route specific, and would cor-
respond to exposure concentrations in an experimental or field study (NOAEL, LOAEL) or would be
derived from statistical analysis of dose-response data to obtain a BMC or to its lower confidence
limit (95th percentile), BMCL, associated with a specified response level (generally in the range of 1
to 10% above background; e.g., BMCL10%) (U.S. EPA, 1994, 2000a).
    RfC Derivation: Route-to-Route Extrapolation   When information on  the POD is  available
only for a noninhalation route of exposure  (e.g., oral route), route-to-route extrapolation may be
conducted (Pauluhn, 2003). Historically, the NOAEL (mg/kg/d) associated with an oral exposure
route was converted to milligrams per day and then to the equivalent inhaled concentration on the
basis of human breathing rate and body weight. Data on the route-specific fraction absorbed, when
available, are used to determine the equivalent inhalation concentration on the basis of equivalent
absorbed doses (U.S. EPA, 1999a). Such simplistic approaches, however, assume that the rates  of
ADME and tissue dosimetry of chemicals are the same for a given total dose, regardless of the expo-
sure route and intake rate. These approaches neglect the route-specific differences in pharmacoki-
netics,  such  as first-pass clearance.  First-pass clearance arises when chemicals  undergo extensive
metabolism in tissues at portals of entry; this may include the intestines and liver for orally absorbed
compounds  or the lungs  for inhaled compounds (Benet et al.,  1996). Therefore, route-to-route
extrapolation using a more complete pharmacokinetic modeling approach, as previously described
in the Route to Route Extrapolation section,  is preferable.
    RfC Derivation: Duration Adjustment   An RfC  addresses continuous exposure of human
populations, so the POD  used in its derivation should correspond to 24-h/d exposures (U.S. EPA,
1994). PODs are frequently obtained from animal exposures or occupational exposures that occur
for 6 to 8 h/d, 5 d/wk, so an adjustment to a continuous 24-h exposure, resulting in a lower concen-
tration for continuous exposures, is conducted on the basis of hours per day and days per week
(i.e., 6/24x5/7) (U.S. EPA, 2002). This simple adjustment assumes that "Haber's Rule" applies, i.e.,
that for a given chemical Cxt=k, where C and t are the concentration (mass per unit volume) and
time needed (at that concentration)  to produce some adverse effect, and k is a constant associated
with that adverse effect. This approach hypothesizes that doubling the concentration will halve the
time needed to produce a comparable effect level. In pharmacokinetics, the integration of Cxt over
the exposure-response time frame of interest is also referred to as the AUC. If the AUC is not the
dose metric  most associated with the adverse effect or various Cxt = k regimens do not result in a
comparable  effect level, then Haber's Rule is not applicable (U.S. EPA, 2002). For example, when
data indicate that a given toxicity is more dependent on concentration than on duration (time), this
adjustment would not be used. If the appropriate measure of  internal dose is uncertain, the  U.S.
EPA uses adjustment to a continuous inhalation exposure based on the Cxt relationship as a matter
of health-protective policy (U.S. EPA, 2002). For additional insights into Haber's Rule (as one in a
family of power functions) and its use in risk assessment, the reader is referred elsewhere (Miller
etal., 2000).
    PBPK models may be used  to  estimate the value of a proposed internal dose metric  that
would  result from various administered doses (Jarabek etal., 1994; U.S. EPA, 2002). PBPK mod-
els often do  not address pharmacodynamic  events and assume that these events do not alter the
kinetics for within-day exposures (<24 h). Consistent with U.S. EPA policy (U.S. EPA, 2002), the
dose metric  of a chemical for the exposure scenario of the critical study is initially determined
using the PBPK model (e.g., 6 h/d, 5 d/wk);  then the  atmospheric concentration for a continuous
exposure (24 h/d) during  a lifetime  or a particular window of exposure that yields the same dose
metric is determined by iterative  simulation  (see example presented earlier in the Duration
Adjustment section).
                        Previous  I     TOC

-------
534                                                                     C. M. THOMPSON ET AL
    RfC Derivation: Dosimetric Adjustment Factor (Interspecies Extrapolation)   In the RfC pro-
cess, a dosimetric adjustment factor (DAF) is applied to the duration-adjusted POD to account for
pharmacokinetic differences between test species and humans to derive HEC (U.S. EPA, 1994). The
DAF depends on the nature of the  inhaled toxicant and the MOA as well as the endpoint (local
effects vs. systemic effects). Dosimetry data, if available, in the test animals and humans (including
deposition data, region-specific dosimetry, blood concentration of systemic toxicants) are used to
estimate the DAF. In the absence of such data,  knowledge of critical parameters or mathematical
models in the test species and humans is useful in estimating the DAF.
    For highly reactive  or water-soluble  gases that do  not significantly  accumulate in blood (e.g.,
hydrogen fluoride, chlorine, formaldehyde, volatile organic esters), the DAF is derived for estimates
of the delivery of chemical to different regions of the respiratory tract, based on regional mass trans-
fer coefficients and differences in surface area and ventilation rates (U.S. EPA,  1994). For poorly
water-soluble gases that produce remote effects (e.g., xylene,  toluene,  styrene), PBPK models are
identified as the preferred approach. Absent a PBPK model,  the DAF is  determined on the basis of
the ratio of blood:air partition coefficients in animals and humans (U.S. EPA, 1994). For gases that
are water soluble with some blood accumulation (e.g., acetone, ethyl acetate, ozone, sulfur diox-
ide, propanol, isoamyl alcohol) and that have the potential for both  respiratory and  remote effects,
some combination of the above approaches may be used.
    An alternative to the use of DAFs, discussed in the RfC guidance (U.S. EPA, 1994), is to employ
more elaborate or chemical-specific  models to make interspecies extrapolations. Various  computa-
tional tools are available to determine the uptake and deposition of gases and PM in nasal pathways
and the respiratory tract (Kimbell et al., 1993; Jarabek et al., 1994; Asgharian et al., 1995; Bush et al.,
1998; Iran etal., 1999;  Hannaetal.,  2001; Bogdanffy and Sarangapani, 2003; U.S. EPA, 2004). PBPK
models are frequently used for systemically distributed gases and vapors, but in conjunction with other
models (e.g., CFD), they may be used for locally acting gases with contact site effects. A limitation of
DAFs is that they do not account for metabolism of gases to more reactive  moieties, so PBPK modeling
approaches would clearly be preferable for these compounds if adequate data are available.
    RfC Derivation: Application of Uncertainty Factors  The uncertainty  and variability factors
(UFs) used in RfC and RfD  derivation account for extrapolations:  (1) from test animals to humans
(interspecies, UFA), (2)  for variability within the human population to  protect the most sensitive
population (intraspecies variability, UFH), (3) across duration of exposure (subchronic to chronic,
UFS), (4) from LOAEL to NOAEL (UFL), and (5) for poor-quality or missing data in the scientific liter-
ature (database deficiency, UFD) (Jarabek et al., 1994; U.S. EPA,  1994). The product of all UFs
generally should not exceed 3000 (U.S. EPA, 2002), otherwise there may not be sufficient data for
quantitative  analysis. If the NOAEL for a chemical with an  adequate database was identified in a
chronic study, only the  UFA and UFH are  used in  the assessment. The conventional default value for
UFA of 10 is  used in RfC and RfD derivation as an approximation of cross-species scaling resulting in
equivalent effects. Similarly, the default value for UFH  of 10 is presumed adequate to account for
variability in the human kinetic and dynamic processes following exposure and to protect poten-
tially sensitive human subpopulations. While the  incorporation of data in UF will reduce the uncer-
tainty in the extrapolation, the data may  not support a  reduction of the  uncertainty factor value, as
the data may demonstrate differences greater than or less than the default value.
    The values for UFA and UFH arising from historical use and science policy are supported by
empirical  information  for  pharmacokinetics and  pharmacodynamics  (e.g., isoenzyme levels,
enzyme activity levels, tissue volumes, breathing  rates, cell proliferation  rates) (Dome et al., 2001 a,
2001 b, 2002). Extrapolations  across species or estimates of interindividual  variability (e.g., differ-
ences  arising from genetic polymorphisms), however, are  best done  on the basis  of chemical-
specific determinants of disposition and effects. Initially, evaluation of various specific determinants
of interspecies differences or human variability is useful, but simple pooling of these determinants
without accounting for covariance or nonlinear interactions may lead  to unrealistic  estimates for
either UFA or UFH (Lipscomb,  2004). The net impact of various determinants on the UFA and UFH is
more properly evaluated within  the integrated  and physiologically based  context of a PBPK or
biologically based dose-response (BBDR) model.
                         Previous  I    TOC

-------
APPLICATIONS OF PBPK MODELS IN RISK ASSESSMENT                                               535
    When data are available to go beyond default uncertainty values, these UFs are subdivided into
their toxicokinetic and  toxicodynamic components (IPCS, 2005; U.S. EPA, 2005b). The World
Health Organization's International Programme on  Chemical Safety  (IPCS)  produced guidance on
the development of chemical-specific adjustment factors (CSAFs) (IPCS, 2005). Although the princi-
ples of using chemical-specific data in developing values for UFs have long been  endorsed by the
U.S. EPA (U.S. EPA, 1994), and many of the guiding principles in the IPCS document are also com-
ponents of the U.S. EPA risk assessment approach, the U.S. EPA does not use CSAFs per se, due in
part to differences in calculation methods. For instance, the U.S.  EPA often separates the pharma-
cokinetic and pharmacodynamic components of the  UFA equally  (i.e., 10    or 3.16, generally
rounded to 3 each), whereas the IPCS advocates  10°'6 (4.0) and 10°'4 (2.5),  respectively (IPCS,
2005; U.S. EPA, 2005c).
    When sufficient chemical-specific data are available for PBPK  modeling, such models are useful
for characterizing the magnitude of the pharmacokinetic component of the UFA as well as the UFH
used in the RfC and RfD processes. When using PBPK models to adjust for pharmacokinetic differ-
ences between species,  a factor of 3 (one-half order of magnitude) is generally retained to account
for remaining uncertainties (U.S. EPA, 1994, 2003; Jarabek, 1995a; Clewell et al., 2002a). How-
ever, chemical-specific information on the pharmacodynamic aspect of inter- and intraspecies dif-
ferences may inform a further reduction or increase of these UFs from default values.  It should be
recognized that PBPK and BBDR models are not currently suitable for predicting the magnitude of
LOAEL-NOAEL, subchronic-chronic, or database UFs, although research in these  areas is ongoing.
In addition, issues related to whether UFs are most appropriately  applied before or after derivation
of a HEC or human equivalent dose continue to be explored (Clewell et al., 2002a).

    Applications of PBPK Models in Cancer Risk Assessment
    The dose-response assessment portion of cancer risk assessment may vary, depending on MOA
considerations. A CSF is based on a linear extrapolation from the  POD (i.e., high-dose to low-dose
extrapolation), or a nonlinear analysis may be applied (U.S. EPA, 2005a). Either approach may also
require  interspecies or  route-to-route extrapolations for the POD. An integrated  PBPK-BBDR
model could likely improve the characterization of a chemical carcinogen's dose-response relation-
ships (e.g., a PBPK model coupled to  a clonal expansion and progression model); however,  most
such coupled models are still in the developmental  stage.  PBPK models improve estimation of the
internal dose metric for a chemical carcinogen and play  an important role in making explicit or
reducing the uncertainties associated with the high-dose to low-dose, interspecies, route to route,
and intraspecies extrapolations used in the cancer risk assessment process.
    Cancer Risk Assessment: High-Dose to Low-Dose Extrapolation   The oral CSF or the IUR is
determined by modeling the relationship between the cancer response and the administered  dose
or exposure concentration (U.S. EPA, 2005a). According to the cancer guidelines  outlined by  U.S.
EPA, either a nonlinear  or linear extrapolation from the POD is conducted, as appropriate for the
MOA of the carcinogen  (U.S. EPA, 2005a). The use of internal dose or delivered dose in such  anal-
ysis has been encouraged.
    Because high doses of chemicals are often administered in rodent cancer bioassays, the number
of tumors observed in such studies is not always directly proportional to the exposure dose. Thus,
the dose-response relationships may appear complex, in part due to  nonlinearity in the pharmaco-
kinetic processes occurring at high exposure doses. In other words, the target tissue dose of the
toxic moiety may be disproportional to the administered doses used in animal bioassays. Therefore,
dose-response analysis based on an appropriate dose metric may result in linearization of the rela-
tionship (Andersen et al., 1987; Clewell et al., 2002a). The slope factor derived using the dose met-
ric-response curve has units of (dose metric)"1. For nonlinear analyses, a POD based on an applied
or external exposure can be converted to  an equivalent internal dose metric and subsequently run
to predict the corresponding applied or external exposure dose in humans.
    Cancer Risk Assessment: Interspecies Extrapolation   For gases and PM,  the default proce-
dure for interspecies extrapolation involves  the derivation of an  HEC  (U.S.  EPA, 1994; Jarabek,
1995a, 1995b).  For oral exposures, when a PBPK model is not available, the U.S. EPA performed
                        Previous  I    TOC

-------
536                                                                     C. M. THOMPSON ET AL


interspecies scaling of doses according to body mass raised to the three-fourths power (BWOJS) (U.S.
EPA, 2002, 2005a). This procedure presumes that equal doses in these units  (i.e., in  mg/kg°-7S/d),
when administered daily over a lifetime, will result in equal  lifetime cancer risks across mammalian
species. The three-fourths power scaling relationship (sometimes called Kleiber's law, from his orig-
inal proposition in a 1932 article) is generally attributed to differences in rates of basal  metabolism.
There remains considerable dissent as to the generality of the BW°'7S scaling factor, the underlying
biological  rationale, and the  value of the exponent  (e.g.,  some proponents advocate a BWa67
scaling  based  on surface  area differences), particularly for toxicological effects  of xenobiotic
chemicals in contrast to endogenous anabolic and catabolic  processes (Agutter & Wheatley, 2004).
Nonetheless, BW°-7S scaling remains the current U.S. EPA default approach for oral cancer assess-
ments (U.S. EPA, 2005a).
    The nature and slope of the dose-response relationship for carcinogens may not be identical in
test species  and humans  due to pharmacokinetic and pharmacodynamic differences (Monro,
1994). If appropriate data  are available in both the test species and humans (e.g., tissue or  blood
concentrations), then interspecies extrapolations of an equivalent carcinogenic or safe dose may be
conducted. In the absence of a complete data set, PBPK models provide a means to  characterize
the relationship between the applied dose and  the internal  dose of a carcinogen in the species of
interest for subsequent extrapolation to humans (Andersen etal., 1987).
    Cancer Risk Assessment: Intraspecies Extrapolation   Cancer risk assessments have not gen-
erally considered intraspecies variability in pharmacokinetics or pharmacodynamics; however,  use
of upper bounds on a maximum  likelihood estimate (MLE)  is  generally thought to be conservative
and thus protective of susceptible populations. One advantage of using PBPK models is that  this
assumption can be tested when sufficient data are available to  link risk to an identifiable pharmaco-
kinetic outcome or dose metric. These models also describe  the impact of variations in pharmacok-
inetic determinants such as polymorphisms in xenobiotic metabolizing enzymes.  For instance,  the
impact of interindividual variability, including polymorphisms in glutathione transferase theta, on
the disposition of methylene chloride was shown to impact cancer risk (Portier  &  Kaplan,  1989,
Casanova etal., 1996, el-Masri etal., 1999; Jonsson & Johanson,  2001).
    PBPK models are useful in evaluating differences in chemical disposition among adults and chil-
dren (Clewell et al., 2002b, 2004; Gentry et al., 2003; Price et al., 2003; Ginsberg et al., 2004). In
this regard, susceptibility to children might be assessed by  examining whether the  dose metric is
expected to  be higher in the young. In addition, recent U.S. EPA guidance suggests that additional
adjustment factors (age-dependent adjustment factors, ADAFs) to the cancer slope or unit risk value
be considered to account for enhanced susceptibility in early life (i.e., to neonates and young chil-
dren) from exposure to carcinogens exhibiting a mutagenic MOA (U.S. EPA, 2005b).  It should be
noted that ADAFs generally account for increased susceptibility related to adult/child differences in
pharmacodynamic  processes such as cell proliferation  rates  and  numbers of cells with proliferative
potential (U.S.  EPA, 2005b);  thus even if pharmacokinetic differences are  accounted for  when
extrapolating from a POD,  these adjustment factors would still likely be made for chemicals with a
mutagenic MOA unless, perhaps, the  carcinogenicity data were derived from young animals or
humans. Furthermore, when assessing the less-than-lifetime exposures occurring in childhood,  the
guidelines stipulate consideration  of adult-children  differences in key exposure factors (e.g., skin
surface area, drinking water ingestion rates) (U.S. EPA,  2005b).
    Cancer Risk Assessment: Route Extrapolation  As with RfC and  RfD derivation, PBPK mod-
els can  facilitate the conduct of route-to-route  extrapolation  by accounting for  the route-specific
rate and magnitude of absorption, first-pass effect,  and metabolism (Clewell &  Andersen, 2004).
The slope factor or the POD associated with one exposure route may be translated  into applied
dose for another exposure route by simulating the tissue dose of toxic moiety associated with  the
exposures by each route (Gerrity et al., 1990; U.S. EPA, 2000b).

    Brief Summary of Select Examples of PBPK Model Use in Risk Assessment
    Regulatory agencies in the United States, Canada, and Europe are increasingly applying PBPK mod-
els in risk assessments, though there are far more models published in the peer-reviewed literature.
                         Previous   I    TOC

-------
APPLICATIONS OF PBPK MODELS IN RISK ASSESSMENT                                               537
Upon evaluation,  some published models are found adequate for risk  assessment applications,
while in other cases the published models reflect research efforts that did not necessarily address
the species, exposure routes, target tissues, or other factors that would be required for risk assess-
ment applications (DeWoskin et al., 2007). Examples from Canada and  Europe in which models
were evaluated and deemed adequate or inadequate for risk assessment purposes are described in
a publication from the International Workshop on the Development of Good Modeling Practice for
PBPK Models (Loizou et al., 2008). Described next are two examples where PBPK models have
been used by the U.S. EPA; more detailed examples are described elsewhere (U.S. EPA, 2006a;
DeWoskin etal., 2007).
    Example of Reference Value Derivation and Application of Uncertainty Factors  The RfD der-
ivation for  ethylene glycol  monobutyl  ether  (EGBE) compared four  dose-response  modeling
approaches: (1) a default approach, (2) a PBPK modeling approach, (3) a BMD analysis, and (4) a
combined PBPK/BMD  approach.  Each approach necessitated the application of different UFs in
deriving the RfD values (Table 4). More about this assessment can be found elsewhere (U.S. EPA,
1999b; DeWoskin et al., 2007), but it is briefly described here to demonstrate the application of
PBPK modeling in reference value derivation,  and more broadly to illustrate how various dose-
response modeling approaches can influence  UFs applied in such derivations.
    Two critical effects observed from EGBE exposure are hemolysis and hepatocellular toxicity. In
rats, the primary toxic  metabolite responsible for these effects are 2-butoxyacetic  acid (BAA);  the
peak concentration of this  metabolite in blood (i.e., Cmax BAA) was identified as an appropriate
dose metric. In each modeling approach, the  UFD and UFS were set to 1. A value of 10 was chosen
for the UFH due to lack of data on human variability in EGBE sensitivity. For interspecies extrapola-
tion, in vivo and in vitro studies  indicate that,  pharmacodynamically, humans are less sensitive than
rats to the hematologic effects of EGBE; therefore the nonpharmacokinetic portion of the UFA was
also set to unity.
    All other UFs were impacted by the choice of modeling approach. In  the standard default and
BMD approaches, the UFA was set to 3 because no pharmacokinetic adjustments were considered
in developing a POD[HEC]. In contrast, the PBPK and combined PBPK/BMD approaches estimated a
POD[HEC] based on the dose metric BAA; thus with  interspecies differences in  pharmacokinetics
accounted for, the UFA was set to 1. In regards to extrapolating from a LOAEL to a NOAEL, UFL was
set to 3 for the default and PBPK approaches  because data indicated that the LOAEL was very near
the threshold  level for the critical effects of concern. In contrast, the UFL was set to 1 for the BMD
analyses because of the minimal and precursive nature (cell swelling) of the critical effect,  and  the
fact that a BMDOS for a minimally adverse effect is  typically deemed to be equivalent to a NOAEL
for continuous data sets.
    In the default approach, the  rat LOAEL  for hepatocellular cytoplasmic changes was chosen
because it provided the more conservative RfD among the data amenable to  default approaches.
The derivation of this RfD is shown in Table 4. In the PBPK modeling approach, the RfD is based on
hemolysis, and a LOAEL of 59 mg/kg-d was chosen  as the POD from which to estimate Cmax BAA in
blood using the PBPK model developed  by Corley et al. (1994, 1997). The model predicted Cmax
BAA in rat blood of 103 |iM. Subsequently, the human PBPK model simulations indicated that 7.6
mg/kg-day EGBE  in drinking water would yield the same Cmax BAA of 103 |iM in humans. This
human equivalent dose (HED) of 7.6 mg/kg-d was divided by the UFs in Table 4.
    BMD analysis yielded a BMDOS of 49 mg/kg-d as the POD. In a combined approach (BMD/
PBPK), the rat PBPK model was exercised to  simulate blood concentrations of BAA resulting from

          TABLE 4. Comparison of RfD derivations for EGBE

                         Default            PBPK            BMD          BMD/PBPK
POD
UFH,A,L
RfD
55 mg/kg-d
10,3,3
0.6 mg/kg-d
7.6 mg/kg-d
10,1,3
0.3 mg/kg-d
49 mg/kg-d
10,3, 1
2 mg/kg-d
5.1 mg/kg-d
10, 1,1
0.5 mg/kg-d
                        Previous  I    TOC

-------
538
                                                                          C. M. THOMPSON ET AL
each of the exposure concentrations in the toxicity study. This BMD analysis yielded a BMDLOS of
64 uM, which corresponds to an HED of 5.1  mg/kg-d as the  POD.  Derivations of the RfD values
from both of these BMD approaches are shown in Table 4.
    It is important to note in  this example that the POD for the various approaches differed by
about  an  order of  magnitude depending on  whether a PBPK  model was used. However, after
applying the various UFs, the RfD values were in relatively closer agreement. This result should not
necessarily be  interpreted as coalescence around a "correct answer"  because  each POD  was
reduced by different UF values—each somewhat  subjective in  nature. Thus, which RfD value to
choose should not arise from the modeling approach per se, but rather the judicious use of data to
support a particular RfD value (PBPK/BMD  approach in this case).  Perhaps  most importantly, an
analysis of which modeling approach is truly best  will not be  possible until further methodologies
are derived for quantifying the uncertainty in the various approaches. To this end, a recent work-
shop was held to describe the state of the science as well  as research and  implementation  needs for
statistical analyses to characterize uncertainty and variability in PBPK  models (Barton et al., 2007).
    Example in Cancer Risk Assessment  Returning to the earlier example of DCM, in addition to
dose metric evaluation, the PBPK model for DCM was used for low-dose extrapolation (Andersen et
al., 1987). The model prediction of the target tissue dose of the glutathione conjugate resulting from
6-h inhalation exposures to 1-4000 ppm DCM is presented in  Figure 7. The estimation of target tis-
sue dose of DCM-glutathione conjugate by linear  back-extrapolation gives rise to a 21-fold higher
estimate than that obtained by the PBPK modeling approach. This discrepancy arises from the non-
linear behavior of DCM metabolism at high-exposure concentrations. At exposure concentrations
exceeding  300  ppm, the cytochrome P-450-mediated oxidation  pathway is saturated, giving rise to
a corresponding disproportionate increase in the flux through glutathione conjugation pathway. By
accounting for  the species-specific differences in  metabolism rates and physiology in the PBPK
                    re
                    •O
                   •D
                   £
                   Q.
                    re
                    •ffi

                    •D
                    IS
                    s
                   O
                       10000
                       1000
                        100
                        0.01-^
                                      10
                                                100
                                             PPM in air
                                                          1000
                                                                     10000
FIGURE 7. PBPK model predictions of glutathione (GST)-pathway metabolites in mouse liver. The three curves are for a linear extrapo-
lation from the bioassay exposures of 2000 and 4000 ppm (solid thin line), the expected tissue dose based on model parameters for the
mouse (solid thick line), and the expected dose based on human model parameters (dotted line). The curvature occurs because oxida-
tion reactions become saturated as inhaled concentration increases above several hundred ppm. Reprinted Andersen et al. (1987), with
permission from Elsevier.
                          Previous
TOC

-------
APPLICATIONS OF PBPK MODELS IN RISK ASSESSMENT                                               539
model, the target tissue dose for humans was estimated to be some 2.7 times lower than that for the
mouse. The target tissue dose-based slope factor was subsequently used for characterizing the cancer
risk associated with  human exposures  (Andersen et al., 1987;  Reitz et al., 1989;  Haddad  et al.,
2001 b). The case of  DCM exemplifies how PBPK models can be used to improve the dose-response
relationship on the basis of appropriate dose metrics, thus leading to scientifically sound conduct of
interspecies and high-dose to low-dose extrapolations essential for cancer risk assessments.


    OTHER APPLICATIONS OF PBPK MODELS IN RISK ASSESSMENT

    Use of Pharmacokinetic Data and Models in Variability and Uncertainty Analysis
    There is increasing focus on the  application of variability and uncertainty analyses with PBPK
models. Although characterizing the variability and uncertainty in a PBPK model may be considered
an aspect of model development, such  analyses also have the potential to improve risk assessments
by also characterizing the uncertainty in the overall characterization of risk. For instance, evaluation
of alternative dose metrics (e.g., Cmax vs. AUC) is useful in characterizing the uncertainty in the use
of a PBPK model; examples include  DCM (Andersen et al., 1987) and retinoic acid (Clewell et al.,
1997). Likewise, uncertainty and variability may also be addressed through  specialized  analyses
such as probabilistic simulations using distributions for physiological, biochemical, and  physico-
chemical parameters as well as measurement error contributing to  uncertainty. Approaches for
incorporating human variability in  risk assessment are reviewed in the section Estimating Intraspe-
cies Variability; both variability and uncertainty in PBPK models are discussed in greater detail else-
where (U.S.  EPA, 2006a, Chiu  et al.,  2007;  Barton et al., 2007).

    Use of Pharmacokinetic Data and Models in Exposure Assessment and  Interpretation
    of Biomonitoring Data
    The conventional approach to exposure assessment involves the calculation of applied dose for
each  route of exposure based on information about the concentration  of the chemical in the
medium, frequency  and duration of  exposure, rate of contact with the medium, and body weight
of the individual (Paustenbach, 2000). As described throughout this article, when sufficient data are
available, PBPK models allow for the prediction of absorbed or internal dose when the exposure
concentration is known (as in experimental  settings). Once an internal dose is calculated  for a given
species, route or duration, the  model is adjusted for alternative species and exposure scenarios and
essentially exercised in reverse in order to determine exposure concentrations associated  with the
internal dose metric. From this approach, it  is evident that PBPK models have the potential to utilize
pharmacokinetic measurements or other biomarker data  to reconstruct  previous exposure(s) to
environmental toxicants or interpret biomonitoring data (Krishnan et al., 1992; Fennell et al., 1992;
Csanady et al., 1996;  Timchalk et al.,  2001, 2004; Tan et al., 2006). Indeed, such models were
used to establish biological exposure indices (e.g., breath, blood, or breath concentrations) to pro-
tect workers from harmful  exposures to solvents (Perbellini et al., 1990; Leung, 1992; Kumagai &
Matsunaga, 1995; Thomas et al., 1996; Droz et al., 1999) or in epidemiology studies to reconstruct
human exposures over time (Roy & Georgopoulos, 1998; Canuel et al., 2000). In this regard, com-
prehensive PBPK models are being developed that provide estimates of an internal tissue dose from
multiroute (oral, inhalation, dermal) or multichemical exposures (Roy et al., 1996; Rao & Ginsberg,
1997; Corley et al., 2000; Liao et al., 2002; Levesque et al., 2002).

    Use of Pharmacokinetic Data and Models in Risk Assessment of Chemical Mixtures
    PBPK models facilitate risk assessment of chemical mixtures by estimating the change in dose
metrics  due to multichemical interactions (Haddad  et al.,  2001 a). Tissue  dosimetry-based assess-
ments for mixtures require adequately  evaluated PBPK models for the mixture (in the test species
and in humans), as well as dose-response values for the individual chemicals (e.g., CSF, RfD,  RfC).
An  approach for using PBPK models in  risk assessment of mixtures of systemic  toxicants or carcino-
gens exhibiting threshold mechanism of action, would consist of (Haddad et al., 2001 a):
                         Previous  I     TOC

-------
540                                                                     C. M. THOMPSON ET AL
1.  Characterizing the dose metrics associated with dose-response values for the mixture components.
2.  Obtaining predictions of dose metrics of each mixture component in humans, based on informa-
   tion on exposure levels provided as input to the mixture PBPK model.
3.  Determining the sum total of the  ratios of the results of steps (1) and  (2) for each component
   during mixed exposures for each target organ or critical effect.

    Similarly, for carcinogens with slope factor (Haddad et al., 2001 a):

1.  The dose metric-based slope factor is  established for each component using the animal PBPK
   model.
2.  The dose metric associated with human exposure concentrations is established using mixture
   PBPK models.
3.  The results  of  steps (1)  and (2)  are combined to determine the potentially altered cancer
   response during mixed  exposures.

    Risk assessments  based on  the  use of PBPK models for single chemicals and mixtures, as
detailed in  previous sections, account for only the pharmacokinetic aspect or,  more specifically,
target tissue exposure to toxic moiety. If these tissue exposure simulations are combined with phar-
macodynamic models, then  better characterization of dose-response  relationships and prediction of
PODs (NOAEL, BMD, and BMC) may become possible.

    Linkage of PBPK Models With Pharmacodynamic Models
    The identification of PODs by simulation may become possible with the use of BBDR models.
These models would require the linkage of quantitative descriptions of pharmacokinetics and phar-
macodynamics via mechanism of action. Accordingly, the output of PBPK models is linked to the
pharmacodynamic (PD) model  using an equation that reflects the researcher's hypothesis of how
the toxic chemical participates  in the initiation  of cellular  changes leading to measurable toxic
responses. For example, certain DMA adducts  induce mutations, some metabolites kill  individual
cells, and expression  of growth factors can stimulate cell proliferation. In  each of these cases, the
temporal change  in  the dose metric  simulated by the  PBPK model is linked with mathematical
descriptions of the process of adduct formation, cytotoxicity, or proliferation in the BBDR models to
simulate the quantitative influence of these processes on tumor outcome.
    For example, using a PD model, the rate of chloroform metabolism (|imol/g liver/h) was related
to  fraction of liver cells killed (Page et al., 1997). In this model, a series of differential equations was
used to simulate time-dependent changes in the number of hepatocytes in the liver as a function of
basal  rates of cell division and death, chloroform-induced cytolethality, and regenerative replica-
tions (Conolly & Butterworth, 1995; Page et al., 1997).
    Table 5 presents  a list  of PD models for  cancer and noncancer endpoints. A characteristic
of  several of these PD models is that they are able to simulate the normal physiological processes
(e.g., cell proliferation rates, hormonal cycle) and additionally account for the ways in which chem-
icals perturbate such  phenomena, leading to the onset and progression  of injury. However,  PD
models that are linked  with PBPK models are not available for a number of adverse effects and
modes of action. This situation  is a result, in part, of the complex nature  of these models and the
extensive data requirements for development and evaluation of these models for various exposure
and physiological conditions.
    With the availability of integrated pharmacokinetic-pharmacodynamic (PK-PD) models, the sci-
entific basis of the process of estimating/extrapolating PODs and characterizing the dose-response
curve will be significantly enhanced. Additionally, such a modeling framework will facilitate a quan-
titative analysis of the impact of pharmacodynamic determinants on the toxicity outcome, such that
the magnitude of the pharmacodynamic component of the interspecies and intraspecies factors
may be characterized more confidently. Even though some PBPK  models have been used in RfD,
RfC, and unit risk estimate derivation, the need for applying such  models (where possible) should
be continuously explored.
                        Previous  I     TOC

-------
APPLICATIONS OF PBPK MODELS IN RISK ASSESSMENT
                                                                                                          541
TABLE 5. Examples of biologically based models of relevant endpoints and toxicological processes

Toxicity endpoint
or process          Features                            Chemical studied          References
Cancer
Cholinesterase
  inhibition
Developmental
  toxicity

Estrus cycle

Gene expression

Granulopoiesis

Nephrotoxicity



Teratogenic effect
Simulation of relative roles of initiation,
  promotion, cytolethality, and
  proliferation
Simulation of dose-dependent
  inhibition of plasma cholinesterase,
  red blood cell acetyl cholinesterase
  and brain acetyl cholinesterase,
  and nontarget B-esterase
Simulation of altered cell kinetics as
  the biological basis of developmental
  toxicity
Simulation of interactions of estradiol
  and luteinizing hormone
Simulation of induction of CYP1A1/2
  protein expression in multiple tissues
Simulation of loss of proliferating cells
  and loss of functional cells
Simulation of induction of renal
  2|i-globulin in male rat kidney as a
  function of proteolytic degradation
  and hepatic production
Sensitivity distribution of embryo
  as a function of age and stage of
  development
2-Acetylaminofluorine
Chloroform
Dimethylnitrosamine
Formaldehyde
Polychlorinated biphenyls
Pentachlorobenzene
Saccharin
Organophophates
Methylmercury


Endocrine-modulating
  substances
Tetrachlorodibenzodioxin

Cyclophospham ide

2,2,4-Trimethyl-2-phenol



Hydroxyurea
Conolly et al. (2003); Tan et al. (2003);
  Thomas et al. (2000); Conolly and
  Andersen (1997); Conolly and Kimbell
  (1994); Chen (1993); Luebeck et al.
  (1991); Cohen and Ellwein (1990);
  Moolgavkar and Luebeck (1990);
  Moolgavkar and Knudson (1981 );
  Moolgavkar and Venzon (1979);
  Armitageand Doll (1957)
Timchalk et al. (2002); Gearhart et al.
  (1994, 1990)
Faustman et al. (1 999); Leroux et al.
  (1996)

Andersen etal. (1997)

Santostefanoetal. (1998)

Steinbach et al. (1 980)

Kohn and Melnick (1999)



Lueckeetal. (1997)
    SUMMARY
    Practical computer implementation of PBPK modeling has been feasible for over 20 yr. Despite
an early impact of PBPK modeling on the risk assessment of dichloromethane, PBPK model usage in
risk  assessment has  been relatively sparse. Today, however, there is  growing  interest, willingness
and expertise to  apply these  models in  health  risk assessments.  Scientific progress in the  under-
standing of lifestage  and genetic differences in dosimetry and their impacts on variability in suscep-
tibility,  as well  as ongoing development of analytical  methods to characterize the uncertainty in
PBPK models makes future  PBPK model use  in risk assessment increasingly likely. As such,  it is
anticipated that when  PBPK models are used  to express adverse tissue  responses in terms of the
internal target tissue dose of the toxic  moiety rather  than the external concentration, the  scientific
basis of, and confidence in, risk assessments will be enhanced.

    REFERENCES
Agutter, P. S., and Wheatley, D. N. 2004. Metabolic scaling: Consensus or controversy? Jheor. Biol. Med. Model. 1:1 3.
Andersen, M. E., Mills, J. J., Gargas, M. L., Kedderis, L., Birnbaum, L. S., Neubert, D., and Greenlee, W. F. 1993. Modeling receptor-
    mediated processes with dioxin: implications for pharmacokinetics and risk assessment. Risk Ana!. 13:25-36.
Andersen, M. E. 1995. Development of physiologically based pharmacokinetic and physiologically based pharmacodynamic models for
    applications in toxicology and risk assessment. Toxicol. Lett.  79:35-44.
Andersen, M. E., Clewell, H. J. Ill, Gargas, M. L., Smith, F. A., and Reitz, R. H. 1987. Physiologically based pharmacokinetics and the risk
    assessment process for methylene chloride. Tox/co/. Appl. Phannacol. 87:185-205.
                              Previous
                                  TOC

-------
542                                                                                               C. M. THOMPSON ET AL
Andersen, M. E., and DennisonJ. E. 2001. Mode of action and tissue dosimetry in current and future risk assessments. Sci. Total Environ.
     274:3-14.
Andersen, M. E., Clewell, H. J., Gearhart, J., Allen, B. C., and Barton, H. A. 1997. Pharmacodynamic model of the rat estr us cycle in rela-
     tion to endocrine disruption.). Toxicol. Environ. Health 52:1 89-209.
Andersen, M., Sarangapani, R., Gentry, R., Clewell, H., Covington, T., and  Frederick, C. B. 1999. Application of a hybrid CFD-PBPK
     nasal dosimetry model in an inhalation risk assessment: an example with acrylic acid. Tox/co/. Sci. 57:312-325.
Andersen, M. E., Green, T., Frederick, C. B., and Bogdanffy, M. S. 2002. Physiologically based pharmacokinetic (PBPK) models for nasal
     tissue dosimetry of organic esters: Assessing the state-of-knowledge and risk assessment applications with methyl methacrylate and
     vinyl acetate. Regul. Toxicol.  Pharmacol. 36:234-245.
Andersen, M. E., and Jarabek, A.  M.  2001.  Nasal tissue dosimetry-issues and approaches for "Category 1" gases: A report on a meeting
     held in Research Triangle Park, NC, February 11-12, 1998. Inhal. Toxicol. 13:415-435.
Armitage, P., and  Doll,  R. 1957. A two-stage theory of carcinogenesis in  relation to the age distribution of human cancer. Br. ]. Cancer
     11:161-169.
Asgharian,  B., Wood, R., and  Schlesinger,  R. B., 1995. Empirical modeling of particle deposition in the alveolar region of the lungs: A
     basis for interspecies extrapolation. Fundam. Appl. Toxicol. 27:232-238.
Aylward, L. L., Hays, S. M., Karch, N. J., and Paustenbach, D. J. 1996. Relative susceptibility of animals and humans to the cancer hazard
     posed by 2,3,7,8-tetrachlorodibenzo-p-dioxin using internal measures of dose. Environ. Sci. Techno/. 30:3534-3543.
Barton, H. A., Deisinger, P. J., English, J. C.,  Gearhart, J. N., Faber, W. D., Tyler, T. R., Banton, M. I., Teeguarden, J., and Andersen, M. E.
     2000. Family approach for estimating reference concentrations/doses for series of related organic chemicals. Toxicol. Sci. 54:251-261.
Barton, H. A.,  Chiu, W.  A., Setzer, R. W., Andersen, M. E.,  Bailer, A. J., Bois, F. Y., Dewoskin,  R. S., Hays, S., Johanson, G., Jones, N.,
     Loizou, G., Macphail, R. C.,  Portier, C. J., Spendiff, M., Tan, Y. M. 2007. Characterizing uncertainty and variability in physiologically-
     based pharmacokinetic (PBPK) models: State of the science and needs for research and implementation. Toxicol. Sci. 99:395-402.
Benet, L. Z., Kroetz, D. L., and Sheiner, L.  B. 1996. Pharmacokinetics: the dynamics of drug absorption, distribution, metabolism, and
     elimination.  In Goodman and Oilman's The pharmacological basis of therapeutics, eds. J. G. Hardman and L. E.  Limbird, pp. 3-27.
     New York: McGraw-Hill.
Benignus, V. A., Boyes, W. K., and Bushnell, P. J. 1 998. A dosimetric analysis of behavioral  effects of acute toluene exposure in rats and
     humans. Toxicol. Sci. 43:186-195.
Bogdanffy, M. S., Sarangapani, R., Plowchalk, D.  R., Jarabek, A., and Andersen, M.  E. 1999. A biologically risk assessment for vinyl
     acetate-induced cancer and  noncancer inhalation toxicity. Toxicol. Sci. 51:1 9-35.
Bogdanffy, M.  S.,  Plowchalk, D. R., Sarangapani, R., Starr, T. B., Andersen, M. E. 2001. Mode-of-action-based dosimeters for interspe-
     cies extrapolation on vinyl acetate inhalation risk. Inhal. Toxicol. 1 3:377-396.
Bogdanffy, M. S., and Sarangapani, R. 2003. Physiologically-based kinetic modeling of vapours toxic to the respiratory tract. Toxicol. Lett.
     138:103-117.
Bouchard, M., Brunei, R. C., Droz, P. O., and Carrier, G. 2001. A biologically based dynamic model  for predicting the disposition of
     methanol and its metabolites in  animals and humans. Toxicol. Sci. 64:169-184.
Boyes, W. K., Bushnell, P. J., Crofton, K. M., Evans, M., and  Simmons, J. E. 2000. Neurotoxic and pharmacokinetic responses to trichlo-
     roethylene as a function of exposure scenario. Environ.  Health Perspect. 108(Suppl. 2):31 7-322.
Brown, R. P., Delp, M.  D., Lindstedt, S. L., Rhomberg,  L. R., and Beliles, R.  P. 1997. Physiological parameter values for physiologically
     based pharmacokinetic models. Toxicol. Ind. Health 13:407-484.
Bruckner, J. V., Keys, D. A., and Fisher, J. W. 2004. The Acute Exposure Guideline Level (AEGL)  program: Applications of physiologically
     based pharmacokinetic modeling. ]. Toxicol. Environ. Health A 67:621-634.
Bush, M. L., Frederick, C. B., Kimbell, J. S., and Ultman, J. S. 1998. A CFD-PBPK hybrid model for simulating gas and vapor uptake in
     the rat nose. Toxicol. Appl. Pharmacol. 1 50:133-145.
Bushnell, P. J.  1997. Concentration-time relationships for the effects of inhaled trichloroethylene on signal detection behavior in rats.
     Fundam. Appl. Toxicol. 36:30-38.
Canuel,  G., Viau,  C., Krishnan, K., 2000. A modeling framework for  back-calculating ambient concentrations from data on biomarkers.
     International Conference on  Health Sciences Simulation. Proc. International Conference on Health Sciences Simulation, pp. 97-102.
     San Diego, CA.
Casanova, M.,  Conolly,  R. B., Heck, and H.  A. 1 996. DNA-protein cross-links (DPX) and cell proliferation in B6C3F., mice but not Syrian
     golden hamsters exposed to dichloromethane: Pharmacokinetics and risk assessment with  DPX as dosimeter. Fundam. Appl.
     Toxicol. 31:103-116.
Chen, C. W. 1993. Armitage-Doll two-stage model: Implications and extension. Risk Anal. 13:273-279.
Chiu, W. A., Barton, H. A., DeWoskin, R. S., Schlosser, P.,  Thompson, C. M., Sonawane, B., Lipscomb, J. C., and  Krishnan, K. 2007.
     Evaluation of physiologically based pharmacokinetic models for  use in risk assessment. ]. Appl. Toxicol. 27:21 8-237.
Clark, L.  H., Setzer, R. W., and Barton, H. A. 2004. Framework for evaluation of physiologically-based  pharmacokinetic models for  use
     in safety or risk assessment. Risk Anal. 24:1697-1 71 7.
Clarke, D. O.,  DuignanJ. M., and Welsch,  F. 1992. 2-Methoxyacetic acid dosimetry-teratogenicity  relationships in CD-1 mice exposed
     to 2-methoxyethanol. Toxicol. Appl. Pharmacol. 114:77.
Clarke, D. O.,  Elswick, B. A., Welsch, F., and Conolly, R. B.  1993. Pharmacokinetics of 2-methoxyethanol and 2-methoxyacetic acid in
     the pregnant mouse: A physiologically-based mathematical model. Toxicol. Appl. Pharmacol. 1 21:239-252.
Clewell, H. J. Ill, and Andersen, M. E. 1 985. Risk assessment extrapolations and physiological modeling. Toxicol. Ind. Health 1:111-131.
Clewell, H. J. Ill, and Andersen, M. E. 1 987. Dose, species and route extrapolation using physiologically-based pharmacokinetic models.
     Drink. Water Health 8:159-182.
                                  Previous   I       TOC

-------
APPLICATIONS OF PBPK MODELS IN RISK ASSESSMENT                                                                 543
Clewell, H. J. Ill, and Andersen, M. E. 2004. Applying mode-of-action and pharmacokinetic considerations in contemporary cancer risk
     assessments:  An example with trichloroethylene. Crit. Rev. Toxicol. 34:385-445.
Clewell, H. J. Ill, and Jarnot, B. M. 1994. Incorporation of pharmacokinetics in noncancer risk assessment: Example with chloropentaflu-
     orobenzene. Risk Ana!. 14:265-276.
Clewell, H. J. Ill, Andersen, M. E., Willis, R. J., and Latriano, L. 1997. A physiologically based pharmacokinetic model for retinoic acid
     and its metabolites.).  Acad Dermatol. 36:S77-S85.
Clewell, H. J. Ill, Gentry, P. R., Covington, T. R., and Gearhart, J. M. 2000. Development of a physiologically based pharmacokinetic
     model of trichloroethylene and its metabolites for use in risk assessment. Environ. Health Perspect. 108:283-305.
Clewell, H. J., Gentry, P. R., Gearhart, J. M., Allen, B. C, and Andersen, M. E. 2001. Comparison of cancer risk estimates for vinyl chlo-
     ride using animal and  human data with a PBPK model. Sci. Total Environ. 274:37-66.
Clewell, H. J. Ill, Andersen, M. E., and Barton, H. A. 2002a. A consistent approach for the application of pharmacokinetic modeling in
     cancer and noncancer risk assessment. Environ.  Health Perspect. 110:85-93.
Clewell, H.J., Teeguarden, J., McDonald, T., Sarangapani, R., Lawrence, G., Covington, T., Gentry, R., and Shipp, A. 2002b. Review and
     evaluation of the potential impact of age- and gender-specific pharmacokinetic differences on tissue dosimetry. Crit. Rev. Toxicol.
     32:329-389.
Clewell, H. J., Gentry, P. R., Covington, T. R., Sarangapani, R., and Teeguarden, J. G. 2004. Evaluation  of the potential impact of age-
     and gender-specific pharmacokinetic differences on tissue dosimetry. Toxicol. Sci. 79:381-393.
Cohen, S. M., and Ellwein,  L. B. 1990. Cell proliferation  in carcinogenesis. Science 249:1007-1011.
Collins, J. 1  987. Prospective predictions and validations in anti-cancer therapy, pp. 431-440. Washington, DC: National Academy Press.
Conolly,  R. B.,  and  Andersen, M.  E. 1997. Hepatic foci in rats after diethyl-nitrosamine  initiation and 2,3,7,8-tetrachlorodibenzo-p-
     dioxin promotion: Evaluation of a quantitative two-cell model and of CYP 1A1/1A2 as a dosimeter. Toxicol. Appl. Pharmacol.
     146:281-293.
Conolly,  R.  B.,  and  Butterworth, B.  E.  1995. Biologically based dose response model for hepatic toxicity: A mechanistically based
     replacement for traditional estimates of noncancer risk. Toxicol. Lett. 82-83:901-906.
Conolly,  R. B.,  and  Kimbell, J. S. 1994. Computer simulation of cell  growth governed  by stochastic processes: Application to clonal
     growth cancer models. Toxicol. Appl. Pharmacol. 124:284-295.
Conolly, R.  B., Kimbell, J. S., Janszen,  D., Schlosser, P. M., Kalisak, D., Preston, J., and Miller, F. J. 2003. Biologically motivated computa-
     tional  modeling of formaldehyde carcinogenicity in the F344 rat. Toxicol. Sci. 75:432-447.
Corley, R. A., Bormett, G.  A., and Ghanayem, B. I. 1994. Physiologically based pharmacokinetics of  2-butoxyethanol and its major
     metabolite, 2-butoxyacetic acid, in rats and humans. Toxicol. Appl. Pharmacol. 129:61-79.
Corley, R. A., Markham, D. A., Banks, C., Delorme,  P., Masterman, A., and Houle, J. M. 1  997. Physiologically based pharmacokinetics
     and the dermal absorption of 2-butoxyethanol vapor by humans. Fundam. Appl. Toxicol. 39:120-1 30.
Corley, R. A., Gordon, S. M., and Wallace, L.  A. 2000. Physiologically based pharmacokinetic modeling of the temperature-dependent
     dermal absorption of chloroform by humans following bath water exposures. Toxicol. Sci. 53:13-23.
Corley, R. A., Mast, T. J., Carney, E. W.,  Rogers, J. M., and Daston, G. P. 2003. Evaluation of physiologically based models of pregnancy
     and lactation for their  application in children's health risk assessments. Crit. Rev. Toxicol. 33:137-211.
Cruzan G.,  Carlson G. P., Johnson K. A., Andrews L. S., Banton M. I., Bevan C., and Cushman J. R. 2002. Styrene respiratory tract toxicity
     and mouse lung tumors are mediated by CYP2F-generated metabolites. Regul. Toxicol. Pharmacol. 35:308-31 9.
Csanady, G. A., Kreuzer, P. E., Baur,  C., and  Filser, J. G.  1996.  A physiological toxicokinetic model for 1,3-butadiene in rodents and
     man:  Blood  concentrations of  1,3-butadiene,  its  metabolically formed  epoxides, and of haemoglobin  adducts—Relevance  of
     glutathione depletion. Toxicology 113:300-305.
DeWoskin, R. S., Lipscomb, J. C., Thompson,  C. M., Chiu, W. A., Schlosser, P.,  Smallwood, C., Swartout, J., Teuschler, K., and Marcus,
     A.  2007. Pharmacokinetic/physiologically based pharmacokinetic models in integrated risk information system assessments.  In
     Toxicokinetics and risk assessment, eds. J. C. Lipscomb and E. V. Ohnaian, pp. 301-348. New York: Informa Healthcare.
Dome, J. L., Walton, K., and Renwick, A. G. 2001a. Human variability in glucuronidation  in relation to uncertainty factors for risk
     assessment. Food Chem. Toxicol. 39:1153-11 73.
Dome, J. L., Walton, K., and Renwick, A. G. 2001 b. Uncertainty factors for chemical  risk assessment, human variability in the pharmaco-
     kinetics of CYP1A2 probe substrates. Food Chem. Toxicol. 39:681-696.
Dome, J. L.,  Walton, K., Slob, W., and Renwick, A. G. 2002.  Human variability in polymorphic CYP2D6 metabolism:  Is the kinetic
     default uncertainty factor adequate? Food Chem. Toxicol. 40:1633-1656.
Droz, P. O., Berode, M., and Jang, J. Y. 1999.  Biological  monitoring of tetrahydrofuran: Contribution of a physiologically based  pharma-
     cokinetic model. Am.  Ind. Hyg. Assoc. J. 60:243-248.
el-Masri,  H. A.,  Bell, D.  A.,  and Portier, C. J. 1999. Effects of glutathione transferase theta polymorphism  on  the risk estimates of dichlo-
     romethane to humans. Toxicol. Appl. Pharmacol. 158:221-230.
Faustman, E. M., Lewandowski, T. A., Ponce, R. A., and  Bartell, S. M. 1999 Biologically based dose-response models for developmental
     toxicants: Lessons from methylmercury. Inhal. Toxicol. 11:559-572.
Fennell, T. R., Sumner, S. C., and Walker, V. E. 1 992. A model for the formation and removal of hemoglobin adducts. Cancer Epidemiol.
     Biomarkers Prev. 1:213-219.
Fisher, J. W., and Allen, B. C. 1993. Evaluating the risk of liver cancer in human exposed to trichloroethylene using physiological models.
     Risk Anal. 13:87-95.
Frederick, C. B., Potter, D. W., Chang-Mateu, M. I., and Andersen, M. E. 1992. A physiologically-based pharmacokinetic and pharmacody-
     namic model to describe the oral dosing of rats with ethyl acrylate and its implications  for risk assessment. Toxicol. Appl. Pharmacol.
     114:246-260.
                                  Previous   I      TOC

-------
544                                                                                              C. M. THOMPSON ET AL
Gearhart, J. M., Jepson, G. W., Clewell, H. J. Ill, Andersen, M. E., and Conolly, R. B. 1 990. Physiologically based pharmacokinetic and
     pharmacodynamic  model  for the inhibition  of acetylcholinesterase  by diisopropylfluorophosphate. Toxicol. Appl.  Pharmacol.
     106:295-310.
Gearhart, J. M., Jepson, G. W., Clewell, H. J., Andersen, M. E., and Conolly, R. B. 1994. Physiologically based pharmacokinetic model
     for the inhibition of acetylcholinesterase by organophosphate esters. Environ. Health Perspect. 102(Suppl. 11):51-60.
Gearhart, J. M., Clewell, H. J., and Crump, K. S. 1 995. Pharmacokinetic dose estimates  of mercury in children and dose-response curves
     of performance tests in a large epidemiological study. Water Air Soil Pollut. 80:49-58.
Gentry, P. R., Covington, T. R., Andersen,  M. E., and  Clewell, H. J. 2002 Application of a physiologically based pharmacokinetic model
     for isopropanol in the derivation of a  reference dose and reference concentration. Regul. Toxicol. Pharmacol. 36:51-68.
Gentry, P. R., Covington, T. R., and Clewell, H. J. III.  2003. Evaluation of the potential impact of pharmacokinetic differences on tissue
     dosimetry in offspring during pregnancy and lactation. Regul. Toxicol. Pharmacol. 38:1-16.
Gentry, P. R., Haber, L. T., McDonald, T.  B., Zhao, Q.,  Covington, T., Nance, P., Clewell, H. J. Ill, Lipscomb, J. C., and Barton, H. A.
     2004. Data for physiologically based pharmacokinetic  modeling in neonatal  animals:  Physiological parameters in mice and
     Sprague-Dawley rats.j. Child. Health  2:363-411.
Gerrity, P. G., Henry, C. J., and Birnbaum, L. 1990. Principles of route-route extrapolation for risk assessment. New York: Elsevier.
Ginsberg, G., Hattis, D., Russ, A., and Sonawane, B. 2004. Physiologically based pharmacokinetic (PBPK) modeling of caffeine and theo-
     phylline in neonates and adults: Implications for assessing children's risks from environmental agents.). Toxicol. Environ. Health A
     67:297-329.
Haddad, S., Beliveau, M., Tardif, R., and Krishnan, K., 2001 a. A PBPK modeling-based approach to account for interactions in the health
     risk assessment of chemical mixtures. Toxicol. Sci. 63:1 25-131.
Haddad, S., Restieri, C., and Krishnan, K. 2001 b. Characterization of age-related changes in body weight and organ weights from birth to
     adolescence in humans. J. Toxicol. Environ. Health A 64:453-464.
Hanna, L. M.,  Lou,  S. R., Su, S., and Jarabek, A. M. 2001. Mass transport analysis: Inhalation RfC methods framework for  interspecies
     dosimetric adjustment. Inhal. Toxicol. 1 3:437-463.
Himmelstein, K. J.,  and  Lutz, R. J.  1979.  A review of the application of physiologically based pharmacokinetic modeling. ]. Pharm.
     Biopharm. 7:127-145.
Hurst, C. H., DeVito, M. J., Setzer, R. W., Birnbaum, L. S. 2000. Acute administration of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) in
     pregnant Long Evans rats: Association of measured tissue concentrations with developmental effects. Toxicol. Sci. 53:411-420.
IPCS. 2005. Chemical-specific adjustment factors  (CSAEs) for interspecies differences and human  variability  in dose/concentration-
     response  assessment: Guidance document for the use of data in  dose/concentration-response assessment. Geneva: World Health
     Organization.
Jarabek, A. M. 1995a. Interspecies extrapolation based  on mechanistic determinants  of chemical  disposition. Hum. Ecol.  Risk Assess.
     1:641-662.
Jarabek, A. M. 1995b. The application of dosimetry models to identify key processes and parameters for default dose-response assess-
     ment approaches. Toxicol. Lett.  79:171-184.
Jarabek, A. M., Fisher, J. W., Rubenstein, R., Lipscomb, J. C., Williams, R. J., Vinegar, A., and McDougal, J. N. 1994. Mechanistic insights
     aid the search for CFC substitutes: Risk assessment of HCFC-123 as an example. Risk Anal. 14:231-250.
Jonsson, F., and Johanson, G. 2001. A Bayesian analysis of the  influence of GSTT1 polymorphism on the cancer risk estimate for dichlo-
     romethane. Toxicol. Appl. Pharmacol. 1 74:99-11 2.
Kenyon, E. M., Kraichely, R.  E., Hudson, K. T., and Medinsky, M. A. 1996. Differences in rates of benzene metabolism correlate with
     observed genotoxicity. Toxicol. Appl. Pharmacol. 136:49-56.
Kim, A. H., Kohn, M. C.,  Portier, C. J., and Walker, N. J. 2002. Impact of physiologically based pharmacokinetic modeling on benchmark
     dose calculations for TCDD-induced biochemical responses. Regul. Toxicol. Pharmacol. 36:287-296.
Kimbell, J. S., Gross, E. A., Joyner, D. R., Godo, M. N., and Morgan,  K. T. 1993. Application of computational fluid dynamics to regional
     dosimetry of inhaled chemicals  in the upper respiratory tract of the rat. Toxicol. Appl. Pharmacol. 121:253-263.
Kirman, C. R.,  Hays, S. M., Kedderis, G.  L., Gargas,  M. L., and Strother, D. E. 2000. Improving cancer dose-response characterization by
     using physiologically based pharmacokinetic modeling: An analysis of pooled data for acrylonitrile-induced brain tumors to assess
     cancer potency in the rat. Risk Anal. 20:1 35-151.
Kohn, M. C., and Melnick, R. L. 1999. A physiological model for ligand-induced accumulation of alpha 2u globulin  in male rat kidney:
     Roles of protein synthesis and lysosomal degradation in the renal dosimetry of 2,4,4-trimethyl-2-pentanol. Toxicology 136:89-105.
Krishnan, K., and Andersen,  M. E. 2007. Physiologically based pharmacokinetic modeling in toxicology. In Principles and  methods of
     toxicology, ed. A. W. Hayes. Philadelphia: Taylor &  Francis, pp. 231-292.
Krishnan, K., Andersen, M.E., 2001. Physiologically  based pharmacokinetic modeling in toxicology. In Principles and methods of toxicol-
     ogy, ed. A. W. Hayes, pp. 193-241. Philadelphia: Taylor & Francis.
Krishnan, K., Gargas, M. L., Fennell, T. R., and Andersen, M. E. 1 992. A physiologically  based description of ethylene oxide dosimetry in
     the rat. Toxicol. Ind. Health 8:1 21-140.
Kumagai, S., and Matsunaga, I. 1995. Physiologically based pharmacokinetic model for  acetone. Occup. Environ. Med. 52:344-352.
Leroux, B.C., Leisenring, W.M., Moolgavkar, S.H., Faustman, E.M., 1996 A biologically-based  dose-response model for development
     toxicology. Risk Anal. 16:449-458.
Leung, H. W. 1991. Development and utilization of  physiologically based pharmacokinetic models for toxicological applications.). Toxicol.
     Environ. Health 32:247-267.
Leung, H. W. 1 992. Use of physiologically based pharmacokinetic models to establish biological  exposure indexes. Am. Ind.  Hyg. Assoc.
     7.53:369-374.
                                  Previous   I      TOC

-------
APPLICATIONS OF PBPK MODELS IN RISK ASSESSMENT                                                                 545
Leung, H. W., and Paustenbach, D. J.  1990. Cancer risk assessment for dioxane based upon a physiologically-based pharmacokinetic
     modeling approach. Toxicol. Lett.  51:147-1 62.
Levesque, B., Ayotte, P., Tardif, R., Ferron, L., Gingras, S., Schlouch, E., Gingras, G., Levallois, P., and Dewailly, E. 2002. Cancer risk
     associated with household exposure to chloroform.). Toxicol. Environ. Health A 65:489-502.
Liao, K. H., Dobrev, I. D., Dennison, J. E., Jr., Andersen, M. E., Reisfeld, B., Reardon, K. F., Campain, J. A., Wei, W., Klein, M. T., Quann,
     R. J., and Yang,  R. S. 2002. Application of biologically based computer modeling to simple or complex mixtures. Environ. Health
     Perspect. 110(Suppl. 6):957-963.
Lipscomb, J. C. 2004. Evaluating the relationship between variance  in enzyme expression and toxicant concentration in health risk
     assessment. Hum. Ecol. Risk Assess. 10:39-55.
Lipscomb, J. C., Fisher, J. W., Confer, P. D., and Byczkowski, J. Z. 1998. In vitro to in vivo extrapolation for trichloroethylene metabolism
     in humans. Toxicol. Appl. Pharmacol. 152:376-387.
Lipscomb, J. C., Teuschler, L. K., Swartout, J., Popken, D.,  Cox, T., and Kedderis, G. L. 2003. The impact of cytochrome P450 2E1-
     dependent metabolic variance on a  risk-relevant pharmacokinetic outcome in humans. Risk Anal. 23:1221-1 238.
Loizou, G., Spendiff, M.,  Barton, H. A.,  Bessems, J., Bois, F. Y., DTvoire, M. B., Buist, H., Clewell, H. J. 3rd, Meek, B., Gundert-Remy, U.,
     Goerlitz, G., and Schmitt, W. 2008. Development of good modelling practice for physiological based pharmacokinetic models for
     use in risk assessment: The first steps. Regul Toxicol Pharmacol.  50:400-411.
Luebeck, E. G., Moolgavkar, S. H., Buchmann, A., and Schwartz, M. 1991. Effects of polychlorinated biphenyls in rat liver:  quantitative
     analysis of enzyme-altered foci. Toxicol. Appl. Pharmacol. 111:469-484.
Luecke, R. H., Wosilait, W. D., and Young, J. F. 1997. Mathematical analysis for teratogenic sensitivity. Teratology 55:373-380.
MacDonald, A. J., Rostami-Hodjegan, A., Tucker, G. T., and Linkens, D. A. 2002. Analysis of solvent central nervous system toxicity and eth-
     anol interactions using a human population physiologically based kinetic and dynamic model. Regul. Toxicol. Pharmacol. 35:165-1 76.
Martonen, T.  B., Zhang, Z., Yu, G., and  Musante, C. J. 2001. Three-dimensional computer modeling of the human upper respiratory
     tract. Cell Biochem. Biophys. 35:255-261.
Meek, M. E., Beauchamp, R., Long, G., Moir, D., Turner, L., and Walker, M. 2002 Chloroform: Exposure estimation, hazard character-
     ization, and exposure-response analysis.). Toxicol. Environ. Health B 5:283-334.
Melnick, R. L., and Kohn, M. C. 2000. Dose-response analyses of experimental cancer data. Drug Metab. Rev. 32:193-209.
Miller, F. J., Schlosser, P. M., and Janszen, D. B. 2000. Haber's rule: A special  case in a family of curves relating  concentration and
     duration of exposure to a fixed level of response for a given endpoint. Toxicology 149:21-34.
Monro, A. 1994. Drug toxicokinetics:  Scope and limitations that arise from species differences in pharmacodynamic and carcinogenic
     responses.).  Pharmacokinet. Biopharm. 22:41-57.
Moolgavkar, S., and Knudson, A. 1981. Mutation and cancer: A model for human carcinogenesis.) Nat/. Cancer Inst.  66:1037-1052.
Moolgavkar, S. H., and Luebeck, G. 1990. Two-event model for carcinogenesis: biological, mathematical, and statistical considerations.
     Risk Anal. 10:323-341.
Moolgavkar, S., and Venzon, D.  1979. Two-event models for carcinogenesis:  Incidence curve for childhood  and adult tumors. Math.
     Biosci. 47:55-77.
Nong, A., McCarver,  D. G., Mines, R.N., and Krishnan,  K. 2006. Modeling interchild differences in pharmacokinetics on  the basis of
     subject-specific data on physiology and hepatic CYP2E1 levels:  a case study with toluene. Toxicol. Appl. Pharmacol. 214:78-87.
O'Flaherty, E. J. 1981. Toxicants and drugs: Kinetics and dynamics. New York: John Wiley & Sons.
O'Flaherty, E. J. 1994. Physiologic changes during growth and development. Environ. Health Perspect. 102(Suppl. 11): 103-106.
Overton, J. H. 2001. Dosimetry modeling of highly soluble reactive gases in the respiratory tract. Inhal. Toxicol. 13:347-357.
Page, N.  P., Singh,  D. V., Farland, W., Goodman, J. I., Conolly, R. B., Andersen, M. E., Clewell, H.J., Frederick, C. B., Yamasaki, H., and
     Lucier, G. 1997. Implementation  of EPA revised cancer assessment guidelines:  Incorporation of mechanistic and pharmacokinetic
     data. Eundam. Appl. Toxicol. 37:1 6-36.
Pauluhn, J. 2003. Issues of dosimetry in inhalation toxicity. Toxicol. Lett. 140-141:229-238.
Paustenbach,  D. J.  2000. The practice of exposure assessment: A state-of-the-art review.).  Toxicol. Environ. Health B 3:1 79-291.
Perbellini, L., Mozzo, P., Olivato, D., and  Brugnone, F. 1990. "Dynamic" biological exposure indexes for n-hexane and 2,5-hexanedione,
     suggested by a physiologically based pharmacokinetic model. Am. Ind. Hyg. Assoc. ]. 51:356-362.
Pierce, C. H., Dills, R. L., Morgan, M.  S.,  Vicini, P., and Kalman, D. A. 1998. Biological monitoring of controlled toluene exposure. Int.
     Arch. Occup.  Environ. Health 71:433-444.
Poet, T. S., Soelberg, J. J., Weitz, K. K., Mast, T. J., Miller, R. A., Thrall, B. D., and Corley, R. A. 2003. Mode of action and pharmacoki-
     netic studies of 2-butoxyethanol in the mouse with an emphasis on forestomach dosimetry. Toxicol. Sci. 71:1 76-1 89.
Poulin, P., and Krishnan, K. 1996a. A mechanistic algorithm for predicting blood:air partition  coefficients  of organic chemicals with the
     consideration of reversible binding in hemoglobin. Toxicol. Appl. Pharmacol. 1 36:1 31-137.
Poulin, P., and Krishnan, K. 1996b. A tissue composition-based algorithm for predicting tissue:air partition  coefficients of organic chemi-
     cals. Toxicol. Appl. Pharmacol. 136:126-130.
Portier, C., Tritscher, A., Kohn, M.,  Sewall, C., Clark, G., Edler, L., Hoel, D., and Lucier, G. 1993. Ligand/receptor binding for 2,3,7,8-
     TCDD: Implications for risk assessment. Fundam. Appl. Toxicol. 20:48-56.
Portier, C. J., and  Kaplan, N. L. 1 989. Variability of safe dose estimates when  using complicated models of the carcinogenic process. A
     case study: Methylene chloride. Eundam. Appl. Toxicol. 1 3:533-544.
Price, K., Haddad,  S., and  Krishnan, K.  2003. Physiological  modeling of age-specific changes in the pharmacokinetics of organic chemi-
     cals in children.). Toxicol. Environ. Health A 66:41 7-433.
Rao, H. V., and Ginsberg, G. L. 1997. A physiologically-based pharmacokinetic model assessment of methyl t-butyl ether in groundwater
     for a bathing and showering determination. Risk Anal. 1 7:583-598.
                                  Previous   I       TOC

-------
546                                                                                              C. M. THOMPSON ET AL
Reddy, C. B., Yang, R. S. H., Clewell, H. J., and Andersen, M. E., eds. 2005. Physiologically based phannacokinetic modeling. Hoboken,
     NJ: Wiley & Sons.
Reitz, R. H., McDougal, J. N., Himmelstein, M. W., Nolan, R. J., and Schumann, A. M. 1988a. Physiologically-based pharmacokinetic
     modeling with methyl chloroform: Implications for interspecies, high-low dose  and dose-route  extrapolations. Toxicol. Appl.
     Phannacol. 95:185-199.
Reitz, R. H., Mendrala, A. L, Park, C. N., Andersen, M. E., and Guengerich, F. P. 1988b. Incorporation of in vitro enzyme data into the phys-
     iologically-based pharmacokinetic (PB-PK) model for methylene chloride: Implications for risk assessment. Tox/co/. Lett. 43:97-116.
Reitz, R. H., Mendrala, A. L., and Guengerich, F. P. 1989. In vitro metabolism of methylene chloride in human and animal tissues: Use
     in physiologically based pharmacokinetic models. Toxicol. Appl. Phannacol. 97:230-246.
Reitz, R. H., Mendrala, A. L., Corley, R. A., Quasi, J. F., Gargas, M. L., Andersen, M. E., Staats, D. A., and Conolly R. B. 1990a. Estimating
     the risk of liver cancer associated  with human exposures to  chloroform  using physiologically-based pharmacokinetic modeling.
     Toxicol. Appl. Pharmacol. 105:443-459.
Reitz, R. H., McCroskey, P. S., Park, C. N., Andersen, M. E., and Gargas, M. L. 1990b. Development of a  physiologically-based pharma-
     cokinetic model for risk assessment with 1,4-dioxane. Toxicol. Appl.  Pharmacol. 105:37-54.
Reitz, R. H., Gargas, M. L., Andersen, M. E., Provan, W. M., Green, T. L.  1996. Predicting cancer risk from vinyl chloride exposure with
     a physiologically based pharmacokinetic model. Toxicol. Appl.  Pharmacol. 137:253-267.
Renwick, A. G. 2001. Toxicokinetics—Pharmacokinetics in  toxicology.  In Principles and methods of  toxicology,  ed. A.  W. Hayes,
     pp. 137-192. Philadelphia: Taylor & Francis.
Rogers, J. M., Mole, M. L., Chernoff, N.,  Barbee, B. D., Turner, C. I., Logsdon, T. R., and Kavlock, R. J. 1 993. The developmental toxicity
     of inhaled methanol in the CD-1 mouse, with quantitative dose-response modeling for estimation of benchmark doses. Teratology
     47:175-188.
Roth, W. L., Freeman, R. A., and Wilson, A.  G. 1993. A physiologically  based  model  for gastrointestinal absorption and excretion of
     chemicals carried by lipids. Risk Anal. 13:531-543.
Rowland, M. 1985. Physiologic pharmacokinetic models and interanimal species scaling. Pharmacol. Ther. 29:49-68.
Roy, A., and Georgopoulos, P. G. 1 998. Reconstructing week-long exposures to volatile organic compounds using physiologically based
     pharmacokinetic models.). Expos. Anal. Environ. Epidemiol. 8:407-422.
Roy, A., Weisel, C. P., Lioy, P. J., and Georgopoulos, P. G. 1996. A distributed parameter physiologically-based pharmacokinetic model
     for dermal and inhalation exposure to volatile organic compounds. Risk Anal. 16:147-160.
Santostefano, M. J., Wang, X.,  Richardson, V. M., Ross, D. G., DeVito, M.  J., and Birnbaum, L. S. 1 998. A pharmacodynamic analysis of
     TCDD-induced cytochrome P450 gene expression in multiple tissues: Dose- and time-dependent effects. Toxicol. Appl.  Pharmacol.
     151:294-310.
Sarangapani, R., Gentry, P. R., Covington, T. R., Teeguarden, J. G., and Clewell, H. J. III. 2003. Evaluation  of the potential impact of age-
     and gender-specific lung morphology and ventilation rate on the dosimetry of vapors. Inhal. Toxicol.  15:987-1016.
Schlosser,  P. M., Lilly, P. D., Conolly,  R. B., Janszen, D. B., and Kimbell, J. S. 2003. Benchmark dose risk assessment for formaldehyde
     using airflow modeling and a single-compartment, DNA-protein cross-link dosimetry model to estimate human equivalent doses.
     RiskAnal. 23:473-487.
Simmons, J. E., Evans, M. V., and Boyes, W. K. 2005. Moving from  external exposure concentration to  internal dose: Duration extrapo-
     lation based on physiologically based pharmacokinetic derived estimates of internal dose. J. Toxicol. Environ. Health A 68:927-950.
Slikker, W., Jr, Andersen, M. E., Bogdanffy, M. S., Bus, J. S., Cohen, S. D., Conolly, R. B., David, R. M., Doerrer, N. G.,  Dorman, D. C.,
     Gaylor, D. W., Hattis, D., Rogers, J. M., Setzer, R. W., Swenberg, J. A., and Wallace, K. 2004. Dose-dependent transitions in mech-
     anisms of toxicity: Case studies. Toxicol. Appl. Pharmacol. 201:226-294.
Starr, T. B., and Festa, J. L. 2003. A proposed inhalation reference concentration for methanol. Regul. Toxicol. Pharmacol. 38:224-231.
Steinbach, K. H., Raffler, H., Pabst, G.,  and  Fliedner,  T. M. 1980. A mathematical model  of canine granulocytopoiesis. ].  Math.  Biol.
     10:1-12.
Sweeney, L. M., Tyler, T. R., Kirman, C.  R., Corley,  R. A., Reitz, R. H., Paustenbach, D. J., Holson, J. F., Whorton, M. D., Thompson, K.
     M., and Gargas, M. L. 2001. Proposed occupational exposure limits for select ethylene glycol ethers using PBPK  models and Monte
     Carlo simulations. Toxicol. Sci. 62:124-139.
Tan, Y. M., Butterworth, B. E., Gargas, M. L., and Conolly, R. B. 2003.  Biologically motivated computational modeling of chloroform
     cytolethality and regenerative cellular proliferation. Toxicol. Sci. 75:192-200.
Tan, Y. M., Liao, K. H., Conolly, R. B., Blount, B.  C., Mason, A. M., and Clewell, H. J.  2006.  Use of a physiologically based pharma-
     cokinetic model to identify exposures consistent with human  biomonitoring data for chloroform.).  Toxicol. Environ. Health A 69:
     1727-1756.
Tardif, R., Charest-Tardif, G., Brodeur, J., and Krishnan, K. 1997. Physiologically based pharmacokinetic modeling of a ternary mixture of
     alkyl  benzenes in rats and humans. Toxicol. Appl. Pharmacol. 144:1 20-134.
Thomas, R. S., Lytle, W. E., Keefe, T. J., Constan, A. A., and Yang, R. S. 1 996. Incorporating Monte Carlo  simulation into physiologically
     based pharmacokinetic models using advanced continuous simulation language (ACSL): A computational method. Fundam. Appl.
     Toxicol. 31:19-28.
Thomas, R. S., Conolly, R. B., Gustafson, D. L., Long, M. E., Benjamin, S. A., and Yang, R. S. 2000. A physiologically based pharmacody-
     namic analysis of hepatic foci within a medium-term liver bioassay using pentachlorobenzene as a promoter and diethylnitrosamine
     as an initiator. Toxicol. Appl. Pharmacol. 166:128-137.
Timchalk, C., Nolan, R. J., Mendrala, A. L., Dittenber, D. A., Brzak,  K. A., and Mattsson, J. L. 2002. A Physiologically based pharmacoki-
     netic and pharmacodynamic (PBPK/PD)  model for the organophosphate insecticide chlorpyrifos in  rats and humans. Toxicol. Sci.
     66:34-53.
                                  Previous   I      TOC

-------
APPLICATIONS OF PBPK MODELS IN RISK ASSESSMENT                                                                 547
Timchalk, C, Poet, T. S., Kousba, A. A., Campbell, J. A., and Lin, Y. 2004. Noninvasive biomonitoring approaches to determine dosime-
     try and risk following acute chemical exposure: Analysis of lead or organophosphate insecticide in saliva. J. Toxicol. Environ. Health
     A 67:635-650.
Timchalk, C., Poet, T. S., Lin, Y., Weitz, K. K., Zhao, R., and Thrall, K. D. 2001. Development of an integrated microanalytical system for
     analysis of lead in saliva and linkage to a physiologically based pharmacokinetic model describing lead saliva secretion. Am. \nd.
     Hyg. Assoc. J.  62:295-302.
Tran, C. L., Jones, A. D., Cullen, R. T., and  Donaldson, K. 1999. Mathematical modeling of the retention and clearance of low-toxicity
     particles in the lung. Inhal. Toxicol. 11:1059-1076.
U.S. Environmental Protection Agency. 1994. Methods for derivation of inhalation reference concentrations and application of inhalation
     dosimetry. EPA/600/8-90/066F. Washington, DC: Office of Health and Environmental Assessment, http://cfpub.epa.gov/ncea/cfm/
     record isplay.cfm?deid = 71 993
U.S. Environmental Protection Agency. 1999a. Extrapolation of the benzene inhalation unit risk estimate to the oral route of exposure (draft).
     NCEA-W-0517.  Washington,  DC:  National  Center  for Environmental  Assessment,  http://cfpub.epa.gov/ncea/cfm/recordisplay.
     cfm?deid = 12140
U.S. Environmental Protection Agency. 1999b. Toxicological review of ethylene glycol monobutyl ether. In support of summary informa-
     tion on the IRIS. CAS No.  777-76-2.  Washington,  DC: National Center for  Environmental Assessment, http://www.epa.gov/iris/
     toxreviews/0500-tr.pdf
U.S. Environmental Protection Agency. 2000a. Benchmark dose technical guidance document (external review draft). EPA/630/R-00/001.
     Washington, DC: Risk Assessment Forum. http://www.epa.gov/ncea/pdfs/bmds/BMD-ExternaM 0_13_2000.pdf
U.S. Environmental Protection Agency. 2000b. Toxicological review of vinyl chloride. In support of summary information on the IRIS. EPA/
     635/R-00/004. Washington, DC: National Center for Environmental Assessment. http://www.epa.gOV/iris/toxreviews/1001 -tr.pdf
U.S. Environmental Protection Agency. 2002. A review of the reference dose and reference concentration processes. EPA/630/P-02/002 F.
     Washington, DC: Risk Assessment Forum, http://cfpub.epa.gov/ncea/cfm/recordisplay.cfm?deid = 55365
U.S. Environmental Protection Agency. 2004. Air quality criteria for particulate matter. EPA/600/P-99/002aF. Research Triangle Park, NC:
     National Center for Environmental Assessment, http://cfpub.epa.gov/ncea/cfm/recordisplay.cfm?deid = 87903
U.S. Environmental Protection Agency. 2005a. Guidelines for carcinogen risk assessment.  EPA/630/P-03/001 B. Washington, DC:  Risk
     Assessment   Forum,   http://cfpub.epa.gov/ncea/cfm/recordisplay .cfm?deid = 116283&CFID = 128481 5&CFTOKEN = 771 S6427&
     jsessionid = 663067650c6a464245b4TR60306630c230
U.S. Environmental Protection Agency. 2005b. Supplemental guidance for assessing susceptibility from early-life exposure to carcinogens.
     EPA/630/R-03/003F. Washington, DC:  Risk Assessment Forum, http://cfpub.epa.gov/ncea/cfm/recordisplay.cfm?deid = 1 60003
U.S. Environmental Protection Agency. 2005c. Toxicological review of boron and compounds. In support of summary information on the
     IRIS. CASRN 7440-42-8. Washington, DC:  National Center for Environmental Assessment, http://www.epa.gov/iris/toxreviews/
     0410-tr.pdf
U.S. Environmental Protection Agency. 2006a. Approaches for the application of physiologically based pharmacokinetic (PBPK) models
     and supporting data in risk assessment.  EPA/600/R-05/043F. Washington, DC: National Center for Environmental Assessment, http:/
     /cfpub.epa.gov/ncea/cfm/recordisplay.cfm?deid = 157668
U.S. Environmental Protection Agency. 2006b. A framework for assessing health risks of environmental exposures to children. EPA/600/R-
     05/093F.  Washington,  DC:   National  Center  for  Environmental  Assessment,  http://cfpub.epa.gov/ncea/cfm/recordisplay.
     cfm?deid = 158363
U.S. Environmental Protection Agency. 2006c. Use of physiologically based pharmacokinetic models to quantify the impact of human age
     and interindividual differences  in physiology and biochemistry pertinent to risk. EPA/600/R-06/014A. Washington, DC:  National
     Center for Environmental Assessment, http://cfpub.epa.gov/ncea/cfm/recordisplay.cfm?deid = 157668
Van Asperen, J., Rijcken, W. R.  P., and Lammers, J.  H. C. M. 2003. Application of physiologically based toxicokinetic modelling to study
     the impact of the exposure scenario on the toxicokinetics and the behavioural effects of toluene in rats.  Toxicol. Lett. 1 38:51-68.
Vinegar, A., and Jepson, G. W. 1996. Cardiac sensitization thresholds of halon replacement chemicals predicted in humans by physiolog-
     ically based pharmacokinetic modeling. Risk Anal. 16:571-579.
Voisin, E. M., Ruthsatz, M., Collins,]. M., and Hoyle, P. C. 1990. Extrapolation of animal toxicity to humans: Interspecies comparisons in
     drug development. Regul.  Toxicol. Pharmacol. 12:107-116.
Wagner, J. G. 1981. History  of pharmacokinetics. Pharmacol. Ther. 1 2:537-562.
Welsch, F., Blumenthal, G.  M.,  and  Conolly, R. B. 1995. Physiologically based pharmacokinetic  models  applicable to organogenesis:
     Extrapolation between  species and potential use  in prenatal toxicity  risk assessments. Toxicol. Lett.  82-83:539-547.
                                  Previous   I      TOC

-------
This article was downloaded by: [US EPA Environmental Protection Agency]
On: 2 September2009
Access details: /Access Details: [subscription number 789514190]
Publisher Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK
     JOURNAL of
  TOXICOLOGY and
 ENVIRONMENTAL
      HEALTH  J
Journal of Toxicology and Environmental Health, Part A
Publication details, including instructions for authors and subscription information:
http://www. info rmaworld. co m/smpp/title~content=t713667303

Bayesian Calibration of a Physiologically Based
Pharmacokinetic/Pharmacodynamic Model of Carbaryl Cholinesterase
Inhibition
Andy Nong a; Yu-Mei Tan a; Michael E. Krolskib; Jiansuo Wang a; Curt Lunchick c; Rory B. Conolly a; Harvey
J. Clewell III'
a The Hamner Institutes for Health Sciences, Research Triangle Park, North Carolina, USA b Bayer
CropScience, Stilwell, Kansas, USA c Bayer CropScience, Research Triangle Park, North Carolina, USA

Online Publication Date: 01 January 2008
To cite this Article Nong, Andy, Tan, Yu-Mei, Krolski, Michael E., Wang, Jiansuo, Lunchick, Curt, Conolly, Rory B. and Clewell
Harvey J.(2008)'Bayesian Calibration of a Physiologically Based Pharmacokinetic/Pharmacodynamic Model of Carbaryl
Cholinesterase Inhibition', Journal of Toxicology and Environmental Health, Part A,71:20,1363 — 1381
To link to this Article: DOI: 10.1080/15287390802271608
URL: http://dx.doi.org/10.1080/15287390802271608
                             PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions of use:  http://www.informaworld.com/terms-and-conditions-of-access.pdf

This article may be used for research,  teaching and private  study purposes. Any  substantial or
systematic reproduction, re-distribution,  re-selling,  loan or sub-licensing,  systematic  supply or
distribution in any form to anyone is expressly forbidden.

The publisher does not give any warranty express or implied  or make  any representation that the contents
will be complete or accurate or up to date.  The accuracy of  any instructions,  formulae and drug doses
should be independently verified with primary sources.  The publisher shall not be  liable for any loss,
actions, claims, proceedings,  demand or costs or damages whatsoever  or  howsoever caused  arising directly
or indirectly in connection with or arising  out of the  use of this material.
                                 Previous

-------
Journal of Toxicology and Environmental Health, Part A, 71: 1363-1381, 2008
Copyright © Taylor & Francis Group, LLC
ISSN: 1528-7394 print/ 1087-2620 online
DOI: 10.1080/15287390802271608
                                               Taylor & Francis
Bayesian  Calibration  of a  Physiologically Based
Pharmacokinetic/Pharmacodynamic Model of Carbaryl
Cholinesterase  Inhibition

Andy Nong1, Yu-Mei Tan1, Michael E. Krolski2, Jiansuo Wang1, Curt Lunchick3,
Rory B. Conolly1, and Harvey J. Clewell III1
1 The Hamner Institutes for Health Sciences, Research Triangle Park, North Carolina, 2Bayer
CropScience, Stihvell, Kansas, and ~:'Bayer CropScience, Research Triangle Park, North Carolina, USA
   Carbaryl,  an jV-methyl carbamate (NMC), is  a common
insecticide  that  reversibly  inhibits  neuronal cholinesterase
activity. The objective of this work was to use a hierarchical
Bayesian approach to estimate the parameters in a physiologi-
cally based pharmacokinetic and pharmacodynamic (PBPK/
PD)  model from experimental measurements of carbaryl in
rats. A PBPK/PD model was developed to describe the tissue
dosimetry of carbaryl  and its  metabolites  (1-naphthol and
"other hydroxylated metabolites") and subsequently to predict
the carbaryl-induced inhibition  of cholinesterase  activity, in
particular in  the brain and blood.  In support of  the model
parameterization,  kinetic tracer studies were undertaken to
determine total radioactive tissue levels of carbaryl and metab-
olites in rats  exposed by  oral or intravenous routes at doses
ranging from 0.8  to 9.2 mg/kg body weight. Inhibition of
cholinesterase activity in blood and brain was also measured
from the exposed rats. Markov Chain Monte Carlo (MCMC)
calibration of the rat model parameters was implemented using
prior information from literature for physiological  parameter
distributions together with kinetic and inhibition data on car-
baryl. The posterior estimates of the parameters  displayed at
most a twofold deviation from the mean. Monte Carlo simula-
tions of the PBPK/PD model  with the posterior distribution
estimates predicted a 95% credible interval of tissue doses for
carbaryl and  1-naphthol within the  range of observed data.
Similar prediction results were achieved for cholinesterase inhi-
bition by carbaryl. This initial model will be used to determine
the experimental studies that may  provide the highest added
value for model refinement. The Bayesian PBPK/PD modeling
   Received 4 September 2007; accepted 15 April 2008.
   The authors acknowledge Drs. Melvin E. Andersen, R. Woodrow
Setzer, Miyoung Yoon, and Jerry Campbell for their valuable reviews
and comments. This work was supported by the Bayer CropScience
and the American Chemistry Council Long-Range Research Initiative.
   Current address for  Rory B. Conolly  is U.S. EPA, Research
Triangle Park, NC 27011, USA.
   Address correspondence to Yu-Mei Tan, The Hamner Institutes
for Health Sciences, 6 Davis Drive, Research Triangle Park, NC
27709, USA. E-mail: ctan@thehamner.org
                                                          approach developed here will serve as a prototype for developing
                                                          mechanism-based risk models for the other NMCs.
       Carbaryl (1-Naphthylenyl methylcarbamate) is a registered
     insecticide used in a variety of fruits, vegetables, field crops,
     ornamentals, and turf. Carbaryl (CAS no. [63-25-2]) belongs to
     the pesticide family of TV-methyl carbamates (NMCs). As with
     the  other NMCs, carbaryl  inhibits neuronal cholinesterase
     activity, which is reversible upon discontinuation of exposure
     (Sogorb & Vilanova,  2002).  Increased  liver  and  kidney
     weights were observed in rats exposed chronically with high
     oral doses of carbaryl (Carpenter et al., 1961). Presently, the
     U.S. Environmental Protection Agency (EPA) has established an
     oral reference dose for chronic exposure (RfD) of 0.1 mg/kg-d
     for liver and kidney toxicity in rats (U.S. EPA, 1999). No RfD
     was established for  inhalation or dermal exposure to carbaryl,
     although exposure by these routes to carbaryl during gardening
     may be significant.
       The kinetics of carbaryl biotransformation were studied in
     animals (Knaak  et  al.,  1965; Declume  & Benard,  1977;
     Strother & Wheeler, 1980; Tanaka et al.,  1980; Mount et al.,
     1981; Knight et al.,  1987; McCraken et al., 1993) and humans
     (Knaak et al., 1965; May et al., 1992; Ward et al., 1998). The
     metabolism of carbaryl by hepatocytes was reported to produce
     metabolites from three major pathways: aromatic hydroxyla-
     tion, aliphatic hydroxylation, and hydrolysis (Figure 1). Both
     aromatic and aliphatic hydroxylation reactions are catalyzed by
     cytochromes P-450  (Tang  et al., 2002). Esterase hydrolysis of
     carbaryl, which also takes place in plasma, produces 1-naphthol
     (McCraken et al., 1993). The resulting metabolites are subse-
     quently  conjugated with sulfate  or glucuronic  acid and
     excreted via urine andfeces (IPCS, 1994).
       In addition to the kinetics of carbaryl, the neuronal response
     has been associated with the inhibition and binding of acetyl-
     cholinesterase  (AChE) in red blood cells (RBC)  and brain
                                                       1363
                                  Previous
TOC

-------
1364
                                                   A. NONG ET AL.
          CARBARYL
Oxidation,

Epoxide
Synthetase,
Conjugation
                                          OTHER
                                      •  COMPOSITE
                                       METABOLITES
                 hydrolysis
         1-NAPHTHOL
              1
                OSO,H
     1-NAPHTHOL SULFATE

FIG. 1.  Suggested  depiction  of  the metabolic  pathway of carbaryl.
1-Naphthol and 1-naphthol sulfate are described as the major metabolites in
the PBPK carbaryl model. All  other composite metabolites are described
within a single kinetic substructure of the model.
(Banerjee et al., 1991;  Dawson,  1994; Rao et al.,  1994;
Mortensen et al.,  1998). Binding of butyrylcholinesterase
(BChE) and binding of carboxylesterase (CaE) in plasma were
also determined in  animal and human using in vitro assays.
One in vivo study in rats measured both the pharmacokinetics
of carbaryl and the  inhibition of the plasma AChE following
intravenous (iv) administration (Fernandez et al., 1982).
   In the present study, tissue concentrations (in total radioactive
residues) of carbaryl and its metabolites such  as 1-naphthol,
together with cholinesterase  activities, were  collected from
male Sprague-Dawley (SD) rats exposed to [14C]carbaryl at
two dose levels following iv or oral exposures. The purpose of
the present study was to develop and calibrate a physiologi-
cally based  pharmacokinetic/pharmacodynamic  (PBPK/PD)
model with the available tissue-specific time  course profiles. In
contrast to traditional model  fitting with  mean values,  the
PBPK/PD model parameters were calibrated with a hierarchi-
cal Bayesian approach (Bernillon & Bois,  2000), permitting
the estimation of the distribution of parameters  on the basis of
individual experimental  data and  prior information.  Model
parameters lacking experimental support or previously fitted to
mean data were subjected to  Markov  Chain Monte  Carlo
(MCMC)  updating  to estimate the posterior distributions of
these parameters. The calibrated model will be used to identify
key experimental data needs for refining the model prior to its
use for human dosimetry extrapolation. Further, as the first
                                                           instance of a PBPK model for an NMC, this Bayesian PBPK/PD
                                                           modeling approach will serve as a prototype for developing
                                                           mechanism-based risk models for other NMC.
MATERIALS AND METHODS

Chemicals
   The  [naphthyl-l-14C]carbaryl  and  the  [naphthyl-4a,5,6,
7,8,8a-14C]carbaryl  were  prepared by Bayer  CropScience
(Stilwell, KS). The [naphthyl-l-14C]carbaryl had a radiochemi-
cal purity of 100% and a specific activity of 21.33 mCi/mmol.
The  [naphthyl-4a,5,6,7,8,8a-14C]carbaryl had a radiochemical
purity of 98.6% and a specific activity of 105.7 mCi/mmol.
Reference standards were obtained from Bayer CropScience
(Frankfurt, Germany). All other solvents and reagent chemi-
cals  used were obtained  from commercial suppliers without
additional purification.


Animals
   Male SD rats were approximately 7 wk of age and weighed
200  g when  obtained from Charles  River Laboratories Inc.
(Kingston, NY). These rats were acclimated for  7  d prior to
dosing. During the acclimation period, rats were housed in indi-
vidual cages and maintained on a 12-h photocycle at 22 + 2°C
temperature and 70 + 5% relative humidity. The rats were allowed
access to food (Rodent Diet, PMI Nutrition International, Inc., St.
Louis, MO) and municipal tap water ad libitum. Immediately prior
to dosing, the rats were fasted for approximately 10 h.


Pharmacokinetic and Pharmacodynamic Experiments
   To determine  the pharmacokinetic  behavior  of carbaryl,
[14C]carbaryl  was administered to rats  at two  dose levels
through  oral  or  intravenous  routes (Table  1). Rats  were
treated orally at 1.08 mg/kg BW (O-LDE) or 8.45 mg/kg BW
(O-HDE) or intravenously at 0.8 mg/kg BW (IV-LDE) or 9.2
mg/kg  BW   (IV-HDE).   [Naphthyl-l-14C]carbaryl   was
administered for all high-dose experiments  and [naphthyl-
4a,5,6,7,8,8a-14C] carbaryl was used for low-dose  experiments.
Stock aqueous  solutions were  evaporated  under a  gentle
stream of dry nitrogen. For oral dosing solutions, the residue
was taken up in 20 ml of an aqueous solution containing 0.5%
(w/v) carboxymethylcellulose and 1% (w/v) Tween 80. For iv
dosing solutions, the residue was taken up in  9.6 ml polyeth-
ylene glycol with an average molecular mass of 200 Da (PEG
200). The  resulting  solution was sonicated  for 2 min and
followed by mixing for 2 min. For high dosing experiments,
an aliquot (80 mg) of nonlabelled carbaryl was also added
into  the solutions. Aliquots of the dosing solution  at various
time of exposure were analyzed by high-performance liquid
chromatography (HPLC) to verify dosing solution stability.
Each rat chosen for the experiment was either dosed by oral
gavage with 0.5 ml of the dosing solution or with 0.2  ml  of
                                    Previous
                            TOC

-------
                                          PBPK/PD MODELING OF CARBARYL
                                                                                                              1365
                                                     TABLE 1
                Summary of the In Vivo Pharmacokinetic and Pharmacodynamics Studies Conducted in Male
                                      Sprague-Dawley Rats Exposed to Carbaryl
   Experiment
  Routes
 Average dose rate
(mg/kg body weight)
              Measured endpoints
   O-LDE
   O-HDE
   IV-LDE
   IV-HDE
Oral gavage
Oral gavage
Intravenous
Intravenous
        1.08
        8.45
       0.80
       9.20
• TRRa in blood, plasma, RBCfe, brain
• TRRa in blood, plasma, RBCfe, brain, liver, and fat
1 Carbaryl in brain
1 1-Naphthol in plasma and brain
1 Naphthol sulfate in plasmac
17V-Hydroxy carbaryl in braind
1 Cholinesterase activities in brain, plasma, and RBCfe
• TRRa in blood, plasma, RBCfe, brain
• TRRa in blood, plasma, RBCfe, brain, liver, and fat
1 Carbaryl in plasma, brain, liver and fat
1 1-Naphthol in plasma, brain, liver and fat
1 Naphthol sulfate in plasmac
17V-Hydroxy carbaryl in braind
1 Cholinesterase activities in brain, plasma, and RBCfe
     Note. Four rats per experiment.
     TRR: total radioactive residues.
     *RBC: red blood cells.
     "Simulated as the eliminated 1-naphthol in the model.
     ^Included in the "other composite metabolites" in the model.
the  dosing  solution  introduced  through  a  previously
implanted jugular cannula.
   Groups of 4 animals were sacrificed at various time points
up to 24 h (oral: 15, 30 min, 1, 2, 4, 6, 12, or 24 h; iv: 5, 10, 20,
30 min, 1, 2, 4, or 8 h) following dose administration to collect
blood and tissues including brain, liver, and fat. At each time
point  in each experiment,  individual  rats were anesthetized
with halothane (Aldrich Chemical Co. Inc.,  Milwaukee, WI).
Brain tissue and whole blood were collected in the low-dose
experiments,  while portions of other  tissues, including brain
and blood,  in the high-dose experiments  were collected and
processed for Cholinesterase  experiments.  Collected whole
blood was cooled and centrifuged to separate plasma from red
blood cells (RBC).
   Total radioactive residue (TRR) levels were determined for
whole blood, plasma, RBC, brain, liver, and fat from individu-
ally rats at predetermined time points  for high-dose groups in
each of the  dosing routes. Liver and fat samples were not ana-
lyzed for the lower dose groups. Triplicate or quadruplicate tis-
sue or blood samples were used for radioassay. All liquid
samples and oxidized solid samples were radioassayed either
by HPLC with a radioactivity detector (Raytest  RAMONA,
Pittsburg, PA) or with a liquid scintillation counter (Beckman,
model LS 6000LL, Irvine, CA).  Samples with residues below
the HPLC  system's  detection limit  of  quantitation for the
radiodetector (9-15 ppb) were quantified by liquid scintillation
counting (detection limit of 0.48-0.79 ppb).
                                         Metabolites and the parent compound carbaryl in plasma,
                                       brain, liver, and fat were identified and quantified by liquid
                                       chromatography/electron-spray  ionization-mass spectrometry
                                       (LC/ESI-MS).  Samples  from the high-dose exposure were
                                       separated in half: samples for TRR measurements and samples
                                       analyzed for metabolic composition. The samples for meta-
                                       bolic composition were pooled  together for a single measure-
                                       ment at each time point. Among the known carbaryl metabolites,
                                       1-naphthol, 1-naphthol sulfate, and 7V-hydroxymethyl  carbaryl
                                       were detected in plasma and the tissues were assayed under the
                                       current experimental conditions.
                                         Cholinesterase activities in brain, plasma, and RBC from
                                       high-dose experiment groups were determined spectrophoto-
                                       metrically with acetylcholine as a substrate using a modified
                                       Ellman  method designed  to minimize  the  reactivation  of
                                       Cholinesterase  during measurement (Nostrandt et al., 1993).
                                       The absorbance change resulting from acetylcholine hydroly-
                                       sis was proportional to time  and protein concentration under
                                       the assay conditions. Cholinesterase inhibition by carbaryl as
                                       percent  depression  was  calculated  with  the following
                                       equation:


                                       %ChEdepression= 100([ChE]contol - [ChE]exposed)/[ChE]contol (1)

                                       where [ChE]  represents the  Cholinesterase  concentration in
                                       control or exposed rats.
                                  Previous
                                  TOC

-------
1366
                                                      A. NONG ET AL.
Model Structure
   The PBPK/PD model for carbaryl was developed based on
the structure of the organophosphate  (OP) models for diisopro-
pylfluorophosphate (DFP) and parathion (Gearhart et al., 1990,
1994). Since the experiments mainly consisted of individual
TRR values that involved radioactivity  of carbaryl and all its
metabolic   derivatives  together,  the  model  described  the
combine molar mass burden of the carbaryl and its composite
metabolites. Therefore, three individual PBPK models inter-
connected by carbaryl metabolic processes described the phar-
macokinetics carbaryl, 1-naphthol (as major metabolite),  and
other composite metabolites derived (Figure 2). A PD model
was then added to simulate the reversible cholinesterase inhibi-
tion in blood and brain based on carbaryl concentrations in
    blood (plasma and RBC). The main differences between our
    carbaryl model and the OP models include: (1) diffusion lim-
    ited tissue compartments for carbaryl; (2) the addition of PBPK
    models for carbaryl metabolites; and  (3) reversible  cholinest-
    erase binding. A detailed description of the PBPK/PD carbaryl
    model is also found in the appendix.

    PBPKModel
       Three PBPK submodels describe the time course of absorp-
    tion, distribution, metabolism, and elimination of carbaryl, and
    distribution and elimination of 1-naphthol and the "other com-
    posite metabolites" (Figure 2). Michaelis-Menten metabolism
    of carbaryl to  1-naphthol is used to describe the elimination by
    esterase of the parent compound in liver and blood. 1-Naphthol
CARBARYL
Q
o
o
m
w
O
Ul
/
I/









BRAIN

FAT

REST OF THE 1
BODY J



LIVER
A|
II
Gl TRACT
t
ORAL

+





Q 0
0 0
3 3
ARTERIAL B
VENOUS B
METABOLISM




1-NAPHTHOL
BRAIN



FAT



REST OF THE
BODY







w

A

LIVER

i





Gl TRACT
!

f \
TERIAL BLOOD
NOUS BLOOD
< >
Manhthnl-
\ * sulfate
\
\
\






OTHER
COMPOSITE
METABOLITES
BRAIN

FAT


REST OF THE
BODY

J
r
LIVER


Gl TRACT
x--*-^




\
\
V
i
c»
Q
O
3
m
Ul
?IN
                                                                                                   FECES
                                          .,'""
                                                 r^ii  °
                                     'Synthesis   py" "^fl'

                                                   \ Inhibition
                                 JFree cholinesterase       Carbamoylated cholinesterase j
                                               Regeneration
                                   ^Degradation
                                     V
FIG. 2.  Schematic representation of the PBPK/PD model structure for carbaryl. Carbaryl, 1-naphthol, and other composite metabolites are presented, with each
its own PBPK structure where they are coupled by liver and blood metabolism. The metabolites are excreted out into the urine and feces by first-order rates. The
dotted lines for carbaryl tissue compartments represent the diffusion-limited distribution of carbaryl, while the tissue compartments in metabolite models are
flow-limited. The PD model of cholinesterase inhibition by carbaryl is introduced in the brain and blood compartments of the carbaryl PBPK model. Activities of
acetylcholinesterase and butyrylcholinesterase are influenced by the carbaryl-cholinesterase interaction.
                                      Previous
TOC

-------
                                            PBPK/PD MODELING OF CARBARYL
                                                                                                                  1367
is modeled to be subsequently conjugated to naphthol sulfate in
a first-order formation process. Naphthol sulfate is then elimi-
nated at a first-order rate. The submodel for the "other composite
metabolites" regroups the generic hydroxylated metabolites
(e.g., 4-hydroxycarbaryl, 5-hydroxycarbaryl, 7V-hydroxymeth-
ylcarbaryl).  The other metabolites originate from the process-
ing of carbaryl  in the liver compartment. In the model, the
CYP450-mediated  metabolism  of carbaryl to  hydroxylated
metabolites  in the liver is described as a common Michaelis-
Menten process.
   The  tissue compartments in the  carbaryl model include
liver, fat, brain, and the rest of the body. Tissue distribution is
allowed  to  be  diffusion  limited.  Originally,  flow-limited
descriptions were attempted for carbaryl distribution, but the
failure to capture slow elimination of carbaryl in liver and fat
prompted the change to a diffusion-limited model for the
parent   compound.  However,  the  two  metabolite models
(1-naphthol  and "the others") were assumed to be flow-limited.
   In the carbaryl model, each tissue compartment was divided
into  blood and tissue subcompartments. While extraction of
carbaryl between the tissue and arterial blood is defined by par-
tition coefficients, the tissues compartments for carbaryl are
unevenly mixed. The transport of carbaryl into tissue was
assumed to  be limited by cell  membrane permeability  rather
than blood flow rate  [as described in Eqs. (2)  and (3)]. For
example, the rate of  change  of carbaryl amount in the liver
tissue (Aliver) was expressed as:
dAllver
  dt
                            C
                              liver
                            —
V Aliver
V xC1
max naphthol In
^m_naphthol ~*~^liv
)
y
Jer , max others
•er m others
x Doral
xQ.
+ cllver
                                                       (2)
where Cliver (|jmol/L) is the carbaryl concentration in liver tissue,
CVliver (|imol/L) is the carbaryl concentration in liver blood, PAL
is the liver permeability area constant, Pliver is the liver tissue to
blood partition coefficient, Vmax (|imol/h) is the maximum meta-
bolic velocity for 1-naphthol or the other composite metabolites,
Km (|jmol/L)  is  the Michaelis-Menten  affinity constant for
1-naphthol or the other composite metabolites, kblle (h"1) is the
rate constant for bile excretion, ka (h"1) is the rate constant for
oral absorption,  and Doral  (|j,mol/L) is the oral gavage dose.
(Units for parameters  are given in Table 2.)  The amount of
carbaryl in the liver blood (Avliver) is then calculated as:
                                                       (3)
        To simulate the  in vivo radioactive residue data obtained
     from the oral gavage study, the model  incorporated a  one-
     compartment gut description for carbaryl  absorption following
     oral  administration.  The  oral  absorption  of  carbaryl  is
     described as  a first order process from  the gut to the liver.
     Intravenous delivery of carbaryl  is assumed as a bolus  dose
     directly in the venous blood.

     PD Model
        A description of the pharmacodynamic interaction of cho-
     linesterase (AChE and BChE) with carbaryl was coupled to the
     PBPK model (Figure 2). Tissue-specific bimolecular inhibition
     of cholinesterase activity (Gearhart et al., 1990) was described
     based on the levels of carbaryl in the brain, plasma, and RBC.
     The tissue-specific  cholinesterase activities  were dependent
     upon (1) the  synthesis and degradation of the free esterase as
     well as (2)  the carbamolyation and decarbamolyation of cho-
     linesterase.  For example, the differential  equations describing
     the amount of acetylcholinesterase in the brain are:

         7
                                                         ^BrAChEl
                                                          dt
                                                            (4)
                           ^'Brain ^ ™ AChE
               _ i •
               ~ K'
                                         AChE
     where ABrAchE, and ABrAchEI are the amount of free and bound
     (inhibited) brain acetylcholinesterase, KsAChE (|imol/h) is the
     rate of  esterase synthesis, kd (IT1) is the rate constant of
     esterase degradation, ki (\\M~l x h"1) and kr (IT1) are the  rate
     constants of carbamolyation and  decarbamolyation  of cho-
     linesterase, and CBrain (|imol/L) is the concentration of carbaryl
     in the brain.  The rate of synthesis of AChE was calculated as:
                       _[ ™Tissue,AChE X ^Tissue
                       —
                                tr
                                              •'AChE
                                                                                                             (6)
                                  AChE
     where BTissueAChE is the basal acetylcholinesterase level (in the
     brain or blood), VTissue is the volume of the tissue (brain or
     blood), and trAChE is the basal acetylcholinesterase turnover
     rate. Similar calculations were also made with BChE. In addi-
     tion, the rate of synthesis of AChE in red blood cells in the
     whole body (KsAcheirbc) was calculated as:
                                                                          B,
                                                                            'brain.AChE
                                * blood '
                                       v Hood
                                                                            tr
                                                                             AChE
                                                                                                                    (7)
where Ca  (|imol/L) is the concentration  of carbaryl in the
arterial blood, and Qliv (L/h) is the liver blood flow.
     where rsblood and rsbrain are the fraction of AChE in red blood cells
     in blood and the fraction of AChE in red blood cells in brain.
                                    Previous
TOC

-------
1368
                                                  A. NONG ET AL.
                                                    TABLE 2
                Physiological, Biochemical, and Pharmacodynamic Priors for the Carbaryl PBPK/PD Model
Parameters
Mean (GM)  Variability (GSD)  Uncertainty
                                       Source
Physiological
Body weight (kg)
Fractional tissue volumes"
 Brain
 Liver
 Fat
 Blood
Rest of the body6
Hematocrit (HCT)
Cardiac output" (L/h/kg)
Fractional tissue blood flows
 Brain
 Liver
 Fat
 Rest of the body6
Pharmacodynamic
Acetylcholinesterase
Basal enzyme activity
 Brain6 (|imol/h/kg tissue)
 Plasma6 (|imol/h/kg tissue)
 Turnover rate6 (IT1)
RBC enzyme activity
 Fraction in brain6 (U/kg)
 Fraction in blood (U/kg)
Cholinesterase rate constants
 Inhibition (jjAT1 x h"1)
 Degradation6 (h'1)
 Regeneration (h"1)
Butyrylcholinesterase
Enzyme activity
 Brain6 (nmol/h/kg)
 Plasma6 Oimol/h/kg)
 Turnover rate6 (h'1)
Cholinesterase rate constants
 Inhibition QjAT1 x h'1)
 Degradation6 (h'1)
 Regeneration (h"1)
Chemical specific
Carbaryl
Partition coefficients'7
   0.255
1.5
0.006
0.034
0.07
0.074
0.716
0.45
14.1
0.051
0.183
0.07
0.698
1.05
1.05
1.05
1.05
—
1.3
1.5
1.05
1.05
1.05
—
1.3
1.3
1.3
1.3
—
1.3
1.3
1.3
1.3
1.3
—
440000 —
13200 —
1.17X107 —
11848 —
1398 1.5
0.4 1.5
0.03 —
1.8 1.5
—
—
—
	
1.3
1.3
—
1.3
46800
15600
3.66 xlO6
0.04
0.03
0.1
—
—
—
1.5
—
1.5
—
—
—
1.3
—
1.3
Measured

Brown etal., 1997
Brown etal., 1997
Brown etal., 1997
Brown etal., 1997
Difference
Suckow et al., 2005
Arms and Travis, 1988

Brown etal., 1997
Brown etal., 1997
Brown etal., 1997
Difference
                                          Maxwell etal., 1987
                                          Maxwell etal., 1987
                                          Wang and Murphy., 1982

                                          Timchalk et al., 2002
                                          Timchalk et al., 2002

                                          Fitted (originally Gearhart et al., 1990)
                                          Gearhart et al., 1990
                                          Fitted (originally Gearhart et al., 1990)
                                          Maxwell etal., 1987
                                          Maxwell etal., 1987
                                          Main etal., 1972

                                          Fitted (originally Gearhart et al., 1990)
                                          Gearhart et al., 1990
                                          Fitted (originally Gearhart et al., 1990)
Brain
Liver
Fat
Rest of the body
1.4
8
6.5
1.34
1.5
1.5
1.5
1.5
1.3
1.3
1.3
1.3
Calculated
Calculated
Calculated
Calculated
                                                                                                     (Continued)
                                  Previous
                     TOC

-------
                                            PBPK/PD MODELING OF CARBARYL
                                                                                                                  1369
                                                       TABLE 2
                                                     (Conitinued)
Parameters
Mean (GM)  Variability (GSD)  Uncertainty
                         Source
Tissue permeability -area constants"
Brain
Liver
Fat
Rest of the body
Gastro-intestinal rate constants
Oral absorption (IT1)
Bile excretion (IT1)
(L/h/kg)
0.052
0.55
2.5
3.98

3.6
0.01

.5
.5
.5
.5

.5
.5

1.3
1.3
1.3
1.3

1.3
1.3

Fitted
Fitted
Fitted
Fitted

Fitted
Fitted
1-Naphthol
Partition coefficients'7
 Brain
 Liver
 Fat
 Rest of the body
Elimination rate constants
   0.36
   1.5
   0.37
   0.003
1.5
1.5
1.5
1.5
Calculated
Calculated
Calculated
Calculated
Liver VmaxCa (|jmol/h/kg)
Liver Km, naphthol (|imol/kg)
Blood VmaxCa (nmol/h/kg)
Blood Km,blood naphthol (|imol/L)
Bile excretion (h :)
Blood first-order elimination (h"1)
Naphthol sulfate formation (h"1)
Naphthol sulfate elimination (h"1)
Other composite metabolites
Partition coefficients'7
Brain
Liver
Fat
Rest of the body
Elimination rate constants
Liver VmaxCa Otmol/h/kg)
Liver Km, others (|j,mol/L)
Bile excretion (h"1)
Blood 1st order elimination (h"1)
13
20
27
27
15
0.001
20
5


1.1
1.32
25
1.6

20
100
0.003
0.001
.5
.5
.5
.5
.5
.5
.5
.5


.5
.5
.5
.5

.5
.5
.5
.5
1.5
1.5
1.5
1.5
1.5
1.5
1.5
1.5


1.5
1.5
1.5
1.5

1.5
1.5
1.5
1.5
Fitted (originally McCracken et al.,
Fitted (originally McCracken et al.,
Fitted (originally McCracken et al.,
Fitted (originally McCracken et al.,
Fitted
Fitted
Fitted
Fitted


Fitted
Fitted
Fitted
Fitted

Fitted (originally Tang et al., 2002)
Fitted (originally Tang et al., 2002)
Fitted
Fitted
1993)
1993)
1993)
1993)















  Note. Parameters that are indicated as "fitted" were either estimated directly from in vivo pharmacokinetic data measured in the present study
or adjusted from previously reported values by fitting the model to the data. Distributions are all lognormal in shape and described using geo-
metric mean (GM), geometric standard deviations (GSD) of variability and uncertainty. Sources refers to mean values.
  "Cardiac output (Qc) is scaled to BW° 75 (e.g., Qc = Qcc x BW° 75) and tissue volumes are scaled to BW. Vmax and permeability-area constants
are scaled to BW075.
  *Not subject to the MCMC analysis.
  "Calculated from Poulin and Krishnan (1996).
Model Parameters and Prior Distributions
   Prior distribution parameter values of the PBPK/PD models
were obtained from the literature, experimentally calculated or
estimated by fitting with the mean values of the PK/PD experi-
ments (Table  2). The values of the  physiological parameters
were  obtained from  the  literature  (Arms & Travis,  1988;
                          Brown et al., 1997) or were measured data undertaken from the
                          present study  (e.g.,  body  weight). Partition coefficients for
                          carbaryl  and  1-naphthol were estimated using  a  published
                          quantitative structure-property relationship (QSPR) algorithm
                          (Poulin & Krishnan, 1996). Cholinesterase-related parameters
                          were  either  obtained  from  studies  of  organophosphate
                                   Previous
                     TOC

-------
1370
                                                   A. NONG ET AL.
pesticides (Gearhart et al, 1990; Timchalk et al., 2002) or esti-
mated using the cholinesterase inhibition data obtained from
the current study.
   The remaining parameters unavailable from literature or
experimental source (e.g., partition coefficients of the metabo-
lites, carbaryl-cholinesterase  inhibition  rate constants,  oral
absorption  constants, and  metabolic rate constants)  were
estimated by fitting to  the values  of the composite tissue
measurements (carbaryl,  1-naphthol, 1-naphthol sulfate, and
jV-hydroxymethyl carbaryl) from the IV-HDE  and O-HDE
experiments. Optimization of the prior estimates of the PBPK7
PD model with the mean data was undertaken in ACSL (Aegis,
Huntsville, AL).
   For some of the parameters, means of the prior distributions
were set to values previously estimated by fitting mean values
of some of the data used in the MCMC analysis. Technically,
this practice violates the Bayesian division between prior infor-
mation and data used to refine the  priors. However, from a
pragmatic viewpoint this approach, which has been used previ-
ously (Bois 2000), was necessary to obtain convergence.
   For the MCMC  analysis, while some distributions of inter-
individual difference were available in the literature (Table 2),
most of the distributions were unknown. The standard devia-
tions for the parameters with unknown  distributions  were
arbitrary  given  values as described from  previous Bayesian
PBPK  modeling studies  (Bois  et al., 1996; Jonsson et  al.,
2001). For the carbaryl specific distributions, a geometric stan-
dard deviation (GSD)  of 1.5 was assigned,  yielding  an
expected ratio of standard deviation to mean equal to 50%. For
all parameters,  the  uncertainty of the parameter distribution
was given a GSD of 1.3 or 1.5 (Table 2). Such a prior distribu-
tion  reflected a "reasonably vague specification" as suggested
by Carlin and Louis (2000). The assignment of these uncertain-
ties was necessarily subjective, given the limited availability of
literature data. Each prior was also bounded by an upper and
lower value of threefold the  GSD, which is not shown in Table
2. The distributions of the parameters were assumed to be log-
normal in shape with a normal error distribution.


Parameter Characterization of Credible  Interval
   A hierarchical approach was used to characterize the upper
and lower bound (variability)  of parameters and their level of
confidence  (uncertainty)  in the PBPK/PD model with  the
MCMC technique (Bernillon & Bois, 2000; Hack, 2006).
   Physiological parameters known to possess high variability
and model parameters with estimated prior mean values were
re-estimated with the individual data sets in the MCMC simu-
lations  (Table 2). Other parameters, such  as the  degradation
rate of free cholinesterase, were held constant from the MCMC
analysis to keep the integrity of these values during the simula-
tion and to generate posteriors related to these literature values.
   The MCMC  simulations were performed with MCSim soft-
ware (version 5.0.0;  http://toxi.ineris.fr ). A different random
    seed value to initiate the MCMC simulation was set for each
    repeated simulation. The first 1000 iterations of the MCMC
    process served as burn-in to generate stable initial values. The
    simulation was then continued until the model parameters were
    able to converge into stable estimates  of their statistical distri-
    bution. Specifically, half of the iterations from three different
    random MCMC simulations (also known as chains) were com-
    pared by visual inspection  as well as by the Brooks, German,
    and Rubin convergence test (Smith, 2005).  Statistical analysis
    on these posterior estimates was performed with the Bayesian
    Output Analysis Program (BOA; Smith, 2005) package for R
    and S-Plus. These posterior distributions were then applied in a
    Monte Carlo simulation to characterize the credible interval of
    model outputs,  including both pharmacokinetics  (i.e., tissue
    dosimetry)   and  pharmacodynamics  (i.e.,  cholinesterase
    activities).
       In addition, a visual inspection of the carbaryl PBPK/PD
    model-simulated blood concentration and plasma AChE levels
    was compared to experimental data  observed in Fernandez
    etal. (1982) (Figure  8). The comparison of the kinetic time
    profiles demonstrates the fit between the mean posterior values
    obtained from the MCMC analysis of the PBPK/PD model
    parameters with a different  data set.


    RESULTS

    Pharmacokinetic Studies—Radioactive Residue
    Measurements
       Uptake, metabolic degradation, and clearance of [14C] carbaryl
    by rats were rapid and complete for both oral and iv routes of
    administration (examples in Table 3). Peak TRR levels in all
    tissues analyzed following a single oral dose at 1.05 mg/kg
    body weight (BW) were reached within 15 min; peak levels
                           TABLE 3a
     Average Total Radioactive Residue (TRR) Levels (ppm) in
      Brain, Red Blood Cells (RBC), Liver, and Fat From Rats
                    Exposed to Carbaryl Orally
O-LDE
(1.08 mg/kg)
Time (h)
0.25
0.5
1
2
4
6
12
24
Brain
0.13
0.06
0.03
0.03
0.02
0.01
0.01
0.003
RBC
0.44
0.32
0.18
0.1
0.11
0.07
0.02
0.01
Brain
1.97
1.15
0.62
0.21
0.11
0.1
0.06
0.01
O-HDE
(8.45 mg/kg)
RBC
2.56
2.59
2.24
0.61
0.36
0.41
0.15
0.04
Liver
20.9
13.5
8.1
2.6
1.6
1.9
0.7
0.1
Fat
3.6
3.4
5.3
1.3
0.3
0.2
0.1
0.0
      Note. Oral low and high dose experiment (O-LDE; O-HDE).
                                   Previous
TOC

-------
                                            PBPK/PD MODELING OF CARBARYL
                                                                                                                   1371
                        TABLE 3b
  Average Total Radioactive Residue (TRR) Levels (ppm) in
   Brain, Red Blood Cells (RBC), Liver, and Fat From Rats
             Exposed to Carbaryl Intravenously
IV-LDE
(0.8 mg/kg)
Time (h)
0.08
0.17
0.33
0.5
1
2
4
8
Brain
0.74
0.41
0.23
0.14
0.06
0.03
0.01
0.01
RBC
1.06
0.91
0.67
0.49
0.3
0.15
0.1
0.05
Brain
13.2
10.7
7.9
6.7
2.4
1.1
0.3
0.2
IV-HDE
(9.2 mg/kg)
RBC
10.2
9.1
7.3
5.5
3.2
2.5
1
0.7
Liver
24.7
27.7
25.5
19.5
13.1
7.2
2.7
1.6
Fat
12.1
15.6
21.2
28.5
16.2
8.5
1.3
0.2
  Note. Intravenous low and high dose experiment (IV-LDE;
IV-HDE).
                              following a single oral dose at 8.45 mg/kg BW were attained
                              within 30 min. Peak TRR levels in whole blood, RBC, and
                              brain following an iv dose at either 0.8 mg/kg BW or 9.2 mg/kg
                              BW were found at 5 min; peak levels in liver were reached at
                              10 min and in fat at 30 min.
                                 Figure 3 illustrates the time-dependent concentration pro-
                              files of carbaryl, 1-naphthol, and 1-naphthol  sulfate in blood
                              and brain from high-dose iv experiments. A  single measure-
                              ment was sampled for every time point, since the compounds
                              were pooled together. Concentration profiles for other tissues,
                              other  metabolites,  or  oral dosing experiments were not
                              included in this  figure,  but  are described later in this article.
                              Carbaryl and metabolites were only detected in the high-dose
                              iv and  oral  experiments.  Carbaryl  and  1-naphthol  were
                              detected in plasma, brain, liver, and fat following an iv dose at
                              9.2 mg/kg BW. Peak carbaryl levels in plasma, brain, and liver
                              were reached at 5 min; peak level in fat was attained at 30 min.
                              Peak 1-naphthol levels in plasma, brain, and liver were reached
                              later than carbaryl, but peak  1-naphthol  level  in  fat was
                              reached  earlier  than for carbaryl. 1-Naphthol  sulfate and
          (A)
     4         6
Time (hours)

    (C)
                                                                                   Time (hours)
                                      =  4-
                                                          Time (hours)

FIG. 3.  Comparison of the PBPK model-simulated (A) carbaryl, (B) 1-naphthol,  and (C) naphthol sulfate radioactive residue levels against average
experimental data. Blood levels (solid lines: predictions; D: data) and brain levels (dotted lines: predictions; A: data) were obtained from rats exposed to a single
iv dose of 9.2 mg/kg body weight (BW).
                                    Previous
                         TOC

-------
1372
                                                   A. NONG ET AL.
7V-hydroxycarbaryl were detected in plasma following an oral
dose of 8.45 mg/kg BW or an iv dose at 9.2 mg/kg BW, but
these metabolites were not found in other tissues.

Pharmacodynamic Studies—Cholinesterase Inhibition
   Cholinesterase inhibition by carbaryl was measured for
each of the two routes in only the high-dose exposures. The
depression of Cholinesterase activities in the brain and blood
reduced as the carbaryl concentrations  decreased with time
(Table 4). The half-life of the inhibition effect was less than 2
h following a single oral or a single iv exposure (i.e., cho-
linesterase activity fell by half after 2 h following exposure).

Model Parameters Characterization
   For all of the physiological parameters, the carbaryl-specific
kinetic constants (including tissue permeability and metabolic

                       TABLE 4a
 Average and Standard Deviations of Cholinesterase Activity
 Inhibition Levels" in Brain and Red Blood Cells (RBC) From
     Rats Exposed to Oral High Doses (HDE) of Carbaryl
Brain
Time (h)
0.25
0.5
1
2
4
6
12
24
Average
57
52
45
27
10
13
2
1
SD
8
7
6
10
6
6
10
8
RBC
Average
59
41
76
29
50
20
3
<0.01

SD
9
31
26
25
21
37
46
50
  "Expressed as percent Cholinesterase depression.


                       TABLE 4b
 Average and Standard Deviations of Cholinesterase Activity
 Inhibition Levels" in Brain and Red Blood Cells (RBC) From
     Rats Exposed to IV High Doses (HDE) of Carbaryl
Brain
Time (h)
0.083
0.167
0.333
0.5
1
2
4
8
Average
83.2
79.6
79.7
75.8
64.4
44.7
17.8
1.9
SD
2.4
2.7
2.4
2.8
5.5
9.0
9.7
7.7
RBC
Average
78
85
74
71
65
45
38
29

SD
20
8
12
8
28
25
7
7
  "Expressed as percent Cholinesterase depression.
    constants), and various pharmacodynamic parameters, conver-
    gence of the Markov Chains was obtained after 25,000 itera-
    tions based on the Brooks, Gelman, and Rubin convergence
    test. The resulting posterior distributions had at most a geomet-
    ric standard deviation (GSD) of 2 (Table 5) or a coefficient of
    variation of 100%. Similar significant posterior estimates were
    also obtained with a larger prior GSD for the variability of the
    parameters (results not shown): A geometric  standard devia-
    tion of 2 was used for the metabolite-specific parameters given
    the limited available literature data.
    PBPK/PD Model Simulation
       The marginal posterior distributions from the MCMC anal-
    ysis were used in the PBPK/PD model to simulate the tissue
    dosimetry and Cholinesterase inhibition. The correlation of the
    model posterior distributions was not considered in these sim-
    ulations.  Time-course predictions of the carbaryl, 1-naphthol,
    and naphthol sulfate levels in rat blood and brain tissue were
    consistent within  less  than twofold difference  with the
    measured data (Figure 3). With only the geometric mean pos-
    terior estimates of the parameter, the individual measurements
    of carbaryl and its components were compared to model simu-
    lations. The predicted tissue  dosimetry was then compared to
    the total radioactive residue population data for all dose levels
    and routes of exposure (Figure 4). Values close to the identity
    line of the plot represent consistency between predictions  of
    TRR and the observed measurements.  Over 80% of the
    predicted tissue concentrations are consistent within twofold
    of the  experimental data.  Although prediction of the tissue
    concentrations was more consistent, the model predictions  of
    Cholinesterase inhibition exhibited considerable variability  as
    the distribution of residuals is spread out away from the iden-
    tity line (Figure 5).  In this case, over 60% of the model predic-
    tions are consistent within  0.5-fold  of the  Cholinesterase
    inhibition data.
       With the mean and standard deviations of the marginal pos-
    terior distributions for the parameters from the MCMC analy-
    sis, a Monte Carlo simulation was conducted to generate
    distributions of model outputs. The observed TRR values  in
    our study are within the 95% credible  interval of the Monte
    Carlo prediction with the PBPK/PD model (example for blood
    and brain in Figure  6). Similar results were also obtained when
    comparing the model predictions with the data from low dose
    exposures (results not shown). The concentration  of TRR  in
    brain for oral dose experiment was overpredicted by the model.
    In general, the Cholinesterase depression by carbaryl measured
    in the experiments was also within the 95% credible interval of
    the simulated values (Figure 7). As observed with the pharma-
    cokinetic data, the prediction  of Cholinesterase depression from
    the model was more consistent with the iv dosing data than
    with the oral  dosing data. The MCMC analysis was unable  to
    determine an oral absorption  rate that was consistent with both
    the kinetic   tissue  concentrations  and  the  Cholinesterase
                                    Previous
TOC

-------
            PBPK/PD MODELING OF CARBARYL
                                                                       1373
                     TABLE 5
Posterior Distribution Estimates for the Carbaryl PBPK/PD Mode
Parameters
Physiological
Fractional tissue volumes (kg"1)
Brain
Liver
Fat
Blood
Hematocrit (HCT)
Cardiac output" (L/h/kg)
Fractional tissue blood flows
Brain
Liver
Fat
Pharmacodynamic
Acetylcholinesterase
RBC enzyme activity in blood (U/kg)
Cholinesterase rate constants
Inhibition (jjAT1 x h"1)
Regeneration (h"1)
Butyrylcholinesterase
Cholinesterase rate constants
Inhibition QjAT1 x h'1)
Regeneration (h"1)
Chemical specific
Carbaryl
Partition coefficients
Brain
Liver
Fata
Rest of the body
Tissue permeability -area constants (L/h/kg)
Brain
Liver
Fat
Rest of the body
Gastrointestinal rate constants
Oral absorption (h"1)
Bile excretion (IT1)
1-Naphthol
Partition coefficients
Brain
Liver
Fata
Rest of the body
Mean (GM)

0.0069
0.033
0.080
0.065
0.45
16.58

0.051
0.196
0.067

824
0.69
6.0

0.043
0.20



1.04
2.53
5.44
0.93
0.13
1.82
0.83
6.29
1.09
0.011


0.33
2.45
0.30
0.0033
Variability (GSD)

1.3
1.3
1.4
1.3
1.4
1.4

1.3
1.3
1.3

1.6
1.8
2.0

1.4
1.8



1.6
2.1
1.5
1.7
1.9
2.1
2.0
1.7
1.7
1.6


1.5
1.8
1.6
1.6
Uncertainty

.2
.2
.2
.2
.2
.2

.2
.2
.2

1.2
1.2
1.2

1.3
1.2



1.2
1.2
1.2
1.2
1.2
1.2
1.2
1.2
1.2
1.3


1.3
1.2
1.3
1.3
                                                       (Continued)
     Previous
TOC

-------
1374
                                                       A. NONG ET AL.
                                                          TABLE 5
                                                         (Continued)
            Parameters
Mean (GM)
Variability (GSD)
Uncertainty
            Elimination rate constants
               Liver VmaxC (^mol/h/kg)
               Liver Km, naphthol (jimol/L)
               Blood VmaxC Oimol/h/kg)
  22.4
  59.2
  21.2
        1.7
       2.0
        1.6
    1.2
    1.2
    1.3
Blood Km,blood (|jmol/L)
Bile excretion (IT1)
First-order blood elimination (IT1)
Naphthol sulfate formation (IT1)
Naphthol sulfate elimination (IT1)
Other composite metabolites
Partition coefficients
Brain
Liver
Fat
Rest of the body
Elimination rate constants
Liver VmaxC (^mol/h/kg)
Liver Km,others (|jmol/L)
Bile excretion (IT1)
1st order blood elimination (IT1)
78.0
12.6
0.0011
15.0
6.2


0.46
7.96
0.57
5.16

22.13
215
0.0032
0.0011
2.0
1.6
1.6
1.6
1.6


2.0
2.0
1.9
2.1

1.5
1.8
1.6
1.6
1.2
1.3
1.3
1.3
1.3


1.2
1.2
1.2
1.2

1.2
1.2
1.2
1.3
               Note. Distributions are described using geometric mean (GM) and geometric standard deviations (GSD).
            Uncertainties are also presented as geometric standard deviations. Cardiac output (Qc) is scaled to BW° 75 (e.g.,
            Qc = Qcc x BW° 75) and tissue volumes are scaled BW. Vmax and permeability-area constants are scaled to BW° 7;
                              10      15     20      25
                           Observed 14C level (mg/kg)
                            10      15      20      25
                          Observed 14C level (mg/kg)
                                                                                                                   30
FIG. 4.  Comparison of predicted and observed radioactive residue levels from rats exposed to carbaryl at various doses and routes (D: iv, •: oral). The identity
line marks similarity between prediction and experimental results.
                                      Previous
   TOC

-------
                                           PBPK/PD MODELING OF CARBARYL
                                                                                                                1375
             100
             80 -
          •o
          w
          ^
          o
             60 -
                 3      O    O O
             40 -
             20 -
                            oo   o
                         o o
                        o o o o
                                                              100
                       20      40      60
                          Observed % ChE depression
                                                      100
                                                              80 -
                                                              60
                                                              40-
                                                              20 -
                  20       40      60      ;
                    Observed % ChE depression
                                                                                                       100
FIG. 5.  Comparison  of predicted and observed cholinesterase (ChE) activity depression from rats exposed to  carbaryl at various doses and routes
(O: iv, •:oral). The identity line marks similarity between prediction and experimental results.
inhibition by carbaryl. This difficulty is reflected in the uncer-
tainty of the posterior estimates.
   The model performance was also evaluated by fitting to pre-
viously reported carbaryl concentration in blood and AChE
inhibition in plasma (Fernandez et al.,  1982)  (Figure 8). The
model was  able to predict both the carbaryl concentration and
the extent of AChE inhibition over time, providing some sup-
port for the validity of the calibrated model.
DISCUSSION
   The   pharmacokinetics  of  carbaryl  was  previously
described with either a noncompartmental model or an empir-
ical model with limited virtual compartments (Houston et al.
1974; Cambon et al. 1980, 1981; Lechner & Abdel-Rahman
1986). In the current study, a PBPK/PD model for carbaryl
was developed to describe the mechanisms that link exposure
with target  tissue dose  and cholinesterase inhibition.  The
model incorporated  key pharmacokinetic characteristics of
carbaryl, including rapid uptake, rapid metabolic degradation,
and rapid elimination for all routes of administration, and rea-
sonably simulated the concentration of the parent compound
and  its metabolites  in several tissues from two different
dosing methods, i.e., iv and oral, and the resulting cholinest-
erase inhibition. Experience gained in developing this model
for carbaryl is expected  to provide  a basis for the more
efficient development of models for other NMCs, similar to
the seminal  impact of the PBPK model for DFP (Gearhart et
al., 1990) on the development of models of other OPs.  Subse-
quent to the DFP model, PBPK models were developed for a
number of other OPs, including  parathion  (Gearhart et al.,
1994), chlorpyrifos  (Timchalk et al., 2002), and diazinon
     (Poet et al., 2004); all of these models made use of data and
     approaches from the work on DFP.
       The refinement of the PBPK/PD model parameterization,
     including variability and uncertainty, in the present study was
     performed with an MCMC technique on a large set of individ-
     ual tissue concentrations, radioactive residue levels, and cho-
     linesterase inhibition data from male SD rats. While the prior
     estimates of some parameters were obtained by fitting with
     mean rat experimental values (e.g., mean [14C]carbaryl brain
     tissue concentration of all the rats together in the IV-HDE), the
     MCMC analysis made use of the individual measurements
     from the experiments (e.g., TRR brain tissue concentration of
     rat number 1, rat number 2,  rat number 3, and rat number 4
     from the IV-HDE). The present Bayesian estimation of poste-
     rior parameter distributions is more robust by allowing for the
     physiological and experimental variability of the kinetics and
     the inhibition response to carbaryl. In addition, because of the
     complex nature of the population data on total radioactive resi-
     dues, the model description involved the  estimation  of the
     combined kinetics and responses of the  parent compound as
     well as the kinetics of the metabolites. Such large and compli-
     cated calibration of the  parameters is  more  sensibly applied
     with a  Bayesian  technique like MCMC,  which is beyond the
     capability of  conventional  modeling  with average  tissue
     concentration time-course data (Hack, 2006).
       The present modeling work quantified  the parameters in
     terms of variability (difference between individual rats) and
     uncertainty (level of confidence in estimates) at the popula-
     tion level, assuming that all  the rats of our study are part of
     the same population. The MCMC analysis with the individual
     data reduced uncertainty of the estimates from 0.5 to 0.3 fold.
     While some parameter priors were subjective,  the posteriors
                                   Previous
TOC

-------
1376
                                                     A. NONG ET AL.
                                                              (A)
                        8       12      16
                            Time (hours)
                                                20
                                                        24
                           345
                            Time (hours)
                                                                 7-
                                                                 6-
                                                              D)
                                                              ^T 4-
                                                              g
                                                              I  3-1

                                                              S»H
                                                              o
                                                                 1 -
                                12      16
                            Time (hours)
                                                20
                                                        24
                      2345
                             Time (hours)
FIG. 6.  Monte Carlo simulated distributions of the total radioactive residue level in the brain (A) and in blood (D) using MCMC posterior estimates. Simulated
results are compared with the high-dose exposure (HDE) data for (A) oral (8.45 mg/kg BW) and (B) iv (9.2 mg/kg BW). Dotted line represents median PK
profile, while the upper 95th and lower 5th credible intervals are presented in solid lines.
were calibrated with the individual experimental data, provid-
ing a greater level of confidence in the estimates. In general,
the model predictions  using the posteriors were consistent
with the experimental iv data in this study as well as with the
study by Fernandez et al. (1982). Consistency between tissue
concentrations and cholinesterase inhibition for oral adminis-
tration were less well achieved in comparison. These route-
specific differences in response have been observed for other
pesticides such as carbolsulfan and methylparathion (Renzi &
Krieger, 1986; Kramer et al., 2002). Discrepancies in kinetics
and responses from the oral route suggest that the delivery of
carbaryl by the gut is more complicated than the current first-
order absorption. Further  analysis  of the dose  and  route
dependencies will  be evaluated in  a subsequent  modeling
effort.
       Previous studies quantified cholinesterase inhibition in the
    rat brain and blood based on a single route of exposure (450,
    800, or 1200 mg/kg orally in Mount et al., 1981; 20 mg/kg iv in
    Fernandez  et al., 1982). In the current study, cholinesterase
    inhibition was investigated from two different routes of expo-
    sure to carbaryl, oral and iv. The responsiveness of the model
    parameters toward the predicted blood concentration and cho-
    linesterase  depression varied according to the exposure routes
    (data not shown). Following iv exposure, changes of the pre-
    dicted  tissue concentrations are most likely to occur from
    changes to  blood flows such as blood flow to the fat or muscle,
    while the blood:brain partition has a greater impact than tissue
    blood flow on  predicted  carbaryl  level  and cholinesterase
    activity in  the brain. Following oral exposures, the  simulated
    tissue concentrations are most sensitive to the oral absorption
                                     Previous
TOC

-------
                                             PBPK/PD MODELING OF CARBARYL
                                                                                                                    1377
       100 n
        60 4
     
-------
1378
    (A)
       100,
     D)
    .
     o
    O
        10:
         1 :
       0.1
A. NONG ET AL.


        (B)
           100n
                          8      12      16
                             Time (hours)
                                                20
                                                        24
                                                             •F  60-
                                                              8
                                                              ra
                                                              
-------
                                                     PBPK/PD MODELING OF CARBARYL
                                                                                                                                           1379
Banerjee,  J.,  Ghosh, P.,  Mitra, S.,  Ghosh, N., and Bhattacharya,  S.  1991.
   Inhibition of human  fetal brain acetylcholinesterase:  marker effect  of
   neurotoxicity. J. Toxicol. Environ. Health 33:283-290.
Bernillon, P.,  and Bois, F. Y. 2000. Statistical issues in toxicokinetic modeling:
   A Bayesian perspective. Environ. HealthPerspect. 108:883-893.
Bois,  F. 2000. Statistical analysis of Clewell et al. PBPK model of trichloroet-
   hylene kinetics. Environ. Health.  Perspect.  108 (Suppl. 2):307-316.
Bois,  F. Y., Jackson, E.  T.,  Pekari, K.,  and Smith,  M. T.  1996. Population
   toxicokinetics of benzene. Environ. Health. Perspect. 104:1405-1411.
Brown, R. P., Delp, M. D., Lindstedt, S. L., Rhomberg, L. R., and Beliles, R. P.
   1997. Physiological parameter values for physiologically based pharmaco-
   kinetic models.  Toxicol. Ind. Health 13:407-484.
Cambon, C., Fernandez, Y., Falzon,  M., and Mitjavila, S. 1981. Variations of
   the digestive absorption kinetics of carbaryl with the nature of the vehicle.
   Toxicology 22:45-51.
Cambon, C., Declume, C., and Derache, R. 1980. Fetal and  maternal rat brain
   acetylcholinesterase: Isoenzymes changes following insecticidal carbamate
   derivatives poisoning. Arch. Toxicol. 45:257-262.
Carlin, B. P.,  and Louis, T. A. 2000. Bayes and empirical Bayes methods for
   data analysis, 2nd ed.  Boca Raton, FL: Chapman & Hall/CRC.
Carpenter, C. P., Weil, C. W.,  Palm, P. E., Woodside,  M. W., Nair,  J. H. Ill,
   and Smyth,  H. F., Jr.  1961. Mammalian toxicity of 1-naphthyl-W-methyl-
   carbamate (Sevin insecticide). J. Agric. FoodChem. 9:30-39.
Dawson, R. 1994.  Rate constants of carbamylation  and decarbamylation of
   acetylcholinesterase for physostigmine and  carbaryl in the presence of an
   oxime. Neurochem. Int. 24:173-182.
Declume, C.,  and Benard, P. 1977. Fetal accumulation of [14C] carbaryl in rats
   and mice.  Autoradiographic study. Toxicology 8:95-105.
Fernandez, Y., Falzon, M., Cambon-Gros, C., and Mitjavila, S. 1982.  Carbaryl
   tricompartmental toxicokinetics  and  anticholinesterase  activity.  Toxicol.
   Lett. 13:253-258.
Fillmore, C.,  and Lessenger,  J. 1993. A cholinesterase testing  program for
   pesticide applicators. J. Occup. Med. 35:61-70.
Gearhart, J., Jepson, G, Clewell, H., Andersen, M., and Conolly, R. 1990. A
   physiologically-based pharmacokinetic and pharmacodynamic model for
   the  inhibition   of  acetylcholinesterase  by  disopropylfluorophosphate.
   Toxicol. Appl. Pharmacol. 106:295-310.
Gearhart, J., Jepson, G.,  Clewell,  H., Andersen, M.,  and Conolly,  R.  1994.
   Physiologically-based pharmacokinetic and pharmacodynamic model for
   the inhibition of acetylcholinesterase  by organophosphate  esters.  Environ.
   Health Perspect. 102:51-60.
Hack, C. E. 2006. Bayesian analysis of physiologically based toxicokinetic and
   toxicodynamic models. Toxicology. 221:241-248.
Houston, J. B., Upshall, D. G., and Bridges, J. W. 1975. Pharmacokinetics and
   metabolism  of two carbamate insecticides, carbaryl and  landrin, in the rat.
   Xenobiotica 5:637-648.
International  Programme on  Chemical  Safety. 1994. Carbaryl. EHC  153.
   Geneva: World Health Organization.
Jonsson, F., Bois, F., and Johanson, G.  2001. Physiologically based pharmaco-
   kinetic modeling of inhalation exposure of humans to dichloromethane
   during moderate to heavy exercise. Toxicol. Sci. 59:209-218.
Knaak, J. B., Tallant, M. J., Bartley, W.  J., and Sullivan, L.  J. 1965. The
   metabolism  of carbaryl  in the rat, guinea pig, and man.  J. Agric.  Food
   Chem. 13:537-543.
Knight, E. V., Alvares, A. P., and  Chin, B. H. 1987.  Effects of phenobarbital
   pretreatment on the in vivo metabolism of carbaryl in rats. Bull.  Environ.
   Contain. Toxicol. 39:815-821.
Lechner, D. W., and  Abdel-Rahman, M. S.  1985.  Alterations in  rat  liver
   microsomal  enzymes  following  exposure to carbaryl  and malathion  in
   combination. Arch. Environ. Contain. Toxicol. 14:451-457.
Lockridge, O., and Masson, P. 2000. Pesticides and  susceptible  populations:
   People with butyrylcholinesterase genetic variants may  be at risk. Neuro-
   toxicology. 21:113-126.
Main, A. R., Tarkan, E., Aull, J. L., and Soucie,  W.  G. 1972. Purification of
   horse serum cholinesterase by  preparative polyacrylamide  gel  electro-
   phoresis. J. Biol. Chem. 247:566-571.
Maxwell, D.  M., Lenz, D. E., Groff, W. A., Kaminskis, A., and Frochlich,
   H. L.  1987. The effects  of blood flow and detoxification on in vivo
         chloinesterase inhibition by soman in rats. Toxicol. Appl. Pharmacol.
         88:66-76.
      May, D., Naukam, R., Kambam, R., and Branch, R. 1992. Cimetidine-carbaryl
         interaction  in humans:  Evidence for an active  metabolite of  carbaryl.
         J. Pharmacol. Exp. Ther. 262:1057-1061.
      McCracken, N. W., Blain, P. G., and Williams, F. M.  1993. Nature and role of
         xenobiotic  metabolizing esterases  in rat liver,  lung,  skin and blood.
         Biochem. Pharmacol. 45:31-36.
      McDougal, J. N., Jepson, G. W., Clewell, H.  J. 3rd, Gargas, M. L., and Ander-
         sen, M. E. 1990. Dermal absorption of organic chemical vapors in rats and
         humans. Fundam. Appl. Toxicol.  14:299-308.
      Mirfazaelian, A., Kim, K. B., Anand, S. S., Kim, H. J., Tornero-Velez,  R.,
         Bruckner, J. V., and Fisher, J. W. 2006. Development of a physiologically
         based pharmacokinetic model for deltamethrin in the adult male  Sprague-
         Dawleyrat. Toxicol. Sci. 93:432^142.
      Mortensen, S. R., Hooper, M. J., and Padilla, S.  1998.  Rat brain acetylcho-
         linesterase activity: Developmental profile and maturational sensitivity to
         carbamate and organophosphorus inhibitors. Toxicology 125:13-19.
      Mount, M. E.,  Dayton, A. D., and Oehme, F. W. 1981. Carbaryl  residues in
         tissues and cholinesterase  activities in brain and blood of rats receiving
         carbaryl. Toxicol. Appl. Pharmacol 58:282-296.
      Nostrandt, A. C., Duncan, J. A.,  and Padilla, S. 1993. A modified spectropho-
         tometric method appropriate for measuring cholinesterase activity in tissue
         from carbaryl-treated animals. Fundam. Appl. Toxicol. 21:196-203.
      Poet, T. S., Kousba, A. A., Dennison, S. L., and Timchalk, C. 2004. Physiolog-
         ically  based  pharmacokinetic/pharmacodynamic  model for the organo-
         phosphorus pesticide diazinon. Neurotoxicology 25:1013-1030.
      Poulin, P., and Krishnan, K.  1996. A  mechanistic algorithm  for predicting
         blood:air partition coefficients of organic chemicals with the consideration of
         reversible binding in hemoglobin. Toxicol. Appl. Pharmacol. 136:131-137.
      Rao, P. S., Roberts,  G. H., Pope, C. N.,  and  Ferguson, P. W.  1994. Compara-
         tive  inhibition of rodent and human erythrocyte acetylcholinesterase by
         carbofuran and carbaryl. Pestic. Biochem. Physiol. 48:79-84.
      Renzi, B. E., and Krieger, R. I.  1986. Sublethal acute toxicity of carbosulfan
         [2,3-dihydro-2,2-dimethyl-7-benzofuranyl(di-«-butylaminosulfe-
         nyl)(methyl)carbamate]  in the rat after  intravenous and oral  exposures.
         Fundam. Appl. Toxicol. 6:7-15.
      Shealy, D. B., Barr,  J. R.,  Ashley, D. L., Patterson, D. G., Camann, D. E., and
         Bond,  A. E.  1997. Correlation  of environmental carbaryl measurements
         with serum and  urinary 1-naphthol  measurements in a farmer applicator
         and his family. Environ. Health Perspect. 105:510-513.
      Smith,  B.  J. 2005. Bayesian Output Analysis program  (BOA), version 1.1.5.
         Ames: University of Iowa.
      Sogorb, M. A., and Vilanova, E.  2002. Enzymes involved in the detoxification
         of  organophosphorus,  carbamate  and  pyrethroid  insecticides through
         hydrolysis.  Toxicol. Lett. 128:215-228.
      Strother, A., and Wheeler, L. 1980. Excretion and disposition of [14C]carbaryl
         in pregnant, non-pregnant and fetal tissues of the rat after acute administra-
         tion. Xenobiotica 10:113-124.
      Suckow, M. A., Weisbroth,  S. H., and Franklin, C. L., eds. 2005. The labora-
         tory rat. Boston: Elsevier Academic.
      Tanaka, R., Fujisawa, S., Nakai,  K., and Minagawa, K. 1980. Distribution and
         biliary excretion of carbaryl, dieldrin and paraquat in rats: Effect of diets.
         J. Toxicol. Sci. 5:151-162.
      Tang, J., Cao, Y., Rose, R. L., and Hodgson, E. 2002. In vitro metabolism of
         carbaryl by human  cytochrome  P450 and its inhibition by chlorpyrifos.
         Chem. Biol. Interact. 141:229-241.
      Thrall, K. D., and Woodstock, A. D. 2002. Evaluation of the dermal absorption
         of aqueous toluene in F344 rats using real-time breath analysis and physio-
         logically based pharmacokinetic  modeling. J.  Toxicol. Environ. Health A
         65:2087-2100.
      Timchalk, C., Nolan, R. J.,  Mendrala, A. L., Dittenber,  D. A., Brzak, K.  A.,
         and  Mattsson, J.  L. 2002. A physiologically based pharmacokinetic and
         pharmacodynamic (PBPK/PD) model for the organophosphate insecticide
         chlorpyrifos in rats and humans. Toxicol.  Sci. 66:34-53.
      U.S.  Environmental Protection  Agency. 1999. Integrated Risk Information
         System (IRIS) on carbaryl. Washington, DC: National Center for Environ-
         mental Assessment, Office of Research and Development.
                                           Previous
TOC

-------
1380
                                                     A. NONG ET AL.
Wang, C., and Murphy, S. D. 1982. The role of non-critical binding proteins in
   the sensitivity of acetylcholinesterase from different species to diisopropyl
   fluorophosphate (DFP) in vitro. Life Sci. 31:139-149.
Ward, S., May, D., Heath, A., and Branch, R. 1988. Carbaryl metabolism is
   inhibited by cimetidine in the isolated perfused rat liver and in man. Clin.
   Toxicol. 26:269-281.
Whittaker, M. 1986. Cholinesterase. Monographs in Human Genetics, Vol. 11.
   Basel: Karger.
APPENDIX

MODEL PARAMETER ABBREVIATIONS
Doral:  the oral dose
RIV: the intravenous injection rate
Qt: blood flow (Qc: cardiac output)
Hot: hematocrit
Vt: tissue volume
PAt: tissue permeability constant
Pt: tissue :blood partition cofffiecient
Ct: tissue concentration (a: arterial, v: venous)
A tissue'- tissue amount
AVtissue: tissue blood circulatory amount
         T: cholinesterase tissue amount
          7: inhibited cholinesterase tissue amount
ka: oral absorption rate constant
kmle: biliary excretion rate constant
RAM:  rate of metabolism from carbaryl to 1-naphthol or other
   composite metabolites
Vmax: maximal metabolic velocity
Km: Mechealis Menton affinity constant
kf, ke: rate constants of naphthol sulfate formation and elimination
RABrninh: rate of brain cholinesterase inhibition by carbaryl
RABldinh : rate of blood cholinesterase inhibition by carbaryl
kd, kr, ki: rate constants of cholinesterase degradation, regeneration
   and inhibition
Ks: rate of cholinesterase  synthesis
   Note: Ks- was calculated from enzyme activity constants as
described in the methods  and with the values  in Table 2 and 5.
Tissue volumes given  in Table  2  are the sum of true tissue
volume (95% of the  reported value) and  the tissue blood
volume (5% of the reported value).

PHARMACOKINETIC MODEL EQUATIONS

Carbaryl tissue kinetics
   Liver tissue concentration:
        dt
         ^- = Qliv(Ca-CVlwer
                                        C
                                                 liver
     dt
CVllver --^- \~RAMnapmol -RAMothers
         "liver .
                                                              Cllver=Allver/(0.95xVllver)
                                                      RAMmetabolite —
                                                                      max,metabolite '
                                                                                       K         -4-C
                                                                                       r^m,metabolite ~"~ ^"liver

                                                                 Fat and remaining body tissues concentration:
                                                                      dt

                                                                     dAti,,,
                                                                                   p
                                                                                    tissue
                                                          = PAt\CVasau—f
                                                                          *t
                                                                       dt
                                                                     CVtissue=AVtJSSUJ(Q.Q5xVtissue)

                                                                 Brain tissue concentration:

                                                                  dAV>
                                                                    dt
                                                                      ^ = Qbrn(Ca-CVbram) + PABr
                                                                                    I D
                                                                                    V  brain
                                                                     ^^- = PABr\
                                                                     dt


                                                                      brain ~  brain/{•      brain /


                                                                   CVbrain=AVbmm/(0.05xVbmm)

                                                                 Arterial and venous blood concentration (Ca, Cv):

                                                                 dAh
                                                                   ^blood
                                                                   dt
                                                      = Qc (CV - Ca) - RABldmh - RAMl
                                                                                       blood,naphthol
                                                   Cv = (Qbrn x Cvbram + Qfat x Cvfat + Qliv x Cvllver
                                                        + Qbod*Cvbody+RIV)IQc


                                                   Ca = AUood/VUood

                                              Cplasma = Ca(l - Hct)

                                                 Crbc = CaxHct
                                                                            RAM,
           -KMexAllver+kaxDoml
                                                                                 blood, naphthol
                                     Previous
                                        TOC

-------
                                                PBPK/PD MODELING OF CARBARYL
                                                                                                                             1381
1 -Naphthol and other composite metabolites tissue
kinetics
   Fat, brain and remaining body tissue concentration (includ-
ing liver for other metabolites):
                  fV    = AY    IV
                     tissue      tissue I  tissue


                 Cvxtissue=CXtissue/PXtissue


  Liver tissue concentration (1-naphthol):


deliver,naphthol
      dt
             — QHv(Canaphthol   Cvliver^naphthol ) + RAMnaphthol
                                                      aphthol
                 "•Bile,naphthol X ^liver,naphthol  V X^liver,naphthol


C          = A           IV
  liver,naphthol    liver, naphthol /  liver



Cv          =C          IP
   liver,naphthol     liver, naphthol /  liver, naphthol
 AA
 "^liver, naphthol- sulfate
        iphtho
       ~dT
              utjau! _ ,f   ,
                   — ft/ A s±nver ina
                               inaphthol
                      -kexA,
                              naphthol—sulfate

   Arterial and venous blood concentration (Ca, Cv):
 dA
   blood, naphthol
     ~dt
                                               Wood,naphthol
                  X ^-blood,naphthol + \*^-"^blood,naphthol )
                                                                            Cv = (Qbrn x Cvbmm + Qfat x Cvfat +
                                                                                  QHvxCvllver+QbodxCvbody)/Qc


                                                                                        Ca = Ablood /Vhlood

                                                                     Pharmacodynamic model equations
                                                                     Brain, plasma or  red  blood  cells  acetyl-  and  butyryl-
                                                                     cholinesterase (AChE, BChE) inhibition:
                                                                             ~ ^    X ^tissue,AChE X^-tissue
                                                                   "Atissue,AChE
                                                                       dt
                                                                                 X ^tissue,AChE X ^tissue + ^AChE X •"-tissue,AChEl
                                                                        dt
                                                                                 v C  —fa-     v A
                                                                                 A <^Br   MjichE A Atissue,
                                                                     ltissue,BChE _ ^^
                                                                       dt
                                                                        •ue,BChEI
                                                                        dt
                                                                                 X ^tissue,BChE X ^Br + ^BChE X ^tissue,BChEI
                                                                                       X -"-tissue,BChE X ^tissue


                                                                                 ~ ™BChE X ^tissue,BChEl
                                       Previous
                                                             TOC

-------
Bayesian  Meta-Analysis
of Genetic  Association  Studies
with Different Sets  of Markers

Claudio Verzilli,2-10 Tina  Shah,1-10 Juan  P.  Casas,2 Juliet  Chapman,2  Manjinder Sandhu,3
Sally L. Debenham,3 Matthijs S. Boekholdt,4  Kay  Tee Khaw,3 Nicholas  J. Wareham,5 Richard Judson,6
Emelia J.  Benjamin,7 Sekar Kathiresan,7 Martin G. Larson,7 Jian  Rong,7 Reecha Sofat,1
Steve  E.  Humphries,8 Liam Smeeth,2 Gianpiero Cavalleri,9 John  C. Whittaker,2-*
and Aroon  D. Hingorani1

Robust assessment of genetic effects on quantitative traits or complex-disease risk requires synthesis of evidence from multiple studies.
Frequently, studies have genotyped partially overlapping sets of SNPs within a gene or region of interest, hampering attempts to com-
bine all the available data. By using the example of C-reactive protein (CRP) as a quantitative trait, we show how linkage disequilibrium
in and around its gene facilitates use of Bayesian hierarchical models to integrate informative data from all available genetic association
studies of this trait, irrespective of the SNP typed. A variable selection scheme, followed by contextualization of SNPs exhibiting inde-
pendent associations within the haplotype structure of the gene, enhanced our ability to infer likely causal variants in this region with
population-scale data. This strategy, based on data from a literature based systematic review and substantial new genotyping, facilitated
the most comprehensive evaluation to date of the role of variants  governing CRP levels, providing important information on the
minimal subset of SNPs necessary for comprehensive evaluation of the  likely causal relevance of elevated CRP levels for coronary-
heart-disease risk by  Mendelian randomization. The same method could be applied to evidence synthesis of other quantitative traits,
whenever the typed  SNPs vary among studies, and to assist fine mapping of causal variants.
Introduction

Genetic effects underlying complex traits and disorders are
small, and their detection requires comprehensive typing
of single nucleotide polymorphisms (SNPs) in large sam-
ples.1'2  Many previous genetic association studies have
been  underpowered,3'4 and even very large biobanks5
may not individually provide conclusive results for certain
outcomes. Quantitative synthesis of evidence from avail-
able studies remains vital,6"8 even in the era of genome-
wide analyses.9"11 However, a major obstacle is that studies
of the same gene, region, or even the genome as a whole
may type a different repertoire of SNPs, thereby yielding
partially overlapping  genotypic  data.  Moreover, often
only single SNP summary data, for  instance genotype
means at each SNP, is reported.
  The meta-analysis of results from each marker in isola-
tion would exclude those  studies that did not type the
marker in question, with a potential loss of power; more-
over, multiple single-SNP analyses are difficult to interpret.
Instead, it would be useful to be able to combine data with
information from all  sites, adjusting  any association  at
each site for the possible correlation with the remaining
variants. One could then disentangle effects at causal sites
    from those at sites that are in LD with a causal variant(s)
    and also borrow information across studies. With focus
    on a quantitative trait, we develop a Bayesian hierarchical
    linear regression that models linear transformations of the
    study-specific genotype-group-specific phenotypic  means
    and that uses pairwise LD measurements between markers
    to make posterior inference on adjusted effects. Informa-
    tion on pairwise marker LD is often  provided by the indi-
    vidual studies as part of the results reported. Alternatively,
    for markers that are not considered jointly in any of the
    study at hand, it can often be obtained from public data-
    bases. This information is then used to specify informative
    priors  in  our Bayesian framework. Specifically, the be-
    tween-marker correlations are modeled by  introduction
    of spatially correlated random effects having a conditional
    autoregressive distribution (CAR).12'13 The between-study
    variability is then accommodated with a random intercept
    term across studies.
      Our approach is motivated by the meta-analysis of stud-
    ies assessing the effect of variants in the C-reactive protein
    (CRP [MIM 123260]) gene region on  plasma CRP  levels.
    CRP is a  circulating monomorphic hepatic  acute-phase
    protein that  indexes and may mediate aspects of the in-
    flammatory response.14 Aside from acute-phase elevations,
JCentre for Clinical Pharmacology, University College London, London WC1E 6JF, UK; 2Department of Epidemiology and Population Health, London
School of Hygiene and Tropical Medicine, London WC1E 7HT, UK; 3Department of Public Health and Primary Care, University of Cambridge, Cambridge
CB1 8RN, UK; 4Department of Cardiology, Academic Medical Center, Amsterdam 1100 DD, Netherlands; 5MRC Epidemiology Unit, University of Cam-
bridge, Cambridge CB2 OQQ, UK; 6Genaissance Pharmaceuticals, New Haven, CT 06511, USA; 7Framingham Heart Study, Framingham, MA 01702-
5827, USA; 8Centre for Cardiovascular Genetics, University College London, London WC1E 6JF, UK; 'Molecular and Cellular Therapeutics, RCSI Research
Institute Royal College of Surgeons in Ireland, Dublin 2, Ireland
1 "These authors contributed equally to this work.
•Correspondence: ]ohn.whittaker@lshtm.ac.uk
DOI 10.1016/].a]hg.2008.01.016. ©2008 by The American Society of Human Genetics. All rights reserved.
                                                     The American Journal of Human Genetics 82, 859-872, April 2008  859
                                 Previous
TOC

-------
      Omrviiw of Ctrl
                               1S7.W6K        157.9WK        1S7.VS1K
 !™
                                  .F-H
blood concentrations of CRP show similar within-individ-
ual variability to serum cholesterol,  and like cholesterol,
CRP has been shown to be associated with future coronary
heart disease (CHD) risk in observational studies.15 How-
ever, the etiological relevance of this potentially important
and highly studied link with  CHD  is uncertain because
CRP may simply be a marker for established risk factors
or for subclinical atheroma.16'17 Common SNPs that are
in the gene encoding CRP and that influence  its level
may help  provide insight on the link because, unlike
CRP itself,  genotype is fixed and unaffected by subclinical
disease and the naturally randomized allocation of alleles
at conception balances the distribution of potential con-
founding factors among genotypic classes. Genetic associ-
ations are  therefore less prone to biases  that limit  causal
inference from observational studies, and genetic studies
possess  properties of a randomized intervention trial.16"18
Therefore,  identification of CRP-gene variants  (HGNC:
2637; Iq21-q23) that influence its concentration is funda-
mental to evaluating the causal relevance of CRP with the
principle of Mendelian randomization.19
  In the absence of hepatic stores of CRP, and given its
constant rate of clearance, gene  transcription provides
the major point of regulation.14  Transcription may be
modified by regulatory SNPs  because concentrations of
CRP show  strong concordance among monozygotic twins
and family studies suggest substantial  heritability.20 In
populations of European descent,  there  are  11 common
SNPs with minor allele frequency >5% within 6 kb of
the CRP gene, but extensive linkage  disequilibrium (LD)
means that four major haplotypes  account for 94% of
chromosomes (see Web Resources).21'22 Individual reports
evaluating associations of CRP SNPs with CRP concentra-
tion have  either typed  single SNPs  or a subset of SNPs
(sometimes tag SNPs) in this region (see Table SI available
online). However,  the SNPs  have varied across studies,
thereby limiting the ability to pool all available data. We
therefore developed a  new integrative approach to evi-
dence synthesis of genetic association studies that  allows
for this  complexity.
  Methods for combining data from genome-wide scans
with nonoverlapping sets of  SNPs with individual-level
genotyping data have been recently proposed by March-
ini et al.23 Here,  because individual-level  data are not
available for most of  the studies on CRP,  we develop
                        Figure  1.  Location of the Eight CRP
                        SNPs Typed Directly in the 26 Data Sets
                        Included in This Study
                        The upper  track shows chromosomal Lo-
                        cation;  the middle track  shows  SNP Lo-
                        cation  and  Log(P)  for  the per-allele
                        random-effect meta-analysis  (from  Left
                        to  right, the  SNPs are  ordered as foLLows:
                        rs3093077, rs!205, rsl!30864, rs!800947,
                        rs!417938, rs3091244, rs2794521, and
                        rs3093059); and the Lower track shows the
                        intron/exon structure of the CRP gene.
    a method that allows the synthesis of studies providing
    only summary data. Also, we are mainly concerned with
    the  synthesis of SNP data in regions of interest for fine
    mapping,  where the number of markers typed is small
    and interest  is  on  disentangling  independent effects
    using a variable selection scheme, for which the Marchini
    approach is not suitable.


    Material and Methods

    We first conducted a literature-based systematic review of all rele-
    vant studies (irrespective of the SNP typed). A total of 23 published
    data sets identified by systematic review evaluated associations
    of eight  SNPs  (rs3093059; rs2794521;  rs3091244;  rs!417938;
    rs!800947;  rsl!30864; rs!205; and rs3093077) in the CRP gene
    with CRP concentration (Figure 1). With data from SeattleSNPs,
    a combination of three SNPs (rsl!30864; rs!205; and rs3093077)
    was identified as haplotype tag SNPs with the haplotype r2 method
    in European subjects. These tag SNPs were typed  in three addi-
    tional population-based studies,  thereby giving an aggregate of
    26 studies including 32,802 subjects. No SNP was typed in every
    study, but there was partial overlap of SNP typing across several
    studies (see Appendix A and Table SI).

    Bayesian Hierarchical Model
    We indicate with Yf the continuous trait of interest for subject
    /e {!,.. .,ns} and study s e {!,.. .,5}. If all studies have genotyped in-
    dividuals at all m marker locations, and these data are available for
    all individuals (individual patient data [IPD]), a sensible approach
    to pool information  across studies would be the  random-effect
    model
                       Ys  ^T (f*s n , -I     2f  \               (-1 \
                        ^ IN llj p -f- Insjti5,(7 lns j               \L)

    where Cs is the rf x (m + 1) design matrix coding for the chosen
    genetic model (e.g., for an additive models, 0, 1, and 2 for homo-
    zygous wild-type, heterozygous, or  homozygous  mutant geno-
    types,  respectively) and  the intercept  term, fis ~N(0, a2) is
    a study-specific random intercept term, !„* is the ns x 1 vector
    of ones, Ins is the ns  x ns identity matrix, and p1 = (fi0,fii,...,fim)'
    is the (m + 1) x 1 vector of regression coefficients of interest mea-
    suring the effect of genotype group on Y. One could then assess the
    relative importance of each marker by using a variable selection
    scheme; we use a reversible jump algorithm on the  space of possi-
    ble models as part of the MCMC scheme as described later in the
    text.24'25
      However, studies will rarely consider all m markers together;
    rather, ms < m will have been typed in study s corresponding to
860  The American Journal of Human Genetics 82, 859-872, April 2008
                                 Previous
TOC

-------
a subset Ls of columns of the complete design matrix Cs, Xs say of
size ns x (ms + 1). Also, complete individual patient data for all
studies are rarely available. Instead, we have the summary statis-
tics reported in each study as in the case of the CRP studies. Typi-
cally,  data will consist of means, variances, and numbers of indi-
viduals  for each genotype groups and each marker.  These are
denoted by y^,  vg, and  n^, respectively,  for genotype group
g = \,...,Gj of marker; e [Ls] in study s. The notation  allows for
marker-specific numbers of genotype groups and thus the possibil-
ity of having a mixture of biallelic and triallelic markers, as in the
application to the CRP data, or different genetic models.
  Our approach uses  Equation  (1) as the building  block but
models  the  linear  transformations  Xs  =  Xs Ys as multivariate
normally distributed across studies
                MVNm
                                                         (2)
where Xs' indicates the transpose of Xs. All entries of the vector Xs
can be obtained from the available data summaries. For instance,
the first element corresponding to the intercept term is the overall
sum of the y values, and any other entry can be obtained similarly
from the genotype-group-specific  phenotype means and counts
Vs. ns.
'«" V
  However, the new design matrix Ws = Xs Cs is only partially ob-
served. In particular, only the dot products involving the columns
of Xs with themselves or the intercept term can be derived from
the observed genotype-group counts. The remaining entries are re-
placed by their expected values under Hardy-Weinberg equilib-
rium (HWE) and the known pairwise LD patterns. Specifically, in-
dicating with wu, h 3= I,  a generic such entry, we first obtain an
estimate of the joint bivariate genotype distribution from the
known marginal allele frequencies and the pairwise measure of
LD.26 For example, if both markers are biallelic, this involves esti-
mation of the 3x3 matrix of the genotype distribution, and this
estimation is then multiplied by the study size to give expected
counts. Finally, we obtained wu by summing the appropriate en-
tries of the resulting matrix of expected counts multiplied by the
values used to code the genotype  groups in the design matrices.
Notice that the vector of coefficients (3 retains the same interpreta-
tion and scale of the original model in Equation (1) (in the exam-
ple below, additive effect of variants on log CRP plasma levels) be-
cause  it is derived from a linear transformation of the variables
therein.
  As well as in the derivation of the new design matrix Ws, prior
information of between-marker LD patterns is also incorporated
in the specification of the (partially unobserved) variance-covari-
ance matrix in Equation (2). Specifically, we partition CT2XSXS
into a  spatially structured component and a residual, unstruc-
tured,  component. We  obtained the  former  by introducing
marker-specific random  effects having a zero-mean conditional
autoregressive distribution
                 U
                                                         (3)
where U is a vector of size m, the number of unique markers across
studies, R is a matrix of weights reflecting spatial associations be-
tween the elements of Xs, and M is a diagonal matrix.12'27 Thus the
covariance  matrix in Equation (2) becomes 
-------
                                                                                 Linkage Disequilibrium
                                                                     234567
Figure 2.  Graphical Representation of Equation (4)
Solid and  dotted  Lines  represent stochastic  and deterministic
dependencies, respectively.

distribution on the number of regression terms k included in each
model.25'30-31 The simulation study in the next section includes
a sensitivity analysis of this choice. A graphical representation of
the hierarchical model (4) is given in Figure 2.
Single-Marker Random-Effect Meta-Analysis
Results from the multilocus model are compared to those obtained
from a more traditional single-locus random-effects meta-analysis
in both simulation studies and with real data from the CRP-gene
region. For the latter, a per-allele effect (95% CI) of individual
SNPs on CRP concentration was derived from  each individual
study. The individual-study linear trend (additive effect) per cate-
gory increase in  genotype with mean  data was  calculated by
simple linear regression, with genotypes coded as 0, 1, and 2 for
homozygous common allele, heterozygous, and homozygous rare-
allele, respectively, with the least-square linear-trend-coefficient
formula, which only depends on the mean values and its standard
deviations. A sensitivity analysis restricted to studies with more
than 500 subjects, healthy at time of blood sampling, or to studies
that reported all the required  standard deviations was also con-
ducted (Table S2). Subsequently, the study-specific linear  trend
and its standard  error were pooled with random-effect models.
Subsidiary analyses included pairwise comparisons within each
polymorphism. The  DerSimonian and Laird Q  test, and the I2
test,32 were  used for evaluating the  degree of  heterogeneity
between studies.
Results

Simulation Studies
We considered various scenarios differing in the number of
studies and, for the multilocus approach, in  the priors on
the model space. Data were obtained as follows:  We first
simulated  a  pool of 4000 haplotypes at seven  biallelic
markers. Pairwise LD  measures (r)  between the  seven
                                                                   0.0418
                                                                            0.0055
                                                                            0.0394
                                                                                   -0.0300
                                                                                    0.0056
                                                                                   -0.1941
                                                                                            0.0124
                                                                                            0.0148
                                                                                            -0.0784
                                                                                            0.3697
                                                                                                    0.0260
                                                                                                    0.0154
                                                                                                    -0.0567
                                                                                                    0.2737
                                                                                                             0.0144
                                                                                                             0.0141
                                                                                                            -0.0471
                                                                                                             0.2443
                             Marker 2

Figure 3.  Pairwise LD Measures between Markers Used in the
Simulation Study
Pairwise LD Measures are r values.

SNPs are shown  in Figure 3, with high LD only between
the last three markers. SNP 6 is assumed to be the single
causal  site in the region and is retained in all subsequent
analyses. Given the high LD between SNP 5, 6, and 7, we
expect the results from the univariate analyses to be less
conclusive than those from the multiple marker approach
that adjusts for the between-marker correlations. The study
size  ns was drawn from a normal distribution  with mean
600  and variance 100, rounded to the nearest integer.
Then,  for subject ie{l,...,ns} and study 5, a continuous
phenotype /,• is simulated as
yl = 00
                               + Ms
(6)
where gi6 denotes the genotype of subject / at marker site 6
(0, 1, or 2 for homozygous wild-type, heterozygous, or ho-
mozygous mutant, respectively), (P0, Pe) = (1, 2), ^s ~ N(0,
1), and s ~ N(0,1). To reflect the fact that not all markers are
typed in every study, we select at random ms markers out of
the possible seven for each study. Thus, in most cases the
univariate analyses are based on fewer than the maximum
total of S studies. For each simulated data set, we also esti-
mated the unadjusted univariate additive effects and their
standard errors at each SNP site; the additive effects are
then  combined in  the univariate random-effect analy-
ses.33'34 Tables 1 and 2 present the results from the multi-
ple-marker meta-analyses. The number of studies consid-
ered was 10, 20, or 40. In each case, the tables report the
results obtained with Poisson priors on  the model size in
the reversible jump algorithm with different means (1 or
2 for priors a and b, respectively) or a uniform prior on
the model space (prior c). Notice that the Poisson  priors
give more weight to the null model and may in general
be a more reasonable choice in this setting.  For example,
862  The American Journal of Human Genetics 82, 859-872, April 2008
                                  Previous

-------
Table 1.  Bayesian Muttilocus Meta-Analysis
                          Parameter
                                                        03
                             06
Number of Studies
                  Prior
                          True
Post proba

a Meana

BCI length
Post prob

10 b Mean

BCI length
Post prob

c Mean

BCI length
Post prob

a Mean

BCI length
Post prob

20 b Mean

BCI length
Post prob

c Mean

BCI length
0.01
(0.01)
-0.01
(0.05)
0.19
0.02
(0.01)
-0.03
(0.04)
0.24
0.05
(0.02)
-0.03
(0.04)
0.24
0.01
(0.01)
-0.04
(0.03)
0.16
0.02
(0.02)
-0.04
(0.03)
0.16
0.06
(0.05)
-0.05
(0.03)
0.18
0.01
(0.004)
-0.004
(0.04)
0.20
0.01
(0.004)
-0.01
(0.03)
0.22
0.04
(0.02)
-0.01
(0.04)
0.21
0.01
(0.002)
-0.02
(0.03)
0.14
0.01
(0.01)
-0.02
(0.03)
0.15
0.03
(0.01)
-0.02
(0.02)
0.15
0.01
(0.004)
0.03
(0.03)
0.15
0.01
(0.01)
0.03
(0.03)
0.16
0.04
(0.02)
0.02
(0.03)
0.17
0.01
(0.003)
0.03
(0.02)
0.11
0.01
(0.01)
0.02
(0.03)
0.12
0.03
(0.01)
0.02
(0.02)
0.12
0.005
(0.002)
0.01
(0.03)
0.14
0.01
(0.004)
0.02
(0.03)
0.15
0.03
(0.01)
0.01
(0.03)
0.15
0.004
(0.002)
0.01
(0.02)
0.10
0.01
(0.003)
0.01
(0.02)
0.10
0.02
(0.01)
0.01
(0.02)
0.11
0.01
(0.01)
-0.01
(0.07)
0.35
0.02
(0.01)
-0.01
(0.08)
0.36
0.07
(0.03)
-0.01
(0.07)
0.35
0.01
(0.003)
-0.01
(0.05)
0.23
0.02
(0.01)
-0.02
(0.05)
0.24
0.06
(0.05)
-0.01
(0.06)
0.26
1.00
(0.00)
2.00
(0.04)
0.19
1.00
(0.001)
2.00
(0.05)
0.23
1.00
(0.001)
1.99
(0.04)
0.33
1.00
(0.00)
1.99
(0.04)
0.12
1.00
(0.001)
1.99
(0.03)
0.17
1.00
(0.002)
1.99
(0.03)
0.23
0.03
(0.02)
0.16
(0.14)
0.76
0.07
(0.08)
0.15
(0.16)
0.76
0.15
(0.05)
0.13
(0.12)
0.76
0.03
(0.08)
0.09
(0.10)
0.56
0.05
(0.04)
0.13
(0.112)
0.57
0.13
(0.08)
0.10
(0.108)
0.59


1.19
(0.01)



1.08
(0.02)



1.12
(0.01)



1.11
(0.01)



1.06
(0.01)



1.09
(0.01)



1.25
(0.61)



1.13
(0.34)



1.42
(0.46)



1.10
(0.26)



1.08
(0.17)



0.99
(0.16)

  Results are averages (std) over 100 replicated data sets. Mean posterior estimates and credible intervals are conditional on the SNP being included in a model.
prior a is Poisson(l) and assigns a probability of -0.26 of
having more that one associated site (-0.59 for b). The
values shown are averages over 100 replicates. For each sce-
nario, we report the marginal posterior probability of se-
lecting each SNP and the mean and 95% credible intervals
    of the posterior distributions of each additive effect, condi-
    tional on the SNP being selected.35'36 Note that posterior
    distributions can be reliably estimated only for markers
    with relatively  high  posterior  probability of inclusion
    (e.g., >0.5), and results in the table should be interpreted
Table 2.  Bayesian Multilocus Meta-Analysis
                          Parameter
                                      0i
 03
05
06
Number of Studies
                  Prior
                          True
Post prob

a Mean

BCI length
Post prob

40 b Mean

BCI length
Post prob

c Mean

BCI length
0.01
(0.15)
-0.05
(0.02)
0.12
0.03
(0.03)
-0.05
(0.02)
0.12
0.07
(0.06)
-0.05
(0.02)
0.12
0.01
(0.003)
-0.03
(0.02)
0.10
0.01
(0.01)
-0.02
(0.02)
0.11
0.03
(0.01)
-0.03
(0.02)
0.11
0.004
(0.002)
0.02
(0.02)
0.08
0.01
(0.00)
0.02
(0.02)
0.08
0.02
(0.01)
0.02
(0.02)
0.10
0.003
(0.001)
0.01
(0.01)
0.07
0.01
(0.002)
0.01
(0.02)
0.07
0.01
(0.01)
0.01
(0.01)
0.077
0.01
(0.004)
-0.02
(0.04)
0.17
0.02
(0.01)
-0.03
(0.03)
0.19
0.04
(0.02)
-0.01
(0.04)
0.19
1(0)

1.99
(0.02)
0.10
1(0)

1.99
(0.02)
0.12
1(0)

1.99
(0.02)
0.18
0.025
(0.28)
0.10
(0.08)
0.46
0.04
(0.03)
0.12
(0.07)
0.46
0.11
(0.06)
0.10
(0.10)
0.48


1.04
(0.01)



1.03
(0.01)



1.03
(0.01)



1.09
(0.14)



0.97
(0.13)



1.01
(0.11)

Results are averages (std) over 100 replicated data sets. Mean posterior estimates and credible intervals are conditional on the SNP being included in a model.
                                                        The American Journal of Human Genetics 82, 859-872, April 2008  863
                                   Previous
TOC

-------
Table 3. Single-Locus Random-Effects Meta-Analysis

Number of Studies


10


20


40
Results are averages
SNPID
True
Mean (std)

Mean BCI length
Mean (std)

Mean BCI length
Mean (std)

Mean BCI length
over 100 replicated data sets.
1
0
-0.033
(0.028)
0.385
-0.03
(0.018)
0.258
-0.032
(0.014)
0.183

2
0
-0.023
(0.024)
0.379
-0.019
(0.018)
0.253
-0.021
(0.012)
0.179

3
0
0.069
(0.026)
0.367
0.072
(0.017)
0.245
0.067
(0.013)
0.175

4
0
0.312
(0.026)
0.365
0.314
(0.017)
0.244
0.311
(0.015)
0.175

5
0
1.158
(0.039)
0.414
1.169
(0.025)
0.269
1.165
(0.02)
0.194

6
2
1.990
(0.040)
0.434
1.999
(0.026)
0.286
1.998
(0.017)
0.203

7
0
1.945
(0.036)
0.482
1.942
(0.03)
0.327
1.94
(0.021)
0.23

with this in mind. The marginal probability of selecting the
causal site is 1 independently of the prior used and even
when considering as few as ten studies, with almost no var-
iability across replicates. Notably,  all other markers have
posterior inclusion probabilities close to zero and would
therefore not be selected if we were to use the traditional
threshold of 0.5. All conditional mean additive effects are
very close to the true values with a minor bias only for the
effect at SNP 7, which is the SNP in highest LD with the
causal site. The choice of prior distribution on model space
does not have a large effect on the results, with possibly nar-
rower credible intervals and slightly larger posterior proba-
bility of including SNP 7 under prior c compared to priors
a and b. This is to be  expected because the Poisson priors
favor models with few terms, whereas the uniform prior
gives equal weight to all models. The tables also report the
results for  the variance terms  cr^ and cr^, which  have
posterior estimates close to the true values in both cases. In-
creasing the number of studies has little or no effect on both
marginal posterior probabilities and posterior estimate bias
but does lead to narrower credible intervals as expected.
      The univariate analyses on the other hand fail to unam-
    biguously identify the causal site at position 6 (Table 3). On
    the basis of results reported therein, although SNP 6 shows
    the highest association with the phenotype, SNP 7 could
    still be considered causal if no prior information is avail-
    able to discriminate  between  the two. Even  markers 4
    and 5 would be selected  on the basis of posterior credible
    intervals; paradoxically, increasing the number of studies
    only exacerbates the problem because credible intervals
    become narrower.
      The previous simulation study assumed the same LD
    pattern  across  studies because  study-specific  genotype
    data are simulated from a common haplotype pool. To
    mimic a  more  realistic  scenario, we  further considered
    study-specific LD patterns by simulating genotype counts
    from  study-specific haplotype  pools  characterized  by
    slightly different LD structures.  The multilocus analysis
    then uses the average LD  table shown  in Figure Dl  (in
    which we also report the standard deviations of the pair-
    wise r2 values across studies in brackets). Results are re-
    ported in Table 4 for replicates with 20 studies. The method
Table 4. Bayesian Multilocus Meta-Analysis
Number
of studies







20







Parameter
Prior True
Post prob

a Mean

BCI length
Post prob

b Mean

BCI length
Post prob

c Mean

BCI length
0i
0
0.01
(0.01)
-0.02
(0.04)
0.16
0.01
(0.01)
-0.01
(0.04)
0.2
0.03
(0.01)
-0.02
(0.03)
0.17
when the LD Structure Is Allowed to Vary
02
0
0.01
(0.01)
-0.02
(0.03)
0.15
0.01
(0.01)
0.01
(0.04)
0.12
0.03
(0.01)
-0.02
(0.02)
0.15
03
0
0.03
(0.02)
0.07
(0.01)
0.10
0.02
(0.01)
0.06
(0.02)
0.15
0.11
(0.13)
0.05
(0.03)
0.12
04
0
0.03
(0.06)
0.04
(0.04)
0.10
0.01
(0.01)
-0.02
(0.04)
0.15
0.10
(0.08)
0.04
(0.04)
0.10
05
0
0.01
(0.01)
-0.01
(0.06)
0.23
0.02
(0.01)
-0.01
(0.07)
0.32
0.04
(0.05)
0.01
(0.04)
0.22
across Studies
06
2
1.00
(0.00)
1.97
(0.01)
0.14
1.00
(0.00)
1.98
(0.04)
0.29
1.00
(0.00)
1.99
(0.03)
0.24
07
0
0.03
(0.03)
0.13
(0.11)
0.58
0.08
(0.05)
0.14
(0.28)
0.67
0.12
(0.04)
0.15
(0.05)
0.58
"I
1


1.06
(0.02)



1.08
(0.02)



1.04
(0.01)

4
1


1.07
(0.23)



1.03
(0.13)



1.07
(0.12)

Results are averages (std) over 100 replicated data sets. Mean posterior estimates and credible intervals are conditional on the SNP being included in
a model. See Figure Dl.
864  The American Journal of Human Genetics 82, 859-872, April 2008
                                 Previous
TOC

-------
                         CRP levels by CRP gene variant
                              Additive model
CRP Gene variant   No. Studies
              (individuals)
rs1800947(+1059G-*-C)*  10(11045)


rs1205(+2302G-»A)    10(16942)


rs1417938(+194A--T)   5(7460)


rs2794521(-717A--G)   6(5803)


rs1130864(+1444C-*T)  19(21674)


rs3091244(-286C-»T/A)  7 (7786)


rs3093077(«!899T-»G)  4 (661


rs3093059(-757T-*C)   3 (3475)
15) ~f-
1
)
) —
4)
)
)
)
-0.38 (-0.50, -0.25)
-0.35 (-0.41
— H 	 0.17(0.03
— 1 	 0.12 (-0.10
H 019(0.14
B 0.20(0.15
	 1 	 0.48 ( 0.33
	 H 	 0 58 ( 0 37
-0.28)
0.31)
0.35)
025)
0.25)
0.62)
0 78)
                                   Figure 4.   Summary Effect from Tradi-
Traditional meta-analyses    Bayesian model      tional Meta-Analysis and Bayesian Mul-
      Estimate           Estimate        tiple-SNP  Hierarchical Linear Model of
(95% confidence intervals)  (95% credible intervals)   thp Elflht SNPs in the CRP Gene
                                   Values shown are additive genetic effects
                                   on (log) CRP levels with 95%  confidence
                                   intervals  or credible  intervals  for tradi-
                                   tional and Bayesian analyses, respectively.
                                   For the  Bayesian  analysis,  results  are
                                   shown only for those  markers that appear
                                   to be strongly  associated after  variable
                                   selection  (see Figure  5). N/A refers to
                                   SNPs excluded from the model. The asterisk
                                   indicates  the dominant  model.  Negative
                       N/A         values indicate the variant allele is associ-
                                   ated with a lower CRP concentration.
                    0.59 (0.37, 0.95)
          -0.26 (-0.46,-0.11)


          -0.52 (-0.68, -0.36)


             N /A


             N /A


          0.34(0.20 1.17)
                     -0.7 -0.5  -0.3  -0.1

appears to be fairly robust to minor deviations in LD pat-
terns across studies (similar to those observed for the real
data in the next section); large differences in LD structures
across studies  would  necessarily  invalidate  the  meta-
analytical approach because there would be little informa-
tion to borrow for variable selection.
  Finally, we considered reducing the effect at the causal
site to 1.5 or placing it at marker position 2, which is in
linkage equilibrium with the other sites: In both cases,
the causal site is  selected with high posterior probability
(>0.8, results not shown).
  The WinBUGS code used to fit the model is given in
Appendix C.


A Meta-Analysis of CRP Studies
The traditional single-locus meta-analyses require that the
available  data be partitioned into  groups  of  studies in
which the same SNP was typed directly. In these analyses,
seven SNPs were  associated with a codominant effect on
CRP  concentration  (Figure  4) with the per-allele effect
in  the range  of 0.19-0.58  mg/L (absolute  p  values:
rs!800947  =  4.35  x 10~9;  rs!205   =  7.76 x  10~26;
rs!417938  = 1.77  x  KT2; rsl!30864 = 2.73 x  KT11;
rs3091244  = 4.50 x  10~ls; rs3093077 = 5.03 x  KT11;
and rs3093059 = 2.27 x  10~8), corresponding to -0.3-
0.8 SD of  the population distribution of  CRP  37.  The
main effect estimates were robust to  analyses limited to
studies of >500 subjects (Table S2),  providing  strong evi-
dence for an association  at this locus. However, because
pooled analyses  of  this  type  are  limited to  individual
SNPs, it is unclear which of these SNPs have independent
effects and which  are associated because of correlation
with  other observed or unobserved SNPs, including  the
true causal variant(s). This can be overcome by incorporat-
ing available information on pairwise LD in the region (Ta-
ble S3) within a Bayesian multilocus  model as described
    above. Bayesian model selection can then facilitate identi-
    fication of variants showing the strongest independent
    association with CRP concentration (Figure 5 and Table
    5). The  approach yields  posterior model  probabilities
                TS3093059
                                             I-S2794521
                rs3091244
                                             rs1417938
       -1.0    -0.5    0.0    0.5

                rs1800947
                                                    -0.5   0.0    0.5    1,0

                                                       rs1130864
                   0,5
                                                 0.5
                                                             1
       -1.0    -0.5    0.0    0.5     1.0    -1.0   -0.5    0.0    0.5    1.0

                 rs1205                       TS3093077
                   0.5
                               1
                                                 0.5
                                                             1
       -1.0    -0.5    0.0    0.5     1.0    -1.0   -0.5    0.0    0.5    1.0

     Figure 5.  Results from the Multiple-SNP Meta-Analysis using
     the Bayesian Hierarchical Linear Model
     The shaded bars show the posterior probability that each SNP is
     included in a model, calculated from the posterior sample of models.
     The x axis indicates the additive effects of each  SNP on  log CRP
     plasma levels, conditionalon thatSNP being included in the model,
     and the y axis indicates the corresponding posterior density. The
     curves can thus be interpreted as smoothed histograms representing
     the probability that the SNP effects take the values on the x axis.
     Also shown are the densities, medians (A), and 95% credible inter-
     vals (	) for the additive effects of each SNP on log  CRP levels.
                                                         The American Journal of Human Genetics 82, 859-872, April 2008  865
                                   Previous
TOC

-------
Table 5.  Application to the Meta-Analysis of CRP Studies
Prob
         SNP included
         rs3093059
                       rs2794521
                                    rs3091244
                                                 rs!417938
                                                               rs!800947
                                                                            rsl 130864
                                                                                         rs!205
                                                                                                    rs3093077
0.22
0.12
0.10
0.07
0.06
Models with more than 2% posterior probability are shown. Results assume a Poisson(2) prior on model size in the reversible jump algorithm.
conditional on the observed data from which marginal
probabilities of association for each  SNP can  be readily
obtained. Of the  markers considered, SNPs  rsl!30864,
rs!205, and rs3093077, all in the 3 UTR, retain the stron-
gest independent association with CRP concentration.
An additional synonymous SNP in exon 2 (rs!800947) ap-
pears to be important, although its posterior probability of
association is sensitive to the prior on the model space, and
becomes unimportant if a more restrictive prior on the
number of associated markers in the region is used (results
not shown). These four SNPs yield the model with the
highest posterior probability (Figures 4 and 5  and Table
5).  Again, the models were not materially altered when
analyses  were limited to studies of >500 subjects (results
not shown).
  Notably, SNPs rsl!30864, rs!205, andrs3093077 formed
the trio of tag SNPs. Because each tag SNP marks a different
haplotype, the Bayesian model implies the presence of at
least three functional SNPs regulating CRP level (Figure 6).
Using HapMap, we found that there were 11 SNPs in strong
LD with rs!205, (five with pairwise r2  = 1) within an asso-
ciated interval of -100 kb. There were 11 SNPs in strong LD
with rs3093077 (nine with pairwise r2 = 1), within a larger
associated interval of -300 kb. A total  of 22 SNPs lay in an
associated  interval of 100  kb  encompassing  rsl 130864
(nine with  pairwise r2 =  1) (Figure  7). Because tightly
linked SNPs were identified in the associated intervals,
    a careful assessment of potential functionality for each of
    these SNPs is now required.
      As mentioned in the previous section, in order to accom-
    modate outliers and heavy tails, we assumed the distribu-
    tion of the between-studies random effects fis to be a mix-
    ture of normals. In particular, inspection of the residuals
    from a model fitted without the between-studies random
    effect appears  to  suggest  the  use of a two-component
    mixture, see Figure 8. The graph plots  a sample of the
    quantities

                    r? = :r?-W-.(t)/?(t)-[7.(t)             (7)

    for current values  of the  spatial  random effects U and
    model at iteration t.28 The posterior distribution of a.\
    and a.2 had means of —0.014 and 3.356,  respectively (Fig-
    ure  8),  whereas TT had  posterior  median estimate of
    0.879. By  monitoring  the mixture  component assign-
    ments of each study, we found that outlying studies were
    mostly  assigned to the second component as expected
    (results  not shown).


    Discussion

    With only small genetic effects expected to contribute to
    most complex diseases, the meta-analysis of studies that
    consider variants in the same genetic region is a promising
                                                               ™o
                                                      ~
                                             r*1M1K5  nt7i»7
    eofis3091244n
                                      rsn»   1*2027471
       HMlO
                       Figure 6.  A  Reduced Median Network
                       Constructed with HapMap CEPH Data for
                       a 20 kb Region Containing the CRP Gene
                       Yellow circles indicate haplotypes. The size
                       of each circle  is proportional to  the fre-
                       quency of that haplotype in the  HapMap
                       CEPH population. Non-HapMap SNPs (indi-
                       cated in italics) were placed on the net-
                       work with  information  from other CEPH
                       populations.
866  The American Journal of Human Genetics 82, 859-872, April 2008
                                 Previous
TOC

-------
        Overview  of Chrl
 B
1.U
0.8
0.6.
0.4.
0.2.
0.0
1 II
08
0.6.
0.4.
0.2
0.0
1 0
08
0.6
0.4.
0.2.
0.0
1 n
08
0.6.
0.4.
0.2.
0.0
-S18C0947





WI13MM


,.l. .L.JJ..I

•CM



	 ,. ,1 1, ,
111,1
rrfOMOT?


ill
Illl ill III, [LI .









|
,




||
















,




!
, |









	




1 lllll
Illl




J ill j
















,




Ml





i II




,









1
||





I




1 ,










|





,









1





|
,




, ,.




,1,, ,1,




,1 1




1
||





JlL 1




1,1,
, *



111.
, '1'





,\ 1 ,




-J UW









1











!









1




,






,

































































t



i
1 .lUllMl 1 . , ....... L... J 	 	 	 , m 	
	 1 	



,11 III ,, 	 L........ 	 	





11,1.1111 , ,,L, ..L, , .
,,,,,,, l 1 ,,,,,,,,, 1




• kJLii 	 - 	 ,
Figure 7.  Genomic Context for CRP Gene
(A) Ideogram depicting the chromosome and region in which the CRP gene Lies (red Line).
(B) Gene diagram with introns and exons depicted as horizontaL and vertical blue lines, respectively.
(C) Pairwise i2 LD  values between independently associating SNPs from Bayesian analysis (identified in top left of window, position
indicated by red arrow) and all other HapMap SNPs in the region (release 20, build 35, red = r2 > 0.8, yellow = 0.5 < i2 < 0.8, gray =
0.3 < i2 < 0.5, blue = 0.2 < r2  <  0.3, and dark gray = missing data).
strategy to increase our chances of finding any associa-
tions. Recognizing  the importance of this approach, sev-
eral coordinated efforts have been initiated to ensure that
results from the individual studies follow agreed guidelines
and can be combined more easily.7
  Most of the meta-analyses conducted so far have consid-
ered each marker in isolation, ignoring the possible corre-
lation between markers due to linkage disequilibrium that
reduces efficiency and that compromises the identification
of any causal site. In this paper, we have presented a multi-
marker approach that yields estimates of effect at each site
adjusted for the effects of other variants, as in multiple re-
gression. In both the simulation study and the application
to the CRP data, we assumed an additive genetic model.
Other choices are possible and would only involve changes
in the entries of the matrices W and X'X.
  The methods borrow from the spatial data literature and
incorporate the prior knowledge of marker pairwise LD in
a fully Bayesian framework. For example, similar hierarchi-
cal models with spatial random effects are used extensively
in the analysis of spatial epidemiological data. A conve-
nient feature of the joint specification (Equation [3]) is
that it allows incorporation of the required correlation
structure as prior information in an explicit way.13'37 In ad-
dition, a reversible jump algorithm on the space of possible
model structures enables the selection of the most promis-
ing associations. The proposed approach assumes data on
a continuous phenotype. However,  it could be extended
to the case of discrete outcomes,  say case-control status,
by introducing a further set of continuous latent variables
related to the discrete outcome as in probit regression. Ex-
tensions to include metaregression are straightforward and
only involve introduction of a further hierarchy for the
vector of coefficients |3 in Equation (4) with means that
would then depend on study-specific covariates. Work on
these extensions is currently in progress.
  When applied to the meta-analysis of studies in the CRP-
gene region, results provide evidence for three CRP modi-
fying alleles distributed over three of four common haplo-
types in  Europeans. These alleles could account for the
                                                     The American Journal of Human Genetics 82, 859-872, April 2008  867
                                 Previous

-------

                     • :,_,.•:'• • .• .:•  :2...-;   • •-     .',;:: -
   50e3
             60e3
                      70e3      80e3

                          Iteration
                                        90e3
                                                 100e3
Figure 8.   Posterior Sample of Residuals from the Hierarchical
Model of Material and Methods Fitted without the Between-
Study Random-Effect Term ns

strong association with CRP of each of the three SNPs that
are chosen for their ability to tag others and that mark the
different haplotypes. The associated interval for each inde-
pendently associating SNP extended at least 100 kb from
either side of the open reading frame with a very sharp
boundary of LD for at least two of these. Within each inter-
val were  a number of additional candidate causal SNPs in
complete LD with the index SNP from the Bayesian analy-
sis, any of which could, in theory, regulate CRP. Although
the A and T alleles of the triallelic SNP rs3091244 appeared
to exhibit functionality in previous reporter-gene studies
in vitro,21 this SNP was not retained within the Bayesian
model. Experimental studies of this type may be biased to-
ward the study of potential regulatory SNPs in the immedi-
ate vicinity against  those located remotely from the gene
of interest because of size constraints on reporter-gene con-
structs. This might explain why results of such reporter
studies are, at times, discordant with the findings of associ-
ation analyses in populations38 or alternative experimen-
tal approaches to assessing functionality.39 Irrespective of
the true causal sites, the three tag SNPs adequately capture
functional variation at this locus for large-scale gene-
disease association  studies. Although the naive expecta-
tion  would be of narrower limits of error around the point
estimates of SNP effects with a Bayesian approach that
includes  all studies simultaneously, this was not observed.
This  is because unlike  the traditional meta-analyses, the
Bayesian analyses were corrected for the effect of other
SNPs; that is, uncertainty about which SNPs are directly
associated with  the trait was  properly  incorporated in
the analyses. However, the  simultaneous use of all data
strengthens evidence for an association at the gene level;
the null model does  not appear at all in the posterior
sample of models, reflecting virtual certainty of an effect
on CRP at this gene.
  Our approach facilitates the integration of data  from
studies that have genotyped different SNPs  across the
same gene or region utilizing prior information on LD. It
has a number of favorable attributes and potential applica-
tions. By increasing the available data set of information
on  any SNP, the efficiency of evidence  synthesis is en-
hanced and the reliability of any identified associations is
increased. Further, the variable selection procedure allows
inference on the relative magnitude of any marker-pheno-
type association and identifies those SNPs that show the
strongest association with the phenotype, either because
they are the functional site(s) or because they exhibit the
strongest allelic association with (unobserved) functional
sites. IPD (where available) can also be incorporated readily
into the analysis because the regression parameters mea-
suring the effect of variants retain the same interpretation
when considering aggregate data (i.e.,  phenotype means
by genotype groups as with CRP studies) or IPD (see Mate-
rial and Methods). Moreover, where a robust evidence
based on genetic association  with a quantitative  trait al-
ready exists (as it does for many blood measures, e.g.,
HDL cholesterol, triglycerides, and others), the methods
described could be used to add and integrate partially over-
lapping SNP  data from   new  genome-wide analyses,
thereby harnessing existing data for both replication, and
to gain insight  into likely causal sites in a gene or region.
The methods we describe, which use the freely available
software WinBUGS, are likely to be of substantial value
both to the emerging networks of investigators engaged
in synthesis of evidence on genetic associations of com-
plex quantitative traits and disorders7 and to those apply-
ing and extending findings from genome-wide association
studies.
Appendix A

Systematic Review
Two electronic databases (PubMed Medline and EMBASE)
were  searched  with the  text  words,  which were  also
MeSH terms, polymorphism(s), mutation(s),  gene(s), ge-
netic, variant(s), and SNP(s) in combination with C-reac-
tive protein and CRP. The literature search was limited to
human and to the English language. Any additional stud-
ies in the references of all identified  publications were
also searched. For inclusion, studies had to have an analyt-
ical design (case control, prospective, or cross sectional)
and examine the association between any polymorphisms
in the CRP gene and low-chronic CRP concentrations in
individuals of European descent. Studies measuring CRP
only during acute phase of an inflammatory response (e.g.,
acute ischemia or infection stimuli) were excluded. In areas
where more than one polymorphism had been studied, in-
formation about the LD between them was  extracted
where available. If relevant information was not reported
868  The American Journal of Human Genetics 82, 859-872, April 2008
                                Previous

-------
(mean CRP levels, standard deviations, genotype numbers,
or linkage disequilibrium data), or it was not reported strat-
ified by ethnicity, the authors were contacted in several oc-
casions to obtain the information. A total of four potential
studies (n = 2614) in European subjects were excluded be-
cause of unavailability of data in the appropriate form (Flex
2004, n = 471; Obisesan 2004, n  = 63; Zee 2004, n = 260;
Carlson 2005, n = 1820) (see Table SI).

New Data Sets
NPHS II is a prospective study of 3012 healthy white Euro-
pean middle-aged men, of which a total of 2479 with CRP
genetic data and CRP concentrations were included in this
report. Recruitment in the study commenced in 198940
in nine general practices. None of the participants  had
a clinical history of unstable angina, myocardial infarction
(including  silent infarction), coronary surgery, other car-
diovascular diseases, aspirin or anticoagulant use, or malig-
nant disease (except skin cancer other than melanoma) at
the time of recruitment. The Ely Study is a prospective pop-
ulation-based cohort study of the etiology and pathogene-
sis of type  2 diabetes and related metabolic disorders in
1122  individuals recruited in 1990 in Ely, Cambridge-
shire.41 Complete data on biochemical and anthropomet-
ric variables were available in 839 participants, and a total
of 548 individuals with data on the CRP genotypes and
CRP levels were included in this analysis. The EPIC-Norfolk
study is a population-based cohort study, recruiting partic-
ipants from general practices in Norfolk.42 For the present
report, only control participants from a nested case-control
study in  coronary heart disease were included, providing
a total of 2196 participants with both data on CRP genetic
variants and CRP concentrations.

New Genotyping
Polymorphisms  in the human CRP gene (HGNC:  2637;
Iq21-q23)  were identified by reference to public-domain
databases of human sequence variation. We used this in-
formation to generate a consensus map of polymorphic
sites. By  using validated genotype data (minor allele fre-
quency  >5%) from  subjects of  European descent from
the SeattleSNPs database and the human HapMap database
(see Web Resources), we examined the pattern of linkage
disequilibrium  across the CRP gene.  We then used the
haplotype LD r2 method to select a set of tagging (t)SNPs
capable of capturing maximum haplotype diversity among
subjects of  European descent by using the program TagIT
(see Web Resources).

LD
Public domain databases (see Web Resources) and individ-
ual publications were examined for  information on the
LD structure in the CRP gene. Both D' and r2 values were
recorded, but r2 values were utilized in Bayesian modeling.
If more than one r2 value for a given pairwise was reported,
a weighted mean r2 was obtained.
    Appendix B

    Recovering the Joint Distribution of Multiallelic Sites
    from Allele Frequencies and Marginal Diallelic r2
    Values
    We consider two loci,  the first locus having G\ — three
    alleles and the second locus having G2 = two alleles. The
    joint probability of the haplotypes  at these  two loci can
    be represented in a 3 x 2 table of the form:

    Table Bl.  Full 3x2 Table of Haplotype Probabilities
Allele at Locus 2
Allolo at-
Locus
1
2
3


1 1
Pii
P2i
Ql - Pll - P21

Qi
2
Pi
P2
(1
(P2
(1

-Pii
-P2i
- Qi) - (Pi - pii) -
-P2l)
-Qi)

Pi
P2
(i-

i



Pi - P2)


           ij denotes the joint probability of allele /atlocus 1 and
    allele/ at locus 2, Pt denotes the probability of allele / at locus
    1, and Qj denotes the probability of allele 1 at locus 2.
      The internal cells of this table are not observed. Our prob-
    lem is to derive this table of probabilities on the basis of infor-
    mation from the margins of this table (Pi, P2 and QJ) and
    pair-wise correlation within two marginal tables of the form:

    Table  B2.  First  Marginal 2x2 Haplotype Table
                    Allele at Locus 2
Allele at Locus 1 1
1
3

Pll
Q'l - P'II
Qi
2
Pi
(1
(1

-Pii
- Qi) - (Pi - Py
-Qi)

Pi
(i
i


-Pi)

    and
    Table B3.  Second Marginal 2x2 Haplotype Table
                    Allele at Locus 2
Allele at Locus 1 1
2
3

P'2i
Q'i - P'k
Q'i
2



(P"2 - P'k)
(1 -
(1-
Q'i) -
Q'i)
(P"2

- P"2i)


P"2
(i -
i


P"2)

    In the first of these tables, p'n denotes the joint probability
    of allele 1 at locus 1 and allele 1 at locus 2, but now this
    probability is conditional upon the allele at locus 1 having
    either a 1 or 3 allele. Similarly, p2\ denotes the probability
    of allele 2 at locus 1 and allele 1 at locus 2 conditional upon
    the allele at locus 1 being either a 2 or a 3.
      We do not observe the two tables above but only the ap-
    propriate deviations from linkage disequilibrium, §i3 and
    8'b, defined by
    and
                                                    The American Journal of Human Genetics 82, 859-872, April 2008  869
                                Previous
TOC

-------
  We know that the probability of each pair-wise haplo-
type in Table 2 is equal to the corresponding probability
of that pairwise haplotype in Table 1, divided by the prob-
ability that the allele at the first locus is either equal to 1 or
3, (1 - PZ>- This means that:
and
  Therefore,
                         (Ql-.Pll-.P2l)
                                   Pll
and
                 (Qi-p2
                  Pu =
  Following a similar argument, we also find that
and
  By writing p'\\ and p"2\ in terms of 8'13 and
that:
          = (Pi*Ql-ai1)(l-P2)
                Pi   (Qi-foi)
                              -
and
                                             we find
      P21 = j
          = (F2* 01'-^) (I-PI)
               P2    (Qi-pn)
             (I-PI)  (I-PI)
            (I-PI
  This means  that we have two equations in  two un-
knowns, pn and p2i, so that by substituting the second
equation for p2l into the  first equation for pllt we can
then solve this equation in terms of pn. Substituting the
expression for p^2 into that for pn gives:

               1
and rearranging in terms of pn results in the equation:
                                                         We may then write p2\ in the form
                                                         We are now able to calculate the probability of every cell
                                                       of Table Bl in terms of pn, pi2, P\, PI, and Qx.
                                                         Note that 8'n and §21 can be obtained from the relevant
                                                       r2 values with the formulae:
                                                               <5'  =
                                                       and
  Care must be taken when choosing which sign to assign
these 8 values because they must be consistent with the
margins of the Tables B2 and B3.


Appendix C

WinBUGS Code for the Model Described in Material
and Methods
model {
  # likelihood
  for(j in 1:Q) {# where Q = ^s (ms + 1)
  T[j] ~dnorm(theta[j],tauy[j])
  tauylj]  <- tau.y/XsXs[j] # XsXs[j] in Equation (4)
  theta[j] <- psi[j]+sumXis[j]*mu[study[j]]+U[marker[j]] #
  linear predictor in Equation (4)

  # pooled variances
  for(i in 1:L) {# where L — J]s ms
  scalep] <—  tau.y/2
  shapep] <- pooled[i,2]/2
  pooled[i,l] ~dgamma(shape[i],scale[i]) # uses the gamma
  parameterization

  #reversible jump part as  detailed in Lunn et al.2s
  psi[l:Q] <- jump.lin.pred(W[l:Q,l:m],K,tau.beta)
  id <— jump.model.id(psi[l:Q])
  pred[l:(m+l)] <— jump.lin.pred.pred(psi[l:Q],X.pred[l:
                                                         for(i in l:m){
                                                         X.pred[i,i] <-  1
                                                         for(j in l:(i-l)) {X.pred[i,j] <- 0}
                                                         for(j in (i+l):m) {X.pred[i,j] <- 0}
                                                         X.pred[(m+l),i] <- 0
                                                         effectp] <— pred[i] -pred[m+l]
                                                         }
                                                         # mixture distribution for study effects
                                                         for(s in l:nstudies) {
                                                         mu[s]  ~dnorm(mumu[s],tau.mu)
870  The American Journal of Human Genetics 82, 859-872, April 2008
                               Previous
                                                   TOC

-------
  mumu[s] <— alpha[comp[s]]
  comp[s] ~dcat(phi[])
  }
  phi[2] - l-phi[l]
  alpha[2] <- (-phi[l]*alp
  # prior distributions
  U[l:m] ~car.proper(thetaU[],M[],adj[],num[],m[],prec,l)
  # thetaU vector of zeros of length m (number of unique
  markers)
  # M is weighted average of the XS/XS matrices
  # details on vectors adj, num and m are given in the
  manual for GeoBUGS
  prec ~dgamma(0.5,0.0005)
  tau.y ~dgamma(0.001,0.001)
  tau.mu ~dgamma(0.001,0.001)
  tau.beta ~dgamma(0.0001,0.0001)
  phi[l] ~dbeta(l,l)
  alpha[l] ~dnorm(0.0,1.0E-6)
  K ~dpois(l) # scenario (a)
  }
  The MCMC chain was run for 1,000,000 iterations with
a burn-in of 500,000 and thinning of 100 iterations, which
took -30 min of CPU time on an Itel Xeon 2.80 GHz with 2
GB of RAM. Convergence was checked by visual inspection
of posterior traces and by  running chains  with different
initial values.36
Appendix D
                      Linkage Disequilibrium
                           4       5
0.0684
(0.0486)




r
0.0063
(0.0161)
0.0287
(0.0559)




0.0051
(0.0143)
-0.0021
(0.0264)
-0.1594
(0.271)



0.0100
(0.0137)
0.0090
(0.0142)
-0.0454
(0.1243)
0.4017
(0.2558)

0.0089
(0.0167)
0.0149
(0.0144)
-0.0266
(0.0935)
0.2812
(0.1842)
0.7078
(0.1256)

-0.0054
(0.0147)
0.0013
(0.0151)
-0.0312
(0.0816)
0.2396
(0.1567)
0.6098
(0.1423)
0.8610
(0.1762)
Figure Dl.  Mean Pairwise LD Measures between Markers Used
in the Simulation Study  when Allowing LD Patterns to Vary
across Studies
Supplemental Data
Three tables are available at http://www.ajhg.org/.
Acknowledgments

Dongliang Ge provided help with Figures 1 and 7. Tabular data
were kindly provided by Jos E Krieger, Per Tornvall, and Moniek
P. M. de Maat. This work was supported by Medical Research
Council Research Grant G0600580. T.S. was supported by the Brit-
ish Heart Foundation (PhD Studentship  FS/02/086/14760), R.S.
was supported by a British Heart Foundation Shillingford Training
Fellowship (FS/07/011), S.E.H. was supported by a British Heart
Foundation Programme Grant (PG2000/015),  A.D.H.  was sup-
ported by a British Heart Foundation Senior Fellowship (FS/05/
125), and L.S. was supported by a Wellcome Trust Senior Clinical
fellowship (082178). A.D.H. acknowledges the generous support
of the Rosetrees Trust. J.C. acknowledges the support of the Well-
come Trust (GR076024). C.J.V. is supported by a Research Council
UK Fellowship. The EPIC-Norfolk study is supported by the Med-
ical Research Council UK, Cancer Research UK, and Stroke Asso-
ciation and Research Into Ageing. The Framingham Heart Study
is funded by N01-HC 25195,  and Framingham inflammation
research is funded by HL076784, AG028321.
Received: September 25, 2007
Revised: November 29, 2007
Accepted: January 22,  2008
Published online: April 3, 2008


Web Resources

The URLs for data presented herein are as follows:

CRP:C-reactive protein, pentraxin-related, http://pga.gs.washington.
  edu/data/crp/
HapMap homepage, http://www.hapmap.org/
Online Mendelian Inheritance in Man (OMIM), http://www.ncbi.
  nlm.nih.gov/Omim
TagIT, http://popgen.biol.ucl.ac.uk/software.html
WinBUGS software, http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/
  con tents, shtml
References

 1.  Cardon, L., and Bell, J. (2001). Association study designs for
    complex diseases. Nat. Rev. Genet. 2, 91-99.
 2.  Colhoun,  H., McKeigue,  P.,  and Davey Smith, G. (2003).
    Problems  of reporting  genetic associations  with  complex
    outcomes. Lancet 361, 865-872.
 3.  Clayton,  D., and McKeigue,  P.  (2001). Epidemiological
    methods for studying genes and environmental factors  in
    complex diseases. Lancet 358, 1356-1360.
 4.  Zeggini, E., Rayner, W., Morris, A.P., Hattersley, A.T., Walker,
    M., Hitman, G.A., Deloukas, P., Cardon, L.R.,  and McCarthy,
    M.I. (2005). An evaluation of hapmap sample size and tagging
    snp performance in large-scale empirical and  simulated data
    sets. Nat. Genet. 37, 1320-1322.
 5.  Cambon-Thomsen, A.  (2003).  Assessing the  impact  of
    biobanks. Nat. Genet. 34, 25-26.
 6.  Little, J., Bradley, L.,  Bray, M., Clyne, M., Dorman, J., Ells-
    worth, D., Hanson, J., Khoury, M., Lau, J., O'Brien, T, et  al.
    (2002). Reporting, appraising, and integrating data on geno-
    type prevalence and gene-disease associations. Am. J. Epide-
    miol. 156, 300-310.
                                                       The American Journal of Human Genetics 82, 859-872, April 2008  871
                                  Previous

-------
 7.  loannidis, J.P., Gwinn, M., Little, J., Higgins, J.P., Bernstein,
    J.L., Boffetta, P., Bondy, M., Bray, M.S., Brenchley, P.E., Buffler,
    P.A., et al. (2006). A road map for efficient and reliable human
    genome epidemiology. Nat. Genet. 38, 3-5.
 8.  Seminara, D., Khoury, M., O'Brien, T., Manolio, T., Gwinn, M.,
    Little,]., Higgins,]., Bernstein,]., Boffetta, P., Bondy, M., et al.
    (2007). The emergence of networks in human genome epide-
    miology: Challenges and opportunities. Epidemiology 18,1-8.
 9.  Scott, L., Mohlke, K., Bonnycastle, L.,Wilier, C., Li,Y., Duren,W.,
    Erdos, M., Stringham, H., Chines, P., Jackson, A., et al. (2007).
    A genome-wide association study of type 2 diabetes in finns
    detects multiple  susceptibility variants. Science  316,  1341-
    1345.
10.  Dina,  C., Meyre,  D., Gallina, S., Durand, E., Krner, A., Jacob-
    son, P.,  Carlsson, L.,  Kiess, W., Vatin, V., Lecoeur, C., et al.
    (2007). Variation in fto contributes to childhood obesity and
    severe adult obesity. Nat. Genet. 39, 724-726.
11.  The Wellcome Trust Case - Control  Consortium (2007). Ge-
    nome - wide association study of 14,000 cases of seven com-
    mon diseases and 3,000 shared controls. Nature 447, 661-678.
12.  Besag, J., and Kooperberg, C.L.  (1995). On conditional and
    intrinsic autoregressions. Biometrika 82, 733-746.
13.  Lawson, A.B. (2001). Statistical Methods in Spatial Epidemiol-
    ogy (Chichester, UK: John Wiley).
14.  Hirschfield, G., and Pepys, M. (2003). C-reactive  protein and
    cardiovascular disease: new insights from  an old  molecule.
    OJM 96, 793-807.
15.  Danesh, J., Wheeler, J., Hirschfield, G., Eda, S., Eiriksdottir, G.,
    Rumley, A., Lowe, G., Pepys, M., and Gudnason, V. (2004). C-
    reactive protein and other circulating markers of inflamma-
    tion in the prediction of  coronary heart disease. N.  Engl. J.
    Med. 350, 1387-1397.
16.  Hingorani, A., and Humphries, S. (2005). Nature's randomised
    trials.  Lancet 366, 1906-1908.
17.  Hingorani, A., Shah, T., and Casas, J. (2006). Linking observa-
    tional and genetic approaches to determine the role of c-reac-
    tive protein in heart disease risk. Eur. Heart J. 27, 1261-1263.
18.  DaveySmith, G., andEbrahim, S. (2003). Mendelianrandomiza-
    tion: Can genetic epidemiology contribute to understanding en-
    vironmental determinants of disease? Int. J. Epidemiol. 32,1-22.
19.  Davey Smith, G.,  and Ebrahim, S. (2004). Mendelian random-
    ization:  Prospects,  potentials and limitations. Int. J.  Epide-
    miol. 33, 30-42.
20.  MacGregor, A., Gallimore, J., Spector, T, and Pepys, M. (2004).
    Genetic effects on  baseline values of c-reactive protein and
    serum  amyloid a  protein: A comparison of monozygotic
    and dizygotic twins. Clin. Chem. 50, 130-134.
21.  Carlson, C., Aldred, S., Lee, P., Tracy, R., Schwartz, S., Rieder,
    M., Liu, K., Williams, O.,  Iribarren, C.,  Lewis,  E., et al.
    (2005).  Polymorphisms within  the  c-reactive protein (crp)
    promoter region are associated with plasma crp levels. Am. J.
    Hum.  Genet. 77,  64-77.
22.  Kardys, L, de Maat,  M., Uitterlinden, A., Hofman, A., and Wit-
    teman, J. (2006).  C-reactive protein gene haplotypes and risk
    of coronary heart disease: The rotterdam study. Eur. Heart J.
    27, 1331-1337.
23.  Marchini, J., Howie, B., Myers, S., McVean, G., and Donnelly, P.
    (2007). A new multipoint method for genome-wide association
    studies by imputation of genotypes. Nat. Genet. 39, 906-913.
     24. Green, P. (1995). Reversible jump markov chain monte carlo
        computation and bayesian model determination. Biometrika
        82, 711-732.
     25. Lunn, D., Whittaker, J., and Best, N. (2006). A bayesian toolkit
        for  genetic  association  studies.   Genet.  Epidemiol.  30,
        231-247.
     26. Thomas, D. (2004). Statistical Methods in Genetic Epidemiol-
        ogy (New York: Oxford University Press).
     27. Besag, J., York, J., and Mollie, A.  (1991). Bayesian  image
        restoration, with  two applications  in spatial statistics (with
        discussion). Ann.  Inst. Stat. Math. 43, 1-59.
     28. Dominici, R, Parmigiani, G.,  Wolpert, R.L., and Hasselblad, V.
        (1999). Meta-analysis  of migraine headache treatments:
        Combining information from heterogeneous designs. J. Am.
        Stat. Assoc. 94, 16-28.
     29. Kelsall, J., and Wakefield, J.C. (1999). Comment on Bayesian
        models for spatially correlated disease and exposure data. In
        Bayesian Statistics 6 Proceedings of the Sixth Valencia Interna-
        tional Meeting (Oxford: Clarendon Press).
     30. Denison, D.G.T.,  Holmes, C.C., Mallick,  B.K., and Smith,
        A.EM. (2002). Bayesian Methods for Nonlinear Classification
        and Regression (Chichester, UK: John Wiley).
     31. Verzilli, C.J., Whittaker, J.C.,  and Stallard, N. (2005). A hierar-
        chical bayesian model for predicting the  functional conse-
        quences of amino acid polymorphisms. J. R. Stat. Soc. Ser. C
        Appl. Stat.  54, 191-207.
     32. Higgins, J., Thompson, S., Deeks, J.,  and Altman, D. (2003).
        Measuring inconsistency  in meta-analyses. BMJ 327, 557-
        560.
     33. DuMouchel, WH. (1990).  Bayesian meta-analysis. In Statisti-
        cal Methodology  in the Pharmaceutical Sciences (New York:
        Marcel Dekker).
     34. Normand,  S.L.T. (1999). Meta-analysis: Formulating, evaluat-
        ing, combining and reporting. Stat. Med. 18, 321-359.
     35. O'Hagan, A., and Forster, J. (2004). Bayesian Inference. In Ken-
        dall's Advance Theory of Statistics, Vol 2B (London: Arnold).
     36. Carlin, B.P., and Louis, T.A. (2002). Bayes and Empirical Bayes
        Methods for Data Analysis (Boca Raton: Chapman & Hall).
     37. Parent, O., and Riou, S. (2005). Bayesian analysis of know-
        ledge spillovers in european regions. J. Reg. Sci. 45, 747-775.
     38. loannidis, J.,  and Kawoura,  F. (2006). Concordance of func-
        tional in vitro data and epidemiological associations in com-
        plex disease genetics. Genet. Med. 8,  583-593.
     39. Cirulli, E.,  and Goldstein, D.  (2007). In vitro assays fail to
        predict in  vivo effects of  regulatory polymorphisms.  Hum.
        Mol. Genet. 16, 1931-1939.
     40. Herbert, A., Lenburg, M., Ulrich, D.,  Gerry, N., Schlauch, K.,
        and Christman, M. (2007). Open-access database of candidate
        associations from a genome-wide snp scan of the framingham
        heart study. Nat. Genet. 39, 135-136.
     41. Wareham, N., Hennings, S., Prentice, A., and Day, N. (1997).
        Feasibility  of  heart-rate monitoring to  estimate total level
        and pattern  of energy expenditure in  a  population-based
        epidemiological study: The ely young cohort feasibility study.
        Br. J. Nutr.  78, 889-900.
     42. Day, N., Oakes, S., Luben, R., Khaw, K., Bingham, S., Welch, A.,
        and Wareham, N.  (1999). Epic-norfolk: Study design and char-
        acteristics of the cohort, european prospective investigation of
        cancer. Br. J. Cancer 80, 95-103.
872  The American Journal of Human Genetics 82, 859-872, April 2008
                                    Prtvfoys
TOG

-------
                                        Regulatory Toxicology and Pharmacology 51 (2008) S27-S36
                                             Contents lists available at ScienceDirect
                             Regulatory Toxicology and Pharmacology
                                  journal homepage: www.elsevier.com/locate/yrtph
Biomonitoring  Equivalents (BE) dossier for toluene (CAS No.  108-88-3)

Lesa L Aylward3, Hugh A. Barton b, Sean M.  Haysc'*
a Summit Toxicology, LIP, 6343 Carolyn Drive, Falls Church, VA 22044, USA
bNational Center for Computational Toxicology U.S. EPA, B205-01, 109 TW Alexander Drive, Research Triangle Park, NC 27711, USA
c Summit Toxicology, LLP, 165 Valley Road, Lyons, CO 80540, USA
ARTICLE  INFO

Article history:
Received 15 January 2008
Available online 22 May 2008

Keywords:
Biomonitoring
Biomonitoring Equivalents
BEs
Toluene
Pharmacokinetics
                                       ABSTRACT
Recent efforts by the US Centers for Disease Control and Prevention and other researchers have resulted
in a growing database of measured concentrations of chemical substances  in blood or urine samples
taken from the general population. However, few tools exist to assist in the  interpretation of the mea-
sured values in a health risk context. Biomonitoring Equivalents (BEs) are defined as the concentration
or range of concentrations of a chemical or its metabolite in a biological medium (blood, urine, or other
medium) that is consistent with an existing health-based exposure guideline. This document reviews
available pharmacokinetic data and models for toluene and applies these data and models to existing
health-based exposure guidance values from the US Environmental Protection Agency, the Agency for
Toxic Substances and Disease Registry, Health Canada, and the World Health Organization, to estimate
corresponding BE values for toluene in blood. These values can be used as screening tools for evaluation
of biomonitoring data for toluene in the context of existing risk assessments for toluene and for prioriti-
zation of the potential need for additional risk assessment efforts for toluene.
                                                     © 2008 Elsevier Inc. All rights reserved.
1. Introduction

   Measurements of environmental  chemicals  in  air, water, or
other media can be compared to health-based exposure guidelines
to identify chemical exposures that may be of concern, or to iden-
tify chemicals for which a wide margin of safety  appears to be
present. Interpretation of human biomonitoring data for environ-
mental compounds is hampered by a  lack of similar screening cri-
teria applicable to measurements of chemicals in biological media
such as blood or urine.  Such screening criteria would ideally be
based upon data from robust epidemiological studies that evaluate
a comprehensive set of  health endpoints in relationship to mea-
sured levels of chemicals in biological media. However, develop-
ment of such epidemiologically based  screening values is a
resource- and time-intensive effort, and appropriate data for such
values may never be available for many compounds. As an interim
effort, the development of Biomonitoring Equivalents (BEs) has
been proposed (Hays et al., 2007).
   A Biomonitoring Equivalent (BE) is defined as the concentration
or range of concentrations  of chemical in a biological medium
(blood, urine, or other medium) that is consistent with an existing
health-based exposure guideline. Existing chemical-specific phar-
macokinetic data are used to estimate biomarker concentrations
associated with the Point of Departure (PODs; such as No Observed
 * Corresponding author.
   E-mail address: shays@summittoxicology.com (S.M. Hays).

0273-2300/S - see front matter © 2008 Elsevier Inc. All rights reserved.
doi: 10.1016/j.yrtph.2008.05.009
                         Effect Levels  [NOELs], Lowest Observed Effect Levels [LOELs], or
                         Benchmark Doses [BMDs]) used as the basis for the exposure guid-
                         ance value and to estimate biomarker concentrations that are con-
                         sistent with  the guidance value. BEs can  be estimated using
                         available human or animal pharmacokinetic data. Guidelines  for
                         the derivation  and communication of BEs  are  available (Hays
                         et al., 2008; LaKind et al., 2008). BEs are designed to be screening
                         tools to gauge which chemicals have large, small or  no  margin of
                         safety  compared to  existing health-based  exposure guidelines,
                         and are designed to provide a basis for prioritization of chemicals
                         for risk assessment follow-up. BEs are only as robust as the under-
                         lying health-based exposure guidelines that they are based upon
                         and the underlying animal and/or human pharmacokinetic data
                         used to derive the BEs. BEs are not designed to be diagnostic  for
                         potential health effects in humans, either individually  or among
                         a population.
                            Toluene is used as a solvent in numerous products including
                         industrial paints, adhesives, coatings, inks, and cleaning products.
                         Toluene is also  added to aviation fuel to improve octane ratings
                         and as a raw material for the manufacture of polymers used to
                         make nylon, plastic soda bottles, and polyurethanes. It is also used
                         in processes for manufacture of  Pharmaceuticals, dyes, cosmetic
                         nail products, and in the synthesis of organic chemicals including
                         benzene. According to the US Environmental  Protection Agency
                         (USEPA), the primary pathway for exposure to toluene is inhalation
                         from ambient and indoor air, although ingestion may also occur
                         through trace amounts of toluene that may occur in food or water.
                                       Previous

-------
S28
                                   LI. Ay/word et al/Regulatory Toxicology and Pharmacology 51 (2008,) S27-S36
Intentional inhalant abuse can result in high exposure to toluene
vapors. Additional general  information  regarding toluene can be
found at http://www.epa.gov/ttn/atw/hlthef/toluene.html.
   This dossier describes the scientific basis for and derivation of
BE values for toluene and discusses issues that are important for
the interpretation of biomonitoring data  using  biomonitoring
equivalents. This BE dossier is not designed to be a comprehensive
compilation of the available hazard, dose-response or risk assess-
ment information for toluene.

1.1. Current health-based exposure guidance values

   The primary effects of toluene in both humans and animals after
either acute or chronic exposure are effects on the central nervous
system (CNS). Acute exposure to high concentrations of toluene
causes symptoms  including fatigue,  sleepiness, headaches,  and
nausea.  Chronic exposure to toluene at levels above current occu-
pational  exposure guidelines has  been associated  with subtle
changes  in sensory function including  reduced color vision (re-
viewed in USEPA, 2005). High level exposures through intentional
inhalant abuse or  accidental or intentional ingestion of toluene
have also been reported to cause  effects on the liver, kidneys,
and other organ systems. With respect to  carcinogenicity, both
the International Agency for Research on Cancer (IARC) and the
US Environmental  Protection Agency (USEPA) consider toluene as
"not classifiable" as  to human carcinogenicity (Groups 3 and D,
respectively) (IARC, 1999; USEPA, 2005).
   Health-based exposure guidelines and toxicity values have been
established for many chemicals for the general population by the
USEPA (Reference Doses or Reference Concentrations [RfDs or
RfCs]), the Agency for Toxic Substances and Disease Registry (ATS-
DR) (Minimal Risk Levels or MRLs), and Health Canada and the
World Health  Organization (WHO) (Tolerable  Daily  Intakes or
TDIs). Although these health-based exposure guidance values have
different labels and slightly different definitions, they all generally
describe an approximation  of daily intake rates (or air concentra-
tions) for a chemical expected to be without adverse effects in
the general  population, including sensitive  subpopulations.1.  For
chemicals considered to be carcinogenic, the USEPA also establishes
estimates of cancer potency by assigning a quantitative estimate of
the upper bound of potential increased cancer risk associated with
a unit of intake or air concentration (unit cancer risks, or UCRs). Fi-
nally, several organizations set chemical-specific air concentrations
that are considered to be safe for workers in the occupational envi-
ronment (for example, Threshold Limit Values [TLVs], Permissible
Exposure Limits [PELs], and Maximum Air Concentrations [MAKs]).
These values are generally not appropriate for application to the gen-
eral population on  a chronic basis, but can provide  perspective for
evaluating non-workplace environmental exposures.
   Several health-based exposure guidelines and toxicity values
are available for toluene including guidelines for both inhalation
and oral exposures. These values are summarized in Table 1. As
discussed above, toluene is generally not considered to be carcino-
genic so  no cancer potency estimates for toluene are available. In
addition, biological monitoring values for toluene in blood or tolu-
ene metabolites in urine in occupationally exposed individuals
(Biological Exposure Indices (BEIs) and Biological Tolerance Values
(BATs))  have also been established  (ACGIH,  2001;  Angerer et al.,
1998). As discussed above, these are not appropriate for applica-
tion to the general population.
 1 See the definition  of RfD at http://www.epa.gov/NCEA/iris/help_gloss.htm#r;
definitions for ATSDR MRLs are included in ATSDR Toxicological Profiles at http://
www.atsdr.cdc.gov/toxpro2.html. Definition of the TDI is available at  http://
ptcl.chem.ox.ac.uk/MSDS/glossary/tolerable_daily_intake.html.
1.2. Pharmacokinetics

   The pharmacokinetics of toluene have been studied extensively
in human volunteers and persons occupationally exposed as well
as in laboratory animals. Toluene is well absorbed following inha-
lation and oral exposure. Toluene  undergoes metabolism, princi-
pally via CYP2E1, and metabolites are excreted in urine. Toluene
is  also eliminated as parent compound  in urine and exhaled air.
The recent  USEPA  IRIS  review of toluene  includes a  detailed
description of the metabolic pathways for toluene (USEPA, 2005).
Detailed physiologically based  pharmacokinetic  (PBPK)  models
for toluene in humans and laboratory rats have been developed
by several groups of researchers and can accurately predict blood
levels associated  with a  variety of inhalation exposure  regimens
(human and rat models by Tardif et al., 1993,1995; human models
by Jang, 1996; Pierce et al., 1998).

1.3. Biomarkers

   The objective of using BEs is to provide a human health risk
framework for screening-level evaluation of human biomonitoring
data. The choice of the biomarker (analyte and medium) should be
optimized to  facilitate this objective. The key criterion  for the
choice of a biomarker is that  it be as closely related to the appro-
priate dose to the target tissue as possible and that it be practical
for collection in a biomonitoring study. This, in turn,  means that
the biomarker should be (i) the compound that causes the toxicity
(parent or metabolite), or (ii)  should be just upstream on the met-
abolic  pathway from the toxic compound, and (iii) as closely re-
lated to the target tissue as possible.
   Several potential biomarkers are available for  assessing inter-
nal exposure to toluene (Table 2). Toluene is excreted unchanged
in exhaled air. However, as a quantitative biomarker, toluene in
exhaled breath is relatively insensitive and  it is difficult to ob-
tain  reliable,  reproducible measurements.  In the occupational
setting, exposure to toluene has been  monitored through mea-
surement of metabolites in urine and parent compound in blood
(ACGIH, 2001). However, use of urinary  metabolites of toluene as
markers for assessing exposure in  persons in the general popula-
tion is of limited utility because neither marker measured, hip-
puric   acid  or ortho-cresol,  is specific  to  toluene exposure.
Instead, each can be observed as metabolites of numerous parent
compounds  (Dossing et  al., 1983). Hippuric acid  levels in urine
are relatively  poorly correlated with  exposure even under occu-
pational exposure conditions  (Truchon et al., 1999). Under condi-
tions  of  higher  occupational  exposure levels, elevated  ortho-
cresol  levels are closely correlated  with  inhalation exposures,
but at  environmental exposure concentrations ortho-cresol levels
are non-specific. For instance, ortho-cresol is present  in cigarette
smoke. Thus,  it  cannot  serve  as  a specific marker  for toluene
exposure  at  low environmental  levels (Dossing et al.,  1983).
Two other urinary  metabolites, S-p-toluylmercapturic  acid and
S-benzylmercapturic acid,  could  potentially serve  as  specific
markers   for  toluene  exposure  (Angerer  et  al., 1998;  Inoue
et al., 2004). However, current analytical techniques are probably
not sensitive enough to quantitate concentrations following envi-
ronmental exposures, and  insufficient data  on quantitative rela-
tionships  between  these metabolites and  toluene exposure or
blood  levels  are currently  available.  Finally, unchanged toluene
in urine has also  been proposed as a biological marker for expo-
sure to toluene  in the occupational setting  (Fustinoni  et al.,
2007;  Kawai  et  al., 2008).  There is relatively  little literature
relating toluene  in urine  to external  exposures, and none of
the current  models  for  toluene  pharmacokinetics explicitly
include this pathway to allow quantitative prediction of elimina-
tion in urine under differing  exposure conditions.  Thus, although
                                          Previous

-------
                                     LI. Ay/ward et al./Regulatory Toxicology and Pharmacology 51 (2008) S27-S36
                                                                                                                                   S29
Table 1
Health-based exposure guidance values for toluene from various agencies
Organization, criterion, and year of
evaluation
Study description
Critical endpoint and dose
Uncertainty factors
                                                                                                                          Value
Inhalation exposure guidelines
USEPA RfC (USEPA, 2005)



Health Canada TDI-inhalation
(Health Canada, 1996)


World Health Organization Air
Quality Guideline (WHO, 2005)


Multiple studies of human
occupationally exposed
populations

Study of human
occupationally exposed
populations

Occupationally exposed
workers with mean
exposure at 332 mg/m3

Transient and persistent neurological effects
NOAEL (average): 34 ppm (128 mg/m3)
NOAEL (adjusted for 24 h, 7 day/week
exposure): 46 mg/m3
Nervous system effects in humans
NOAEL: 150 mg/m3
NOAEL (adjusted by a factor of 6/24 for 24 h,
7 day/week exposure): 38 mg/m3
Impacts on neurological performance tests
LOAEL of 332 mg/m3, duration adjusted to
80 mg/m3

10- total
10— interindividual


10- total
10— interindividual


300-total
10— interindividual
10-use of a LOAEL

5 mg/m3
(1.3 ppm)


3.8 mg/m3
(1 ppm)


0.26 mg/m3
(0.07 ppm)

ATSDR acute inhalation MRL (1-14     Human volunteers,
  days exposure) (ATSDR, 2000)
ATSDR chronic inhalation MRL (>1
  year) (ATSDR, 2000)


Oral exposure guidelines
USEPA RfD (USEPA, 2005)
Health Canada TDI-oral (Health
  Canada, 1996)
World Health Organization TDI
  (WHO, 2004)
ATSDR acute oral MRL (1-14 days
  exposure) (ATSDR, 2000)
ATSDR intermediate oral MRL (up to
  1 year) (ATSDR, 2000)
exposure to 10, 40, or
100 ppm, 6 h per day for 4
days
Occupationally exposed
workers with mean
exposure at 35 ppm

Rat gavage, 13 weeks (NTP,
1990)
Rat gavage, 13 weeks (NTP,
1990)
Mouse gavage, 13 weeks
(NTP, 1990)
Rats exposed by single dose
corn oil gavage to 0, 250,
500, or 1000 mg/kg day

Mice exposed by drinking
water for 28 days to 5,22, or
105 mg/kg day
Trend of decreased neurological performance,
with NOAEL at 40 ppm. Duration adjusted
Decreased color vision after adjustment for age
and alcohol use, with LOAEL at 35 ppm.
Duration adjusted

Kidney weight changes as a precursor to
kidney toxicity at higher doses
NOAEL: 223 mg/kg day
LOAEL: 446 mg/kg day (duration adjustment
applied)
BMDL: 228 mg/kg day

Liver and kidney weight changes
NOAEL: 223 mg/kg day
LOAEL: 446 mg/kg day (duration adjustment
applied)
Marginal hepatotoxic effects at the lowest dose
Decreased flash-evoked potential at all dose
levels (no dose-response trend)
Changes in neurotransmitter levels
                                                                 3—potential sensitivity
                                                               of developing CNS
                                                               10-total
                                                                 10—interindividual
                           3 mg/m3
                           (0.8 ppm)
100-total                   0.3 mg/m3
  10-minimal LOAEL to NOAEL  (0.08 ppm)
  10—interindividual
3000-total
  10—interspecies
  10—interindividual
  10—subchronic to chronic
  3—database uncertainties
1000-total
  10—interspecies
  10—interindividual
  10—subchronic to chronic
1000-total
  10—interspecies
  10—interindividual
  10—subchronic to chronic
and use of a LOAEL
300-total
  3-minimal LOAEL to NOAEL
  10—interspecies
  10—interindividual
300-total
  3-minimal LOAEL to NOAEL
  10—interspecies
  10—interindividual
0.08 mg/kg day
0.22 mg/kg day
0.2 mg/kg day
0.8 mg/kg day
0.02 mg/kg day
toluene  in  urine may prove useful as a specific biomarker for
environmental  exposure  to toluene,  the  current data are  not
sufficient to  rely on toluene  in urine  for the  Biomonitoring
Equivalent  process. Thus, no urinary biomarker for exposure to
toluene  is currently useful for assessing general  environmental
exposures.
   Toluene  has also been measured in blood and correlated with
inhalation  exposure  levels  in  persons  exposed  occupationally
(see, for example, Neubert et al., 2001), in volunteers under condi-
tions of controlled exposure (see, for example, Pierce et al., 1998),
and in the general population (see,  for example, Sexton et al.,
2005). Toluene in blood has also  been identified as a useful bio-
marker in the occupational  setting (ACGIH, 2001).
   Identification of relevant dose metrics depends upon the health
endpoints that are the bases of the health-based screening values.
The available health-based  criteria presented in Table 1 focus on
two health endpoints.  The USEPA oral  RfD is based on  subtle
kidney toxicity following oral gavage dosing  in rats, while  the
ATSDR acute and intermediate MRL values are based on changes
                                      in neurological endpoints in rats and mice. Inhalation criteria from
                                      all agencies are based on subtle neurological effects observed in
                                      humans after acute and chronic exposure to toluene.
                                         The mechanisms of the renal toxicity observed in rats following
                                      subchronic oral gavage in the National Toxicology Program study
                                      are unknown,  but recent in vitro studies  by Al-Ghamdi et al.
                                      (2003a,b) in proximal tubule cell cultures suggest that the toxicity
                                      may be attributable to benzyl alcohol, a toluene metabolite pro-
                                      duced via CYP2E1. Al-Ghamdi et al. (2003a,b) showed that inhibit-
                                      ing CYP2E1 activity  prevented toxicity in cell culture following
                                      toluene exposure. Renal toxicity has also been observed in humans
                                      following intentional  or accidental ingestion of large amounts of
                                      toluene and following chronic inhalation abuse (Stengel  et  al.,
                                      1998).  However, such toxicity has not been reported in occupa-
                                      tional populations  exposed to  more moderate air concentrations.
                                      For example, Stengel et  al.  (1998)  reported a no-observed-ad-
                                      verse-effect-level (NOAEL) at the TLV of 50 ppm (188 mg/m3) for
                                      renal function  changes in a chronically exposed occupational  co-
                                      hort. Renal toxicity is most  likely a phenomenon associated with
                                          Previous

-------
S30
                                    LL Ay/ward et al/Regulatory Toxicology and Pharmacology 51 ("2008) S27-S36
Table 2
Potential biomarkers of exposure to toluene
Analyte
                      Medium
             Advantages
Disadvantages
Toluene
Hippuric acid

ort/io-Cresol
S-p-Toluylmercapturic
  acid
S-Benzylmercapturic
  acid
                      Blood         Sensitive and specific; highly relevant to target
                                   tissue concentrations
                      Urine         Sample easily obtained; specific biomarker
Exhaled air     Sample easily obtained; specific biomarker
Urine         Sample easily obtained

Urine         Sample easily obtained

Urine         Sample easily obtained; specific biomarker


Urine         Sample easily obtained; specific biomarker
Requires blood draw

Lack of robust data set or model to quantify relationship between
exposure and observed levels; not directly relevant to target tissue
concentrations
Insensitive, difficult to obtain reproducible results
Non-specific metabolite; not directly relevant to target tissue
concentrations
Non-specific metabolite at environmental exposure levels; not directly
relevant to target tissue concentrations
Lack of robust data set or model to quantify relationship between
exposure and observed levels; not directly relevant to target tissue
concentrations; analytical sensitivity may not be sufficient
Lack of robust data set or model to quantify relationship between
exposure and observed levels; not directly relevant to target tissue
concentrations; analytical sensitivity may not be sufficient
high peak toluene blood levels resulting in high rates of metabo-
lism and subsequent activity of metabolites in the kidney (Al-Gha-
mdi et al., 2003a,b).
   Neurological responses following inhalation exposure to tolu-
ene in humans or oral exposure in rats and mice are likely to be re-
lated directly to brain concentrations of toluene, which in turn are
directly related to blood concentrations (Benignus  et  al., 2007;
Bushnell et al.,  2007). The RfC for toluene derived  by USEPA is
based on evaluation of potential neurotoxicity, which appears to
be the most sensitive endpoint identified  in numerous  studies of
long-term occupationally exposed  populations  (see USEPA, 2005,
for a complete description of these studies and populations). These
studies  are  characterized by long-term exposure with monitored
air exposure concentrations. The mechanisms underlying the ob-
served neurotoxicity are not fully understood, but appear to be re-
lated to concentrations  of the  parent compound  (rather than
metabolites) reaching the brain (van Asperen et al., 2003; Benignus
et al., 2007; Bushnell et al., 2007). However, there are insufficient
data to conclusively identify whether peak or average toluene con-
centration in blood is the most appropriate dose metric for various
neurological responses.  The direct correlation between toluene
blood concentration and neurological responses  supports use of
blood concentration of toluene as a biomarker, and under chronic
exposure conditions, average blood concentration should be di-
rectly relevant.

2. BE derivation

2.1. Methods

2.1.1. Urine
   As discussed above, data do not support the use of urinary
markers for toluene exposure at environmental exposure levels
at this time, although, as discussed  above, selected specific metab-
olites or unchanged toluene in urine might serve as reliable bio-
markers if more data can be developed and analytical techniques
for those markers become sufficiently sensitive.  No urinary BE val-
ues were derived for toluene exposure.

2.1.2. Blood
   In order to estimate human blood levels associated with expo-
sure to toluene at the  various  health-based inhalation and oral
exposure guidelines detailed in Table 1, the  human PBPK model
for toluene developed by Tardif et al. (1993,  1995) was imple-
mented. Models by Pierce et al. (1996,1998) and Jang (1996) were
also available. Each is similar to the model developed by Tardif
                                             et al. (1995), but of the three, the Tardif et al. (1995)  model has
                                             been used the most extensively and was therefore chosen for use
                                             in this BE derivation. The rat PBPK model by Tardif et al. (1993)
                                             was  also used to estimate blood concentrations in rats at the dose
                                             levels used as the point of departure for derivation of the oral RfD
                                             and  oral MRL values. These blood concentrations at the point of
                                             departure for risk assessment can provide  additional context for
                                             interpretation of measured blood concentrations in  humans in
                                             the general population.
                                                Both the human and the rat PBPK models required minor mod-
                                             ifications to incorporate the oral route of exposure. These additions
                                             are described below.

                                             2.1.2.1. Rat model. The PBPK model of Tardif et al. (1993) for toluene
                                             inhalation exposure in rats was implemented in  MS Excel® with
                                             physiological and physicochemical parameters as described in Ta-
                                             bles  1 and 2 of that publication. The model was modified to incorpo-
                                             rate  oral dosing  by  adding  a virtual  gastrointestinal  tract
                                             compartment with a first-order absorption process to the liver. To
                                             parameterize the absorption rate from oral dosing, the gastrointesti-
                                             nal absorption rate was calibrated visually  against graphical data
                                             from Sullivan and Conolly (1988) for the time course of blood tolu-
                                             ene concentrations following oral gavage at four different dose levels
                                             in Sprague-Dawley rats. The absorption rate from the rat gastroin-
                                             testinal tract was adjusted to result in peak blood concentrations be-
                                             tween 2 and 2.5 h post-gavage, as  reflected  in the Sullivan and
                                             Conolly (1988) data set. All other parameters were retained as re-
                                             ported by Tardif et al.  (1993). The parameters used in the rat oral
                                             and inhalation toluene PBPK model are presented in Table 3.

                                             2.1.2.2. Adult human model. The human PBPK model of Tardif et al.
                                             (1995) with parameters as reported by Nong et al. (2006, Table 1)
                                             for toluene inhalation  exposure was similarly implemented in MS
                                             Excel®. The  model was able to accurately  reproduce the central
                                             tendency of the measured blood  and exhaled air concentrations
                                             in an independent data set for volunteers exposed to 50 ppm tolu-
                                             ene for 2 h from Pierce et al. (1998;  results  not shown).
                                                An oral dose route  was also added to the human PBPK model.
                                             Addition of this dose  route required parameterization of an oral
                                             absorption rate constant. An  oral absorption rate was calibrated
                                             against the time course to peak exhaled air concentrations follow-
                                             ing administration of toluene at measured drip rates for specified
                                             time periods to human volunteers via nasal-gastric tube (Baelum
                                             et al., 1993). The exhaled air concentrations  peaked approximately
                                             15-30 min following cessation of exposure. The full set of  model
                                             parameters for the adult human model is included in Table 3.
                                           Previous

-------
                                    LL Ay/ward et al/Regulatory Toxicology and Pharmacology 51 (2008,) S27-S36
                                                                                                                               S31
Table 3
Model parameters used in the rat and human
Parameter
Physiological parameters
Body weight (kg)
Tissue volumes (L)
Liver
Fat
Richly perfused
Poorly perfused
Cardiac output (L/h)
Alveolar ventilation (L/h)
GI Tract emptying ratec (tr1)
Tissue blood flow rates (L/h)
Liver
Fat
Richly perfused
Poorly perfused
Partition coefficients
Blood:air
Livenblood
Fat:blood
Richly perfused:blood
Poorly perfused :blood
Metabolic constants
Vmax (mg/h)
Km (mg/L)
a From Tardif et al. (1993).
b From Tardif et al. (1995) and Nong et al.
c Fit to data sets as described in text.

PBPK models
Adultb

70

1.82
13.3
3.5
43.4
418
418
0.69

109
21
184
104

15.6
2.98
65.8
2.66
1.37

116.2
0.55

(2006).



Rata

0.25

0.0123
0.0225
0.0125
0.18
5.3
5.3
0.23

1.33
0.48
2.70
0.80

18
4.64
56.7
4.64
1.54

1.7
0.55



                                                                            External
                                                                             Doss
                              Relevant
                              Internal
                               Dose
Monitored
Biomarker
2A.2.3. Evaluation of BEs for oral exposure guidelines. The general
approach for  the derivation of BE  values  for the oral exposure
guidelines is presented in Figs. 1 and 2.
   For those exposure guidance values derived based on rat toxicity
study data, the rat and human PBPK models were used in combina-
tion to derive BE values. Briefly, the process (Fig. 1) is as follows:

•  Step 1: Calculate relevant animal internal dose at POD. In this case,
   effects on liver and/or kidney following chronic gavage adminis-
   tration of toluene are the most sensitive effects and serve as the
   basis for the derivation of oral exposure guidelines. These effects
   are likely related to production of metabolites in these organs.
                                                   Monitored
                                                   Biomarker
  Animal
  Human
Avg. blood
cone.
X Q
u. u_
13 Z>

Fig. 1. Schematic of approach used to estimate BE values for toluene in humans
corresponding to oral exposure guidance values based on rat toxicity data. NOAELatij
POD: Point of departure, adjusted for duration and LOAEL to NOAEL, as appropriate;
UFA_PD: component of interspecies uncertainty factor for pharmacodynamic sensi-
tivity; UFA-PIO component of interspecies uncertainty factor for pharmacokinetic
sensitivity; UFn: intraspecies uncertainty factor; UFo: uncertainty factor component
for database uncertainties, where applicable. See text for discussion.
                                                                     Animal
                                                                     Human
                                                                                                  Human average blood concentration
                                                                   Fig. 2. Schematic of approach used to estimate BE values for toluene in humans
                                                                   corresponding to oral exposure guidance values based on mouse toxicity data.
                                                                   UFH-PD:  component of intraspecies uncertainty  factor for  pharmacodynamic
                                                                   sensitivity.
   Production of these metabolites is likely to be proportional to
   area under the curve of toluene in these organs. Thus, toluene
   area under the curve for kidney (modeled as richly perfused tis-
   sue) or liver (modeled explicitly) was selected as the relevant
   internal dose metric, and an estimate of the target organ AUC
   at the duration- and LOAEL-to-NOAEL adjusted POD was  made
   for each of the oral exposure guidelines. The average blood con-
   centration in the animals at the POD (BEPOD_animai) was also esti-
   mated using the PBPK model.
•  Step 2:  Interspecies extrapolation. Interspecies extrapolation of
   this relevant internal dose metric to a corresponding human tar-
   get organ AUC by application of an interspecies uncertainty factor
   for pharmacodynamic differences. An interspecies factor for phar-
   macokinetic differences was also applied. This factor accounts for
   unknown differences between humans and the experimental ani-
   mals  of interest in the  pharmacokinetics  of the metabolites
   believed to be responsible for the organ-specific toxicity.
•  Step 3: Calculate BEPOD. Application of the human pharmacoki-
   netic model to identify  an average blood concentration corre-
   sponding to the relevant target organ internal dose measure
   identified above (human equivalent BEPOD)-
•  Step 4: Calculate BE. Application of relevant intraspecies uncer-
   tainty factor(s) and any additional applicable uncertainty factors
   identified by the organizations that derived the oral exposure
   guidelines initially (for  example, database  uncertainty factors
   sometimes applied by USEPA). Because the measured biomarker
   is directly  related to the  internal dose metric of interest, direct
   measurement of this biomarker concentration replaces applica-
   tion of  the  pharmacokinetic component  of the intraspecies
   uncertainty factor  in derivation of the BE values (Hays et al.,
   2008); only the pharmacodynamic factor is appropriate on an
   internal dose basis in this case.

   For those oral exposure guidance values derived based on
mouse toxicity data (the WHO TDI and  the ATSDR intermediate
MRL), a modified process outlined in Fig. 2 was used because of
the lack of a mouse PBPK model. In this approach, the interspecies
extrapolation is conducted on an external dose basis to obtain the
human equivalent external dose POD. The human average blood
concentrations associated  with  this POD were then estimated
using the  human  PBPK  model to obtain the human equivalent
                                        Previous

-------
S32
                                   LI. Ay/word et al/Regulatory Toxicology and Pharmacology 51 (2008,) S27-S36
2.1.2.4. Evaluation ofBEsfor inhalation exposure guidelines. The gen-
eral approach for the derivation of BE values for the inhalation
exposure guidelines is presented in Fig. 3. Each of the applicable
guidelines is derived based on human data. Thus, the derivation
process does  not involve an  interspecies  extrapolation. Briefly,
the process is as follows:

•  Step 1: Calculate the BEPOD.  The steady-state blood concentra-
   tions in humans exposed at  the duration- and LOAEL-to-NOAEL
   adjusted PODs (based on human study data) were modeled
   using the PBPK model described above. Because blood concen-
   tration  has been identified  as a directly relevant dose metric
   for neurological effects, the relevant internal dose metric and
   the  monitored biomarker concentration are the same.  Thus,
   these modeled blood concentrations are the BEPOD values used
   in  the  derivation  of the  BEs for the  inhalation  exposure
   guidelines.
•  Step 2: Calculate the BE.  Application  of relevant intraspecies
   uncertainty factor(s) and any additional applicable uncertainty
   factors  identified by  the  organizations that derived the oral
   exposure guidelines initially (for example, database uncertainty
   factors  sometimes applied by USEPA). As  above, because the
   measured biomarker is directly related to the internal dose met-
   ric of interest (they are the same for this set of exposure guide-
   lines, blood toluene concentration) direct measurement of this
   biomarker concentration replaces application of the pharmaco-
   kinetic component of the intraspecies uncertainty factor in der-
   ivation  of the BE  values  (Hays  et al., 2008);  only  the
   pharmacodynamic factor  is  appropriate on  an  internal  dose
   basis in this case.
2.2. Results of modeling and identification of BE values

2.2.1. Urine
   As discussed above, no specific and useful urinary markers for
toluene exposure at environmental exposure levels currently exist.
No urinary BE values were derived for toluene exposure.

2.2.2. Blood—oral exposure
   All of the available chronic oral exposure guidelines are based
upon extrapolation from the same study of subchronic (13 weeks)
administration of toluene by gavage to rats or mice at duration-ad-
justed doses of 223, 446, 893, 1786, or 3571 mg/kg day (NTP,
1990). Two organizations, the USEPA and  Health Canada, based
         External
          Dose
Relevant
Internal
 Dose
Monitored
Biomarker
  Animal
  Human
Human
NOAELad(
POD

N
Human \
PK Model /
V
Human average blood concentration

u?5
Fig. 3. Schematic of approach used to estimate BE values for toluene corresponding
to inhalation exposure guidance values. See text for discussion.
their guidance values on liver or kidney toxicity observed in the
rat gavage study, while the WHO based its oral  guidance value
on results from the mouse study. The ATSDR also derived exposure
guidance values  for acute (1 to 14 day) and intermediate (up to 1
year) exposures.  The acute MRL was derived based on a single dose
rat gavage study, while the intermediate MRL was derived based
on a 28-day mouse drinking water study.
   Table 4 presents the modeling results and BE derivation for the
oral exposure guidance values based on rat toxicity data using the
general approach outlined in Fig. 1. Table 5 presents  the corre-
sponding  results and BE derivation for those oral exposure guid-
ance values derived from  mouse toxicity data according to  the
approach  in Fig.  2.

2.2.3. Blood—inhalation exposure
   All  of  the available inhalation exposure  guidance values  are
based  on  studies of human  occupationally  exposed populations
with a focus on  a range of potential neurological effects. Several
studies in human occupational cohorts  provide LOAEL or  NOAEL
exposure  estimates for all studied neurological endpoints includ-
ing both transient and persistent effects, as well as for a wide range
of other biochemical and health effect endpoints.  Different agen-
cies have made slightly different choices in their selection of points
of departure for  derivation of exposure  guidance values, as sum-
marized in Table 1. Selected occupational exposure levels were ad-
justed  to  an equivalent continuous  exposure concentration from
intermittent exposures experienced in the workplace. This adjust-
ment is applied  to account for the presumption that the general
public  could be continuously  exposed in air. Note that this adjust-
ment implicitly  assumes  that the average concentration (or area
under the curve) is the critical dose metric. However, it is possible
that peak concentrations or time above a threshold level is as—or
more—important than average concentration in producing neuro-
toxic effects. Estimated peak blood concentrations following expo-
sure under actual  occupational  exposure  concentrations  are
approximately 3-fold higher  than the duration-adjusted average
blood concentrations (modeling  not shown). The ATSDR has also
derived an  acute duration MRL (1-14  day exposure) based  on
dose-response for  neurological  effects  observed  in a volunteer
study.
   The results of PBPK modeling and BE derivation for the inhala-
tion exposure guidance values are presented in Table 6. The BE val-
ues from  different  agencies  differ substantially due to different
judgments regarding whether selected occupational exposure lev-
els represent NOAELs or LOAELs. The BE values corresponding to
the USEPA RfC and the Health Canada inhalation TDI are higher
than the human equivalent BEPOD values derived  from the WHO
and ATSDR exposure guidance values.

2.3. Discussion of sources of variability and uncertainty

2.3.1. Model uncertainty
   The PBPK model used here has been used extensively to eval-
uate data sets for human and rat inhalation exposure and  can
reproduce the observed  blood concentration vs. time behavior
from independent data sets. The model  incorporates understand-
ing of  the physiological, physicochemical, and metabolic deter-
minants of toluene pharmacokinetics.  However,  as  discussed
above,   its  application  to  oral  exposures  introduces  some
additional uncertainty  due to the behavior of rapidly eliminated
volatile compounds and  the uncertainties  associated  with esti-
mation of peak  blood concentrations  associated with  bolus oral
dosing. At very  high oral exposures, saturation  of metabolism
may become an issue resulting  in non-linear relationships  be-
tween  external  doses and resulting blood concentrations.  How-
ever, at environmentally  relevant exposures,  such saturation is
                                          Previous

-------
                                      LL Ay/ward et al/Regulatory Toxicology and Pharmacology 51 (2008,) S27-S36
                                                                                                                                        S33
Table 4
Estimated internal dose metrics and average human blood concentrations consistent with the derivation of oral exposure guidance values for toluene based on rat toxicity data
(see Fig. 1)
BE derivation step
                                                USEPA chronic RfD
                                                                                  Health Canada chronic oral TDI
                                                                                                                     ATSDR acute MRL
Target organ
Administered dose regimen
Kidney
321 mgkg"1 day"1, rat gavage, 5 day/
week, 13 weeks (NOAEL)
Kidney, liver
321 mg kg"1 day"1, rat gavage, 5 day/
week, 13 weeks (NOAEL)
Brain
250 mgkg"1 rat single dose
gavage (LOAEL)
LOAEL-to-NOAEL adjustment
Duration adjustment and/or benchmark dose
modeling
Subchronic to chronic adjustment
POD, mgkg"1 day"1
BEpoD animal, Hg L"1 (Corresponding animal avg. blood
cone, from PBPK model)
Animal avg. target organ cone, from PBPK model, |ig L"1
Interspecies uncertainty factors
Pharmacodynamic
Pharmacokinetic
Human equivalent target organ avg. cone., j^gL"1
Human equivalent BEpoo, i^g L"1 (corresponding human
avg. blood cone, from PBPK model)
Intraspecies uncertainty factors
Pharmacodynamic
Pharmacokinetic
Other uncertainty factors
BE value, jig L"1
Confidence ratingd
None
Adjust for 5/7 day/week; benchmark
dose modeling
10
23
90

390a

io°-5
io°-5
39
16


io°-5
lc
3— database uncertainties
2
Medium
None
Adjust for 5/7 day/week

10
22
90

390a-450b

io°-5
io°-5
39-45
12-16


io°-5
lc
NA
3-5
Medium
3
None

NA
83
830

3650a

io°-5
io°-5
365
150


io°-5
lc
NA
50
Medium
  a Average daily toluene concentration in kidney resulting from once daily bolus dosing at the NOAELadJ POD as estimated from the richly perfused compartment of the PBPK
model.
  b Average daily toluene concentration in liver resulting from once daily bolus dosing at the NOAELadJ POD as estimated from the liver compartment of the PBPK model.
  c Measurement of a biomarker that is directly relevant to the internal dose metric of interest replaces the default uncertainty factor for pharmacokinetic sensitivity. See
text for discussion.
  d See text for discussion.
Table 5
Derivation of BEs for oral exposure guidance values for toluene based on mouse toxicity data (see Fig. 2)
BE derivation step
                                                            ATSDR intermediate MRL
                                                                                                  WHO chronic TDI
Target organ
Administered dose regimen

  LOAEL-to-NOAEL adjustment
  Duration adjustment and/or benchmark dose modeling
  Subchronic to chronic adjustment
POD, mgkg"1 day"1
Interspecies uncertainty factors
  Pharmacodynamic
  Pharmacokinetic
Human equivalent POD, mgkg"1 day"1
Human equivalent BEPOD, Hg L"1 (corresponding human avg. blood
  cone, from PBPK model)
Intraspecies uncertainty factors
  Pharmacodynamic
  Pharmacokinetic
Other uncertainty factors
BE value, jig L"1
Confidence rating15
            Brain
            5 mgkg"1 day"1 mouse drinking water,
            28 day (LOAEL)
            3
            NA
            NA
            10°-5
            io°-5
            0.17
            1.7
            io°-5
            la
            NA
            0.5
            Medium
                Liver
                321 mg kg"1 day"1, mouse gavage, 5 day/week,
                13 weeks (LOAEL)
                io°-5
                Adjust for 5/7 day/week
                10°-5
                22

                10°-5
                io°-5
                2.2
                23
                io°-5
                la
                NA
                7
                Medium
  a Measurement of a biomarker that is directly relevant to the internal dose metric of interest replaces the default uncertainty factor for pharmacokinetic sensitivity. See
text for discussion.
  b See text for discussion.
unlikely  to occur.  Thus,  as a  tool for  predicting the central
tendency  of  blood  concentrations  associated  with  inhalation
exposure to toluene, the  model uncertainty is low, while uncer-
tainty is somewhat higher for estimating peak concentrations
following oral exposures. Another  potential  uncertainty  relates
to the modeling of kidney concentrations. The existing published
PBPK  models  used here  do  not  include  an  explicit  kidney
compartment.  The  use  of the  "richly perfused"  compartment
for simulation of kidney concentrations is appropriate, but inclu-
sion of an explicit  kidney compartment  could be  considered if
future data support this refinement of the model.
                        2.3.2. Analytical
                           The analytical methods for measuring toluene in blood are well
                        established (ACGIH, 2001). The variability due to analytical issues
                        should be minor in the context of the BE values.

                        2.3.3. Interindividual variations in pharmacokinetics
                           Differences in body composition (body fat levels, etc.), level of
                        physical activity, metabolic capability, and  other factors  can lead
                        to variations in blood concentrations of toluene associated with a
                        given exposure  level. Data sets  from controlled exposure experi-
                        ments show variations in blood levels in individuals exposed to
                                           Previous

-------
S34
                                   LI. Ay/word et al/Regulatory Toxicology and Pharmacology 51 (2008,) S27-S36
Table 6
Estimated internal dose metrics and average human blood concentrations consistent with the derivation of inhalation exposure guidance values for toluene based on human
toxicity data for central nervous system (CNS) effects (see Fig. 3)
BE derivation step
Target organ
Administered dose regimen


LOAEL-to-NOAEL adj.
Duration adjustment

Subchronic to chronic adjustment
POD, mg irr3 continuous
Human equivalent BEpoo, i^g L"1
(corresponding human avg. blood
cone, from PBPK model)
Intraspecies uncertainty factors
Pharmacodynamic
Pharmacokinetic
Other uncertainty factors


BE value, ng L"1
Confidence rating15
USEPA chronic RfC
CNS effects
34ppm (128 mgnr3)
NOAEL, occupational
exposure
NA
Adjust to continuous
exposure
None
46
170



io°-5
la
NA


50
High
Health Canada chronic
inhalation TDI
CNS effects
40ppm (150 mgm~3)
NOAEL, occupational
exposure
NA
Adjust to continuous
exposure
None
38
135



io°-5
la
NA


40
High
WHO air quality
guideline
CNS effects
SSppm (332 mgnr3)
LOAEL, occupational
exposure
10
Adjust to continuous
exposure
None
8
30



io°-5
la
3— potential
sensitivity of
developing CNS
3
High
ATSDR chronic
inhalation MRL
CNS effects
35ppm (132 mgirr3)
LOAEL, occupational
exposure
10
Adjust to continuous
exposure
NA
3
10



io°-5
la
NA


3
High
ATSDR acute MRL
CNS effects
40 ppm (150 mg irr3)
NOAEL, volunteers
exposed 6 h/day, 4 day
NA
Adjust to continuous
exposure
NA
30
100



io°-5
la
NA


30
High
 a Measurement of a biomarker that is directly relevant to the internal dose metric of interest replaces the default uncertainty factor for pharmokinetic sensitivity. See text
for discussion.
 b See text for discussion.
the same external air concentrations. Pierce et al. (1998) exposed
individuals to controlled concentrations of toluene in air  for 2 h
and followed blood concentrations for approximately 100 h after
exposure ceased. At each time point, variations of a factor of two
to three from the mean were observed among the individuals, con-
sistent with a default assumption that interindividual pharmacoki-
netic  differences could account  for 3-fold variations from  the
mean. These experimental  results  are  consistent  with  the
variations predicted when physiological variability is incorporated
into PBPK modeling using the Tardif et al. models (Tardif et al.,
2002).

2.3.4.  Gender and age
   Nong et al. (2006) conducted a modeling study to evaluate the
impact of the development of CYP2E1 metabolic capability in in-
fants and children on the predicted blood concentration of toluene
following inhalation exposure. Nong et al. (2006) used data on the
concentration of hepatic CYP2E1 protein (Johnsrud et al., 2003) and
the change in liver tissue volume as a function of age to estimate
total CYP2E1  metabolic capability as a fraction of adult capability.
Using these data with age-specific physiologic parameters in  the
Tardif et al. (1995) PBPK model, Nong  et al. predicted that blood
levels in newborn infants could be as much as 3 times higher than
blood levels predicted in adults at the same air exposure level. Pre-
dicted blood concentrations in older children and adolescents were
more  similar to those predicted in adults. Nong et al. (2006) noted
that this degree of variability was consistent with the pharmacoki-
netic component of the interindividual uncertainty factor used in
the derivation of the RfC. No data on the impact of gender on the
pharmacokinetics of toluene were identified, other than those
resulting from physiological variability, which can  be accounted
for using the PBPK model. Varying bodyweight and other physio-
logical parameters in the model  to account for female vs. male
physiology does not result in marked changes in predicted blood
concentrations (variations generally less than about 10 percent)
(Pelekis et al., 2001).

2.3.5. Smoking, drugs, alcohol co-exposures
   Ethanol can inhibit metabolism of toluene through competition
for CYP2E1 (Baelum, 1991). Thus, co-exposure to these or other com-
pounds that are substrates for or inhibitors of CYP2E1 may result in
prolonged elevation of toluene blood concentrations compared to
those resulting from exposure to toluene alone. Smoking is a source
of toluene exposure and smokers consistently demonstrate higher
blood concentrations of toluene than non-smokers (see, for example,
Churchill et al., 2001). However, no information is available regard-
ing the impact of smoking on elimination rates of toluene.

2.3.6. Polymorphisms in enzymes or other factors
   Researchers are beginning to identify polymorphisms in genes
coding for key metabolic enzymes and examine the impact of such
polymorphisms on potential responses. Polymorphisms in several
of the enzymes known to be involved in the metabolism of toluene,
including CYP2E1, have been identified. However, researchers have
focused on correlating the occurrence of such polymorphisms with
susceptibility to various conditions (see, for example, Heuser et al.,
2007; Kezic et al., 2006) rather than directly assessing the effects of
these polymorphisms on metabolic capability. Some studies have
indicated an impact of such polymorphisms on enzymatic activity,
but data are limited to date (reviewed in Gemma et al., 2006). Thus,
we cannot draw any conclusions regarding the  impact of such
polymorphisms on predicted blood concentrations in individuals
exposed  at an  exposure guideline.  Such  polymorphisms may
account for some of the variability in blood levels observed among
individuals after controlled exposures (see above).

2.4. Confidence assessment

   Guidelines for derivation of BE values (Hays et al., 2008) specify
consideration  of two main elements in the assessment of confi-
dence in the derived BE values: robustness  of the available phar-
macokinetic  models  and  data,  and  understanding  of  the
relationship between the measured biomarker and the critical or
relevant target tissue dose metric.

2.4A. Confidence  in BE values based on oral exposure guidelines
   As discussed  above, the oral exposure route introduces addi-
tional uncertainties in estimating blood BE values corresponding
to the oral exposure guidelines.  These uncertainties stem from
uncertainty regarding the appropriate dose metric (for example,
                                          Previous

-------
                                   LL Ay/ward et al/Regulatory Toxicology and Pharmacology 51 (2008,) S27-S36
                                                                                                                            S35
area under the curve vs.  peak target organ concentrations)  and
uncertainty in the active metabolite responsible for liver or  kid-
ney toxicity. In this assessment we have relied on area under the
curve of the parent compound in the target organ of interest as
the relevant  dose  metric. However, if peak concentration (or
peak metabolite production) is  more relevant, then uncertainty
regarding the  appropriate oral  absorption  rate  value for  the
models (which impacts estimates of peak, but not  average, blood
concentrations), and uncertainty regarding the appropriate dos-
ing regimen to assume for exposure at the health-based expo-
sure guidelines (once  per day bolus  vs. divided dose;  again,
impacting peak but not  average blood  concentrations) affects
confidence in the BE values.  Blood concentration as a biomarker
for toluene  should be  directly  related to average target organ
concentration, but may be less informative regarding peak target
organ concentration. For this reason, confidence in the BE values
associated with the oral dosing route is lower than that for the
inhalation exposure route.
   In summary, the confidence ratings for BE values for oral expo-
sure guidance values are:

•  Relevance of biomarker to  relevant dose metrics: MEDIUM.
•  Robustness of pharmacokinetic data/models: MEDIUM.
2.4.2. Confidence in BE values based on inhalation exposure guidelines
   Blood toluene concentrations are directly related to target tis-
sue concentrations in the brain. The available PBPK  models are
well-validated  and have been extensively used  in combination
with occupational data sets in humans. The BE values for inhala-
tion exposure guidance values for toluene based on blood concen-
tration as  the  biomarker thus  have HIGH confidence for both
aspects.
   In summary, the confidence ratings for BE values for inhalation
exposure guidance values are:

• Relevance of biomarker to relevant dose metrics: HIGH.
• Robustness of pharmacokinetic data/models:  HIGH.

   The summary confidence ratings are presented in Tables 4-6.


3. Discussion and interpretation of BE values

   The BE values presented here represent the concentrations of
toluene in blood that are consistent with exposure at the exposure
guideline values  that have been established by various agencies
(Table  1). These BE values should be regarded as interim values
that can be updated or replaced if the exposure guideline values
are updated or if the scientific and regulatory communities develop
additional data on acceptable or tolerable concentrations in human
biological media based directly on epidemiological data.
   The BE values presented here are screening values  and can be
used to provide a screening-level  assessment of measured blood
concentrations  of toluene in population- or cohort-based studies.
Comparison of measured values to the values presented here can
provide an initial evaluation of whether the measured values in a
given study are of low, medium, or high priority for risk assess-
ment follow-up. Fig. 4 illustrates the presentation of the BE value
corresponding  to the Health Canada  inhalation TDI. Measured
biomarker values in excess of the human equivalent  BEPOD indi-
cate a  high priority for risk assessment follow-up. Values below
the BEpoo but  above the BE suggest a medium priority for risk
assessment  follow-up, while those below the BE values suggest
low  priority for risk assessment follow-up.  Based on the results
of such comparisons, an  evaluation can be made of the need for
              135
               40
                         High
                       Medium
                         Low
                                                 POD
BE
Fig. 4. Example presentation of the BE value corresponding to the Health Canada
inhalation  TDI. The  BEPOD corresponds to the average blood concentrations
estimated at the identified human no-observed-adverse-effect-level used as point
of departure for  the guideline  derivation. The BE value presents the blood
concentration consistent with the TDI (see Table 6 and the text for details on the
derivation). Similar graphs could be prepared for the BE values derived for each of
the available exposure guidance values.
additional studies on exposure pathways, potential health effects,
other aspects  affecting exposure or risk,  or other risk manage-
ment activities.
   Numerous  exposure guideline values and thus BEs exist for
interpreting human biomonitoring data for toluene. Selecting the
most appropriate BE (and  BEPOD) for interpreting biomonitoring
data may depend on several factors including: the year the expo-
sure guidance value was established (and thus potentially reflects
advancement in understanding of toluene toxicity, mechanism of
action, or available studies  for deriving an exposure guideline va-
lue); whether the exposure  guidance value was based upon animal
or human toxicity data; the route of exposure from which the
exposure guideline value was derived (and thus potentially reflects
the more predominant  pathway for exposure in the environment);
the degree of uncertainty involved in the derivation of the expo-
sure guidance value; and other judgments regarding the reliability
of the underlying exposure guidance value.
   BE values do not represent diagnostic  criteria and  cannot be
used to evaluate the likelihood of an adverse health effect in an
individual or even among a population. In the case of toluene, BE
values corresponding to exposure guidance values from different
agencies differ widely, and interpretation of biomonitoring data re-
sults may depend upon which guidance value is regarded as most
reliable and  appropriate for a given situation. Further discussion of
interpretation and communications aspects of the BE values is pre-
sented in LaKind et al.  (2008).

Disclaimer

   This work was reviewed by EPA and approved for publication,
but does not necessarily reflect official Agency policy. Mention of
trade names or commercial products does  not constitute endorse-
ment or recommendation by EPA for use.

Conflict of interest disclosure statement

   The authors declare that they have no conflicts of interest.
                                       Previous

-------
S36
                                          LI. Ay/ward et al./Regulatory Toxicology and Pharmacology 51 (2008) S27-S36
Acknowledgments


    Funding for this project was provided by the US Environmental
Protection Agency, Health Canada, the American Chemistry Coun-
cil, the American  Petroleum Institute,  Responsible  Industry for a
Sound Environment, and the Soap and  Detergent Association. The
authors thank the BE Steering Committee for their advice and guid-
ance: John  H. Duffus,  Monty Eberhart,  Bruce Fowler (advisor),
George Johnson, Mike Kaplan, Bette Meek, David  Moir, David J.
Miller,  Larry L. Needham (advisor),  and Bob  Sonawane.  Finally,
the authors thank  Michael J. Bartels, Peter J. Boogaard, and Kannan
Krishnan  for  their   careful  review  and  comments  on  this
manuscript.


References

Agency for Toxic Substances and  Disease Registry (ATSDR), 2000. Toxicological
    profile  for  toluene.  Available from:  .
Al-Ghamdi, S.S., Raftery, M.J., Yaqoob, M.M., 2003a. Acute solvent exposure induced
    activation of cytochrome P4502E1  causes proximal  tubular cell necrosis by
    oxidative stress. Toxicol. In Vitro  17, 335-341.
Al-Ghamdi, S.S., Raftery,  M.J.,  Yaqoob, M.M.,  2003b.  Organic solvent-induced
    proximal tubular cell toxicity via caspase-3 activation. J. Toxicol. Clin. Toxicol.
    41, 941-945.
American Conference of Governmental and Industrial Hygienists (ACGIH), 2001.
    Documentation of toluene BEI. Available from:  .
Angerer, J., Schildbach,  M., Kramer, A, 1998. S-p-Toluylmercapturic acid in the urine
    of workers exposed to  toluene: a new biomarker for toluene exposure. Arch.
    Toxicol. 72,  119-123.
Baelum,  J.,  1991.   Human  solvent   exposure.  Factors   influencing   the
    pharmacokinetics and acute toxicity. Pharmacol. Toxicol. 68  (Suppl. 1), 1-36.
Baelum, J., Molhave, L,  Honore Hansen, S., Dossing, M., 1993. Hepatic metabolism of
    toluene after gastrointestinal uptake in humans. Scand. J. Work Environ. Health
    19, 55-62.
Benignus,  V.A., Boyes, W.K., Kenyon,  E.M.,  Bushnell,  P.J., 2007. Quantitative
    comparisons of  the acute  neurotoxicity of  toluene in rats  and  humans.
    Toxicol. Sci. 100, 146-155.
Bushnell, P.J., Oshiro, W.M., Samsam, T.E., Benignus, V.A., Krantz, Q.T., Kenyon, E.M.,
    2007. A dosimetric  analysis of the acute behavioral effects of inhaled toluene in
    rats. Toxicol. Sci. 99,181-189.
Churchill, J.E., Ashley, D.L, Kaye, W.E., 2001. Recent chemical exposures and blood
    volatile organic compound  levels in a large population-based sample. Arch.
    Environ. Health 56,157-166.
Dossing, M., Aelum, J.B., Hansen, S.H., Lundqvist, G.R., Andersen, N.T., 1983. Urinary
    hippuric acid and orthocresol excretion in man during experimental  exposure
    to toluene. Br. J. Ind. Med. 40, 470-473.
Fustinoni, S., Mercadante, R., Campo,  L, Scibetta, L, Valla, C, Consonni, D., Foa, V.,
    2007. Comparison between urinary o-cresol  and toluene  as biomarkers of
    toluene exposure.]. Occup. Environ. Hyg. 4,1-9.
Gemma, S., Vichi, S., Testai, E., 2006. Individual susceptibility and alcohol effects:
    biochemical and genetic aspects. Ann. 1st. Super. Sanita 42, 8-16.
Hays, S.M., Becker, R.A., Leung, H.W., Aylward, LL, Pyatt, D.W., 2007. Biomonitoring
    equivalents: a screening approach for interpreting biomonitoring results from a
    public health risk perspective. Regul. Toxicol. Pharmacol. 47, 96-109.
Hays, S.M., Aylward, LL, LaKind, J.S., Bartels, M.J., Barton, H.A., Boogaard, P.J., Brunk,
    P.J., Dizio, S., Dourson, M., Goldstein, D.A., Lipscomb, J., Kilpatrick, M.E., Krewski,
    D., Krishnan, K., Nordberg, M., Okino, M., Tan, Y.-M., Viau, C, Yager, J.W., 2008.
    Guidelines  for the  derivation  of Biomonitoring Equivalents: Report  from  the
    Biomonitoring Equivalents Expert Workshop. Regul. Toxicol. Pharmacol. 51, S4-
    S15.
Health Canada, 1996.  Health-based tolerable daily  intakes/concentrations  and
    tumorigenic  doses/concentrations  for priority  substances. Environmental
    Health Directorate. Available from: .
Heuser, V.D., Erdtmann, B., Kvitko, K., Rohr,  P., da Silva, J., 2007. Evaluation of
    genetic damage in  Brazilian footwear-workers: biomarkers of exposure, effect,
    and susceptibility.  Toxicology 232, 235-247.
Inoue, 0.,  Kanno,  E., Kasai,  K.,  Ukai,  H., Okamoto, S.,  Ikeda,  M., 2004.
    Benzylmercapturic acid  is superior  to  hippuric acid  and o-cresol  as a
    urinary marker  of occupational exposure to toluene. Toxicol. Lett. 147,
    177-186.
International Agency for Research on Cancer (IARC), 1999. IARC Monographs on the
    Evaluation of Carcinogenic Risks to Humans. Vol. 71: Re-Evaluation of Some
    Organic Chemicals, Hydrazine  and Hydrogen Peroxide. Summary of Data
    Reported  and  Evaluation.  Available from:   (accessed 12/2007).
Jang, J.-Y., 1996. Simulation of the toluene in venous blood with a physiologically
    based pharmacokinetic model:  its application to  biological exposure  index
    development. Appl. Occup. Environ. Hyg. 11, 1092-1095.
Johnsrud, E.K., Koukouritaki, S.B., Divakaran, K., Brunengraber, LL, Hines, R.N.,
    McCarver, D.G., 2003. Human hepatic CYP2E1 expression during development.
    J. Pharmacol. Exp. Ther. 307, 402-437.
Kawai, T., Ukai, H., Inoue, 0., Maejima, Y., Fukui, Y., Ohashi, F., Okamoto, S., Takada,
    S., Sakurai,  H., Ikeda,  M., 2008.  Evaluation of biomarkers of occupational
    exposure to toluene at low levels. Int. Arch. Occup. Environ. Health 81,253-262.
Kezic, S., Calkoen, F., Wenker,  M.A., Jacobs, J.J., Verberk, M.M.,  2006. Genetic
    polymorphism  of  metabolic enzymes modifies the  risk  of chronic solvent-
    induced encephalopathy. Toxicol. Ind. Health 22, 281-289.
LaKind, J.S., Aylward, LL, Brunk, C, DiZio, S., Dourson, M., Goldstein, D.A., Kilpatrick,
    M.E., Krewski, D., Bartels, M., Barton, H.A., Boogaard, P.J., Lipscomb, J., Krishnan,
    K., Nordberg, M., Okino, M., Tan, Y.-M., Viau, C, Yager, J.W., Hays, S.M., 2008.
    Guidelines for the communication of Biomonitoring Equivalents: Report from
    the Biomonitoring Equivalents Expert Workshop. Regul. Toxicol. Pharmacol. 51,
    S16-S26.
National Toxicology Program (NTP), 1990. Toxicology and carcinogenesis studies of
    toluene (CAS #108-88-3) in F344/N rats and B6C3F1 mice (inhalation studies).
    NTP Technical Report Series No. 371. U.S. Department of Health and Human
    Services.
Neubert, D., Gericke, C., Hanke, B., Beckmann, G., Baltes, M.M., Kiihl, K.P., Bochert, G.,
    Hartmann, J.Toluene Field Study Group, 2001. Multicenter field trial on possible
    health effects of  toluene: II. Cross-sectional evaluation of acute  low-level
    exposure. Toxicology 168,158-183.
Nong, A., McCarver,  D.G.,  Hines, R.N., Krishnan, K., 2006. Modeling  interchild
    differences in  pharmacokinetics  on the basis  of subject-specific  data  on
    physiology and hepatic CYP2E1  levels: a  case study with toluene. Toxicol.
    Appl. Pharmacol. 214, 78-87.
Pelekis,  M., Gephart, L.A., Lerman, S.E., 2001. Physiological-model-based derivation
    of the adult  and child pharmacokinetic intraspecies uncertainty factors  for
    volatile organic compounds.  Regul. Toxicol. Pharmacol. 33,12-20.
Pierce, C.H., Dills, R.L, Morgan, M.S., Nothstein, G.L, Shen, D.D., Kalman, D.A., 1996.
    Interindividual  differences  in  2H8-toluene  toxicokinetics  assessed  by
    semiempirical physiologically based  model. Toxicol. Appl. Pharmacol. 139,
    49-61.
Pierce, C.H., Dills,  R.L, Morgan, M.S., Vicini, P., Kalman, D.A., 1998.  Biological
    monitoring of controlled toluene exposure. Int. Arch. Occup. Environ. Health 71,
    433-444.
Sexton, K., Adgate, J.L, Church, T.R., Ashley, D.L, Needham, LL, Ramachandran, G.,
    Fredrickson,  AL,  Ryan, A.D., 2005. Children's exposure to volatile  organic
    compounds as determined by longitudinal measurements in  blood. Environ.
    Health Perspect. 113, 342-349.
Stengel, B., Cenee, S., Limasset, J.C., Diebold, F., Michard, D., Druet, P., Hemon, D.,
    1998. Immunologic and renal markers among photogravure printers exposed to
    toluene. Scand. J. Work Environ. Health 24, 276-284.
Sullivan, M.J., Conolly, R.B., 1988.  Comparison  of blood  toluene  levels after
    inhalation and oral administration. Environ. Res. 45, 64-70.
Tardif, R., Lapare, S., Krishnan, K., Brodeur, J., 1993. Physiologically based modeling
    of the toxicokinetic  interaction between toluene  and m-xylene in the rat.
    Toxicol. Appl. Pharmacol. 120, 266-273.
Tardif, R., Droz, P.O., Charest-Tardif, G., Pierrehumbert, G., Truchon, G., 2002. Impact
    of human variability on the biological monitoring of exposure to toluene: I.
    Physiologically based toxicokinetic modelling. Toxicol. Lett. 134,155-163.
Tardif,   R., Lapare,  S.,  Charest-Tardif,  G.,  Brodeur,  J., Krishnan,  K.,   1995.
    Physiologically based  modeling  of a  mixture of  toluene  and xylene  in
    humans. Risk Anal. 15,  335-342.
Truchon, G., Tardif, R., Brodeur, J., 1999. o-Cresol: a good indicator of exposure  to
    low levels of toluene. Appl. Occup. Environ. Hyg. 14, 677-681.
United  States Environmental Protection  Agency (USEPA),  2005.  Toxicological
    Review of Toluene (CAS No. 108-88-3) In Support of Summary Information on
    the Integrated Risk Information System (IRIS). EPA/635/R-05/004.
van Asperen, J., Rijcken, W.R., Lammers, J.H., 2003. Application of physiologically
    based toxicokinetic modelling to study the impact of the exposure scenario on
    the toxicokinetics  and the behavioural effects of toluene in rats. Toxicol. Lett.
    138, 51-62.
World Health Organization (WHO),  2004. Guidelines for Drinking-Water Quality,
    vol.  1. Recommendations, third ed. World Health Organization, Geneva.
World Health Organization  (WHO), 2005. Air Quality Guidelines for Europe, second
    ed. WHO Regional Office for Europe Copenhagen. Available  from: .
                                                  Previous

-------
TOXICOLOGICAL SCIENCES 108(1), 207-221 (2009)
doi:10.1093/toxsci/kfp005
Advance Access publication January 27, 2009
   Comparative Microarray Analysis  and  Pulmonary  Changes in Brown
 Norway  Rats Exposed  to Ovalbumin  and  Concentrated Air Particulates

Brooke L. Heidenfelder,* David M. Reif,t Jack R. Harkema,^ Elaine A. Cohen Hubal,t Edward E. Hudgens,* Lori A. Bramble,^
        James G. Wagner,^ Masako Morishita,§ Gerald J. Keeler,§ Stephen W. Edwards,^f and Jane E. Gallagher*'1
   *Mail Drop 58 C Human Studies Division, National Health and Environmental Effects Research Laboratory, Office of Research and Development, US
   Environmental Protection Agency, Research Triangle Park, North Carolina 27711; ^Mail Drop D343-03 and Mail Drop 205-01, National Center for
  Computational Toxicology, Office of Research and Development, US Environmental Protection Agency, Research Triangle Park, North Carolina 27711;
 ^Department of Pathobiology and Diagnostic Investigation, Michigan State University, East Lansing, Michigan 48824; §Ai'r Quality Laboratory,  University
    of Michigan, Ann Arbor, Michigan 48109; and \Mail Drop B305-01 Office of the Associate Director of Health, National Health and Environmental
    Effects Research Laboratory Immediate Office, US Environmental Protection Agency, Office of Research and Development, Research Triangle Park,
                                                   North Carolina 27711

                                      Received October 7, 2008; accepted December 23, 2008
  The interaction between air particulates and genetic suscepti-
bility has been  implicated in the pathogenesis of asthma. The
overall objective of this study was to  determine the effects of
inhalation exposure to environmentally  relevant concentrated air
particulates (CAPs) on  the lungs of ovalbumin (ova) sensitized
and challenged Brown Norway rats. Changes in gene expression
were compared with lung tissue histopathology, morphometry,
and  biochemical  and cellular parameters in bronchoalveolar
lavage  fluid (BALF). Ova challenge  was responsible  for  the
preponderance of  gene  expression  changes, related largely  to
inflammation. CAPs exposure alone resulted in no significant gene
expression changes, but CAPs and ova-exposed rodents exhibited
an  enhanced effect  relative  to ova alone  with differentially
expressed genes primarily related  to inflammation  and airway
remodeling.  Gene expression  data was consistent with  the
biochemical and cellular analyses of the BALF, the pulmonary
pathology, and morphometric changes when comparing the CAPs-
ova group to the air-saline or CAPs-saline group. However, the
gene expression data were more sensitive than the BALF cell type
and number for assessing the effects of  CAPs and ova versus the
ova challenge alone. In  addition,  the  gene expression results
provided some  additional  insight  into the  TGF-p-mediated
molecular processes underlying these changes.  The broad-based
histopathology and functional genomic analyses demonstrate that
exposure to CAPs exacerbates rodents with allergic inflammation
induced by an allergen  and suggests that asthmatics may be at
increased risk for air pollution effects.
  Key Words: ovalbumin;  allergen; asthma;  particulate matter;
remodeling; inflammation; microarray.
  1 To whom correspondence should be addressed at Mail Drop 58C National
Health and Environmental Effects Laboratory, US Environmental Protection
Agency, Office of Research and  Development, Research Triangle  Park, NC
27711. Express mail: 104 Mason Farm Rd. Human Study Facility, Chapel Hill,
NC 27514. Fax: (919) 966-0655. E-mail: gallagher.jane@epa.gov.

Published by Oxford University Press 2009.
        The prevalence of asthma  has  increased  in recent years,
      particularly in industrialized countries, making it an important
      public health concern. The air pollution and specific allergens
      that  accompany  urbanization are thought  to  be  partly re-
      sponsible for this increase (Busse and Mitchell, 2007). Diesel
      exhaust (DE) is a major contributor to particulate matter (PM)
      related  air  pollution,  especially  in urban areas. As DE
      emissions have increased  globally, so has asthma prevalence
      (Keller and Lowenstein, 2002). Airborne PM has a number of
      detrimental health effects, particularly in asthmatics (Riedl and
      Diaz-Sanchez, 2005). Genetics are also a factor; they determine
      the  susceptibility to asthma or other respiratory diseases that
      can be influenced by PM (Moller et al, 2007).
        PM from air pollution has several mechanisms of action for
      generating  inflammation,  damaging cells, and  exacerbating
      airway hyperresponsiveness (Gilmour  et  al., 2006).  hi par-
      ticular, PM derived from a variety of sources  induces the
      production of reactive oxygen species  in inflammatory  cells
      (Becker et al., 2002) and oxidative  DNA damage  in human
      airway epithelial cells  (Prahalad et al., 2001). PM exposure
      also  has  specific allergy-related effects.  For  example,  the
      polyaromatic hydrocarbons in DE particles have been shown to
      increase IgE production by human B cells  (Takenaka et al.,
      1995; Tsien et al., 1997). Increases in coarse PM correlate with
      increases in circulating eosinophils,  serum triglycerides, and
      decreased heart  rate variability  in asthmatics (Yeatts et al.,
      2007). PM has  also been shown to act as an  adjuvant  in
      ovalbumin (ova)-induced allergic reactions in mice and Brown
      Norway (BN)  rats (Dong et al, 2005; Harkema et al, 2004a;
      Matsumoto et al, 2006; Miyabara et al, 1998;  Takano et al,
      1997). BN rats challenged with ova  have greater PM particle
      retention  and eosinophilia  in  the  lungs,  compared  with
      unchallenged BN rats (Morishita et al, 2004). Humans  with
      allergies have increased Th2 cytokines and ragweed-specific
      IgE expression after coexposure to ragweed allergen and DE
                                  Previous
TOG
Next

-------
208
                                                   HEIDENFELDER ET AL.
participates (Diaz-Sanchez  et al.,  1997). The mechanism for
the effect of PM on asthma remains unclear, due to its complex
interactions with  the immune system and  other  metabolic
pathways (McCunney, 2005).
  Examining  the complex  interaction between environmental
exposures  and genetic factors  can provide vital  information
needed to better  understand  and treat respiratory diseases
(Kleeberger and Peden, 2005). Because asthma genetics are
complex  and likely to involve  many  genes  that  vary by
population, the use of emerging genomics tools  such  as
expression profiling and pathway  analyses are needed (Ober
and Hoffjan, 2006). Using  a rodent model of airway allergic
inflammation or pulmonary  allergy minimizes genetic variation
so that allergy- and exposure-induced gene expression changes
can be isolated and  investigated. Gene expression profiling in
animal  models  has yielded large numbers  of differentially
expressed  genes,  from  which   genes related  to  asthma
susceptibility  and pathogenesis have been distilled (Follettie
et al, 2006; Izuhara and Saito, 2006; Kuperman et al, 2005;
Walker et al,  2006). However,  there  is  little  information
regarding  the  precise  mechanisms  of  gene-environment
interactions in relation to asthma (London,  2007).
  BN  rats sensitized  with  ova are thought to  resemble the
clinically significant features of allergic asthma (Salmon et al,
1999; Tarayre et al, 1992;  Underwood et al, 1995) and often
used as  an animal  model  for pulmonary  allergic conditions
such as  asthma because  they have a high capacity for IgE
production and exhibit airway hyperresponsiveness following
exposure  to inhaled allergens (Abadie and Prouvost-Danon,
1980; Pauwels et al, 1979).  A few studies  have  examined
global  gene  expression   profiles  in  rodents   exposed  to
environmental agents, including concentrated ambient partic-
          ulates (CAPs) (Gunnison and Chen, 2005; Sigaud et al, 2007),
          environmental tobacco smoke (Izzotti et  al, 2005; Nadadur
          et al, 2002),  ozone (Leikauf et al,  2001; Park et al, 2004;
          Williams et al, 2007), and PM (Kooter et al, 2005; Nadadur
          and Kodavanti, 2002; Sato et al, 1999; Wise et al,  2006). BN
          rats coexposed to both lipopolysaccharide (LPS) and tobacco
          smoke (Meng et al, 2006) or mice coexposed to LPS and DE
          (Yanagisawa et al, 2004) have also been examined via micro-
          array, as models of chronic obstructive pulmonary disorder and
          acute lung injury, respectively.
            To our  knowledge, this  study is  the  first to use  gene
          expression  array  profiling  and  traditional  toxicological ap-
          proaches to decipher  and evaluate how the  combination of
          susceptibility and exposure to environmentally relevant CAPS
          act  together to  perturb  biological  pathways  that may  be
          important to the exacerbation of asthma.
                         MATERIALS AND METHODS

            Animals and exposure regimen.  Figure 1 shows the experimental design
          for the four treatment groups BN rodents (Charles River, Indianapolis, IN) Each
          group consisted of eight male rats, aged  10-12 weeks. Rats were free of
          pathogens and respiratory disease, and used in accordance with guidelines set
          forth by the Institutional Animal Care and Use Committee at Michigan State
          University. All four groups were sensitized in the animal care facility, for three
          consecutive days to ova (0.5% in saline, intranasally). Two weeks later, two of
          the groups were then challenged with saline vehicle, the other two with ova, by
          intranasal instillation (0.5% ova in saline, 150 ul/nasal passage) for 3 con-
          secutive days. Starting the day after the last challenge, two groups of rats (one
          each of the saline and ova-challenged groups) were exposed to  CAPs from
          Grand Rapids, MI (8 h/day for 13 days). There was a second ova or saline
          challenge 9 days following the first ova challenge. Rats were sacrificed 24 h
          after the last CAPs exposure.
    Treatment group
                      Sensitization
1st challenge
Exposures
2nd challenge
End
Air-
Saline
Air-
ova
CAPs-
Saline
CAPs-
ova

OVa Saline Saline

OVa ova ova

Ova Saline Saline

OVa OVa ova








                                                               14    16  17
                                                               Time (days)
                                                                                         26
                                                                                                              29  30
  FIG. 1.  Experimental design and exposure regimen. The initial ova sensitizations and first ova or saline challenges were administered intranasally over three
consecutive days (0-2 and 14-16, respectively). Animals were exposed to CAPs or ambient filtered air on days 17-29. A second ova or saline challenge was
administered intranasally on day 26. Lung tissue was harvested on day 30.
                                       Previous
       TOC

-------
                    COMPARATIVE MICROARRAY ANALYSIS AND PULMONARY CHANGES IN BROWN NORWAY RATS
                                                                                                                                            209
   The CAPs exposure and ova challenge occurred via the mobile air research
laboratory, AirCARE 1 (Harkema et al., 2004b). AirCARE 1 contained whole
body  inhalation  chambers  with  a  Harvard/EPA  ambient  fine  particle
concentrator, a biomedical lab,  an inhalation exposure lab, and an atmospheric
monitoring lab (Harkema et al., 2004b; Keeler et al., 2007). The fine particle
concentrator  was  a three-stage  aerosol concentrator  that  utilizes  virtual
impactors to increase  the concentration of particles (size range 0.1-2.5 um)
by an approximate factor of 30 (Sioutas et al., 1997). The remaining two groups
were exposed to HEPA-filtered ambient air, also in AirCARE 1.

   Characterization of CAPs.  CAPs were collected during each 8-h exposure
period. The mass concentrations of CAPs were determined by placing 47-mm
Teflon filters (Gelman Sciences, Inc., Ann Arbor,  MI) in Teflon filter packs
attached to the back of the animal exposure chambers at  flow rates of 3 LPM.
After the gravimetric determination, CAPs samples  on Teflon filters were
extracted  in  10%  nitric  acid  and analyzed for  a suite of trace elements
(including crustal/urban dust related elements (Fe, Si, Ca, Al) using inductively
coupled plasma-mass spectrometry (ELEMENT 2, Thermo Finnigan, San Jose,
CA). Pre-baked quartz filters (Gelman Sciences, Inc., Ann Arbor, MI) were also
placed in Teflon filter packs mounted  on the exposure  chambers and were
analyzed  for carbonaceous (organic and elemental) aerosols  by  a thermal-
optical analyzer (Sunset Labs, Forest Grove, OR). Annular denuder/filter pack
samples were also collected and analyzed for acid gases and ion species by ion
chromatography (Model  DX-600, DIONEX, Sunnyvale,  CA).

   RNA isolation.   Four animals from each group were randomly chosen for
gene expression analysis. Total RNA was isolated  from the right cranial lung
lobe,  using  RNeasy Mini kits (Qiagen Inc.,  Valencia, CA) with DNase
treatment. RNA quality  was checked using an Agilent  Bioanalyzer (Agilent
Technologies, Palo Alto, CA).  RNA was quantified on a NanoDrop ND-1000
spectrophotometer (NanoDrop Technologies, Wilmington, DE), and then 3.8 ug
of each sample was sent to Expression Analysis (Durham, NC) for cDNA target
generation and hybridization to  rat R230 2.0 whole genome arrays (Affymetrix,
Inc., Santa Clara, CA). RNA from one rat in the CAPs-saline exposure group
failed to generate enough target to hybridize to a microarray chip.

   Gene array data analysis.   Invariant probe signals were removed using the
REDI  (reduction  of invariant probes)  method,  proprietary  to Expression
Analysis. Array data were normalized using  robust multiple-array  averaging
(RMA), and compared by group using the permutation analysis for differential
expression (FADE, http://www.expressionanalysis.com/pdf/PADE_TechNote_
2005.pdf) algorithm. Noise from the FADE  was  removed by setting probe
signals that were <  128 to 128 (the level of noise) to prevent artificially high
fold change differences,  and fold changes were recalculated. Absolute fold
changes < 1.5 and values which exceeded accumulated false discovery rates
(FDRs) of < 0.05 were removed. The resulting significant probesets from each
FADE analysis are shown in Supplemental Table S1. Probesets were annotated
using NetAffx (Affymetrix Inc., Santa Clara, CA).  Only three FADE analyses
had a list of significant  genes  that met the cutoff values. The probeset lists
generated by  this procedure  were uploaded  into GeneSpring 7  (Agilent
Technologies, Palo Alto,  CA) for principal components and clustering analyses.
   Pathways analysis with metacore.  Probeset lists created using an FDR of
<0.2 (Supplemental Table S2) rather than <0.05 were uploaded into MetaCore
4.5 (GeneGo, Inc., St Joseph,  MI,  http://www.genego.com/metacore.php) for
functional and pathways analysis.  Each gene  identifier was  mapped to its
corresponding gene object in the  MetaCore  database. The  genes were then
compared with both gene ontology (GO) processes  and GeneGo  maps to
determine processes or pathways which were significantly overrepresented in
the differentially expressed gene list. The p values for maps and processes were
calculated using a hypergeometric distribution.

   Pathways analysis  with ingenuity  pathways analysis.  Probeset  lists
created using an FDR of <0.2 (Supplemental  Table  S2) rather than <0.05
were analyzed through the use of Ingenuity Pathways Analysis (Ingenuity
Systems,   www.ingenuity.com). Each   gene  identifier  was mapped  to  its
corresponding gene object in the Ingenuity Pathways Knowledge Base. These
       genes  were  overlaid  onto  a  global  molecular  network  developed  from
       information contained in the Ingenuity Pathways Knowledge Base. Networks
       of these focus genes  were then algorithmically  generated based on  their
       connectivity.  The Functional Analysis of the top scoring network identified the
       biological functions and/or diseases that were most significant to the genes in
       the network.  The network genes associated  with biological functions and/or
       diseases in the Ingenuity Pathways Knowledge Base were considered for the
       analysis. Fisher's exact test was used to calculate a p value determining the
       probability that each biological function and/or disease assigned to that network
       is due to chance alone.
          Confirmatory quantitative PCR.   Total RNA was  used to generate cDNA
       for confirmatory quantitative PCR (qPCR). Total RNA (350 ng) was reverse
       transcribed in a buffer containing 1X reverse transcriptase polymerase chain
       reaction buffer, 25uM random hexamers, 5mM dithiotreitol, SOOuM deoxnu-
       cleotide triphosphates, 20 U RNase OUT, and 200 U Superscript III (all from
       Invitrogen Corp., Carlsbad, CA) for 10 min at 25°C,  60 min at 50°C, and 15 min
       at 75°C. Six  genes  of interest and one  endogenous  control (p-actin)  were
       analyzed by confirmatory qPCR. TaqMan MGB probes (Applied Biosystems,
       Foster City, CA) for transforming growth factor (33 (TGF-J33), Fc receptor IgE
       high affinity  I alpha  polypeptide (Peerla), complement component 4 binding
       protein (C4BP), vascular  endothelial growth factor C (VEGF-C), chitinase
       3-like 1 (CM3L1), metallothionein la (Mtla), and the p-actin endogenous control
       were used according to the recommended procedure on a 7500 Real-Time PCR
       machine (Applied Biosystems, Foster City,  CA). Briefly, 5  ul of cDNA was
       mixed with 2.5 ulof 20X TaqMan MGB probe mix, 17.5 ul of water, and 25 ul
       of 2X  TaqMan Universal PCR Master Mix for a total volume of 50 ul per well
       in a 96-well optical plate (Applied Biosystems, Foster City, CA). Reactions for
       each sample were performed in duplicate. The plate was run for 2 min at 50°C,
       then 10 min at 90°C,  followed by 40 cycles of 15 s  at 95°C and 1 min at 60°C.
       The data were analyzed using the Auto CT method to generate a standard curve,
       and the duplicate relative  concentration measurements for each sample were
       averaged. Data were  normalized to the control gene concentration by dividing
       the mean relative concentration  for  a gene of interest by  the mean relative
       concentration of the  p-actin control.  Normalized values  were then averaged
       among rats in each treatment group.
          Statistical  analysis of qPCR results.  Data from qPCR (n  = 4)  were
       expressed as the mean group value ± the standard error of the mean. The data
       were subjected to ANOVA for factors of inhalation exposure type (e.g., CAPs
       or filtered air) and airway sensitization/challenge (ova or saline). Significant
       differences between  experimental groups were identified  using the Tukey
       honest  significant difference  post  hoc  test.  The criterion for statistical
       significance was p < 0.05.
          Lung tissue section and bronchoalveolar lavage. All animal sacrifices
       were conducted in the laboratory of Dr. Harkema at Michigan State University.
       At the time of sacrifice, animals were deeply anesthetized, 5 ml of whole blood
       was collected from the ascending vena cava, and the animal was exsanguinated.
       The trachea was cannulated, and the  heart/lung block  was removed from the
       thoracic cavity. The right extrapulmonary bronchus was ligated with suture and
       the right lung lobes were removed. The  entire right cranial lobe was processed
       for isolation  of total  RNA (see above). The  left lung lobe  was lavaged with
       saline  and  the recovered lavage fluids (bronchoalveolar lavage fluid; BALF)
       from each rat were analyzed for total and differential cell counts, total and ova-
       specific IgE  and secreted mucins  by  enzyme-linked immunosorbent assay
       (ELISA). After bronchoalveolar lavage  the left lung lobe was perfusion-fixed
       with 10% neutral buffered formalin via the trachea at a  constant pressure of
       30 cm of fixative. After 2 h of intratracheal fixation, the trachea was ligated and
       the lung lobe was immersed in a large volume of the same fixative for at least
       24 h before the left  lung lobe was further processed for light microscopy as
       described below.
          Analysis of BALF.  Total cells recovered by bronchoalveolar lavages were
       determined manually using a hemacytometer. Cytospin preparations from eight
       rodents per treatment group were made  with a cytocentrifuge and stained with
       Diff-Quick  (IMEB,   Inc., San  Marcos,   CA).  Differential  cell  counts
                                         Previous
TOC
Next

-------
210
                                                        HEIDENFELDER ET AL.
(e.g., neutrophils, macrophages, eosinophils, lymphocytes) were determined by
counting 200 cells per animal. The BALE was centrifuged to remove cells and
debris, and the supernatant was  stored at —80°C. Cell-free  supernatant was
assayed for protein content by the bicinchoninic acid method (#23255, Pierce
Chemical Co., Rockford, IL). Mucin glycoprotein SAC in BALE was analyzed
by ELISA  using a  monoclonal antibody (mucin5AC Ab-1;  Neomarkers,
Fremont,  CA), a peroxidase-conjugated avidin/biotin complex (ABC Reagent;
Vector Laboratories,  Burlingame, CA), and a fluorescent substrate (Quanta-
Blue;  Pierce Chemical, Rockford, IL). Total and ova-specific IgE were
determined in ELISA (colormetric capture assays) by coating plates with anti-
rat IgE (Pharmingen, San Diego, CA) or ova, respectively. After incubation
with BALE, bound samples were detected with biotinylated  anti-rat IgE  and
quantified with a peroxidase system (Vector  Laboratories, Burlingame, CA)
using  a Bio Tek Elx 808 plate reader.
   Lung tissue collection and epithelial morphometry.  The intrapulmonary
airways of the fixed  left lung lobe from  each  of  eight rodents were
microdissected. Beginning at the lobar bronchus, airways were split down
the long axis of the largest daughter branches (i.e., main  axial airway; large
diameter conducting airway) through the 12th airway generation. Tissue blocks
that transverse the entire lung lobe at the level of the fifth and eleventh airway
generation of the main axial  airway  were excised and processed for  light
microscopy and morphometric analyses. The lung tissue blocks were embedded
in paraffin, and 5- to 6-um sections from the proximal face of each block were
cut and placed on charged slides (Probe-on-plus; Fischer Scientific, Pittsburgh,
PA).  Tissue sections  were  histochemically stained with  (1) hematoxylin  and
eosin  (H&E)  for evaluation of epithelial  morphology and  quantitation of
epithelial  cell numeric  densities, or (2) Alcian Blue (pH  2.5)/Periodic Acid
Schiff's sequence (AB/PAS) to detect  acidic and neutral mucosubstances for
quantitation of stored mucosubstances within the airway  epithelium.
   Morphometric  Analysis of intraepithelial  mucosubstances  in  air-
way.   The  volume  density (Vs)  of  stored  intraepithelial  mucosubstances
(IMs)  in the surface epithelium lining the proximal and distal pulmonary axial
airways (generations  5 and 11,  respectively)  were determined  using image
analysis and standard morphometric techniques previously described in detail
(Harkema et al., 1997a, b). The quantity of stored mucosubstances per unit area
were determined as described by Harkema et al. (1987a, b) and expressed as the
mean  volume (nl) of ip mucosubstances (IM)/mm  of basal lamina ± standard
error of the mean (Fig.  8D).
   Statistical analysis  of BALE and morphometric  endpoint.  Data  de-
scribing the type and magnitude of the pulmonary inflammatory response in
BALF and  morphometric changes in  mucosubstances in airway epithelium
(n = 8) were expressed as the mean group value ± the  SEM. The data were
subjected to ANOVA for factors of inhalation exposure type (e.g., CAPs or
filtered air) and airway sensitization/challenge (ova or  saline). Significant
differences  between experimental groups were identified using appropriate
post hoc tests (Tukey's  omega procedure). Transformation of data (e.g., log or
arcsin^1)  was performed if needed to render variances homogeneous. The
criterion for statistical significance was p < 0.05.
                            RESULTS

Characterization of CAPs
   The average CAPs concentration during the 13-day exposure
period was  493 ±391  Hg/m3 (Table  1). As shown in Table  1,
over half of the CAPs collected in Grand Rapids were organic
and elemental carbon.  Although the  13-day averaged concen-
tration of sulfate  was only about 10% of the CAPs, during the
first week of the exposure study  a large amount of secondary
particles composed mostly of sulfate and organic carbon was
observed (Fig.  2).  The  Hybrid  Single  Particle  Lagrangian
                               TABLE 1
      Average Mass and Major Composition of Concentrated Air
               Particles during 13-Day Exposure Period
   Mass
   Organic carbon
   Elemental carbon
   Sulfate
   Nitrate
   Ammonium
   Urban dust"
493 ± 391
244 ± 144
 10 ±4
 79 ± 131
 39 ± 67
 39 ±59
 18 ±6
     Note. All values are presented as mean ± SD (ug/m ).
     "Crustal urban dust was estimated from Fe, Al, Ca, and Si.
   Integrated  Trajectories model (HYSPLIT,  National Oceanic
   and  Atmospheric Administration)  indicated regional transport
   of an air  mass  that passed  through Missouri,  Illinois, and
   northwestern  Indiana (Fig. 2).  Cities in northwest Indiana,
   including Gary and East Chicago, have  been home to  heavy
   industry for the last century.

   Differentially  Expressed Genes
     Criteria for the pairwise analyses were FDR < 0.05 and fold
   change >  1.5 for expression over background. The centroid
   plot in Figure 3A is a visual representation of  the relative
                                                                             7/17 7/18  7/19 7/20 7/21 7/22  7/23 7/24 7/25 7/26  7/27 7/28 7/29
     FIG. 2.  Air mass contributing to  the  CAPs  used  for exposure.  The
   HYSPLIT  model  (National Oceanic  and  Atmospheric  Administration)-
   generated  48-h backward trajectories showing the history of an air mass
   arriving in Grand Rapids at noon on July 21.  (Data are from National Oceanic
   and Atmospheric Administration.)
                                           Previous
TOC

-------
                 COMPARATIVE MICROARRAY ANALYSIS AND PULMONARY CHANGES IN BROWN NORWAY RATS
                                                                                                                         211
   A
                 Air-Ova
                                Air-Saline
                                               CAPs-Ova
                                                                       CAPs-Saline
                                                                                             B
                                                                                            Saline vs air ova  Air saline vs caps ova
                                                                                                Caps-saline vs Caps Ova
    Col1a1
    Fndcl
    Cdh11
    Ama4
    TgfbS
    Ms4a2
    Fbnt
    lgG-2a
    IgsflO
    Cpa3
    Fcerla
    Scd1
    Aqp4
    Vegfc
    Tpm3
    Slc34a1
    Abp1
    Col5a1
    Eln
    Baspl
    Cnn1
    CreW2
    Actg2
    Rrm2
    EII2
    Gpr176
    DnajcS
    Ada
    PqlcS
    Tmem97
    Pdia4
    Cars
    Lamps
    Srd5a1
    Lrp2
    Pcbdl
    Bmp3        —
    PipSklb
    Hkr3
    Nritp
    LOC259224   —
    Kcne2
    Lbp
    C5          —
    Cfi
    C4bpa
    Mt1a
    Lcn2
    ChiSM

  FIG. 3.  Probesets with an absolute fold change > 1.5 and a FDR < 0.05 from the three permutation analyses for differential expression comparisons were
considered significant, (total of 87 differentially expressed probesets). (A) Centroid plot of significant genes. The bars represent expression values relative to the
centroid value for each gene considering all groups. Bars extending to the right of the vertical line in each treatment group represent elevated expression of that
gene: bars extending to the left represent lower expression of that gene. Genes are plotted in descending order based on expression by the CAPs-ova treatment
group.  Red = air-ova, green = air-saline, dark blue = CAPs-ova, light blue = CAPs-saline. (B) Venn diagram of all 87 significant probesets.
class-wise expression of genes that met these criteria, not includ-
ing probesets of limited or ambiguous annotation (for the full list
of genes see Supplemental Table SI). The shrunken centroids
were calculated using the  method described in (Tibshirani
et al., 2002). It is clear  in this centroid plot that the relative
gene expression patterns for the CAPs-ova treatment group
were opposite that of the air-saline and CAPs-saline treatment
groups and more like that seen in the air-ova treatment group.
Figure 3B  shows  a Venn  diagram indicating  the amount of
overlap  between  the significant  probesets  from  pairwise
comparisons. The  first  pairwise comparison  group  studied
was the air-exposed, saline-challenged rats compared with the
air-exposed, ova-challenged rats (air-saline vs. air-ova). The 28
differentially  expressed  probesets  in  this  comparison group
were all upregulated;  the greatest gene upregulation was 3.39-
fold for immunoglobulin heavy chain gamma 2a (IgG-2a). The
expression  of  39  probesets was  significantly changed  in
the CAPs-exposed, saline-challenged rats compared  with the
CAPs-exposed, ova-challenged rats (CAPs-saline vs. CAPs-
                                                                ova).  The  greatest increase in expression  was  4.95-fold  for
                                                                stearoyl-coenzyme  A  desaturated  1  (Scdl);  the  greatest
                                                                decrease was  —4.23—fold  for  complement  component 5
                                                                (C5).  A total of 32 probesets were differentially expressed in
                                                                the  air-saline rats compared  with  the  CAPs-exposed, ova-
                                                                challenged  rats   (air-saline  vs.   CAPs-ova).  The  greatest
                                                                upregulation was 3.42-fold for IgG-2a, and  the greatest down-
                                                                regulation was 2.49-fold for Mtla. No significant differentially
                                                                expressed probesets were found  by comparing  the  air-saline
                                                                and  air-ova groups  to  the  CAPs-saline  and  CAPs-ova,
                                                                respectively (data not shown).
                                                                  The relative  upregulation/downregulation  of  genes  was
                                                                confirmed  using  qPCR  (Fig. 4).  Specifically,  the  PADE
                                                                analysis indicated that TGF-p3 is expressed 1.6- and 2.2-fold
                                                                higher in CAPs-ova versus air-saline and versus CAPs-saline
                                                                exposed rodents, respectively; Fcerla was  expressed 2.1-fold
                                                                higher in  CAPs-ova  versus  air-saline  rodents;  C4BP  was
                                                                expressed  2.8-fold  lower  in  CAPs-ova  versus CAPs-saline
                                                                rodents; VEGF-C was expressed  1.6-fold higher in CAPs-ova
                                    Previous

-------
212
                                                     HEIDENFELDER ET AL.
             qPCR Confirmation of Gene Expression Changes
  i
  c
  o
  o
  13
  Q)
  N
  O
          TGFB3    FCER1    C4BP    VEGF.C    CM3L1
                             Gene Symbol
                                                       Mt1a
  FIG. 4.   Gene expression by confirmatory qPCR. The  average relative
concentration of each target gene  (run in duplicate), divided  by  the
corresponding fi-actin endogenous control per rat, then averaged with all rats
of the same treatment group. Error bars show the standard error of the mean.
The relative expression of each gene by treatment group corresponds with
microarray data outputs (compare to Supplemental Table 1).  *Significantly
different from air-saline treatment (p  < 0.05); +significantly different from
CAPs-saline treatment (p < 0.05); there were no significant differences between
the air-ova & CAPs-ova groups.
versus air-saline rodents; CM3L1  was expressed 3.5-fold lower
in  CAPs-ova versus  CAPS-saline rodents;  and  Mtla  was
expressed 3.3-fold  lower  in  CAPs-ova versus  CAPs-saline
rodents and 2.5-fold lower in air-ova versus air-saline rodents.
Due to small  sample size, the changes for three of the six genes
were not significantly  different,  illustrating  the  danger  in
relying upon  measurements of single genes in studies like this.
Microarray studies  allow  the  simultaneous  consideration  of
groups of functionally related genes, allowing smaller changes
to be detected (as discussed below).

Principal Components and Cluster Analysis
   PCA by treatment conditions was performed on  array data
using the list of 87 differently expressed probesets described
above (Fig. 5A). The saline-challenged animals were  separated
from the ova-challenged animals by  the first  principal com-
ponent, which  accounts for 55.92% of the variance between
samples.  The  second  principal  component (accounting  for
15.21% of variance) separated the ova-challenged rats between
those exposed  to laboratory air  and those exposed  to  CAPs
with one  exception. The air-saline and the CAPs-saline groups
were indistinguishable in the first two principal components.
   Hierarchical clustering of all samples by condition, based on
the list of 87 differentially expressed probesets, separated  the
animals   into groups  of  either  saline-challenged  or  ova-
challenged rodents (Fig.  5B). Individual probesets  were nor-
malized  to  the  median  value  for coloring,  and  clustering
indicated two main branches. The bottom branch showed lower
levels in six of the eight ova-challenged samples versus higher
A '•'-

— 0.8-
i
.2 06 -
(5
S? 0.4 -
I
£ 02-
CM
^ n n
(^ U.U
e
§. -0.2 -
0 -0.4 -
O
Q- -0.6 -

.nn


•



g
o
* •
0


0

0


• AR-OVA
O CUP-OVA
T W-SWJNE
A i CW-SALINE

A
T T

a
T



T


         -15       -1.0      -0.5      0.0       0.5       10
                      PCA component 1 (55.9%variance)
                  ff**ff«18S*»S*t«
                  SSSSSfffsIIsis
                    »iv>vtv»OL«»525**fji^f;
                    i«o,aa<«AB f  *? A f  i  f
                  »*slfii***a  ?»*?  *^
                      I! I
                      co v> tr>
                                                > i
                                                                                                                            1 5
                                                             I

                                                             I '

                                                             I


                                                              '• ••
                                                               .
                                                             I
                                                             I  •
                                                             I  •
                                                             I
                                                             I  i
                                                               !
     FIG.  5.  Principal  components analysis and hierarchical clustering of
   differentially expressed probesets. The list of 87 differentially expressed
   probesets was  used for  the  analyses.  (A)  Two  dimensional  principal
   components analysis by  conditions of treatment,  x-axis: component 1,
   accounting  for  55.92% of variance; y-axis: component 2, accounting  for
   15.21% of variance. Component 1 strongly separates the ova-challenged from
   the saline-challenged rats. (B) Hierarchical clustering  of treatment conditions
   and genes. Expression levels were derived from REDI-adjusted and RMA-
   normalized signal values, which were then normalized to the per gene median
   in order to achieve the color scale (red  indicates a high level of expression,
   green indicates a low level). Clustering shows that the rats fall into two main
   groups:  ova-challenged and saline-challenged.  The  blue bracket indicates
   probesets in a subset of the top branch that are highly expressed in all of the
   CAPs-ova exposed rats and in one rat each from the air-ova and the CAPs-
   saline exposure  groups.
   expression  levels  in  all  but one  of the  saline-challenged
   animals. The top branch was the reverse: it  showed mostly
   higher  expression  in  the  ova-exposed  animals  and lower
   expression values in all but one saline-exposed animal.  One
   gene, Scdl, is shown between these two groups.
                                         Previous
TOC

-------
                 COMPARATIVE MICROARRAY ANALYSIS AND PULMONARY CHANGES IN BROWN NORWAY RATS
                                                                                                                       213
Functional Analysis
  Functional analysis  to  screen the differentially  expressed
genes  in the  three  group  comparisons  was  performed in
Metacore using a larger number of genes (air-saline vs. air-ova =
75, CAPs-saline vs. CAPs-ova = 93, and air-saline  vs. CAPs-
ova = 200) obtained  via  a relaxed  FDR  cutoff of <0.2
(Supplemental Table S2). The most significant Metacore maps
(corresponding  to known biological pathways  curated  by
GeneGo) of the air-saline versus air-ova comparison were split
into two main groups: nucleotide metabolism (maps 1-2, p =
1.279  X 10~2, 2.445  X  10~2,  respectively) and immune
response/histamine  metabolism  (maps  3, 4,  6,  8, p  value
range  = 3.227  X  10~2 to 4.501  X  10~2) (data not shown).
Maps  1 and  5 for  the CAPs-saline vs. CAPs-ova comparison
were related  to cell adhesion and  extracellular matrix (ECM)
remodeling (p = 2.726  X 10~4, 8.206  X  10~3, respectively),
whereas maps  2-A  were  related  to immune response  via
complement  pathways (p value range = 8.324  X 10~4 to
6.367  X 10~3) (data not shown). The top 10 maps for the air-
saline  versus CAPs-ova comparison also broke into two main
groups.  Maps  1-3  were related  to immune response  via
complement pathways (p value range = 3.229 X 10~5 to 9.242 X
10~5), and maps 4-9 related to remodeling and cell adhesion
(p value range  =  1.841 X 10~4 to 3.405 X 10~3) (data not
shown).  GO processes for this comparison broke into  similar
groups, with  the first and  second processes relating to lung
development  (p = 4.0489  X  10~7 and 4.5838 X 10~7) and
processes 3-7 relating to the inflammatory response (p value
range =  6.0503 X 10~7 to 2.2709 X 10~5) (data not shown).
  The most statistically  significant  gene  ontology processes
across all genelists are shown in Figure 6. The top three relate to
tissue development and the fourth to the defense response (Fig.
6).  It is of note that the top four categories were much more
significant in the presence  of CAPs than  with ova treatment
alone. Air-saline versus  CAPs-ova  had the most significant
relationship to lung/respiratory tube  development  and defense
response indicating both inflammatory responses  as  well as
remodeling in response  to  this  treatment. The CAPs-saline
versus CAPs-ova comparison had the most significant relation-
ship with skeletal development and the differentially expressed
genes in  the skeletal development category were predominantly
regulated  via  TGF-p.  This,  along  with  the  significant
enrichment  for the "regulation of TGF-p  receptor signaling"
category (Fig. 6), suggests that changes in TGF-p signaling are
particularly significant when comparing these two  groups. The
expression changes for TGF-p3 (Fig. 3) may explain this effect
in  that transcription  is reduced by CAPs exposure  in the
absence of ova challenge but not in the presence of ovalbumin.
                                                     12
                                                            15
                                                               -log(pValue)

                                                                 1. skeletal development

                                                                 2. lung development

                                                                 3. respiratory tube development

                                                                 4. defense response

                                                                 5. embryonic neurocranium morphogenesis

                                                                 6. epithelial to mesenchymal transition

                                                                 7. regulation of transforming growth
                                                                   factor beta receptor signaling
                                                                   pathway

                                                                 8. negative regulation of phagocytosis

                                                                 9. negative regulation of release of
                                                                   sequestered calcium ion into cytosol

                                                                10. defense response to fungus,
                                                                   incompatible interaction
                           • Processes
                           I^^Bi = air-saline vs. air-ova
                           ^f^m = CAPs-saline vs. CAPs-ova
                           ^^^m = air-saline vs. CAPs-ova

  FIG. 6.  Overrepresented gene ontology categories from Metacore for all three treatment comparisons, y-axis shows the top 10 GO processes sorted by the
lowest p value for any single treatment comparison. Longer bars correspond to lower p values for enrichment of genes in a GO category within the differentially
expressed genes in each treatment contrast with p values ranging from 1CT3 to 1CF15. Categories 1 and 7 were most significantly enriched in the CAPs-saline
versus CAPs-ova treatment. Categories 2-4, 6, and 8-10 were most significantly enriched in the air-saline versus CAPs-ova treatment. Category 5 was most
significantly enriched in the air-saline versus air-ova treatment. Top, air-saline versus air-ova, middle—CAPs-saline versus CAPs-ova, bottom—air-saline versus
CAPs-ova. p Values reported by MetaCore are not corrected for multiple testing.
                                   Previous

-------
214
                                                    HEIDENFELDER ET AL.
The only genes that  overlap between the skeletal and lung/
respiratory tube development processes that are found in these
two genelists were TGF-pl and  TGF-p3, however several of
the other genes from  the skeletal development process  (Fbnl,
CollAl,  CollA2, and Cdhll)  could be involved in lung/
respiratory tube development as well, p  Values reported by
MetaCore are not corrected for multiple testing.
Network Analysis
  The  CAPs-ova  synergy  was  investigated  further with
pathways analysis using Ingenuity  IPA. The highest-scoring
network generated  for air-saline versus  CAPs-ova  had  top
  functions of cellular movement, immune response, cell-to-cell
  signaling and interaction (Fig. 7) and matched the results seen
  with the MetaCore analysis (Fig. 6). The genes in this network
  can be  categorized  as either  related  to  remodeling  (orange
  circle), inflammation (purple circle)  or both. Overlaying data
  relating only to  air-saline  versus  air-ova onto this network
  show  three genes  (FcerlA, Ms4a2,  Bcl6b)  related  to in-
  flammation (Albrecht et al, 2004; Galli et al, 2008), two genes
  (cadherin  11  [Cdhll]  and TGF-p3)  that  fall  under  both
  inflammation and remodeling  (Broide, 2008), and one  gene
  (elastin  [Eln]) that is consistent with  remodeling  only (Shi
  et al., 2007). Overlaying  data relating only to the air-saline
  versus CAPs-ova comparison show a dramatic increase in the
                             Remodeling

Air+ova
CAPs+ova

1
9
Both
2
13
Inf
3
10
                                                                                                         Inflammat
  FIG. 7.  Gene network derived from FADE data from the air-saline versus CAPs-ova comparison was uploaded into Ingenuity IPA. Cutoffs were set at > 1.5-fold
for fold change and < 0.2 for FDR. This network represents the highest-scoring network (score = 61), with genes involved in either inflammatory (purple circle) or
remodeling (orange circle) mechanisms. For nodes, red  = upregulated expression and green = downregulated expression in the CAPs-ova versus air-saline
comparison. Bars adjacent to each node represent statistically significant expression (upregulated or downregulated) for air-saline versus air-ova (left) and air-saline
versus CAPs-ova (right). The table (inset) shows the number of genes falling in each class showing the enhancement of effects by coexposure to CAPs. Note that three
nodes (Sod, Vegf, G-protein beta) in the network correspond to gene families rather than genes and were not counted in the totals, (see Supplemental Table 4).
                                        Previous
TOC

-------
                 COMPARATIVE MICROARRAY ANALYSIS AND PULMONARY CHANGES IN BROWN NORWAY RATS
                                                                                                                         215
number of the genes falling into inflammation, remodeling, or
both (Fig. 7, inset).
   The genes in the remodeling category that do not overlap with
inflammation  are  predominantly upregulated,  indicating  an
increase in pathways that impact remodeling. Expression levels
of the remaining genes in the remodeling category and those in
the inflammation category are  mixed. For example,  the  IgE
receptors FCER1A and MS4A2/FCER1B and the cytokine ILIA
are upregulated as expected, whereas genes (Mmpl2 and CCL5/
RANTES) that have been shown to be upregulated in asthma are
                                                         downregulated in this study. This is possibly due to mechanisms
                                                         such as feedback inhibitory signaling at the level of transcription
                                                         and may not reflect a decrease in the associated  protein. The
                                                         inflammatory and remodeling genes PLAU, PLAUR, and Mmp9
                                                         were found in a screen for genes related to COPD (Wang et al.,
                                                         2007),  and both  PLAU and MMP9 have been implicated  in
                                                         asthma (Begin et al., 2007; Sampsonas et al., 2007). Thus, these
                                                         three genes were included in the network even though their fold
                                                         changes and/or FDRs did not meet cutoff values.  PLAU  came
                                                         very close  to significance,  with an FDR  of 0.146 and a fold
                                                              MucSAC
                                                                                              Total Protein
          Air-saline
                   CAPs-
                   saline
                           Air-ova  CAPs-ova
  B
   «
50
45
40-
35-
30-
25-
20-
15-
10-
 B-
               BALF Cell Summary
                   JIT
          Air-saline  CAPs-
                   saline
                           Air-ova  CAPs -ova
         | • Eoslnopnils a Neulrophils o Lymphocytes
                                                  BALF Cell Summary
Gene Expression Function by
         Treatment
                                            Air-saline  CAPs.
                                                     saline
                                                                    Air-ova  CAPs-ova
                                                                                           CAPs-saline    Air-Ova
                                                                                                                  CAPs- Ova
                                           I Macrophages & Monocytes
                                                                         D Total Cells
                                                                                         I Remodeling
                                                                                                            D Inflammation
        Airway Morphometric Determinations of
                 Volume Density
        7 -
          air-saline   CAPs-    Air-Ova  CAPS-ova
                    saline
       [• Proximal (Generation 5) D Distal (Generation 11) |

  FIG. 8.  Effect of ova sensitization and challenge (with and without coexposure to CAPs) on biochemical and cellular parameters in the BALF and gene
expression and airway morphometry in lung tissue. (A) IgE Mucin SAC and total Protein measured in BALF; (B) BALF cell differential; (C) number of genes
significantly changed in each treatment group (n = 4) by functional category. Data represent the group mean (n = 8) and SEM in each treatment group for (A, B,
D). *Significantly different from air-saline group (p < 0.05); +significantly different from air-ova group (p < 0.05).  (D) Morphometric determinations of the
volume density of AB/PAS-stained IMs lining the axial airways of the left lobe (Proximal airway generation 5 and distal airway generation 11). ^Significantly
different from CAPs-saline group (p < 0.05). Volume density (Vs; nl/mm2).
                                    Previous
                                                   TOC

-------
216
                                                 HEIDENFELDER ET AL.
change of — 1.3 in air-saline versus CAPs-ova group. Overall, the
network analysis shows a perturbation of inflammatory genes in
rodents challenged with allergen. Moreover, genes involved in
the remodeling of lung tissue are altered by CAPs exposure only
in the presence of allergen challenge.

BALF
   Figure 8 contains the results of the biochemical and cellular
analyses  of the BALF and gene  expression and lung morpho-
metric determinations   from each  of the exposure  groups
(Supplemental Table S3 contains the BALF data in table form).
Exposure to CAPs for 13 consecutive days produced no changes
in BALF  proteins or  cellularity in saline-challenged  rats.
Sensitization and challenge with ova caused increases in total
cells (fourfold),  neutrophils  (15-fold), lymphocytes  (16-fold),
eosinophils (fourfold)   (Fig. 8B),  and  mucin  glycoprotein
(mucSAC) (threefold) (Fig. 8A).  CAPs exposure did not affect
ova-induced increases in BALF cellularity, but enhanced ova-
induced  increase in mucin  glycoprotein by  50%  (Fig.  8A).
Furthermore, only ova-challenged rats exposed to CAPs had
significant elevations in total BALF protein. CAPs exposure also
enhanced ova-induced  increases  in ova-specific IgE in BALF
(Fig. 8A). The gene expression changes reflected this same trend
(Fig. 8C) with the number of inflammatory genes increasing
from 5 (OVA alone) to 23 (CAPS plus OVA) and the number of
remodeling genes increasing from 3 (OVA alone) to 22 (CAPS
and OVA). Total IgE in BALF was not significantly increased by
CAPs exposure or ova challenge. The small sample size for this
study did not provide the power required to explicitly test for
correlation, but the cellular  & protein changes  are consistent
with the gene expression results (Figs. 7 and 8C) with the caveat
that the cellular changes are less sensitive with regard to the
enhancement of inflammation and remodeling by CAPs.
   The number of eosinophils in the BALFs from our  ova-
sensitized  and saline-challenged BN rats were higher  than
routinely  seen in other strains/stocks  of rats  (e.g.,  F344,
Sprague-Dawley), but not uncommon for the  BN strain of rat,
and well below previously published reports in non-infected
BN  rats   that  were  similarly  ova-sensitized  and  saline
challenged  (Noritake  et  al.,  2007). The  levels of  both
eosinophils and  neutrophils  in the  BALFs  of  our  ova-
challenged rats were  markedly and significantly  elevated
compared  with our  saline-challenged controls, indicative  of
an allergen-induced hypersensitivity response.

Pulmonary Pathology IMorphometry
   Light microscopic examination of the selected lung sections
from all the rats in this  study, substantiated and correlated well
with the analytical observations of the BALF  from these same
rats. Furthermore, significant morphometric differences among
the groups, described below, correlated with our findings in the
BALF and histopathology.  The  lungs of control rats  (ova-
sensitized and saline-challenged  rats) had  a minimal influx of
eosinophils widely scattered along with a few  mononuclear
  leukocytes (e.g., lymphocytes, plasma cells) in the interstitium
  surrounding some of the bronchiolar airways, In addition, there
  were occasional  small foci of eosinophils and mononuclear
  cells (macrophages,  monocytes,  and  lymphocytes)  in  the
  alveolar parenchyma  of these control rats. These background
  findings are  consistent  with previous  reports on  the  normal
  pulmonary morphology of pathogen-free  BN  rats (Noritake
  et al., 2007). In contrast,  the lungs of BN rats sensitized  and
  challenged with ova had conspicuous allergic bronchiolitis  and
  alveolitis (allergic bronchopneumonia). There was a noticeable
  proximal to distal decrease in the severity of the bronchiolitis in
  the  left  lung lobe  of these rodents  with more inflammatory
  lesions in  the proximal lung section (G5  axial airway level;
  closest to the hilus of the lung lobe) compared with the more
  distal section (Gil  axial airway level). Ova-induced inflamma-
  tory and epithelial lesions in the conducting airways involved
  the  large  diameter,  proximal  axial  airways  and  the small
  diameter, distal pre-terminal and terminal airways.  Inflamma-
  tory and epithelial lesions were usually more severe in the more
  proximal axial bronchioles compared with those in the more
  distal  pre-terminal and  terminal bronchioles. Ova-induced
  bronchiolitis  was  characterized by peribronchiolar  edema
  associated with a mixed inflammatory cell influx of eosinophils,
  lymphocytes, plasma  cells, and occasional neutrophils (Fig. 9).
     Bronchiole-associated lymphoid tissues in the air-ova group
  airways  were  also enlarged  due  to  lymphoid hyperplasia
  relative  to  the   air-saline  group.  Perivascular  interstitial
  accumulation of  a  similar mixture of eosinophils and mono-
  nuclear cells, along with perivascular edema, were also  present
  in the lungs of air-ova rats (i.e., surrounding pulmonary  arteries
  adjacent  to  bronchioles  and  pulmonary  veins  scattered
  throughout the alveolar parenchyma).
     Air-ova rats  had a mild-marked epithelial hypertrophy  and
  mucous  cell metaplasia/hyperplasia (MCM)  with increased
  amounts of AB/P AS-stained mucosubstances in the mucous cells
  within the surface epithelium (i.e., IMs) lining the affected large
  diameter bronchioles, including  the proximal and distal axial
  airways (Fig. 9). Saline-challenged rats (the air-saline group) had
  mucous cells with less IM compared with the ova-challenged (air-
  ova) rats (Fig.  10). There was no significant difference in the
  amounts of IM between air-saline and CAPs-saline rats (Fig. 8D).
     In addition to  the perivascular and peribronchiolar lesions,
  there were varying  sized focal areas of allergic alveolitis in the
  lung parenchyma of  the  air-ova rats.  These alveolar  lesions
  were characterized by  accumulations  of large numbers  of
  alveolar macrophages, epithelioid cells, and eosinophils, with
  lesser numbers of lymphocytes, monocytes, and plasma cells,
  in the alveolar airspace. Often the alveolar septa in these areas
  of  alveolitis were thickened  due  to type  II  pneumocyte
  hyperplasia and  hypertrophy, intracapillary accumulation of
  inflammatory cells, and capillary congestion.
     Ova-challenged rats exposed to CAPs (the CAPs-ova group)
  had a more severe allergic bronchopneumonia than the  air-ova
  group. This was reflected both in the severity and distribution of
                                     Previous
TOG
Next

-------
                 COMPARATIVE MICROARRAY ANALYSIS AND PULMONARY CHANGES IN BROWN NORWAY RATS

                      A                                      B
                                                                                                                     217
                                                                   n   ••:-•     '
                                                                  —^r>^ v
  FIG. 9.  Light photomicrographs of the respiratory epithelium (e) lining the proximal axial airway (generation 5) in the left lung lobe of rats exposed to (A) air/
saline (control), (B) CAPs/saline, (C) air/ova, or (D) CAPs/ova. Compared with the normal airway in the control rats (A), significant morphologic changes are
present only in rats challenged with ova (C, D). The most prominent histologic changes due to ova challenge included a thickened, hypertrophic, respiratory
epithelium with increased numbers of mucous goblet cells, and a mixed inflammatory cell infiltrate (asterisks) consisting mainly of lymphocytes, plasma cells and
widely scattered eosinophils (arrows) in the interstitium (i) of the airway wall. These histologic airway changes are slightly greater in the CAPs/ova rats
(D) compared with those in the air/ova rats  (C). All tissues stained  with H&E. Sm, smooth muscle.
the allergic bronchiliolitis and alveolitis. Air-ova rats had a mild
to moderate allergic bronchopneumonia with the inflammatory
and epithelial lesions in approximately one fourth to one third of
the lung lobe.  In contrast, CAPs-ova rats had a  moderate to
marked bronchopneumonia with lesions in approximately one
half or more of the  lung lobe. CAPs-ova exposed rats also had
more  severe MCM  in the epithelium lining the large diameter
axial airways compared with the air-ova group.
   Thirteen  consecutive  days  of  CAPs  exposure in  saline-
challenged rats did  not cause changes in the morpheme trie ally
measured amounts  of stored mucus in airway epithelium in
either proximal or distal airways (Fig. 8D). The only significant
increases in IMs were found in CAPs-ova rats  (Fig. 8D). This
CAPs-induced enhancement of ova-induced IM correlated with
the increased amounts of muc5AC  in BALF from these animals
(Fig. 8A). There was a trend (though not statistically different)
for a CAPs-related exacerbation of ova-induced increases in the
stored intraepithelial AB/PAS-stained mucosubstances in both
the proximal and  distal axial  airways of the left lung lobe
(quantitative estimate of the severity of MCM).
                       DISCUSSION

  Asthma is  a  complex  airway  disease  involving  gene-
environment interactions. This study used comparative micro-
      array analysis to investigate the effect that CAPs have on a BN
      rodent model of airway allergic inflammation. An enhanced
      effect on the type and magnitude of gene  expression changes
      was observed with  ova-challenged BN rats  coexposed  with
      CAPs. The genomic data  were consistent with the observed
      histopathology.  This combination  of allergen challenge  and
      CAPs exposure led to differential expression of genes related to
      airway remodeling that  was not observed in rodents  treated
      with either allergen or CAPs alone.
        In agreement with the data presented here, air particulates
      have  been  shown to have an adjuvant effect  on  allergic
      reactions  as determined by changes  in  histopathology.  For
      example,  BN rats  exposed to  both CAPs and ova  showed
      adjuvant effects  which resulted in increased airway mucus,
      mucin glycoprotein mucSAC,  pulmonary inflammation,  and
      airway epithelial remodeling  (e.g., MCM) (Harkema et al.,
      2004a). Other studies point to  an adjuvant  effect of diesel
      particulates  on  ova-challenged rodents  which  resulted  in
      enhanced airway hyperresponsiveness and pulmonary inflam-
      mation (Dong et al., 2005; Matsumoto et al., 2006; Miyabara
      et al., 1998;  Takano  et al., 1997). At the  molecular level,
      exposure to diesel particulates in ragweed-sensitized humans
      resulted in an increase in mRNA transcripts of asthma-related
      cytokines (Diaz-Sanchez et al.,  1997).
        CAPs  exposure  alone had no  significant effect  on  gene
      expression  in  the  lung  in this  study.   ApoE/LDL  double
                                   Previous
TOC

-------
218
                                                 HEIDENFELDER ET AL.
                                           e
                                 sm
                                                            B
        e  -
             sm

                                              50um
                                             /


                                                                                 •
               4 •*?***
                            sm
       sm
                                 <
  FIG. 10.  Light photomicrographs of AB/PAS-stained mucosubstances (arrows; dark magenta stain) in mucous goblet cells of the respiratory epithelium (e)
lining the proximal axial airway (generation 5) in the left lung lobe of rats exposed to (A) air/saline (control), (B) CAPs/saline, (C) air/ova, or (D) CAPs/ova.
Compared with the normal airway in the control rat(A), significant increases in the amount of IMs are present only in rats challenged with ova (C, D). This ova-
induced intraepithelial change is slightly greater in the CAPs/ova rats (D) compared with that in the air/ova rats (C). All tissues stained with AB (pH 2.5)/PAS
sequence to detect acidic and neutral mucosubstances. Sm, smooth muscle; I, interstitial layer of the airway wall.
knockout mice exposed to a longer period of inhaled CAPs
resulted in  no significant changes in gene expression  in lung
tissue (Gunnison and Chen,  2005).  Another study in which
mice were  intranasally instilled with CAPs showed increased
gene  expression  in several  cytokines  and  an  increase in
polymorphonuclear cells  in  BALF (Sigaud  et  al.,   2007),
however the authors note  the high dose of particles used for
instillation was a useful for proof-of-principle but not applicable
to realistic  exposures.  A recent toxicogenomics  study found
differential  gene  expression  upon  inhalation exposure of
BALB/c mice to DE (Stevens et al., 2008) in the  presence or
absence of ova. The genes identified in that study were involved
in immune function, cell signaling, and metabolic and oxidative
stress response. Taken together, this suggests that the response
is dependent  on  the  source  of air  pollutants,  individual
susceptibility and possibly route and duration of exposure.
  Ova  challenge  alone resulted in the  largest  changes in
magnitude of gene expression, albeit for a limited number of
genes, in the present study (Fig. 5B). Two  prominent genes
upregulated by ova challenge were Fcerla and Ms4a2/Fcerlb;
receptors for IgE that factor into allergy mediated events, In the
present  study  and  in previous  studies, cytological,  and
immunohistochemical  methods  for  assessing  ova-induced
allergic  airway inflammation in BN rats showed responses
common to asthma such as increased levels of airway mucous
glycoproteins, MCM  in  bronchiolar  epithelium,  increased
numbers of eosinophils, neutrophils,  and lymphocytes  in both
lung tissue and BALF, and increased IgE in the serum (Salmon
et al.,  1999;  Tarayre et al., 1992; Underwood et al., 1995).
Gene expression studies  in ova-challenged mice have impli-
cated a variety of genes that  relate  to inflammation, airway
hyperresponsiveness, atopy, or  mucus  production  and may
impact asthma pathogenesis (Follettie et al., 2006; Izuhara and
Saito, 2006; Kuperman et al, 2005; Walker et al, 2006). Our
study is consistent with these previous reports in that they all
implicate genes involved in inflammation.
  In contrast to the results  seen with ova alone, the effects
specifically due  to CAPs/ova coexposure were more  subtle
though much more widespread with regard to number of genes
effected In  the  current study, CAPs exposure in  the  ova-
challenged rats led to expression changes in remodeling genes,
and a larger number of genes related to inflammation compared
with  ova challenge  alone.  Inflammation  and  remodeling
pathways  centered predominantly  on TGF-pl (Fig.  7).  Two
other growth factors,  TGF-p3  and  VEGF-C,  were   also
identified  in the network. Members  of both  the TGF family
and the VEGF family are thought to be important for airway
remodeling and  inflammation (Lloyd and  Robinson, 2007).
There were 23 TGF-p related inflammatory genes significantly
changed with CAPs-ova coexposure versus five with ova
exposure  alone  (Figs.  7  and 8C).  This corresponded  with
a significant increase in the amount of ova-specific IgE found
in the BALF from  CAPs-ova rats in our study when compared
with either the air-saline  or air-ova groups  (Fig. 8A). This is
                                     Previous

-------
                COMPARATIVE MICROARRAY ANALYSIS AND PULMONARY CHANGES IN BROWN NORWAY RATS
                                                                                                                  219
further evidence for an enhancement of ova challenge by CAPs
via increased allergy mediators.
  Airway remodeling is  a hallmark of asthma  referring  to
changes in structural cells, including the thickening of airway
walls (Lloyd  and  Robinson, 2007). The findings of both
bronchiolar and alveolar wall thickening in air-ova and CAPs-
ova rats in the present study (due to epithelial hypertrophy and
MCM)  are characteristic  histopathologic features of allergic
airway disease  that mimic those reported in asthmatic airways
of  humans. The  MCM  in the epithelium lining the  large
diameter axial airways was more severe in the CAPs-ova rats
compared with the air-ova rats,  consistent with the  gene
expression data implicating remodeling events in the CAPs-ova
exposure group. Significant increases in IM and  total BALF
protein and elevated mucSAC were apparent only in the CAPs-
ova group. Taken  together, pulmonary histopathology  (sub-
jective assessment of inflammatory and epithelial responses),
morphometric measurements (quantitative assessment of airway
epithelial  remodeling),  and  BALF analyses   (quantitative
assessment of  inflammatory  and  mucosecretory responses)
consistently demonstrated that CAPs exposure enhanced the
allergic airway  disease. Typically,  inflammation and remodel-
ing processes are viewed separately, however, there is emerging
data that suggest that smooth muscle changes can contribute
directly to proinflammatory changes that may perpetuate airway
inflammation and  the  development of airway  remodeling
(Broide, 2008). Collectively, our data support this  conclusion.
  In total, 22 TGF-p related genes implicated in remodeling were
significantly up- or downregulated in response to  CAPs-ova
coexposure (Figs. 7 and 8C). Several of these are components  of
the ECM—fibrinogen, vitronectin, nidogen 1, fibrillin 1, elastin,
and matrix Gla protein, and three genes  are related to smooth
muscle and/or the cytoskeleton—calponin 1, alpha  2 actin, and
tropomyosin 3.  Altered ECM protein profiles in asthmatic  lungs
result in an increase in the proliferation rate of airway smooth
muscle  (ASM)  cells,  a feature of  remodeling that causes the
thickening of airway walls (Johnson et al.,  2004).  TGF-pl can
increase the proliferation of smooth muscle cells and the deposition
of ECM proteins, whereas  the smooth muscle cells aid in the
downregulation of  ECM-degrading matrix  metalloproteinases
such as Mmp9 and Mmpl2 (Parameswaran et al., 2006).
  Because  inflammatory  cells  in  the  BALF were not
significantly different  between the CAPs-ova and  CAPs-saline
groups, the changes in gene expression are most likely due  to
an  intracellular transcriptional response from either the  in-
flammatory  cells,  the resident alveolar cells,  or both. Our
results cannot distinguish whether  the transcriptional changes
shown in Figure 7 represent signaling in a single  cell type  or
integrated responses  across multiple cell types,  but it does
provide clues to the signaling changes taking place in response
to  ova  and CAPs  coexposure.  TGF-pl  and TGF-p2  have
previously been causally linked to allergen-induced prolifera-
tion of ASM cells and mucus-producing goblet cell hyperplasia
(Lloyd and Robinson, 2007; McMillan et al., 2005). TGF-p2
      treatment  of primary  bronchial epithelial  cells  in  culture
      produced significant increases in MUCSAC mRNA and protein
      levels as well  as  elevated mucin as measured by AB/PAS
      staining (Chu et al., 2004). The results from the current study
      show more pronounced changes  in TGF-p3 transcript levels
      suggesting  that  TGF-p3 may also play a key role in lung
      remodeling. This  is  supported by reports   of  delays  in
      pulmonary  development in TGF-p3 knockout mice (Kaartinen
      et al., 1995). This  study supports previous findings of TGF-p
      involvement in  lung remodeling, implicates a third member of
      this  gene family  in the  process, provides clues as  to  the
      downstream  mediators  of  this  process,  and   shows  an
      enhancement of these effects upon coexposure to CAPs.
        A concern from an air regulatory point of view is whether
      allergic  individuals are more sensitive to adverse effects from
      exposure to air PM than nonallergic individuals. Application of
      new genomic  technologies  together with  more  traditional
      lexicological approaches provide a means to begin to  address
      links among inhalation exposures to fine particles, genetic and
      metabolic changes in  the lung  cells in response to these
      exposures,  and  aggravation of the symptoms of asthma. The
      integration  of  these  approaches can  add important new  in-
      formation to  the knowledge  base regarding specific  asthma
      mechanisms at more environmentally relevant doses of PM2.5 to
      help ensure adequate protection against adverse health effects,
      particularly for susceptible individuals.
        Asthma is a disease that results from a variety of environmental
      factors acting on a background of genetic factors. There are likely
      subtypes of asthmatics with varying susceptibility to a wide array
      of environmental exposures. Using a rodent model of allergic
      airway disease,  this study explores the application of emerging
      toxicogenomic tools in conjunction with bronchoalveolar lavage,
      pulmonary pathology and morphometric analyses to investigate
      the multifactorial etiology of allergy induced airway inflamma-
      tion using ova-sensitized and challenged BN rats exposed to
      environmentally relevant CAPs. By using a sensitive model such
      as this, the mechanistic basis for the environmental influence can
      be  more easily characterized. Although the contribution of
      genetics remains uncharacterized, this study provides a frame-
      work for interpreting  human studies which incorporate  the
      genetic variability most relevant for risk assessment.
                      SUPPLEMENTARY DATA

        Supplementary data are  available online  at http://toxsci.
      oxfordjournals .org/.
                              FUNDING

        The research described in this paper was funded wholly by
      the    United   States   Environmental  Protection   Agency
      (EPD07069 to jh and ED-D-07-017 for Expression Analysis).
                                  Previous
TOG
Next

-------
220
                                                           HEIDENFELDER ET AL.
                     ACKNOWLEDGMENTS

   We would like to thank Drs Susan Hester, Ian Gilmour, and
Markey Johnson for their helpful comments and careful review
of this manuscript. We also acknowledge Expression Analysis,
Durham, NC, for chip hybridization and preliminary analyses.
This research was subjected to review by the National Health
and Environmental Effects Research Laboratory and approved
for  publication.  Approval does not signify that the contents
necessarily reflect the views and policies  of the Agency nor
does mention of trade names or commercial products constitute
endorsement or recommendation for use.
                          REFERENCES

Abadie, A., and Prouvost-Danon, A. (1980). Specific and total IgE responses to
  antigenic  stimuli in Brown-Norway,  Lewis  and Sprague-Dawley rats.
  Immunology 39, 561-569.
Albrecht, E. A., Chinnaiyan, A. M., Varambally, S., Kumar-Sinha, C, Barrette, T. R.,
  Sarma, J. V.,  and Ward, P.  A.  (2004).  C5a-induced  gene expression in
  human umbilical vein endothelial cells. Am. J. Pathol. 164, 849-859.
Becker, S., Soukup, J. M., and Gallagher, J.  E. (2002). Differential particulate
  air pollution induced oxidant stress in human granulocytes, monocytes and
  alveolar macrophages. Toxicol. In Vitro 16, 209-218.
Begin, P., Tremblay, K.,  Daley, D., Lemire, M.,  Claveau, S., Salesse,  C.,
  Kacel,  S.,  Montpetit,  A.,  Becker,  A., Chan-Yeung,  M., et al.  (2007).
  Association of urokinase-type plasminogen activator with asthma and atopy.
  Am. J. Respir.  Crit. Care Med. 175, 1109-1116.
Broide, D. H. (2008). Immunologic  and inflammatory mechanisms that drive
  asthma progression to remodeling. /. Allergy Clin. Immunol. 121, 560- 570;
  quiz 571-572.
Busse, W. W., and Mitchell, H. (2007). Addressing  issues of asthma in inner-
  city children. /. Allergy Clin. Immunol. 119, 43^9.
Chu, H. W., Balzar, S., Seedorf, G. J., Westcott, J. Y., Trudeau, J. B., Silkoff, P.,
  and Wenzel, S.  E. (2004). Transforming growth factor-beta2 induces bronchial
  epithelial mucin expression in asthma. Am. J. Pathol. 165, 1097-1106.
Diaz-Sanchez,  D., Tsien, A., Fleming, J., and  Saxon, A. (1997). Combined
  diesel exhaust particulate and ragweed allergen challenge markedly enhances
  human in vivo  nasal ragweed-specific IgE  and skews cytokine production to
  a T helper cell 2-type pattern. /. Immunol. 158, 2406-2413.
Dong, C. C., Yin, X. J., Ma, J. Y., Millecchia, L., Wu, Z. X., Barger, M. W.,
  Roberts, J. R.,  Antonini, J. M., Dey, R. D., and Ma, J. K. (2005). Effect of
  diesel exhaust  particles on  allergic reactions and airway responsiveness in
  ovalbumin-sensitized brown Norway rats.  Toxicol. Sci. 88, 202-212.
Follettie,  M.  T., Ellis, D. K., Donaldson, D. D., Hill, A. A., Diesl,  V.,
  DeClercq, C., Sypek, J. P., Dorner, A. J.,  and Wills-Karp, M. (2006). Gene
  expression analysis in a murine model of allergic asthma reveals overlapping
  disease and therapy dependent pathways in the lung. Pharmacogenomics J.
  6, 141-152.
Galli,  S.  J., Tsai, M., and Piliponsky, A.  M. (2008). The development of
  allergic inflammation. Nature 454, 445^-54.
Gilmour, M. L, Jaakkola, M. S., London, S.  J.,  Nel, A. E., and Rogers, C. A.
  (2006).  How  exposure  to   environmental  tobacco smoke, outdoor  air
  pollutants, and increased pollen burdens influences the incidence of asthma.
  Environ. Health Perspect. 114, 627-633.
Gunnison, A., and Chen,  L.  C. (2005). Effects of subchronic exposures to
  concentrated ambient particles (CAPs) in mice. VI. Gene expression in heart
  and lung tissue. Inhal. Toxicol. 17, 225-233.
   Harkema, J. R., Barr, E. B., and Hotchkiss, J. A. (1997a). Responses of rat nasal
     epithelium to short- and  long-term exposures of ozone: Image analysis of
     epithelial injury, adaptation and repair. Microsc. Res. Tech. 36, 276-286.
   Harkema, J. R., Hotchkiss, J. A., and Griffith, W. C. (1997b).  Mucous cell
     metaplasia in rat  nasal epithelium  after a 20-month exposure to ozone:
     A morphometric study of epithelial differentiation. Am. J. Respir. Cell. Mol.
     Biol. 16,  521-530.
   Harkema, J. R., Keeler, G., Wagner, J., Morishita, M., Timm, E., Hotchkiss, J.,
     Marsik, F., Dvonch,  T., Kaminski, N., and Barr, E.  (2004a). Effects of
     concentrated ambient particles on normal and hypersecretory airways in rats.
     Res. Rep. Health Eff.  Inst. 1- 68; discussion 69-79.
   Harkema, J. R., Keeler, G., Wagner, J., Morishita, M., Timm, E., Hotchkiss, J.,
     Marsik, F., Dvonch, T., Kaminski, N., and Barr, E. (2004b). HEI Research
     Report. Effects of inhaled urban air  particulates  on  normal and hyper-
     secretory airways  in rats.  Health Effects Institute, Cambridge, MA.
   Harkema, J. R., Plopper, C. G., Hyde, D.  M., and St George, J. A.  (1987a).
     Regional differences  in quantities of histochemically detectable mucosub-
     stances in nasal, paranasal, and nasopharyngeal epithelium of the bonnet
     monkey. /. Histochem. Cytochem. 35,  279-286.
   Harkema, J. R., Plopper, C. G., Hyde, D. M., St George, J. A., and Dungworth, D. L.
     (1987b). Effects of an ambient level of ozone on  primate nasal epithelial
     mucosubstances. Quantitative histochemistry. Am. J. Pathol. 127, 90-96.
   Izuhara, K., and Saito, H.  (2006). Microarray-based identification of novel
     biomarkers in asthma. Allergol. Int. 55, 361-367.
   Izzotti, A.,  Bagnasco, M.,  Cartiglia,  C., Longobardi, M., Balansky, R.  M.,
     Merello,  A., Lubet, R. A., and De  Flora,  S.  (2005). Chemoprevention of
     genome, transcriptome, and proteome alterations induced by cigarette smoke
     in rat lung. Eur. J. Cancer 41, 1864-1874.
   Johnson, P. R., Burgess, J.  K., Underwood,  P.  A., Au, W., Poniris, M. H.,
     Tamm, M., Ge, Q., Roth, M., and Black, J. L. (2004). Extracellular matrix
     proteins modulate  asthmatic airway smooth muscle cell proliferation via an
     autocrine mechanism. /. Allergy Clin. Immunol. 113, 690-696.
   Kaartinen,  V.,  Voncken,  J.  W.,  Shuler,  C., Warburton,  D.,  Bu,  D.,
     Heisterkamp, N., and Groffen, J. (1995). Abnormal lung development and
     cleft palate in mice  lacking TGF-beta  3  indicates defects of epithelial-
     mesenchymal interaction. Nat.  Genet. 11, 415^-21.
   Keeler, G.  J.,  Morishita, M.,  Wagner,  J. G., and Harkema, J. R.  (2007).
     Characterization of urban atmospheres  during inhalation exposure studies in
     Detroit and Grand Rapids, Michigan. Toxicol. Pathol. 35, 15-22.
   Keller, M. B., and Lowenstein, S. R. (2002). Epidemiology of asthma. Semin.
     Respir. Crit. Care Med. 23, 317-329.
   Kleeberger, S.  R., and  Peden, D. (2005).  Gene-environment interactions in
     asthma and other respiratory diseases. Annu. Rev. Med. 56, 383^1-00.
   Kooter,  L,  Pennings,  J.,  Opperhuizen,  A.,  and Cassee, F. (2005). Gene
     expression pattern in spontaneously hypertensive  rats  exposed to urban
     particulate matter (EHC-93). Inhal. Toxicol. 17, 53-65.
   Kuperman,  D.  A.,  Lewis,  C.  C.,  Woodruff,  P.  G., Rodriguez,  M.  W.,
     Yang, Y.  H., Dolganov, G. M., Fahy, J. V., and Erie, D. J. (2005). Dissecting
     asthma using focused transgenic  modeling  and  functional  genomics.
     /. Allergy Clin. Immunol.  116, 305-311.
   Leikauf, G. D., McDowell, S.  A.,  Wesselkamper,  S.  C.,  Miller,  C. R.,
     Hardie,  W.  D.,  Gammon,  K.,  Biswas,  P.  P.,  Korfhagen,  T.  R.,
     Bachurski, C. J., Wiest, J. S., et al.  (2001). Pathogenomic mechanisms for
     particulate matter induction of acute lung injury and inflammation in mice.
     Res. Rep. Health Eff.  Inst. 5- 58; discussion 59-71.
   Lloyd, C.  M.,  and Robinson,  D.  S.   (2007).  Allergen-induced  airway
     remodelling. Eur. Respir. J. 29, 1020-1032.
   London, S.  J.  (2007). Gene-air pollution  interactions  in asthma. Proc. Am.
     Thome. Soc. 4, 217-220.
   Matsumoto, A., Hiramatsu,  K., Li, Y., Azuma, A., Kudoh, S., Takizawa, H.,
     and Sugawara, I. (2006). Repeated exposure to low-dose diesel exhaust after
                                             Previous
TOC
Next

-------
                    COMPARATIVE MICROARRAY ANALYSIS AND PULMONARY CHANGES IN BROWN NORWAY RATS
                                                                                                                                            221
  allergen challenge exaggerates asthmatic responses in mice. din. Immunol.
  121, 227-235.
McCunney, R. J. (2005). Asthma, genes, and air pollution. /. Occup. Environ.
  Med. 47, 1285-1291.
McMillan, S. J., Xanthou, G., and Lloyd, C. M. (2005). Manipulation of allergen-
  induced airway remodeling by treatment with anti-TGF-beta antibody: Effect
  on the Smad signaling pathway. /. Immunol.  174, 5774—5780.
Meng, Q. R., Gideon, K. M.,  Harbo, S. J., Renne, R. A.,  Lee, M.  K.,
  Brys, A. M., and Jones, R. (2006). Gene expression profiling in lung tissues
  from mice exposed to cigarette smoke,  lipopolysaccharide, or smoke plus
  lipopolysaccharide by inhalation. Inhal. Toxicol. 18, 555-568.
Miyabara, Y., Takano, H., Ichinose,  T., Lim, H.  B., and Sagai,  M. (1998).
  Diesel exhaust enhances allergic airway inflammation and hyperresponsive-
  ness in mice. Am. J. Respir. Crit. Care Med. 157, 1138-1144.
Moller, M., Gravenor, M. B., Roberts, S. E., Sun, D., Gao, P., and Hopkin, J. M.
  (2007). Genetic haplotypes of Th-2 immune signalling link allergy to enhanced
  protection to parasitic worms. Hum. Mol. Genet. 16, 1828-1836.
Morishita, M., Keeler, G.,  Wagner, J., Marsik, F., Timm, E., Dvonch, J., and
  Harkema,  J. (2004). Pulmonary retention of particulate matter is associated
  with airway inflammation in allergic rats exposed to air pollution in urban
  Detroit. Inhal. Toxicol.  16, 663-674.
Nadadur, S. S., and  Kodavanti, U. P.  (2002). Altered gene expression profiles
  of rat lung in response to an emission particulate and its metal constituents.
  /. Toxicol. Environ. Health A 65, 1333-1350.
Nadadur, S. S., Pinkerton, K. E., and Kodavanti, U. P. (2002). Pulmonary gene
  expression profiles of spontaneously hypertensive rats exposed  to environ-
  mental tobacco smoke.  Chest 121,  83S-84S.
Noritake,  S., Ogawa,  K.,  Suzuki, G.,  Ozawa, K., and Ikeda,  T. (2007).
  Pulmonary inflammation in brown Norway rats: Possible  association  of
  environmental particles  in the animal room environment. Exp.  Anim.  56,
  319-327.
Ober, C., and Hoffjan, S. (2006). Asthma genetics 2006: The long and winding
  road to gene discovery. Genes Immun. 7, 95-100.
Parameswaran, K., Willems-Widyastuti, A., Alagappan, V. K., Radford, K.,
  Kranenburg, A. R., and Sharma, H. S. (2006). Role of extracellular matrix
  and its regulators in human airway smooth  muscle biology. Cell Biochem.
  Biophys. 44, 139-146.
Park,  J. W., Taube, C., Swasey, C.,  Kodama, T., Joetham, A., Balhorn, A.,
  Takeda, K.,  Miyahara,  N., Allen, C.  B.,  Dakhama,  A., et  al. (2004).
  Interleukin-1  receptor  antagonist  attenuates  airway hyperresponsiveness
  following  exposure  to ozone. Am. J. Respir. Cell. Mol. Biol. 30, 830-836.
Pauwels, R., Bazin, H., Platteau, B., and Van Der Straeten, M. (1979). Relation
  between total  serum IgE levels and IgE antibody production in rats. Int.
  Arch. Allergy Appl. Immunol. 58, 351-357.
Prahalad, A.  K., Inmon, J., Dailey, L. A., Madden, M. C., Ohio,  A.  J., and
  Gallagher, J. E. (2001). Air pollution particles mediated oxidative DNA base
  damage in a cell free system and in human airway epithelial cells in relation
  to particulate  metal content and  bioreactivity. Chem. Res. Toxicol.  14,
  879-887.
Riedl, M., and Diaz-Sanchez,  D. (2005). Biology of diesel exhaust effects on
  respiratory function. /. Allergy Clin. Immunol. 115, 221- 228; quiz 229.
Salmon, M., Walsh, D. A., Koto, H., Barnes, P. J., and Chung, K. F. (1999).
  Repeated allergen exposure of sensitized Brown-Norway rats induces airway
  cell DNA  synthesis and remodelling. Eur. Respir. J.  14, 633-641.
Sampsonas,   F.,  Kaparianos,  A.,   Lykouras,  D.,  Karkoulias,   K.,  and
  Spiropoulos,  K. (2007).  DNA sequence variations of metalloproteinases:
  Their role in asthma and COPD. Postgrad. Med. J. 83, 244-250.
Sato,  H., Sagai,  M., Suzuki,  K. T., and Aoki, Y.  (1999). Identification, by
  cDNA microarray, of A-raf and proliferating cell nuclear antigen as genes
  induced in rat lung by exposure  to diesel exhaust. Res.  Commun. Mol.
  Pathol. Pharmacol. 105, 77-86.
Shi, W., Bellusci, S., and Warburton, D. (2007). Lung development and adult
  lung diseases. Chest 132, 651-656.
Sigaud, S., Goldsmith, C. A., Zhou, H., Yang, Z., Fedulov, A., Imrich, A., and
  Kobzik, L. (2007). Air pollution particles diminish bacterial clearance in the
  primed lungs of mice. Toxicol. Appl.  Pharmacol. 223, 1-9.
Sioutas, C.,  Koutrakis, P., Godleski, J. J., Ferguson, S. T., Kim, C. S., and
  Burton, R. M. (1997). Fine particle concentrators for inhalation exposures—
  Effect of particle size and composition. /. Aerosol Sci. 28, 1057-1071.
Stevens, T., Krantz, Q. T., Linak, W. P., Hester, S., and Gilmour, M. I. (2008).
  Increased  transcription of immune and metabolic pathways  in naive and
  allergic mice exposed to diesel exhaust. Toxicol. Sci. 102, 359-370.
Takano, H., Yoshikawa, T.,  Ichinose, T.,  Miyabara,  Y., Imaoka, K., and
  Sagai, M. (1997). Diesel exhaust particles enhance antigen-induced  airway
  inflammation and local cytokine expression in mice. Am. J. Respir.  Crit.
  Care Med. 156, 36-42.
Takenaka, H., Zhang, K., Diaz-Sanchez, D., Tsien, A., and Saxon, A. (1995).
  Enhanced human  IgE production  results from exposure to the  aromatic
  hydrocarbons from diesel exhaust: Direct effects on B-cell IgE production. /.
  Allergy Clin. Immunol. 95,  103-115.
Tarayre, J. P., Aliaga, M., Barbara,  M., Tisseyre, N., Vieu, S.,  and Tisne-
  Versailles, J. (1992). Model of bronchial allergic inflammation in the brown
  Norway rat.  Pharmacological  modulation. Int. J. Immunopharmacol. 14,
  847-855.
Tibshirani, R., Hastie, T., Narasimhan, B., and Chu, G. (2002). Diagnosis of
  multiple cancer types by shrunken centroids of gene expression. Proc. Natl.
  Acad. Sci. U. S. A. 99, 6567-6572.
Tsien,  A., Diaz-Sanchez,  D.,  Ma, J.,  and Saxon, A. (1997). The organic
  component of  diesel  exhaust  particles and phenanthrene, a major poly-
  aromatic hydrocarbon constituent, enhances IgE production by IgE-secreting
  EBV-transformed human B  cells in vitro. Toxicol. Appl. Pharmacol.  142,
  256-263.
Underwood, S. L., Kemeny, D. M., Lee, T. H., Raeburn, D., and Karlsson, J. A.
  (1995).  IgE production, antigen-induced  airway  inflammation and  airway
  hyperreactivity in the  brown Norway rat:  The effects of ricin. Immunology
  85, 256-261.
Walker,  J.  K.,  Ahumada,  A., Frank,  B.,   Gaspard,  R.,  Berman,  K.,
  Quackenbush,  J.,   and  Schwartz,  D.  A.   (2006).  Multistrain  genetic
  comparisons reveal CCR5 as a  receptor involved in airway hyperresponsive-
  ness. Am.  J. Respir. Cell. Mol. Biol. 34, 711-718.
Wang, I.  M.,  Stepaniants,  S.,  Boie,  Y.,  Mortimer,  J.  R.,  Kennedy, B.,
  Elliott, M., Hayashi, S., Loy, L., Coulter, S., Cervino, S., et al. (2007). Gene
  expression profiling in patients with chronic obstructive pulmonary disease
  and lung cancer. Am. J. Respir. Crit.  Care Med.
Williams,  A.  S., Issa,  R.,  Leung, S.  Y.,  Nath, P.,  Ferguson,  G.  D.,
  Bennett, B.  L., Adcock,  I. M., and Chung, K. F.  (2007). Attenuation of
  ozone-induced  airway inflammation and  hyper-responsiveness by  c-Jun
  NH2 terminal kinase  inhibitor SP600125. /.  Pharmacol. Exp. Ther.  322,
  351-359.
Wise, H., Balharry, D., Reynolds, L. J., Sexton, K., and Richards, R. J. (2006).
  Conventional and toxicogenomic assessment of the acute pulmonary damage
  induced by  the instillation of Cardiff PM10  into the rat lung. Sci. Total
  Environ. 360, 60-67.
Yanagisawa,  R.,  Takano,  H.,   Inoue,  K.,  Ichinose,  T.,  Yoshida,  S.,
  Sadakane, K., Takeda, K., Yoshino, S., Yamaki, K., Kumagai, Y., et al.
  (2004).  Complementary  DNA microarray  analysis  in  acute lung injury
  induced by lipopolysaccharide  and diesel exhaust particles. Exp. Biol. Med.
  (Maywood) 229, 1081-1087.
Yeatts, K., Svendsen, E., Creason,  J., Alexis, N., Herbst, M., Scott, J., Kupper, L.,
  Williams,  R., Neas, L., Cascio, W., et al. (2007). Coarse particulate matter
  (PM2.5-10)  affects heart rate variability,  blood lipids, and circulating
  eosinophils in adults with asthma. Environ. Health Perspect. 115, 709-714.
                                                                   TOC
              Next

-------
J Pharmacokinet Pharmacodyn (2008) 35:683-712
DOI 10.1007/sl0928-008-9108-2
Comparing models for perfluorooctanoic acid
pharmacokinetics using Bayesian analysis

John F. Wambaugh • Hugh A. Barton •
R. Woodrow Setzer
Received: 13 May 2008/Accepted: 8 December 2008 / Published online: 8 January 2009
© Springer Science+Business Media, LLC 2008
Abstract   Selecting  the  appropriate  pharmacokinetic (PK) model  given  the
available data is investigated for perfluorooctanoic acid (PFOA), which has been
widely  analyzed with  an  empirical, one-compartment  model.  This research
examined the results of experiments [Kemper R. A., DuPont Haskell Laboratories,
USEPA Administrative Record AR-226.1499 (2003)] that administered  single oral
or iv doses of PFOA to  adult male  and female rats. PFOA concentration was
observed over time; in plasma for some animals and in fecal and urinary excretion
for others.  There were four rats per dose group,  for a total of 36 males and  36
females.  Assuming  that the PK parameters for each individual within a gender
were drawn from the same, biologically varying population, plasma and excretion
data were jointly analyzed using a hierarchical framework to  separate uncertainty
due to measurement error from  actual biological variability. Bayesian analysis
using Markov Chain Monte  Carlo (MCMC) provides tools  to perform such  an
analysis as well as  quantitative diagnostics to evaluate and discriminate between
models. Starting from a one-compartment PK model with  separate clearances to
urine and feces, the model was incrementally expanded using Bayesian measures
to assess if the expansion  was supported  by the data. PFOA excretion is sexually
dimorphic in rats; male rats have bi-phasic elimination that  is  roughly 40 times
slower than that of  the females, which appear to have a single elimination phase.
The male and female data were analyzed separately, keeping  only  the parameters
describing  the  measurement process in  common.  For  male  rats,   including
Electronic supplementary material  The online version of this article (doi:
10.1007/sl0928-008-9108-2) contains supplementary material, which is available to authorized users.
J. F. Wambaugh (El) • H. A. Barton • R. W. Setzer
National Center for Computational Toxicology, US EPA, Research Triangle Park, NC 27711, USA
e-mail: wambaugh.john@epa.gov
                                                                   4y Springer
               Previous

-------
684                                   J Pharmacokinet Pharmacodyn (2008) 35:683-712

excretion  data  initially  decreased certainty in the one-compartment parameter
estimates  compared  to  an  analysis using plasma  data  only. Allowing  a third,
unspecified clearance improved agreement and increased certainty when all the
data was used,  however a significant amount of eliminated PFOA was estimated
to be missing from  the excretion data. Adding an additional PK compartment
reduced the unaccounted-for elimination to amounts comparable to the cage wash.
For both sexes, an MCMC estimate of the  appropriateness of a model for a given
data type, the Deviance Information Criterion, indicated that this  two-compart-
ment model was better suited to describing PFOA PK. The median estimate was
142.1 ± 37.6 ml/kg  for  the  volume  of  the  primary   compartment  and
1.24 ± 1.1 ml/kg/h for the clearances of male rats and  166.4 ± 46.8 ml/kg and
30.3 ± 13.2 ml/kg/h, respectively for female rats.  The estimates for the second
compartment differed greatly with gender—volume 311.8 ± 453.9  ml/kg  with
clearance  3.2 ± 6.2  for males and 1400 ± 2507.5 ml/kg and 4.3 ±  2.2 ml/kg/h
for females.  The median estimated clearance was 12 ± 6% to  feces  and 85 ± 7%
to urine for male rats and 8 ± 6% and 77 ± 9% for female rats. We conclude that
the available data may support more  models  for  PFOA PK beyond two-com-
partments and that the methods employed  here will be generally useful for more
complicated, including PBPK, models.

Keyword  Bayesian analysis • Pharmacokinetics • Model comparison •
Hierarchical models • WinBUGS • Perfluorooctanoic acid  • Sprague-Dawley rats •
Markov Chain Monte Carlo •  Risk assessment • Sexually  dimorphic  •
Deviance Information Criterion • Population models
Introduction

Compartmental pharmacokinetic models range in complexity from first order, one
compartment descriptions to physiologically based pharmacokinetic (PBPK) models
of the absorption, distribution, metabolism, and excretion of a given compound by
an organism using biologically relevant compartments connected by biologically
based flows. The concentration of the toxicant in each compartment is described by
differential equations that contain parameters which may be biologically based or
phenomenological and  about which there may or may not be prior experimental
knowledge [1].  Calibrating  a pharmacokinetic  model to describe a particular
organism  may  therefore  require simultaneously  determining a large number of
parameters from  a  sparse data set. For  this reason the choice of the specific
pharmacokinetic model used for  a particular  toxicant in a  particular  organism
depends both on the behavior of the toxicant and the relevant data available  [2-6].
Bayesian  techniques  are particularly well-suited  to the  task  of  evaluating
pharmacokinetic models:  distributions of model parameters can be determined for
complicated model structures and these distributions can remain broad if the data
does not inform the model; knowledge about model parameters can be incorporated
in the form of prior distributions; and diverse model structures can be quantitatively
compared with each other [7-9].
4y Springer
               Previous

-------
J Pharmacokinet Pharmacodyn (2008) 35:683-712                                    685

   We are investigating perfluorooctanoic acid (PFOA), a perfluorinated fatty acid
analog that  is of concern  in part because it  has been detected in the general
population and wildlife around the world [10]. PFOA is a persistent chemical with a
human half-life that has been calculated to be 4 years [11]. Because it is wide-
spread and has a long half-life, PFOA is of interest to toxicology and environmental
risk assessment [11-14]. PFOA is not metabolized [15], and at first glance seems
ideally  suited  to  simple,  empirical  pharmacokinetic approaches  that  model
organisms as one  or two well-mixed compartments. The  half-life of PFOA is
much more rapid in animals than humans, ranging from hours in female rats to days
in male  rats to weeks  in  monkeys  [10].  These differences  lead  to  estimated
exposures that  vary by  several  orders  of  magnitude  between species.  PFOA
excretion is well known to be sexually dimorphic in rats and there is evidence that
testosterone-dependent expression of specific transporters is responsible [16-18].
Surveys of human plasma concentration have found no clear gender differences [13].
If the pharmacokinetics of PFOA is driven by transporters, then saturation of these
transporters at high concentrations may be a further complication [12].
   Determining the level of detail sufficient to characterize the relationship between
exposure and response for  PFOA is an  ongoing question.  There is  evidence in
studies on monkeys for saturable kinetics resulting in non-linear dose-dependence at
high concentrations  [12], however serial measures of PFOA serum concentration in
occupationally exposed humans are  consistent with linear kinetics [11]. Since the
human kinetics  are presumed  to be  at steady-state  and  the  observed  PFOA
concentrations in the general population of the United States are even lower than the
occupationally exposed [13], it  may be hard to provide the  information needed to
support complicated pharmacokinetic models. There is  limited  evidence for dose-
dependent pharmacokinetics in  rats  [19], but much of the serial rat pharmacoki-
netics data [20] considered by the US EPA draft risk assessment appears linear with
dose. Some, but not all, of the time-courses indicate bi-phasic elimination behavior.
Because the choice of model  is  not clear by inspection, and since the human data
appear linear, a one-compartment model for PFOA has been used by the draft risk
assessment and elsewhere  [14,  21].  Given the limited human data, care must be
taken if a more  complicated  model—whether it  be two-compartments, a biolog-
ically based model like that  of Andersen et al. [12], or even a full PBPK model—is
to be used to for PFOA pharmacokinetics.
   This research has two goals. First, we wish to examine the tradeoff between
making models more sophisticated  and  introducing more  unknowns that could
potentially increase uncertainty. We do this to help the development of verifiable
methodologies for understanding when a model is appropriate for the available data
[7, 22]. We  used empirical pharmacokinetic  models with very few parameters to
describe data for rats exposed to PFOA. Because analytic solutions exist for these
models they can be rapidly solved,  allowing Bayesian analysis with little prior
information  via Markov Chain Monte  Carlo  (MCMC).  The  methodology  of
Bayesian statistics allows us to make model comparisons  and determine which
parameters can be estimated and the extent to which the data inform these estimates.
Because  not all parameters  in empirical pharmacokinetic models  have direct
biological analogs,  we  made  relatively uninformative  assumptions about  the
                                                                  4y Springer
               Previous   I     TOC

-------
686                                    J Pharmacokinet Pharmacodyn (2008) 35:683-712

parameters to investigate whether the data truly contains information  about the
parameters for our  assumed models. This lets us assess  the usefulness of these
techniques for estimating pharmacokinetic parameters in the hypothetical situation
of no prior estimates. When we find that the available data does not inform a model
parameter, we then learn that we must simplify the model, make an assumption
about the parameter value, or acquire more data.
   Our second goal  is to characterize the  empirical pharmacokinetics of PFOA in
rats. Since  a  one-compartment  model  has been  widely used,  quantitatively
determining  the  relative   advantages  of a two-compartment  model  is both
immediately useful  and instructive for future model development. We assumed a
population model for the parameters of individual rats allowing us to attempt to
separate measurement uncertainty from biological variability. This approach allows
more accurate assessment  of the information that  would be gained by additional
experiments as well as greater predictive power [23]. Since a population model is
one method for allowing different types of measurements to inform each other, we
jointly analyzed  plasma  and  excretion  data to  determine,  for  instance, the
elimination of PFOA from the plasma and the fractions of that elimination that is
excreted  in urine and feces. Reconciling data  of different types requires careful
model  construction—for a poorer model, additional data may actually increase
apparent  uncertainty for parameter estimates.
   Since  there is  ample reason  to expect that  PFOA pharmacokinetics are more
complicated than can be described by a one compartment model, PFOA provides an
opportunity to develop quantitative arguments for more complicated models that are
both immediately useful to PFOA risk assessment  and generally useful for data-
driving modeling.
Methods

WinBUGS version  1.4.2 [24] was used  to  describe  and analyze  population
pharmacokinetic models for  PFOA in rats. WinBUGS uses MCMC techniques to
make Bayesian  statistical inferences about the probability distribution of the
parameters in the model. Analyses were performed on dual processor 3 GHz
Pentium-D desktop computer.
   The PKBUGS [25] pharmacokinetic interface for WinBUGS was not used, in
order to allow the examination of slightly modified compartmental models with two
elimination rates instead of one. The freely available R statistical software package
[26] and the R2WinBUGS [27]  library were used to parameterize our model and run
WinBUGS. To  increase processing  speed, compiled functions were  created to
calculate pharmacokinetic properties using WBDEV, the WinBUGS  developer
package [28], and the freely available Blackbox Component Pascal language [29] in
which WinBUGS is written.
   Following  the example of Gelman et al. [2] we identify the following  three
components of our model; the individual pharmacokinetic model, the  population
model, and the measurement model.
4y Springer
               Previous

-------
J Pharmacokinet Pharmacodyn (2008) 35:683-712                                    687

Individual pharmacokinetic model

One-compartment model

The one-compartment model consisted of a volume of distribution Vj (ml/kg) from
which  there is  clearance CL (ml/kg/h). Although  physiologically the volume of
distribution and the clearances are independent of each other, the mathematical form
of the time-dependence of the concentration depends on the ratio of the two, i.e. the
elimination  rates k = ^ . For  the  actual MCMC implementation we sampled
elimination  rates rather than  clearances because the rate could be changed
independently of the volume of distribution and  therefore allowed more rapid
convergence.
  Because we have separate data for urinary and fecal clearance, two clearances, Cluri
and Clfec, are distinguishable. However, initial attempts to model excretion using a
total clearance that is the sum of the separate clearances, Cl — Cluri + Clfec failed.
Instead, we had to parameterize using a total clearance Cl and two fractions FRACuri
and FRACfec, where Clx — FRACX x ClwAFRACuri + FRACfec
-------
                                       J Pharmacokinet Pharmacodyn (2008) 35:683-712
occurs with a clearance Clj. The parameter vector now contains these two additional
parameters:
                9 = (Vd, Vt, 0, FRACfec, FRACuri, Cld, ka,f).
   For the two-compartment  model the differential equations for the concentration
in the two compartments are:
                dCi     f • doseor  _kat   Cl     Cld
                -                     --
                                      (C,-0).                          (4)

   These equations can also be solved to give an analytic solution for the primary
compartment (we do not need the solution for the secondary compartment), C\ (Q,i).
   For both the one- and two-compartment models the concentration and integrated
concentration were each  implemented in WinBUGS as analytic functions using
Blackbox Component Pascal [29].

Population model

A hierarchical statistical framework that assumes that each individual has  its own
pharmacokinetic parameters that have  been drawn from  the  same  population
distribution was used. We distinguish between parameters — which lead to predictions
for the observed data — and the hyper-parameters that characterize the distribution of
the parameters. Because of the sexually dimorphic excretion of PFOA, we assumed
separate populations for male and female pharmacokinetic parameters.
   For most of the pharmacokinetic model parameters we assume that there are two
population hyper-parameters that generally  describe the mean  (v)  and standard
deviation (r\) of a log-normally  distributed population of individual parameter
values.  Achieving convergence  for  the  estimated  distributions  of  population
standard  deviations was  the slowest  of all parameter types, necessitating long
MCMC run times. For this reason, cross-correlations were  not included,  as they
would be even less informed by the available data.
   Since we have excretion data indicating that the bioavailability was likely less
than one, we felt comfortable assuming that non-radiolabeled PFOA in the mass-
balance experiments was  not a significant factor. For the bioavailability to be less
than or equal to one and the excretion fractions to sum to one, we assume that those
parameters  are Dirichlet  distributed. The forms  of the distributions are given in
Table 1. For the one-compartment model:
                  Vc ~ TLN(Vd • pop • fi,Vd • pop • T, —oo, 8)
                  ktot ~ TLN(ktot • pop • n, ktot • pop • T, -oo, -0.5)
                  ka ~ TLN(ka • pop • fi,ka • pop • T, —4, 4)
              FRACx~D(FRAChyper)
                   f~D(BIOAVAILhyper)
   The log-normal distribution  constrains  the individual parameters  to positive
values. In order to speed convergence the truncated log-normal distribution, which
4y Springer
               Previous   I     TOC

-------
J Pharmacokinet Pharmacodyn (2008) 35:683-712
Table 1 Definition of probability distributions
               truncated log-normal
                                               Dirichlet
Notation         TLN(x, /j.,t,a, b)
Random variable  Univariate x, e" < x < eb
Hyper-parameters  Location /i and precision T of log*
Form
                                               D(x, a)
                                               Multivariate x, Y%=\ *; = 1 , 0 < *, < 1
                                               a is a k element vector
                                                 *
is  both constrained to elower<  x < euPPer  ancj normalized to the interval [lower,
upper}, was used to sample  individual values. By preventing extreme individual
parameter values, the likelihood is more sensitive to changes in parameter value.
With  this  approximation,  the posterior  parameter  value  distributions  of  all
individuals must be inspected to ensure that mass has not  accumulated  near a
truncation boundary.
   There are two additional parameter distributions for the two  compartment model:
                  Vt ~ TLN(Vt • pop • n,Vt- pop • T, -oo, 8)
                 ki2 ~ TLN(k\2 • pop • jj., k\2 • pop • T, —oo, —1)

where fe12 — CIJVC. We used the WinBUGS Markov chain Monte Carlo package to
determine a Bayesian posterior probability distribution [30]  for all parameters and
hyper-parameters.  We explain this process in  detail  in the section  "Bayesian
Analysis."  The number  of iterations necessary  for  convergence  was  greatly
decreased by working with the logarithm of the parameters when creating new
samples. By estimating the logarithm of the parameter, changes in parameter value
are of a proportional size, rather than fixed, speeding up convergence.
   A  Bayesian analysis  requires prior  probability distributions for the  hyper-
parameters to characterize previous knowledge  about their values. We specified
distributions that were uniform over an arbitrary interval that was hopefully large
enough to encompass the true value of each parameter. An important check of this
assumption was to ensure that there is little probability mass near these  arbitrary
limits in the converged posterior distributions.
   We chose to use priors  on the actual parameter values, as opposed to the log-
transformed parameters, so that the units of the prior distributions were the same as
parameters themselves. We believe that using uniform priors, rather than more
numerically efficient conjugate priors, permits the upper and lower limits of  the
parameter values to be more  accessible to a general biological modeling audience.
   We assumed the prior distributions indicated by Table 2, which were the same
for the analyses of all data and both  one-  and two-compartment models,  with  the
exception that different ka priors were used  for males and females. In most cases  the
mean  and standard deviation  hyper-parameters are drawn  from  the  uniform
distribution  U(x, min, max) = mca1_min , min < x < max  [31].  We  made  use  of
distributions  that gave  the logarithms of parameters but  were uniform on  the
original scale.
               Previous
                                    TOC
                                                                    & Springer

-------
690                                    J Pharmacokinet Pharmacodyn (2008) 35:683-712


Table 2 Population priors for pharmacokinetic hyper-parameters
Parameter
vc
k,0,
Y%=\ FRAChyper
Y?i=i BIOAVAILhyper
ka
V,
kl2
Units
ml/kg
1/h


1/h
ml/kg
1/h
Mean
Min
10
10"4
1
1
IO"2



Max
IO3
1
50
50
20


SD
Min
10
10"3


10"2


Location Scale
Max Min Max Min
2 x IO3
2


60 -4 5.5 IO"4
0 8 IO"4
-25 -2 IO"1

Max




15
IO3
25
   In addition to the use of log-transformed variables, we also found that sampling
speed could be improved if, for some parameters, we sampled the precision and not
the standard deviation. For instance, if the standard deviation is large it is insensitive
to small perturbations to its value. Since the precision is  the reciprocal of the
squared standard deviation, it is a very small number for large standard  deviations
and much more sensitive to changes by the sampler.
   Since we wished to use a uniform prior for the mean v and standard deviation
r\, it was necessary to convert to  log-normal location /x =  log(v) — |log(l +^r)
and scale r = Wlog(l +^r)  to draw log-normally distributed individual param-
eters.  Problems estimating scale parameters in hierarchical models are well known
[31, 32]. In our case, because the log-normally transformed variables /x and i each
depend  on both  v  and a, varying the mean and standard deviation independently
greatly  slows the approach to convergence. For example, if all individuals have a
parameter value near 100, then since log(lOO) x 4.6 we  know that fj, x 4.6. If
the standard deviation is 10, then a mean of 100 gives fj,  x  4.6. If the standard
deviation is  100, however, the likelihood of large parameter values grows, so that
for fi  to be centered where the  individuals are located, the  mean must actually be
larger than before,  xl27. If the standard deviation is larger still, say  1,000, then
the mean needed to describe  a log-normal distribution  located at fi x 4.6 is
«325.  In short,  for large standard deviations the choice of mean depends on the
choice of standard  deviation.
   For the female rats, the clearance occurred so rapidly that only the  lower limit
of absorption  was  sharply  defined  (the absorption must  be sufficiently fast to
explain  the presence of PFOA  in the plasma). Because data was collected  at the
same  time points  for males and females, we effectively have less female  data.
With  few data  points, it is hard to set  an  upper limit since  the  concentration
primarily depends  upon e~ka —for ka greater than  10 this term is effectively zero.
For this reason,  the standard deviation can be large for the  female rats and we
found that we could not achieve convergence by using the same parameterization
as for the other variables. Instead we estimated the log-normal parameters fi and T,
and specified  uniform priors on the  transformed  variables.  This amounts to a
4y Springer
               Previous  I      TOC

-------
J Pharmacokinet Pharmacodyn (2008) 35:683-712                                     691

non-uniform prior on the mean and standard deviation for ka, but analyses of male
rat  data using uniform mean  and  standard  deviation  or uniform  log-normal
location and scale gave similar results. For both male and female rats, we found
we  also had to use  uniform priors on the log-normal location and scale  for the
inter-compartmental  clearance rate fe12.


Measurement model

We assume a single measurement model with the same parameters for both male
and female rats.
  For each  individual  we either have  a  series of  observations  of  plasma
concentration y(tl) at several discrete times t1, or data for the fraction of the PFOA
dose recovered separately in urine and feces at times t2, yuri(t2) and yfec(t2).
  To evaluate the likelihood that a given set of pharmacokinetic parameters for an
individual, 9, gave rise to observations  made for  that individual,  we  model the
measurements as normally distributed about the model values for those parameters:
                                          ),y.',(Q,ti))                      (5)
where N(x, v, d) = \f^f~^-x~^  and  the  precision y.x of each observation y is
modeled as:

                                                     ~                    (6)

Note that this precision depends both on the concentration and a constant offset £c.
Thus  large concentrations are considered  to be less precise, but the precision for
small concentrations  does not become arbitrarily large. This  is  similar to the
approach of Rocke and Lorenzato [33] except that both the concentration-dependent
and independent contributions to error are lumped together and  are  normally
distributed in order to reduce the number of calculations necessary to find the error.
   The corresponding standard deviation y.v is:
                                             ))c+ec.                      (7)
   When observations were below the limit of detection or quantitation, we modeled
a censored observation by requiring that the modeled value lie below the smallest
observed quantity for that dose-level because  we did not have the actual limit of
detection for each observation dose group. We allowed the measurement threshold
to depend on dose-level in case different dilutions were used for the different doses.
   Observations of the urinary elimination are  similarly modeled as:
                     yUri(ti) ~N(x, elimuri(9, ti),yuri.i;(9, ?,-))                  (8)

where the precision  has  the  same form as  for the concentration but different
parameter values:
                 yuri-^d, t) =  (e"uri x elimuri(9, t+e                    (9)

and the elimination over the time interval (f,_i, ?,) is:
                                                                    4y Springer
               Previous  I      TOC

-------
692                                    J Pharmacokinet Pharmacodyn (2008) 35:683-712
                  elimuri(Q, ti) = - -   - -    C(6, t')dt'.              (10)
                                doseor
   For the  fecal  elimination  observations  we  note  two features  that need
modification  of  our one-  and  two-compartment  analyses.  First,  there is  a
conspicuously large  fraction of the  PFOA that is eliminated in the  first  one or
two observed eliminations that does not seem well-described  by an exponential
process. This might be due to a bioavailability-like issue in which some fraction of
the initial dose passes straight through  to feces. We include a fraction of 1 —/of
oral doses in the first sample of the feces.
                    yfec(ti) ~N(x, elimfec(9, t^y^.^O, f,-))                 (11)

with the precision is  given by:
                                     elimfec(9, t+e^                 (12)
and the elimination is calculated as:

                         Clfec
      elimfec(9,ti) =
                    doseor
                                                A             \
                            C(9,t>) + (1 -/)    d°Se°r - ,5(0 }dt>
                                            doseor + doseiv     J
where 8(t) is the Dirac <5-function defined as J^cSt/lX)) = /(0).
   Note that times earlier than the initial dosing time — assumed to be t — 0 here —
must be truncated when solving for concentrations. This is handled by the Blackbox
functions.
   Another aspect of the measurements of PFOA in feces and urine is that not all of
the urine and feces is recovered until the cages  are washed at the end of the study.
Usually, around 1 % of the total dose was found in the cage and residual feed. In one
instance, however, the  amount recovered was 16%. Because we do  not expect the
contribution to cage wash to be consistent from measurement-to-measurement, the
contribution to total cage wash would have to be estimated for each elimination data
point. For the sake of model simplicity  we have not done this, and  as a result we
have potentially increased estimates of ECfec and Ecuri. When we later model a third
route of elimination, in addition to feces and urine, we can interpret this unknown
fraction as  an estimated average loss to  cage wash.
   The models for precision introduce several additional parameters which must be
estimated with the MCMC  analysis. For each of the parameters we must specify a
prior  distribution reflecting  our  assumptions  about  that parameter. The priors
indicated in Table 3 were found to be sufficiently general that for all types of data
analyzed and both the  one- and two-compartment models that there was minimal
posterior probability mass near the limits of the priors.
4y Springer
               Previous  I     TOC

-------
J Pharmacokinet Pharmacodyn (2008) 35:683-712
                                                                           693
Table 3 Prior distributions assumed for error model parameters
Distribution
E" Uniform
Eb Normal
EC Uniform
ffec Uniform
ffec Normal
ffec Uniform
Eauri Uniform
Eburi Normal
furi Uniform
Minimum Maximum Mean
10~2 10
1
10~8 5 x 10~3
10"2 20
1
10~6 0.5
10"2 5
1
10~6 0.5
SD

1


1


1

Bayesian analysis

Markov Chain Monte Carlo assists Bayesian statistical analysis  by constructing
samples from the posterior distribution of the model parameters using the priors and
likelihood to construct a Markov Chain, a kind of random walk through the space
defined by the parameters: each new step in the walk being  a  new full set of
parameter values.  The chain is constructed  so  that, in the long run, the joint
probability  distribution  of the collection  of steps  converges to  the posterior
distribution  [30].  Generally, parameters  in  the posterior  distribution are not
independent of each other; entire sets of parameter values must be drawn from the
chain at once  to  preserve  inter-parameter correlations.  Multiple chains  can be
created from different initial conditions to determine if similar posterior distribu-
tions are reached [30].
   Typically, priors for model parameters would be based on the results of previous
relevant  experiments, and physical constraints  relevant  to  the system being
modeled.  However,  there  are  few  very  stringent physical constraints  on the
parameters of the one- or two-compartment models used here, and we chose not to
base priors on previous experimental work  in order to evaluate the effectiveness of
Bayesian analysis  for a novel chemical. Priors for most parameters were  instead
minimally informative proper uniform prior distributions.
   To estimate the posterior distributions of the individual and hyper parameters,
two Markov chains were initialized using values drawn from the prior distributions
and several  hundred thousand iterations of the WinBUGS sampler were typically
run for both chains simultaneously on separate processors. Parameter  values in
consecutive samples  are often very  similar;  to minimize this  auto-correlation a
thinning interval n was selected such that only every wth sample was recorded and
the intervening samples were discarded.  Depending on the total  number of
iterations, the thinning interval was  varied so that 8,000 evenly  spaced sampler
states were recorded.  The first half of each  chain was treated as a "burn in" period
during which  the  chain converges to the  posterior distribution.  The final 4,000
sampled states were then taken as the estimate of the Bayesian posterior distribution
[30].
                                                                    4y Springer
               Previous
TOC

-------
694                                    J Pharmacokinet Pharmacodyn (2008) 35:683-712

   Estimating the posterior is itself an iterative process in which we ran the sampler
for two chains, evaluated their convergence and then, if the chains did not pass our
convergence tests, either modified our model or re-initialized using draws from the
chains and ran the chains  for more iterations.
   When using MCMC to perform a Bayesian analysis a primary concern is whether
a chain is sufficiently long to fully estimate the posterior  distribution.  Since it  is
possible  to obtain physically  plausible posterior distributions from unconverged
chains the utmost care must be taken  in assessing convergence [34,  35]. Though
convergence can never be proven, we apply the following four methods to estimate
chain convergence.
   We first visually inspected the traces of parameter value  with iterations to detect
transients dependent upon the initial conditions and obvious auto-correlation. We
then used the Convergence Diagnosis and Output Analysis (CODA) package [36] as
implemented for R  [37]. We  use Gelman's  f quantity  [38]  that compares the
variance from multiple  chains to the variance within each chain. Values close to
unity indicate convergence, and we typically run until f < 1.02 for every  parameter.
We also applied the Raftery and Lewis  criteria for estimating sufficient chain length
[39].  Following  Vicini and Dodds MCMC  approach to Bayesian analysis  of
hierarchical pharmacokinetics  models, we used  less restrictive  values for the
Raftery and Lewis convergence criteria than the CODA default [40]. For a target
quantile of 0.025 to 0.975 and a target accuracy of ±0.02 we estimated the chain
length sufficient  for a  90%  probability of attaining our  target accuracy.  These
parameters are appropriate when the parameter probability distributions do not have
"fat" tails [39].
   We found that at the least both the Raftery and Lewis criteria and the Gelman f
were necessary to determine convergence—insufficiently long chains could pass
one without passing the other.
   Comparing  prior to posterior distributions helps us recognize parameters that are
poorly identified by the available data, allowing us to either simplify the model or
choose different  prior  distributions.   Additionally,  analysis  of the  posterior
distributions provides  a fourth  criteria for assessing  parameter estimates: large
probability mass near a boundary of the prior distribution indicates that we have not
used a sufficiently vague prior. If we observed evidence of an over-restrictive prior,
we widened the boundaries on the prior by roughly an order of magnitude and reran
the sampler.

Data analyzed

The data set analyzed includes the results from several different experiments [20] on
adult male and female Sprague-Dawley rats using radioactively labeled PFOA. The
mean and standard deviation of the body weights  were 229.9 ± 22.9 g for males
and  191.1 ± 12.4 g for females. Serial plasma concentration time course samples
were collected for doses of approximately 0.1, 1,  5 and 25 mg/kg of PFOA, that
were administered by oral gavage in cohorts of four each of males and females. For
the male rats,  samples were collected  0.25, 0.5, 1,  2, 4, 8,  12, 16, and 24 h after
dosing; then every 24 h through the 8th day; and then every 48 h through  the 22 day
4y Springer
               Previous

-------
J Pharmacokinet Pharmacodyn (2008) 35:683-712
                                                                         695
for a total of 23 data points. For the female rats, for whom clearance was observed
to be much faster, samples were collected 0.25, 0.5, 1, 2, 4, 8, 12, 16, 24, 36, 48, 72,
and 96 h after dosing for a total  of 13 data points. Plasma concentration was also
monitored at the same time intervals for additional cohorts of four male and four
female rats that  were  administered  approximately 1 mg/kg intravenously. An
additional,  extended time course cohort of four  male and four female rats was
administered 0.1 mg/kg, by oral gavage. The males in the extended time course
cohort were monitored for 2,016 h (84 days) and the females were monitored for
312 h (13 days).
   For three additional cohorts of four male and four female rats each, PFOA was
administered by oral gavage at doses of approximately 1,  5 and  25 mg/kg. For the
male rats, feces and urine were collected after 4,  8,  12, and 24 h; then after 24-h
intervals  through  the 14 day; and then after 48-h intervals through  the 28th day,
giving a total of 24 data points per male rat. For female rats the same time points
were collected only until the 7th day, giving a total of 10 data points. Radioactivity
was monitored in liquid samples using liquid scintillation counting  (LSC) and in
solid  sample using combustion followed by  LSC. The limit of quantitation was
1 ppb. Because PFOA is not  known to  be metabolized, radioactivity  from
metabolites was not considered as  an influence on the results of the mass-balance
experiments.
   The entire data set includes 36 male and 36 female rats.  The investigators fed the
rats PMI Nutrition  International, LLC Certified Rodent Lab  Diet 5002. The rats
were  fasted overnight prior  to  dosing and  for  approximately 24  h afterwards.
Haskell Animal Welfare Committee guidelines were  followed.
Results

The medians of the distributions for the population means and standard deviations
of the pharmacokinetic parameters for the two-compartment model are summarized
in Table 4. Only a portion of the two-compartment results are reported in detail
Table 4 Means and standard deviations of two compartment model parameter distributions
Population parameter
vc
v,
ci,0,
FRACfec
FRACuri
Cld
Ka
Unavailability /
Units
ml/kg
ml/kg
ml/kg/h


ml/kg/h
1/h

Male
Mean
142.10
311.80
1.24
0.12
0.85
3.17
5.18
0.97

SD
37.61
453.85
1.08
0.06
0.07
6.20
29.45
0.05
Female
Mean
166.40
1400.00
30.33
0.08
0.77
4.30
3.01
0.96

SD
46.80
2507.50
13.15
0.06
0.09
2.21
2.93
0.03
               Previous
TOC
                                                                   4y Springer

-------
696
                                        J Pharmacokinet Pharmacodyn (2008) 35:683-712
here.  Complete results for both the one- and  two-compartment model analyses,
including simulated experiments, are available in on-line supplementary material.

Two compartment pharmacokinetics for PFOA

To generate our final, converged chains we used 960,000 iterations, which required
over 63 h per chain for the plasma, urine  and feces data jointly.
  The distribution of predictions for the two-compartment model is  shown for male
rats in Fig. 1.  The addition of a second compartment leads to plasma concentration
predictions that better capture  the higher PFOA plasma concentration  and more
rapid  elimination at early times than the single-compartment model. The scatter of
the  predictions  is  slightly broader than with  the  simpler model, but  simulated
experiments using  means  of four simulated animals at each dose  look  more like
the  experimental observations. Although  the  two-compartment  model  allows
    1x10'
    2x10'
    5x10'
                I      I     I     I     I
           0   100  200   300   400   500
                                          1 x10 '
                                          2x10~'
                                          5x10"'
                                                          1  mg/kg Oral
                  i      i     i     i     i
             0    100   200   300   400   500
 D)
    2x10~'
• 1 mg/kg Intravenous
fffi

1
1)
1

'(
'l

I
:.



•(



1



(
1

1
, 1
J [

1 1
1 :


1 i
1 i


i
i j


| <
1 i

t
J ii
                                          1 x10
                                          2x10~'
                                          5x10~'
    5x 10~4 -I                          I  Tl  1 x 10~;
           0   100  200   300   400   500
                                                          5 mg/kg Oral
                                                                Mil
                                                  0    100   200   300   400  500
    2x10'
                                                    .1 mg/kg Oral (Extended)
    5x10'
           0   100  200   300   400   500            0
                                        Time (h)
                                                        500    1000    1500   2000
Fig. 1  Distribution of predicted PFOA plasma concentrations using a two-compartment model for male
rats. We compare the  means (D) and 95% confidence interval (vertical lines) of 500  simulated
experiments to the observations. The mean observations (•) from cohorts of four animals for each dosing
fall mostly within the confidence intervals
4y Springer
                Previous
TOC

-------
J Pharmacokinet Pharmacodyn (2008) 35:683-712
                                                                             697
two-phases  of clearance, the male extended time course data are still underesti-
mated, possibly indicating the need for  an additional  phase of even slower re-
distribution and clearance.
   For the male rats the predicted urine excretion, shown in Fig. 2, better matches
the observations  than the  one-compartment  model predictions.  The  predicted
excretion to feces, also shown in Fig. 2, is  generally higher than the observations for
the 5 mg/kg  dose, although  the  predictions  for  both  1 and  25 mg/kg  seem
appropriate.
   Though the excretion data also indicates a time lag—PFOA is observed in the
plasma and urine roughly 6 h before appearing in the feces—we had little success in
estimating this parameter using simple modifications to one- and two-compartment
pharmacokinetics  so we did not include a fecal time lag in our final model.
   Because of the  rapid clearance of PFOA from female rats, the predicted plasma
pharmacokinetics  shown in Fig. 3  do not appear especially  different from the
    1 x10
    1 x10 '
    1 x10 '
                  1  mg/kg - Feces
                                           1x10
                                           1x10 '
                                           1x10 '
                                        1 mg/kg - Urine
                                                                  ttfffff
           0  100  200 300  400  500  600
                                                  n    i    i    i    i    i    i
                                                  0   100  200  300  400  500  600
 S.
 T3
 CD
 §  1x10-
 O
 Q
     x10"
5 mg/kg - Feces
                         1x10"' -
                         1x10"
                                           1x10"
5 mg/kg - Urine
           0  100  200 300  400  500  600
                                                  0   100  200  300  400  500  600
    ixio
    1 x10'
    1 x10'
           i    i    i     i    r
           0   100  200 300  400  500  600
                                           1x10
                                           1x10 J -
                                           1x10 '
                                         Time (h)
                                                         25 mg/kg - Urine
                                                  0   100  200  300  400  500  600
Fig. 2 Distribution of predicted fecal and urinary PFOA excretion using a two-compartment model for
male rats. We compare the means (D) and 95% confidence interval (vertical lines) of 500 simulated
experiments to the observations. The mean observations (•) from cohorts of four animals for each dosing
fall mostly within the confidence intervals
                                                                      4y Springer
                Previous
                   TOC

-------
                                        J Pharmacokinet Pharmacodyn (2008) 35:683-712
    1x10~
    1x10~
    1x10~
"
0.1
*
mg/kg oral
•
                                           1 x10~
                                           IxKT
                                                           1 mg/kg oral
                                                  f* ?
                      10     15    20
                                                  0     5     10     15    20
 o
 LL
 Q_
    1x10~
    1x10'
    1x10~
    1x10
    1x10 '
    1x10 '

s
0 5

f f

1 mg/kg iv
1 t i
M t
i i i
10 15 20
25 mg/kg oral
* * t i
1
                                           1 x10 '
                                           1 x10 '
                                                           5 mg/kg oral
                                                             10     15    20
~
1 x10~3 -
IxKT4 -
1 x10~5 -
0.1 mg/kg oral (Extended)

ff f
Mi
t l
                      10     15    20
                                          Time (h)
                                                            10    15    20
Fig. 3 Distribution of predicted PFOA plasma concentrations using a two-compartment model for
female rats. We compare the means (D) and 95% confidence interval (vertical lines) of 500 simulated
experiments to the observations. The mean observations (•) from cohorts of four animals for each dosing
fall  mostly within the confidence  intervals Although plasma sample were collected  after 30 h, the
concentrations were below the limit of quantitation
one-compartment case. One clear difference between the models, however, is that
the estimated  measurement  error is reduced for the  two-compartment model,
resulting in much better agreement at long times in the extended time-course data.
The mean excretion for female rats predicted using the two-compartment estimates,
shown in Fig. 4, is also very similar to the one-compartment case. However, the
uncertainty about these  predictions  is significantly  reduced and  the estimated
measurement error seems more similar to the observed point-to-point variation.
   In Fig. 5 we plot the mean and 95% quantiles for the posterior distribution for the
parameters for every  male rat. All  the parameter distributions  are too broad to
demonstrate systematic dose-dependence. However, for the four male rats receiving
the lowest dose (0.1 mg/kg) but not monitored over extended time, the distribution
of Vc  is centered noticeably higher than for  all other male rats,  including the
extended time-course low dose category. There is  no similar change with dose for


0 Springer
                Previous
TOC

-------
J Pharmacokinet Pharmacodyn (2008) 35:683-712
    1 x10
    1 x10 '
    1 x 1CT
[ i 1 mg/kg - Feces
II
i
i


i
r
4

1
litiij
                                           1x10"'
                                           1x10"'
                                                         1 mg/kg - Urine
                                                             11111
                  50
                          100       150
                                                         50       100
                                                                          150
t 1x10-1 -
0
Q_
T3
0
0
g 1x10'3-
o
0
ce
0
ID
I









5 mg/kg - Feces


• i
1
1






1

m
T I
1 T T t 1
1x10~3 -
1x10~5 -
[I
P
f
5
t
mg/kg
t
t
Urine
t
t +
                  50
                          100       150
                                                         50       100
                                                                          150
 c
 O
 O  1 x10
 (3
    1 x10 •
    1 x10 '
[ i 25 mg/kg - Feces
"I

|
k

tttui
1x10 1 -
1x10~3 -
1x10~5 -
L 25 mg/kg - Urine
n

f f 1 1 1 1
iii iii
50 100 150 50 100 150
Time (h)
Fig. 4 Distribution of predicted fecal and urinary PFOA excretion using a two-compartment model for
female rats. We compare the means (D) and 95% confidence interval (vertical lines) of 500 simulated
experiments to the observations. The mean observations (•) from cohorts of four animals for each dosing
fall mostly within the confidence intervals

the volume of distribution of the second compartment. One possible explanation for
the differences in Vc is that an error in the dosing of those animals caused them to
receive less than the desired 0.1 mg/kg. Separate parameter estimations including an
additional factor  multiplying the  doses  received  by  those  four  animals were
performed, resulting in an estimate that those animals may have received only 0.67
of the reported dose. Since there is no further evidence of this hypothesis, it was not
included in the final analyses. For the male excretion data, qualitatively different fits
are found for the 1 mg/kg dose group and one of the 5 mg/kg animals than are found
for  the higher-dose excretion data  and plasma  data. The total clearance, Cltot, is
substantially higher for these animals. Variation in the female parameter estimates
was less pronounced. The individual fits to the data for these animals indicate that
the feces data is better fit than the urine data. For the higher dose excretion data the
urine data is better fit than the feces. There is no discernable dose-dependence in the
means of the inter-compartmental clearance Clj, but the density of the distribution


                                                                      0 Springer
                Previous
TOC

-------
700
                                          J Pharmacokinet Pharmacodyn (2008) 35:683-712
 O)
 .^
    S-
          0.1
                     \       I       I
                     1       5     25
 O
          0.1
                     1       5     25
                                           O
                                                               1       5     25
    8-
                                           >,  °>-
                                           -S1  o
                                           o
                                           in
                                              in
                                              o
          0.1
                                  25
                                                    0.1
                                                                            25
                                      Dose (mg/kg)
Fig. 5 Dose dependence of two-compartment model parameters for male rats. The parameters for the
individual animals give an estimate of parameter dependence on dose and study type. Individuals (indicated
by tone) are grouped into four study types (indicated by symbol): orally dosed, plasma time-course (O),
orally dosed, plasma extended time-course (D), iv-dosed, plasma time-course (O), and orally dosed, urine
and feces time-course (A). Each symbol indicates the mean parameter values and upper and lower error bars
indicate the 0.975 and 0.025 quantiles respectively. The x-axis is not continuous—individuals have been
offset from their actual dose value for clarity

for  the lowest dose  is much broader  than for the higher  doses. For the two-
compartment model there is  no  obvious dose  dependence of fraction of  PFOA
excreted to feces  or urine.
  As  a  reality  check on  our  pharmacokinetic estimates  we examined  the
dependence of certainty about individual parameter estimates  on the type of data
available  for and dosing  administered to that animal, as indicated by  symbol in
Fig. 5. As expected, while the estimated means do not significantly vary, the breadth
of the distributions depend strongly upon dose-type. For the intravenously dosed
animals the volume of the central compartment Vc is well known since  it is  not
confounded by the  rate of absorption ka.  Correspondingly, the probability density
for  ka is quite broad for iv-dosed  animals since this parameter is unrelated to the
4y Springer
                Previous
TOC

-------
J Pharmacokinet Pharmacodyn (2008) 35:683-712
                                                                              701
experimental observations and is only informed by the population distribution. Also
as expected, for all of the parameters the distributions for animals where excretion
data was available are  much broader.
   The predicted exposure,  as quantified by the integrated area under the plasma
concentration curve,  is lower for the male rats when a two-compartment model is
used. The distribution  of the ratio of the two-compartment  AUC to the one-
compartment  AUC is centered near 2/3, indicating a reduced predicted exposure.
This lower ratio  is  driven  by the  excretion  data—when using the plasma data
alone the difference  between the one- and two-compartment models is  much less.
The ratio is closer to unity for the  female rats,  but is still peaked at slightly less
than one.
Measurement model parameter estimates

Using 500 draws from our  posterior parameter distributions, we calculated  a
distribution of  predictions  for  every observed time point and subtracted each
prediction from the observed value to obtain a distribution of residuals. We scaled
the residuals by the standard deviation predicted using the measurement model. In
Fig. 6 the median of these standardized residuals is plotted  against  the  median
prediction for each male  rat observation. For the  majority of data  points  the
predicted residuals are evenly distributed  about zero  and within  three standard
deviations a wide range of values and experiment types. Some model predictions,
however,  deviate  significantly  from observations.  Additionally,  the  predicted
standard deviations appear  to be  over-estimated since the standardized residuals
are clustered near zero.
 
-------
702                                    J Pharmacokinet Pharmacodyn (2008) 35:683-712


Table 5 The medians of the estimated measurement parameters
Data
Plasma
Urine
Feces
£a
0.14
2.11
7.65
f
0.94
1.85
1.67
EC
5.62 x 10~8
2.36 x 10~3
2.05 x 10~4
Heteroskedasticity
0.646
0.001
0.068
   As was  clear in Fig. 1, at low concentrations (~10 5 mg/1) the male plasma
extended time-course observations are significantly underestimated and correspond-
ingly some of the residuals in Fig. 6 are as large as 100 standard deviations from the
actual observation and are beyond the plotted region. For the male excretion time
courses, large residuals are predicted for a few  of the large feces measurements,
corresponding in Fig.  2 to the observation of the large initial quantity of PFOA in
feces that we have approximated as being excreted in the first time point,  but
actually was occasionally distributed over the first few time points.
   In Table 5 we list the estimated measurement parameters for the plasma, urine
and feces data. To give a sense of heteroskedasticity—how the relative magnitude
of the error varies with concentration—we also indicate the ratio  of error for a
concentration of 10~2 to the error for concentration  10~5. Constant error would be
indicated by a ratio of 1.0, but there was pronounced variation in the magnitude of
the measurement error for all data sets, especially the excretion data.

Model comparison

To compare  the one-  and  two-compartment models and assess the  additional
information gained by jointly analyzing the plasma concentration and excretion data
we used the posterior distributions to predict simulated data, investigated obvious
discrepancies between model predictions and data,  and then  examined  general
model appropriateness  criteria for Bayesian  analyses. The  posterior predictive
distribution analysis is available in supplementary material on-line.

Unknown elimination  in one-compartment model

We  performed separate analyses for two  data  sets, one of just  the 24 plasma
concentration  curves   for  each  gender,  and  one  including  both  the  plasma
concentration time-courses as well as the  mass-balance  time courses for  the  12
additional animals of each gender for whom the amount of PFOA in feces and urine
was measured. When including the mass-balance  data, separate clearances Clfec and
dun can be estimated for the feces and urine excretion.
   We first attempted to estimate the clearance Cl as the sum of separate clearances
Clfec and Cluri, but had difficulty creating converged chains. The number of degrees
of freedom pD estimated by WinBUGS became negative, indicating major problems
reconciling the data with the model [41, 42] (see the section "Model  appropriate-
ness  given the available data" for further discussion of pD). Especially of concern
was  that,  for a one compartment model  in  which  excretion  to urine  and feces
4y Springer
               Previous   I     TOC

-------
J Pharmacokinet Pharmacodyn (2008) 35:683-712
                                                                            703
             O d
              c
              
-------
704
                                        J Pharmacokinet Pharmacodyn (2008) 35:683-712

   R-
 2
 CL
    t  °^
    o
    t
    I  °H
    Q
    2
    CL
      0.0    0.2    0.4   0.6    0.8    1.0
     Fraction of Eliminated PFOA in Excretion
         0.0    0.2    0.4   0.6    0.8    1.0
         Fraction of Eliminated PFOA in Excretion
Fig. 8 Distribution of the estimated fraction of PFOA eliminated from the plasma that is found in urine
and feces for male rats. By allowing a third, unknown route of excretion the total PFOA found in urine
and feces can be less than the estimated elimination from the plasma. The parameters estimated for the
one compartment model (left) are unable to account for nearly a third of the eliminated PFOA, while for
the two compartment model (right) only a small amount, if any, is missing

two-compartment model,  the discrepancy between unknown elimination and cage
wash is resolved.
   Somewhat of the reverse is found for the female case: For the one-compartment
model, all of the eliminated PFOA is almost certainly accounted for by excretion.
When we use the two-compartment model, however,  we find that the uncertainty
has actually increased. This seems to indicate that using a two-compartment model
for female rats may  introduce unnecessary parameters.

Model appropriateness given the available data

Model parameter estimates were qualitatively assessed by examining the difference
between the prior and posterior parameter distributions to determine whether model
parameter estimates have  been  informed  by  the  available  data. For  example,
because of the rapid elimination of PFOA from the plasma of female rats  the role of
a second  compartment is ambiguous.  Estimating the breadth of the population
distribution for second compartment parameters was especially difficult. When only
plasma data was  used, the posterior distribution of the precision of the volume of
distribution for the second compartment is essentially identical to the assumed prior
distribution.  Only  when  excretion  data was  considered was there   sufficient
information to produce a posterior  parameter distribution  that  differed from the
prior. Visually  inspecting  the  posterior distributions sometimes  allowed rapid
determination of the appropriateness of some model parameters.
   As  a  measure of  quantitative model consistency,   Spiegelhalter et al.  [41]
introduced a Bayesian deviance describing the fit of the data to the posterior
distribution  of model parameters. They then  took the  difference  between this
posterior deviance and the deviance of the data when only the means of the posterior
distribution are used to estimate the complexity of the  data,  as characterized by the
estimated degrees of freedom, pD.

0 Springer
                Previous
TOC

-------
J Pharmacokinet Pharmacodyn (2008) 35:683-712                                    705

   Each parameter in the model contributes 1 estimated degree of freedom if it is
completely free, 0 if it is completely constrained, and some fraction between if it is
correlated with other parameters. For the one compartment model with plasma data
only we have four parameters per rat, eight hyper-parameters and three measure-
ment model parameters, so for  24 rats we have 107 total parameters. The pD is
48.37  for  the  males  and 46.29  for  the females,  indicating roughly two free
parameters per individual animal. This is unsurprising since many of the parameters
are interrelated, such as the bioavailability and the volume of distribution.
   When we jointly analyzed excretion and plasma data with the one-compartment
model, there are roughly 37 additional degrees of freedom for the males and 27 for
the females—more than two for each of the 12 additional animals. This indicates
that including the excretion  data  does provide a more informative picture of the
pharmacokinetics.
   We also found that the female feces data alone was  insufficient to estimate both
pharmacokinetic  and  error  model  parameters  (resulting  in negative estimated
degrees of freedom). A major advantage of analyzing the male and female data set
jointly is that the information about the error model parameters in the male feces
data allowed the female feces data to be used.
   When we examined the estimated free parameters for the two  compartment
model, we found that for both  male  and female plasma data that  nearly one
additional degree of freedom per individual animal is available for bi-phasic fits. It
is  interesting to  note  that when we  perform a joint analysis  using the two-
compartment model, that the feces data provides a significant number of the degrees
of freedom and actually constrains the fit to the plasma data. This would  seem to
indicate that both fecal and plasma data  are important for  characterizing PFOA
pharmacokinetics.
   One method for evaluating  the  goodness of fit  of a Bayesian analysis of  a
particular model  is the Deviance Information Criterion (DIG). Like  the Akaike
Information Criterion or the Bayesian Information Criterion, the DIG is intended to
allow model comparison by characterizing the additional information gained by
including additional parameters.  The  DIG is specifically  designed  for Bayesian
analyses  of hierarchical models. Roughly, the DIG is the  difference between the
deviance of  the  observations from the predictions  for each individual  and  the
deviance of the observations  from the  predictions for the  population means. If a
model includes additional parameters that simply improve the "fit" of the mean
trends of the data, then the DIG  is higher. If the additional parameters  actually
describe  experimental  variation,  then the  DIG is  lower.  The  DIG is meant to
discriminate between models, with the better model having a lower value  [41].
   For all combinations of data  examined, listed in Table 6,  the two-compartment
model has a lower value of DIG. This suggests that the additional complexity of a
second compartment is actually supported by the data. Note that all of the DIC's
calculated  here are negative.  This is  because we  examined parameters with
probability  densities  much  greater than  one, resulting  in negative,  but still
appropriate, values [41].
   The success of the models  was also  examined by comparing  the  different
distributions predicted for plasma concentration integrated over time from the origin
                                                                  4y Springer
               Previous   I     TOC

-------
706
                                         J Pharmacokinet Pharmacodyn (2008) 35:683-712
Table 6  Comparison of Deviance Information Criterion for one- and two-compartment model
Data
Male plasma
Male urine
Male feces
Male total
Female plasma
Female urine
Female feces
Female total
One comp.
Plasma
-8295


-8295
-2387


-2387

Joint
-8294
-2113
-2196
-12603
-2387
-618
-891
-3896
Two comp.
Plasma
-8745


-8745
-2655


-2655

Joint
-8808
-2049
-2511
-13368
-2689
-676
-945
-4310
to infinity—the AUC. The AUC was normalized across doses using the adminis-
tered dose and the volume of the primary compartment. For the one-compartment
model, the normalized AUC is equivalent to the clearance. As shown in Fig. 9, for
the male rats  the  one-compartment model (in  the  top row), the AUC is more
precisely estimated when the excretion data is used than when plasma data is used
alone. The means of the two distributions are roughly similar, however. For the two
compartment model (in the bottom row)  the addition of excretion data shifts the
                 oo
                 q -
                 ci
             O  ci
•*- o
o q -
-S1 d
'co     0
                         100   200   300
                          oo
                          q
                          ci
                                           o
                                           q  -
                                           ci
                                   100   200   300
             Q
             .•51  oo
              CD
             JD
              O
sj
ci
o
q -
ci
                          1
            100   200   300
                                           oo
                                           q  -
                                           ci
                              sj
                              ci
                              o
                              q -
                              ci
                              0
                                                    100  200  300
                        Normalized Area Under the Curve (h)

Fig. 9 Distribution of integrated plasma concentration in male rats for combinations of models and data.
We integrate the plasma concentration from zero to infinity (AUC) for one- (top)  and two- (bottom)
compartment models for Bayesian estimations conducted with  the plasma data alone (left) or plasma,
feces and urine data jointly  (right).  The AUC  is calculated  for a 25 mg/kg oral  dose and is then
normalized by 25 mg/kg and the volume of the primary compartment
4y Springer
                Previous
                    TOC

-------
J Pharmacokinet Pharmacodyn (2008) 35:683-712
                                                                             707
o  q _
>, o
                             in
                             ci
q
ci
                                                     1
                    0   2  4   6   8   10
                                 0  2   4   6  8  10
             Q  q  ,
              ro
              2  o
             D.  ^
                 in
                 ci
                 q
                 ci
                                        1
                    02468   10       02468  10
                         Normalized Area Under the Curve (h)

Fig. 10 Distribution of integrated plasma concentration in female rats for combinations of models and
data. We integrate the plasma concentration from zero to infinity (AUC) for one- (top) and two- (bottom)
compartment models for Bayesian estimations conducted with the plasma data alone (left) or plasma,
feces and urine data jointly (right). The AUC is calculated for a 25 mg/kg oral dose  and is then
normalized by 25 mg/kg and the volume of the primary compartment. Note that the x- and y-axis scalings
are changed from Fig. 9 to better show the details of the distributions
AUC to lower values. This  indicates that the predicted pharmacokinetics  are
somewhat different for our most elaborate analysis. Since the DIG is the lowest for
this analysis, it is likely that the male PFOA AUC is lower than would be estimated
from the one-compartment model and/or plasma data alone.
   For the AUCs for the female rats shown in Fig. 10, the modes of the distributions
of AUC are  very  similar for  all models and combinations of data. However, it is
clear that for plasma data alone that there  is still some possibility of AUCs much
smaller than the mode. The  inclusion of  excretion data appears to rule out the
possibility of much smaller AUCs for female rats.
Discussion

We have performed Bayesian estimation of the distribution of parameters for one-
and  two-compartment pharmacokinetics of PFOA in male and female rats. We
found  that estimates could be obtained by analyzing 24 rats of each gender  for
which  plasma concentration  data is  available  alone, and that also  including
excretion data from twelve additional rats of each gender improved these estimates.
   We conducted this research with two goals. We wished to not only characterize
the pharmacokinetics of PFOA in rats, but also to assess the benefits of Bayesian


                                                                      0 Springer
                Previous
                       TOC

-------
708                                   J Pharmacokinet Pharmacodyn (2008) 35:683-712

analysis of pharmacokinetic models in general as Bayesian methods are likely to be
of great value for more complex models such as PBPK [43].

The pharmacokinetics of PFOA

For both male and female rats we find that pharmacokinetic parameter estimates are
improved when plasma data is carefully combined with excretion data. Our initial
attempts to model elimination  as a sum of the two types of excretion resulted in
parameter estimates where uncertainty increased as additional data was used. Since
additional data made the parameter estimates for the initial model  seem more
unlikely, we modified the model; at first by allowing a third,  unspecified  route of
elimination. With this change uncertainty was reduced by  including additional data
but the  estimated amount of PFOA eliminated by this unknown route was quite
large.
   What appeared to  be a third elimination route for the  one-compartment model
was consistent with the distribution to  a second, deep-tissue compartment: for a
two-compartment model the estimated amount of PFOA  cleared by the unknown
route was negligible. Thus, analyzing the uncertainty about one-compartment model
parameter estimates and jointly considering the plasma and excretion data  forced a
more complicated, though still relatively simple, model. This is the main reason we
believe  that the two-compartment model is more appropriate for PFOA  pharma-
cokinetics in the male rat.
   A quantitative measure of model fitness, the DIG, also indicates that the two-
compartment  model  is better supported by available data. This means  that the
additional uncertainty from estimating two more parameters per individual appears
to be offset by better matching the data. For the female data, however, the plasma
AUC does not seem  to differ significantly between the one- or two-compartment
model. This suggests that for some applications simple, empirical models may be
adequate to describe PFOA  pharmacokinetics in female rats.
   For male rats we estimate a population mean clearance of 1.24 ± 1.08 ml/kg/h
and a total volume of distribution of 453.9 ± 455.41 ml/kg for male Sprague-
Dawley, compared to 2.1 ± 0.6 ml/kg/h and 345.6 ± 57.3 ml/kg in Kudo et al.'s
[16] studies of Wistar rats. For female Sprague-Dawley rats we estimate a clearance
of 30.33 ± 13.15 ml/kg/h  and a  total volume  of distribution  of   1566.40 ±
2507.94 ml/kg compared to 93.06 ± 33.54 ml/kg/h and 211.2 ± 28.2 ml/kg for
the Wistar rats of Kudo et al. [16]. Because of the rapid  clearance of PFOA from
female  rats, the larger  volume of distribution is likely due to the  diminished
influence of, and thus greater uncertainty about, the second compartment for the
female time-courses—since we report the medians of parameter distributions, broad
distributions will have large median values.
   PFOA clearance is strikingly faster in female than in male rats, complicating
extrapolation of rat exposures to human. In both cases the clearance is  much more
rapid  than would be  consistent  with  the  3.8 year human  half-life [11] and
extrapolation is further complicated by the lack of a sexual dimorphism in human
clearance [13]. It has been hypothesized that renal transporters may act on PFOA,
inhibiting renal excretion [12]. The function of these  transporters have been shown
4y Springer
               Previous  I      TOC

-------
J Pharmacokinet Pharmacodyn (2008) 35:683-712                                    709

to be modulated by sex hormones in rats, possibly explaining the difference between
rat genders [17]. However, in vitro studies of the affinity of these transporters for
PFOA have not  found significant differences between rats and humans, possibly
indicating that expression of the transporters may drive the interspecies differences
[18].
  In studies of Sprague-Dawley rats by Vanden Heuvel et al. [15] 91% of PFOA
was eliminated through urine within  24 h of dosing female rats. In male rats, the
cumulative excretion of PFOA was 36.4% to urine and 35.1% to feces after 28 days
(672 h). Using a one-compartment model and the data of Kudo et al. [16], Harada
et al.[44] calculated that 5 h after dosing  91.4 ± 50.2% of PFOA that was cleared
from plasma had been excreted through urine  in males, while 47.2 ± 22.1% of
cleared PFOA was excreted through urine in females. Using the two compartment
model we  estimated that  overall, 77 ± 9%  is  excreted to urine and 8 ± 6% is
excreted to feces in female Sprague-Dawley rats in the Kemper data [20]. For male
Sprague-Dawley rats,  we  estimate  that 85 ±  7%  and 12 ± 6% of the PFOA
eliminated  from  plasma is excreted to urine and feces respectively.
  The presence of residual PFOA in  the  animal carcass,  confirmed by the
experimenters [20], indicates that more complicated models that include additional,
deeper compartments may be needed to accurately describe PFOA pharmacokinet-
ics. Research by Kudo et al. [45] has  found that small (0.041 mg/kg) and  large
(16.56 mg/kg) doses of PFOA  distribute very differently  to liver in Wistar rats,
making it very likely that the PFOA kinetics are concentration-dependent. We did
not observe trends among  the two-compartment parameter  estimates  receiving
different  doses,  but since we  assumed  that all  of our parameters  were  dose-
independent only very large trends should be observable.
  Concentration-dependent  models for PFOA PK, possibly including differential
levels of transporter expression, should be investigated in order to develop inter-
species extrapolations. For  such  models it will not be possible to use analytic
pharmacokinetic solutions, since the kinetics will change as the PFOA is distributed
and cleared.  In  order to perform a  Bayesian analysis  using  MCMC it will  be
necessary to solve differential equations for the relevant parameters  for  each
iteration of the Markov chains—greatly increasing the time  per iteration. If new
models will allow better  interspecies extrapolation then this is a computational
hurdle worth overcoming.

Bayesian analysis of pharmacokinetic models

Using fairly uninformative, though bounded, uniform priors we can obtain estimates
of some parameters both for specific test animals and the population from which the
test animal were drawn. We specify three models, one each for the pharmacoki-
netics, the distribution  of  parameters  within  the  population  of rats, and the
measurements performed  on each test animal. A clear benefit of the  Bayesian
approach is the requirement for explicit,  quantitative descriptions of the pharma-
cokinetic model  and prior knowledge of the parameter distributions. Estimates
obtained  given a fully described Bayesian analysis  can be replicated  because  no
further "goodness of fit" assumptions are made about the values of the parameters.
                                                                   4y Springer
               Previous   I     TOC

-------
710                                    J Pharmacokinet Pharmacodyn (2008) 35:683-712

If the Markov chains have converged, then a representative sample of the possible
combinations of parameters has been examined with respect to the likeliness of
having produced the observations.
   Convergence  is  a  primary  drawback  of Bayesian  analysis  performed with
MCMC,  especially when using uninformative priors. For the empirical models
studied here, chain lengths on  the order of a million iterations were required to
determine  posterior distributions.  These  models  have  analytic  solutions,  so
determining the predictions based on the new parameters for each iteration  is
relatively rapid. If more complicated models require numerically solving systems of
differential equations,  then the time for each iteration may increase greatly. Part of
the need for large numbers of  iterations is driven, however, by  the necessity of
vague prior distributions on empirical pharmacokinetic parameters. One advantage
of more complicated, biologically based models may be that there should be a trade-
off between the time needed per iteration  to solve more sophisticated models and
the additional available information about physiological parameter values that can
reduce the  number of iterations  needed to  converge.
   There  are  many advantages of  having performed a  Bayesian  analysis  of
pharmacokinetic  data.  Using  sets  of parameters  drawn from  the  posterior
distribution, we are able to easily calculate probability distributions for quantities
that depend upon  the parameter, e.g. the Area Under the Curve for the plasma
concentration  as  a measure of tissue exposure.  By  analyzing a hierarchical
statistical model we also found distributions for the population hyper-parameters,
allowing  us  to  simulate  new  individuals under various  dose  conditions. The
hierarchical model also allowed us to analyze the gains and uncertainty in using
different  types of data—plasma  concentration and  excretion—to  inform the
estimates of different parameters that are not identifiable with just one type of the
data alone.
   Another, very useful, outcome of Bayesian analysis is that  we identify certain
parameters that, while possibly  important to the underlying biology or pharmaco-
kinetics,  simply are not sufficiently informed by the data to be estimated. When the
estimated posterior distribution  is too similar to the assumed prior distribution we
know that our assumptions, and not the data, are driving the parameter estimation.
Recognizing when these situations might occur could inform the design of potential
future experiments.
   Problems with estimating parameters also inform model design. For instance,
when we found that we could not constrain the total clearance to be the sum of the
fecal and urinary excretion, we changed to estimating excretion as a fraction of the
total clearance and learned that there may be some fraction of the elimination of
PFOA from the plasma for which we do not account.
   Finally, Markov-chain Monte Carlo approaches to Bayesian analysis allow direct
comparison of different models.  As in our comparison of one and two-compartment
empirical models,  we can generally use the  DIG to evaluate  whether additional
information is gained from the data by the inclusion of additional model parameters.
In this way we can determine whether to expand or contract a model. This technique
should be extremely useful in constructing and comparing physiologically based
pharmacokinetic models.
4y Springer
               Previous   I     TOC

-------
J Pharmacokinet Pharmacodyn (2008) 35:683-712                                            711


Acknowledgments and Disclaimer  The United States Environmental Protection Agency through its
Office of Research and Development funded and managed the research described here. Interagency
Agreement  RW-75-92207501  with the National Toxicology Program  at  the National Institute  for
Environmental Health Science was a partial source of funding. This research has been subjected to
Agency review and approved for publication.
References

  1. Andersen ME, Clewell HJ, Frederick CB (1995) Applying simulation modeling  to problems in
    toxicology and risk assessment—a short perspective. Toxicol Appl Pharmacol 133:181-187
  2. Gelman A, Bois FY, Jiang J (1996) Physiological pharmacokinetic analysis using population mod-
    eling and informative prior distributions. J Am Stat Assoc 91:1400-1412
  3. Parham FM, Matthews HB, Portier CJ (2002) A physiologically based pharmacokinetic model of
    /?,/?'-dichlorodiphenylsulfone. Toxicol Appl Pharmacol 181:153-163
  4. Keys DA, Wallace DG, Kepler TB, Conolly RB (1999)  Quantitative evaluation of alternative
    mechanisms  of blood and testes disposition of di(2-ethylhexyl) phthalate and mono(2-ethylhexyl)
    phthalate in rats. Toxicol Sci 49:172-185
  5. Collins AS, Sumner SCJ, Borghoff SJ, Medinsky MA (1999) A physiological model for tert-amyl
    methyl ether and tert-amyl alcohol: hypothesis testing of model structures. Toxicol Sci 49:15-28
  6. Cobelli C, DiStefano JJ (1980) Parameter and structural identifiability concepts and ambiguities: a
    critical review and analysis. Am J Physiol Regul, Integr Comp Physiol 239:7-24
  7. Bernillon P,  Bois FY (2000) Statistical issues in toxicokinetic modeling: a Bayesian perspective.
    Environ Health Perspect 108:883-893
  8. Mezzetti M,  Ibrahim JG, Bois FY, Ryan LM, Ngo L,  Smith TJ (2003) A Bayesian compartmental
    model for the evaluation of 1,3-butadiene metabolism. J R Stat Soc Ser C-Appl Stat 52:291-305
  9. Gueorguieva I, Aarons L, Rowland M (2006) Diazepam pharamacokinetics from preclinical to phase
    I using a Bayesian population physiologically based pharmacokinetic model with informative prior
    distributions  in Winbugs. J Pharmacokinet Pharmacodyn 33:571-594
10. Lau C,  Anitole K, Hodes C, Lai D, Pfahles-Hutchens A, Seed J (2007) Perfluoroalkyl acids: a review
    of monitoring and toxicological findings. Toxicol Sci 99:366-394
11. Olsen GW, Burris JM, Ehresman  DJ, Froehlich JW, Seacat AM, Butenhoff JL, Zobel LR  (2007)
    Half-life  of  serum  elimination  of perfluorooctanesulfonate, perfluorohexanesulfonate, and perflu-
    orooctanoate in retired fluorochemical production workers. Environ Health Perspect 115:1298-1305
12. Andersen ME, Clewell HJ, Tan Y-M,  Butenhoff JL, Olsen GW (2006) Pharmacokinetic modeling of
    saturable, renal resorption of perfluoroalkylacids in  monkeys-probing the determinants of long
    plasma half-lives. Toxicology 227:156-164
13. Calafat AM,  Wong  L-Y, Kuklenyik Z, Reidy JA, Needham LL (2007) Polyfluoroalkyl chemicals in
    the US population:  data from the National Health and Nutrition Examination Survey (NHANES)
    2003-2004 and comparisons with NHANES 1999-2000. Environ Health Perspect 115:1596-1602
14. Washburn ST, Bingman TS, Braithwaite SK,  Buck RC, Buxton LW, Clewell HJ, Haroun LA,
    Kester  JE, Rickard RW,  Shipp AM (2005)  Exposure assessment and risk characterization for
    perfluorooctanoate in  selected consumer articles. Environ Sci Technol 39:3904-3910
15. Heuvel JPV,  Kuslikis BI, Rafelghem MJV, Peterson RE (1991) Tissue distribution, metabolism, and
    elimination of perfluorooctanoic acid  in male and female rats. J Biochem Toxicol 6:83-92
16. Kudo N, Katakura  M, Sato Y, Kawashima Y  (2002) Sex hormone-regulated renal  transport of
    perfluorooctanoic acid. Chemico-Biol Interact 139:301-316
17. Katakura M, Kudo  N, Tsuda T, Hibino Y, Mitsumoto A, Kawashima Y (2007) Rat organic anion
    transporter 3  and organic anion transporting polypeptide 1 mediate perfluorooctanoic  acid transport. J
    Health  Sci 53:77-83
18. Nakagawa H et al.  (2007) Roles of organic anion transporters in the renal excretion of perfluoroc-
    tanoic acid. Basic Clin Pharmacol Toxicol 103:1-8
19. Loveless  SE, Finlay C, Everds NE, Frame SR, Gillies PJ, O'Connor JC, Powley CR, Kennedy GL
    (2006)  Comparative responses of rats and mice exposed to  linear/branched, linear, or branched
    ammonium perfluorooctanoate (apfo). Toxicology 220:203-217
20. Kemper RA  (2003) Perfluorooctanoic acid: toxicokinetics in the rat, DuPont Haskell Laboratories,
    Laboratory Project ID: DuPont-7473.  USEPA Administrative Record AR-226.1499
                                                                                 4y Springer
                  Previous   I       TOC

-------
712                                            J Pharmacokinet Pharmacodyn (2008) 35:683-712


21. Butenhoff JL, Gaylor DW, Moore JA, Olsen GW, Rodricks J, Mandel JH, Zobel LR (2004) Char-
    acterization of risk for general population exposure to perfluorooctanoate. Regul Toxicol Pharmacol
    39:363-380
22. Portier CJ,  Lyles CM (1996) Practicing safe modeling:  GLP for biologically based mechanistic
    models. Environ Health Perspect 104:806
23. Sheiner LB, Beal SL (1980) Evaluation  of methods for estimating population pharmacokinetic
    parameters. I. Michaelis-Menten model: routine clinical pharmacokinetic data.  J Pharmacokinet
    Pharmacodyn 8:553-571
24. Lunn DJ, Thomas A, Best N, Spiegelhalter D (2000) WinBUGS—a Bayesian modeling framework:
    concepts, structure and extensibility. Stat Comput 10:325-337
25. Lunn DJ, Best NG, Thomas A, Wakefield J, Spiegelhalter D (2002) Bayesian analysis of population
    PK7PD models: general concepts and software. J Pharmacokinet Pharmacodyn 29:271-307
26. Ihaka  R, Gentleman R (1996) R: a language for data analysis and graphics. J Comput Graph Stat
    5:299-314
27. Sturtz S, Ligges U, Gelman A (2005) R2WinBUGS: a package for running WinBUGS from R. J Stat
    Software, 12
28. Lunn DJ (2003) WinBUGS development interface (WBDev). ISBA Bull 10:10-11
29. Oberon Microsystems Inc (2001) Component pascal language report
30. Gelman A, Carlin JB, Stern HS, Rubin DB (2004) Bayesian data analysis, 2nd edn. Chapman and
    Hall/CRC
31. Gelman A (2006) Prior distributions for variance parameters in hierarchical models. Bayesian Anal
    1:515-533
32. Lambert PC, Sutton AJ, Burton PR, Abrams  KR, Jones DR (2005) How vague is vague? A simu-
    lation  study of the impact of the use  of vague prior distributions in MCMC using WinBUGS. Stat
    Med 25:2401-2428
33. Rocke DM, Lorenzato S (1995) A two-component model for measurment error in analytic chemistry.
    Technometrics 37:176-184
34. Bois FY, Gelman A, Jiang J, Maszle D, Zeise L, Alexeef G (1996) Population toxicokinetics of
    tetrachloroethylene. Arch Toxicol 70:347-355
35. Chiu WA, Bois FY (2006) Revisiting the population toxicokinetics of tertracholoroethylene.  Arch
    Toxicol 80:382-385
36. Best NG, Cowles MK, Vines K (1995) CODA: convergence diagnosis and output analysis for Gibbs
    sampling output
37. R Development Core Team (2008)  R: a language and environment for statistical computing, ISBN
    3-900051-07-0, URL http://www.R-project.org
38. Gelman A, Rubin DB (1992) Inferences from iterative simulation using multiple sequences. Stat Sci
    7:457-472
39. Raftery AE, Lewis SM (1992) Comment: one long run with diagnostics: implementation strategies
    for Markov Chain Monte Carlo. Stat Sci 7:493^1-97
40. Dodds MG, Vicini P (2004) Assessing convergence of Markov Chain Monte Carlo simulations in
    hierarchical Bayesian models for populations pharmacokinetics.  Ann Biomed Eng 32:1300-1313
41. Spiegelhalter DJ, Best NG, Carlin BP (2002) Bayesian measures of model complexity and fit. J R Stat
    Soc B 64:583-639
42. Spiegelhalter DJ, Best NG, Carlin BP  (1998) Bayesian deviance, the effective number of parameters,
    and the comparison  of arbitrarily complex models. Technical Report, Medical Research Council
    Biostatistics Unit, Cambridge, UK
43. Barton HA et al. (2007) Characterizing uncertainty and variability in physiologically based  phar-
    macokinetic models: state of the science and needs for research and implementation. Toxicol Sci
    99:395-402
44. Harada K, Inoue K, Morikawa A,  Yoshinaga T, Saito N, Koizumi A  (2005) Renal clearance of
    perfluorooctane sulfonate  and perfluorooctanoate in humans and  their species-specific excretion.
    Environ Res 99:253-261
45. Kudo N, Sakai A, Mitsumoto A, Hibino Y, Tsuda T, Kawashima Y (2007) Tissue distribution and
    hepatic subcellular distribution of perfluorooctanoic acid at low dose are different from those at high
    dose in rats. Biol Pharm Bull 30:1535-1540
46. Stone  CJ, Hansen M, Kooperberg C, Truong YK (1997) The use  of polynomial splines and their
    tensor products in extended linear modeling (with discussion). Ann Stat 25:1371-1470
4y Springer
                  Previous   I       TOC

-------
                                        Environ. Sci. Technol. 2008, 42, 934-939
                  te
aid
ELAINE  A.  COHEN HUBAL,*' +
MARCIA  G.  NISHIOKA,*
WILLIAM  A.  IVANCIC.*
MICHELE MO KARA,*  AND
PETER P. EGEGHY§
National Center for Computational Toxicology, U.S. EPA,
Research Triangle Park, North Carolina 27711, Battelle
Memorial Institute, Columbus, Ohio 43201, and National
Exposure Research Laboratory, U.S. EPA,
Research Triangle Park, North Carolina 27711

Received July 6, 2007. "Revised manuscript received October
31, 2007. Accepted November 5, 2007.
Transfer of chemicals from contaminated surfaces such as
foliage, floors, and furniture is a potentially significant source
of both occupational exposure and children's residential exposure,
Increased understanding of relevant factors influencing
transfers from contaminated surfaces to skin and resulting dermal-
loading will reduce uncertainty in exposure assessment. In a
previously reported study, a fluorescence imaging system was
developed, tested, and used to measure transfer of riboflavin
residues from surfaces to hands. Parameters evaluated included
surface type, surface  loading, contact motion, pressure,
duration, and skin condition. Results of the initial study indicated
that contact  duration and pressure were not significant for
the range of values tested, butthat there are potentially significant
differences in transfer efficiencies of different compounds.  In
the study reported here, experimental methods were refined and
additional transfer data were collected. A second fluorescent
tracer, Uvitex OB, with very different physicochemical
properties than riboflavin, was also evaluated to  better
characterize the range of transfers that  may be expected for
a variety  of compounds. Fluorescent tracers were applied
individually to surfaces and transfers to  skin were measured
after repeated hand contacts with the surface.  Additional trials
were  conducted to compare transfer of tracers and co-
applied pesticide residues. Results of this study indicate that
dermal loadings of both tracers increase through the seventh
brief contact. Dermal loading of Uvitex tends to increase at
a higher rate than dermal loadings of riboflavin. Measurement
of co-applied tracer and pesticide suggest results for these
two tracers may provide  reasonable bounding estimates
of pesticide transfer.

Introduction
Although monitoring for surface contamination in work with
radioactive  materials and dermal monitoring of pesticide
exposure to agricultural workers have been standard practice

   * Corresponding  author phone:   (919)  541-4077;  e-mail:
hubal.elaine@epa.gov.
   f National  Center for Computational  Toxicology, U.S. EPA.
   * Battelle Memorial Institute.
   § National  Exposure Research Laboratory, U.S. EPA.

934 « ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 42, NO. 3, 2008
     for 50 years, regular surface sampling and dermal monitoring
     methods  have been applied to industrial and residential
     contamination only since the 1980s. In recent years, there
     have been significant advances in tools available to measure
     and assess dermal exposures resulting from contacts with
     contaminated surfaces (1). However, due to the complexity
     of this system, there are still important gaps in our under-
     standing of determinants of dermal transfer and how best
     to measure and assess resulting exposure. To identify major
     uncertainties associated with quantifying dermal exposures
     resulting from contact with contaminated surfaces it is useful
     to consider  pathways and mechanisms for these exposures.
     Transfer of  contaminants from a contaminated surface to
     skin is a function of (1) contaminant form (residue, particle,
     formulation, age, physicochemical properties); (2)  surface
     characteristics (hard, plush, porous, surface loading, previous
     transfer); (3) nature of interaction between contaminant and
     surface; (4)  skin characteristics (moisture, age, loading); (5)
     contact mechanics (pressure, duration, smudge, repetition);
     and (6) environmental  conditions  (temperature,  relative
     humidity). Currently, it  is not clear which of these many
     factors will  drive transfer and  under what  conditions.
     Increased understanding of the most significant factors for
     influencing transfers from contaminated surfaces to skin and
     resulting dermal loading is required to reduce uncertainty
     in exposure assessment  (2).
        Transfer of pesticide residues has been studied previously
     by hand press to pesticide spiked surfaces  (3-6). Currently,
     there are no direct methods for measuring pesticide residues
     on hands. As such, in each of these studies the hand was
     wiped  or rinsed with  2-propanol  to collect transferred
     pesticide for quantification.  In  addition  to  uncertainty
     introduced  by use of a  rinse  or wipe, conducting  studies
     with direclcdpes (icicle contact to children is clearly unethical.
     Video  fluorescent imaging is one approach that has been
     used successfully in bo th occupational and residential settings
     to explore dermal exposure mechanisms and mitigation
     strategies (7-9). Application of nontoxic fluorescent tracers
     provides  an opportunity to design studies that address
     limitations of pesticide transfer studies and lend insight on
     occupational and residential  pesticide exposures to both
     children and adults.
        In a previously reported study, a fluorescence imaging
     system was  developed, evaluated (10), and used to measure
     transfer of riboflavin (Vitamin B2) residues from surfaces to
     hands for multiple contacts (II). Results of this initial study
     indicated that surface loading and skin  condition were
     important parameters affecting residue transfer of riboflavin.
     Contact duration and pressure were not significant for the
     range of values tested, and surface type was not significant
     after the  first contact. Preliminary results also suggested
     potentially significant differences in transfer efficiencies of
     different compounds.  Limitations  of  this study included
     surface loadings that were  relatively high compared with
     contaminant loadings expected in residential environments
     using  current crack and crevice application methods.  In
     addition,  only one  fluorescent surrogate was evaluated
     limiting our ability to evaluate  the effect of compound
     properties on transfer  efficiency and to extrapolate results
     to a range of current use pesticides.
        In  the study  presented  here,  experimental methods
     developed previously were refined and dermal transfer data
     were collected using both .Uvitex OB (lipophilic) andriboflavin
     (hydrophilic). Residue transfers to three types of dislodgeable
     residue sampling tools were also collected using these two
     fluorescent  tracers and live different organophosphatc (OP)
                                                                  10.1021/es071668h CCC: $40.75
                                                                                          © 2008 American Chemical Society
                                                                                               Published on Web 12/19/2007
                                 Previoys
TOG
Next

-------
TABLE  1. Hani Contact Trials3
                                                           Trial Number
parameter
surface type
surface loading
contact type
skin condition
tracer
1
a
b
c
d
e
2
A
B
c
d
e
3
A
b
C
D
E
4
a
B
C
D
E
5
A
b
C
d
e
6
a
B
C
d
e
7
a
b
c
D
E
8
A
B
c
D
E
9
A
b
c
d
E
10
a
B
c
d
E
11
a
b
C
D
e
12
A
B
C
D
e
13
A
b
c
D
e
14
a
B
c
D
e
15
a
b
C
d
E
16
A
B
C
d
E
  3 Key: Surface, skin,  and contact parameters.  A = carpet; a = laminate. B = low surface loading (0.2 g/cm2); b = high
surface loading (2 g/cm2). C = uniform press; c = smudge. D = dry hand; d = moist hand. E = Riboflavin; e = Uvitex,

and pyrethroid pesticides. There were two main objectives
of this study: evaluate impact  of parameters  related to
compound, surface, and skin on transfer efficiency; and relate
transfer of tracers to transfer of representative pesticides with
similar physicochemical properties.
         and
Study Design. This study was performed using both riboflavin
and Uvitex OB as surrogates for pesticide residues, and video-
imaging technology to quantify dermal loadings of fluorescent
tracers following contact with  tracer-treated surfaces. The
approach was as follows: (1) Apply fluorescent tracer to test
surfaces. (2) Conduct controlled hand-transfer experiments
varying selected parameters. (3) Measure mass of tracer
transferred and estimate surface and dermal  contact areas.
(4) Assess relative transfers of tracers and pesticides using
transferable residue sampling techniques.
   A number of fluorescent tracers were considered, espe-
cially those used in previous studies. Safety was the overriding
concern in choosing tracers. Two tracers were selected having
physicochemical properties that bound properties of several
pesticides of interest: Uvitex OB (Ciba Specialty Chemicals}
and riboflavin. Pesticides selected areofcurrentinterestdue
to widespread use in the United States for residential and
agricultural applications. Chlorpyrifos and diazinon were OPs
used most extensively in the indoor residential market and
are still being measured in U.S. homes. Pyrethroids are now
the dominant residential-use insecticides. Q's- and trans-
permethrin as well as esfenvalerate are commonly found in
homes at measurable levels.
   Parameters evaluated in this study included tracer, surface
type, surface loading, contact motion, and skin condition.
Eight experiments or trials involving contact withriboflavin-
treated surfaces, and 8 experiments involving Uvitex-treated
surfaces  (Table 1) were conducted; each experiment was
repeated in triplicate. Because riboflavin can be washed from
hands, 3 subjects were recruited for riboflavin experiments;
each person completed all 8 experiments. In contrast, because
Uvitex cannot  be washed from hands, 24 subjects  were
recruited to gather triplicate data for each of the 8 Uvitex
experiments.
   As described previously (2), the Youden ruggedness test
(.12) was  used to  select parameter combinations for  each
trial. By using this design, more than one parameter could
be varied at a time minimizing the number of trials required
to test for main effects of all parameters. The experimental
plan used here is a l/2fractional replication of a3 x 25factorial
(Table 1). Tested  parameter values  for this study and the
previous study are summarized in Table 2. Available data for
octanol/water partition coefficient, vapor pressure, and water
solubility for tested pesticides and tracers are  listed in Table
3. Study design, protocols, and consentforms wereapproved
by the Battelle Memorial  Institute  IRB for use  of human
subjects, and subsequently reviewed by the EPA administrator
for human subjects experiments.
   Application of Tracers to Test Surfaces. General protocols
for spray applications to surfaces have been  discussed
     TABLE  2.
         parameter

     tracer
     skin condition
     surface type
     surface loading
     contact motion
     contact duration
     contact pressure 0.1  or 1 psi
     contact number multiple
     initial experiments    refined experiments"

   riboflavina         riboflavinfa or Uvitex"
   dry, moist, or sticky dry or moist
   carpet or laminate   carpet or laminate
                      0.2 or 2 ag/cm2
                      press or smudge
                      2sd
                      0.1 psid
                      multiple
2 or 10 ag/cm2
press or smudge
2 or 20 s
        '"* Refined  experiments  added  Uvitex,  reduced  loading
     levels, and reduced number of parameters tested, b Relatively
     water soluble. c Relatively  water insoluble. rf Parameter was
     not varied in study.
     previously (2). Detail of differences specific to this study are
     presented in the Supporting Information (Section SI). In
     general,  an improved spray system was used to deliver
     smaller, finer droplets to test surfaces than in the previous
     study. Measured variability in loading across a platform was
     25% with riboflavin solution and 14% with Uvitex OB solution.
     Each application surface (platform), 60 cm x 180 crn, was
     platted into 3 rows of 11 blocks. One  block per row was
     allocated to a deposition coupon, 10 blocks were used for
     dermal contact. Each block was used only once.
        Contact  Trials and Transfer Off Protocols. For each
     experiment, the subject was instructed on contact motion and
     skin condition for surface contact. Contact duration was held
     at 2 s and contact pressure was held at approximately 0.1 psi
     for all experiments. For dry condition, hands were washed, dried,
     and then held up in room air for 30 s. For moist condition,
     hands were washed, dried and then held 8-10 cm away from
     outlet of CoolMist vaporizer for 20 s. To familiarize subjects
     with the feel of 0.1 psi contact, subjects practiced 10 presses on
     a scale prior to initiating an experiment. The subject contacted
     surface in an unused area (block), had the hand imaged, and
     then repeated contact motion in a new area. A series of seven
     sequential  contacts with surface using the same  motion
     constituted one experiment.
        Measurement of Dermal Loading. The fluorescent imag-
     ing system described in Ivancic et al. (1) was used to monitor
     and measure fluorescence on hands following contact with
     tracer-treated surface.  Details  of  the  fluorescent lamp
     configurations and wavelength settings for each of the tracers
     are presented in the Supporting Information (Section S2).
        For tests with riboflavin,  a full calibration curve of
     riboflavin (different points) was obtained with each subject.
     Different amounts of a lOO^g/mL aqueous riboflavin solution
     were  deposited on the hand to simulate different riboflavin
     loadings. In contrast, for Uvitex, each subject completed one
     experiment with his right hand, and a series of 3-6 different
     calibration curve points was imaged on his left hand. Uvitex
     OB calibration curve solution was prepared as an aqueous
     solution  of Uvitex in 0.1% Pluronic (ernulsiiier to suspend
                                Previoys
                                                             VOL. 42, NO. 3, 2008 / ENVIRONMENTAL SCIENCE & TECHNOLOGY » 935
TOG
Next

-------
TABLE 3. Properties of Testei Pesticiies and Tracers

analyte
diazinon
chlorpyrifos
tech. permethrinc
c/s-permethrin
trans-permethrin
esfenvalerate
riboflavin
uvitex OB
aRef (73). fcRef (74). c Technical
octanol/water
partition
6,400°
50,000^
3,160,000d
naf
na
1,660,000^
Q.035g
not available
grade. d Ref (15), ° Ref (16).
fapor pressure
(mPa)
0,097 @ 20 °Cb
2.5 @ 25 "C^
0.0013 @ 20°Ce
0,0025 @ 20 °C
0,0015 @ 20 °C
0.067 @ 25 °Cb
negligible
0,000003 @ 20 °C''
fNot available. g Ref (IT).
water solubility
(mg/L)
40 @ 20 °Cb
2 @ 25 cCb
0.2 ca 20 CC&
na
na
<0.3 @ 25 °C&
150h
negligible'
ft Determined here. '' Ref ( 78).
Uvitex in water) to accurately simulate spectral interferences
that m ight arise from contact with surfaces that also contai ned
Pluronic.
   Subject-specific calibration curves for riboflavin were not
significantly different. As such, the three individual riboflavin
curves were merged into one universal calibration curve and
applied to all data. Similarly, Uvitex calibration data from all
subjects were merged into one calibration curve and applied
to all  Uvitex  measurements.  Details of the  calibration
approach  are  presented in the Supporting Information
(Section S2).
   Coapplied Pesticide and Tracer Transferable Residue
Sampling. Detailed application and sampling methods are
presented in the Supporting Information (Section S3).Two
separate mixtures of five pesticides and selected tracer were
applied to  carpet  and laminate platforms.  One mixture
consisted of distilled/ deionized water doped with an aqueous
solution of riboflavin, concentrated acetonitrile solutions of
diazinon, cis- and rrans-permethrin, and esfenvalerate, and
a commercial "ready-mix"  aqueous formulation  of 6%
chlorpyrifos (Spectracide brand). Cis- and frans-permethrin
doping solution was prepared from neat material in 1:4 ratio
of cisl trans isomers.  The  second mixture  consisted of
distilled/deionized water with 0.1% Pluronic F68 doped with
concentrated acetone  solution of Uvitex OB and the same
pesticide solutions. Application loading of analytes,  except
cis-permethrin, was 0.2 «g/cm2 for all of these tests; loading
of cis-permethrin was 0.04 ag/cm2. Three methods were used
to collect transferable residue samples from test surfaces:
aqueous wipe,  CIS Empore (3M) Press Disk, and PUF roller.
In addition, deposition coupons were used to verify surface
loading.
   Detailed analytical  methods are presented in the Sup-
porting Information (Section S3). Pesticide QC spike samples
had recoveries  as follows: 85-105% (depending on pesticide)
for deposition coupons (10/^g), 87-106% for aqueous wipes
(10 «g), 20-46% for PUF roller sleeves (1 /.«g), and 89-103%
for C18 disks (25 ng).  PUF Roller data were surprising, as
previous testing of method showed recoveries of 120-130%.
For these QC spike samples analyzed with field samples,
when coapplied with riboflavin, recoveries of pcs ticides from
PUF were 45 ± 1, 36 ± 3, 30 ± 3, 31 ± 5, and 35 ±  4% for
diazinon, chlorpyrifos, cis-permethrin, rrans-permethrin, and
esfenvalerate,  respectively.  When coapplied with Uvitex,
recoveries from PUF were 18 ± 3, 20 ± 5, 16 ± 4,17 ± 1, and
17 ± 2% for diazinon,  chlorpyrifos, cis-permethrin, trans-
permethrin, and esfenvalerate. PUF Roller sample data were
corrected by individual spike recoveries. Pesticides were not
detected in field matrix blanks.
   Uvitex OB QC; spike samples had recoveries as follows:
89% for deposition coupons (lO^g), 69% for aqueous wipes
(lO^g), 94% for PUF Roller sleeves (1 ;tg), and 127% for CIS
disks (25 ng). Uvitex OB was not detected in matrix blanks
with exception of low levels  in PUF  Roller sleeve and C18
disks; corrections for these were made to data. Riboflavin
     QC  spike  samples had recoveries as  follows:  98% for
     deposition coupons (10,«g), 91% for aqueous wipes (10,«g),
     90% for PUF Roller sleeves (1 «g), and an anomalous 468%
     for 25 ng  applied to CIS  disks. Earlier studies  showed
     recoveries of 85% for this method,  so this was assumed to
     be an outlier. Riboflavin was not detected in matrix blanks
     with exception of low levels in CIS press disks sample extracts.
        Statistical Methods. Descriptive statistics (mean and
     standard deviation) were calculated for measured loading
     on and percent transferred to hands. Multifactorial ANOVA
     tests were performed  to determine which experimental
     parameters were statistically significant predictors of cither
     hand loading or transfer efficiency using data only from the
     initial  contact  between  hand and surfaces. Parameters
     evaluated in these two models were surface type, surface
     loading, contact type, and skin condition. A linear mixed-
     effects regression  test was  used  to evaluate the above
     parameters, with addition of contact number, on repeated
     hand loading measurement data. Differences between tracers
     in the rate of  transfer to  skin  were evaluated with an
     interaction term in the mixed-effects model.

     Results
     Hand Contact Trials. A summary of results for individual
     experiments is presented in Table SI (Supporting Informa-
     tion). Averages for the 3 subjects are shown for initial contact
     and seventh contact in terms of loading  on hand (ag/cm2)
     and  overall percent transfer. The range of dermal loading
     after first contact was 0.01-0.62 ag/cm2, and the range of
     loading after seventh contact was 0.02-1.69 «g/cm2. Overall
     percent transfer ranged between 0.8 and  45.5% for the first
     contact,  and 0.6-19.4% for  the  seventh contact. Overall
     percent transfer takes into account total area contacted over
     the multiple presses, defined previously  (2).
        Results for this study compare well with previous data
     reported in the literature. Camman ct al. (8) studied press
     transfer of pesticide residues to moistened hands. Resulting
     transfer efficiencies ranged from 1 to 5%. Hsu et al. (5)
     evaluated transfer of thirteen  different pesticides  from
     aluminum foil to the hand heel following a series of 10 presses.
     Both dry and moist hands were studied; two different press
     times (1 and 5 s per press) and two  different hand motions
     were also tested. The effect of varying these parameters was
     not evaluated.  Mean  pesticide transfer efficiencies  were
     between 5 and 16% with a relative standard deviation greater
     than 30%. Mean dermal recovery did not  differ significantly
     among evaluated pesticides.
        Hand loading by contact number is compared for the two
     tracers (riboflavin or Uvitex) at the  two surface loadings (2
     (Kg/cm2 or 0.2 ag/cm2) and presented in Figure 1. Results of
     this  study  indicate that median dermal loadings of  both
     tracers increase in a near linear fashion through the seventh
     contact. Dermal loading of Uvitex tends to increase at a higher
     rate than dermal loading of riboflavin: 0.13 versus 0.069^/,g/
     cm2/contact, respectively, at the higher surface loading, and
936 m ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 42, NO. 3, 2008
                               Previoys
TOG
Next

-------
              Loading by Contact No., Follow—Up Experiment
                         Loading = high
                                Loading by Contact No., Follow-Up Experiment
                                           Loading = high
2400-
2200-
2000-
1800-
1600-
1400-
1200-
1000-
800-

600-
400-
200-
0-
-200-
Analyte=Riboflavm 2400-
2200-
2000-
-,- 1800-
£• 1600-
E
^ 1400-
— i— ^
~T ^ 1200-
01
1 '1 1000-
— j— [-1— | 0
1 , ° 800-
	 1 	 1 I—1—! , C
1 _^_ , — in 600-
r~i li 400-
1 1 1 1 — 1 — 1 — —
0-
i i i i i i i -200-






0

-T
R
-1-

1234567 1
Contact Number











•
bd


2

Ana lyte= Uvitex














•

i






-L
m







m















•

~r








I
I 1 1 -L


34567
Contact Number
              Loading by Contact No., Follow—Up Experiment
                         Loading = low
                                Loading by Contact No., Follow-Up Experiment
                                           Loading = low
1300-
1200-
1100-
1DOO-
^ 900-
CN
| 800-
g> 700-
o> 600-
| 500-
.c 400-
1/1 300-
200-
100-
0-
-100-
Analyte=Riboflavin
1300-
1200-
1100-
1000-
^ 900-
IN
U 800'
a. 700-
o g> 600 -
a 500-
—r- ,i 400-
1/1 300-
^ ™ 0-
i i i i i i i — 10CH
1234567
Ana lyte= Uvitex






•
T
1
1
Coniac-t Number








1
b








E









2
















I









3
















I
4
Contact
FIGURE 1. Hand loading by contact number using riboflavin (left panels) or Uvitex (right
0.2 / 0,1

    loading (//,g/crn2)        p<0.01
surface type     surface loading     contact motion     skin condition     contact number

          first contact (ANOVA)
 p<0.05        p<0.01          p<0.1           p>0,1          b
 p<0.05        p = 0.001         p< 0.001         p>0.1
 repeated contact (Mixed-Effects Model)
 p>0.1         p< 0.001         p< 0.001         p<0.05         p< 0.001
  3 Bold text indicates parameter is statistically significant at p < 0.05. b Contact number only relevant to repeated contact
evaluation.
0.059 versus 0.017/^g/cm2/contact, respectively, at the lower
surface loading. Rates of change are significantly different (p
< 0.0001) in each condition. Atthe0.2,Mg/ cm2 surface loading
(low loading), Uvitex transfers more efficiently than riboflavin.
   Results of statistical analysis to identify significant pa-
rameters for characterizing residue transfers are  presented
in Table  4. These results show that effects of surface loading
on transfer from surfaces to hands under study conditions
                      is significant at alpha = 0.05. Surface type is significant only
                      with initial contact and  skin condition is significant only
                      with repeated contacts.  Comparison  of "first contact" to
                      "repeated contact" results suggests that effect of surface type
                      appears to diminish with repeated contact while effect of
                      skin condition appears to increase with repeated contact.
                      Although effect estimates are similar for surface type: 0.126
                      (initial) versus 0.127 (repeated) there is loss of significance
                                                                 VOL. 42, NO. 3, 2008 / ENVIRONMENTAL SCIENCE & TECHNOLOGY » 937
                                  Previews
                TOG
Next

-------
      from    from     from    from     from
    laminate  carpet  iaminate  carpet  laminate  carpeJ

    Transfer to Aqueous Transfer to PUF Roller Transfer to C1S Press
         Wipe                     Disk (20 sec}
      Transfer Efficiency (%Transfer) for Pesticides and Uvitex OB
and Riboflavin

O DiaziTOfi
O CMorpy rites
Dcis-Permelhrin
e trans -Permeihrtrfe
oEsfenvaierale
; "|_ m Ribaflavtrj
'"flj
1'iU
from
carpeJ
8 Press
TABLE 5. Evidence of
Surface-to-Skin Transfer

Importance of
Experiments3

parameter initial experiments

tracer
skin condition
surface type
surface loading
contact motion
contact duration
contact pressure
contact number


•o
oo
•o
•o
oo
oo
••
Factors Tested across


refined experiments

•O
•O
•O
••
•o
__
-
••
10 ',
s '
CL '



0.01 .U A
^!&"i
%= n
\l '•
v^ ( ^;
<=, -
,, =

fti I :l
HI
s I ~ '•
* • , " :
^ -.
H
^ : ' ?



-------
pesticides  shortly after application. Compounds in other
forms (e.g., particle bound) may transfer differently.
   On the whole, data developed in these studies will reduce
uncertainty in screening-level exposure assessments that are
based on limited default assumptions. In particular, these
results are currently being used with the SHEDS model to
improve estimates  of  exposures resulting  from hand-to-
mouth behavior (21,22). However, the importance of multiple
contacts for characterizing residue transfers to skin and the
need to link dermal loading with absorption to characterize
dose suggest that measurement and modeling approaches
incorporating important temporal aspects of the system need
to be adapted for use in assessing exposures resulting from
dermal contact with contaminated surfaces.

Acknowledgments
The authors acknowledge laboratory assistance of K.Andrews,
J. Sowry, M. McCauley, A. Gregg, and C. Lukuch of Battelle.
We also acknowledge Tom McCurdy of U.S. EPA for helpful
review of this manuscript. The United States Environmental
Protection  Agency through its  Office of Research and
Development funded the  research  described here under
contract 68-D-99-011 to Battelle. It has been subjected to
Agency review and approved for publication.

Supporting  Information Awailable
Detailed methods and summary results for the transfer
experiments (both tracer and pesticide). This information is
available free of charge via the Internet at http://pubs.acs.org.

Literature
 (1)  Fenske, R. A. Dermal exposure: a decade of real progress. Ann.
     Occup. Hyg. 2000, 44 (7), 489-191.
 (2)  Cohen Hubal, E. A.; Sheldon,  L. S.; Zufall, M. J.;  Burke, J. M.;
     Thomas, K. W. The challenge of assessing children's exposure
     to pesticides./. Exposure Anal. Environ. EpidemioL 2000, 10,
     638-649.
 (3)  Hsu, J. P.;  Camann, D. E.; Schatterberg, H., 111.;  Wheeler, B.;
     Villalobos, K.; Garza, M.; Millard, P.; Lewis, R. G. New dermal
     exposure sampling technique. In EPA/AWMA International
     Symposium: Measurement of Toxic and Related Air Pollutants,
     Raleigh, NC, 1990.
 (4)  U.S. EPA Protocol for Dermal Assessment: A Technical Report,
     EPA/600/X-93/005;Office of Research and Development: Wash-
     ington, DC, 1993.
 (5)  Geno, P. W.; Camann, D. E. Handwipe sampling and analysis
     procedure  for the measurement of  dermal contact with
     pesticides. Arch. Environ. Contam. Toxicol 1996, 30 (I), 132-8.
       (6) Camann, D.;  Harding, H. J.; Geno, P. W.; Agrawal,  S. R.
          Comparison of Methods to Determine Dislodgeable Residue
          Transfer from Hoors;EPA/600/R-96/089; Office of Research and
          Development: Washington, DC, 1996.
       (7) Fenske, H. A. Nonuniform dermal deposition patterns during
          occupational exposure to pesticides. Arch, Environ, Contam.
          Toxicol 1990, 19, 332-7.
       (8) Black, K. G.; Fenske, R. A. Dislodgeability of chlorpyrifos and
          fluorescent tracer residues on turf:  comparison of wipe and
          foliar wash samplingtechniques. Arch, Environ. Contam. Toxicol,
          1996,  31, 563-70.
       (9) Fenske, R. A; Birnbaum, S. G. Second generation video imaging
          technique for assessing dermal exposure (VITAE System). Am.
          Ind. Hyg. Assoc. J. 1997, 58, 636-45.
      (10) Ivancic, W. A.; Nishioka, M. G.; Barnes, H. H.; Cohen Hubal,
          E. A.  Development and evaluation of a quantitative video
          fluorescence imaging system and fluorescent tracer for mea-
          suring transfer of pesticide residues from surfaces to  hands
          with repeated contacts. Ann. Occup. Hyg. 2004, 48, 519-532.
      (11) Cohen Hubal, E. A; Suggs, J. C.; Nishioka, M. G.; Ivancic, W. A.
          Characterizing residue transfer efficiencies using a fluorescent
          imaging technique. /. Exposure Anal. Environ. EpidemioL 2005,
          15 (3), 261-270.
      (12) Cochran,W. G.; Cox, G. M. Experimental Designs, 2nd ed.; John
          Wiley  and Sons: New York, 1957;  Vol. 53,  pp 2-540.
      (13) Montgomery, J. H. Agrochemicals Desk Reference; CRC Press:
          Boca Raton, FL, 1997.
      (14) Kamrin, M. A. Pesticide Profiles; CRC Press: Boca Raton, FL,
          1997.
      (15) International Programme on Chemical Safety. Chemical Safety
          Information from Intergovernmental Organizations;  www.
          inchem.org/documents.
      (16) British Crop Protection Council. The Pesticide Manual; British
          Crop Protection Council: Hampshire, U.K., 1997.
      (17) SyracuseResearchCorporation.interkowjwww.syrres.com/esc/.
      (18) Ciba Specialty Chemicals. Uvitex OB Fact Sheet; June. 1998.
      (19) U.S. EPA. Important Exposure Factors for Children: An Analysis
          of Laboratory and Observational Field Data Characterizing
          Cumulative Exposure to Pesft'cofes;EPA600/R-07/013; Oficc of
          Research and Development: Washington, DC, 2007, p 63; http://
          www.epa.gov/nerl/research/data/ (accessed 06/01/07).
      (20) Cohen Hubal, E. A; Egeghy, P; Leovic, K; Akland, G. Measuring
          potential dermal transfer of a pesticide to children in a daycare
          center. Environ. Health Perspect. 2006, 114 (2), 264-269.
      (21) Ozkaynak, H. Personal communication. October 26, 2007.
      (22) Zartarian, V. G.; Ozkaynak, H.; Burke, J. M.; Zufall, M. J.; Higas,
          M. L.; Furtaw, E.  J. A modeling framework for estimating
          children's residential exposure and dose to  chlorpyrifos  via
          dermal residue contact and non-dietary  ingestion. Environ,
          Health Perspect. 2000, 108 (6), 505-514.

          ES071668H
                                  Previous
TOC
                                                                 VOL. 42, NO. 3, 2008 / ENVIRONMENTAL SCIENCE & TECHNOLOGY » 939
Next

-------
                                                                                                           Commentary
Computational Molecular Modeling  for  Evaluating the Toxicity
of  Environmental  Chemicals:  Prioritizing  Bioassay  Requirements
James R. Rabinowitz, Michael-Rock Goldsmith, Stephen B. Little, and Melissa A. Pasquinelli
National Center for Computational Toxicology, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA
 BACKGROUND: The human health risk from exposure to environmental chemicals often must be
 evaluated when relevant elements of the preferred data are unavailable. Therefore, strategies are
 needed that can predict this information and prioritize the outstanding data requirements for the
 risk evaluation. Many modes of molecular toxicity require the chemical or one of its biotransforma-
 tion products to interact with specific biologic macromolecules (i.e., proteins and DNA). Molecular
 modeling approaches may be adapted to study the interactions of environmental chemicals with
 biomolecular targets.
 OBJECTIVE: In this commentary we provide an overview of the challenges that arise from applying
 molecular modeling tools developed and commonly used for pharmaceutical discovery to the prob-
 lem of predicting the potential toxicities of environmental chemicals.
 DISCUSSION: The use of molecular modeling tools to predict the unintended health and environ-
 mental consequences of environmental chemicals differs strategically from the use of the same tools
 in the pharmaceutical discovery process in terms of the goals and potential applications. It also
 requires consideration of the greater diversity of chemical space and binding affinity domains than
 is covered by pharmaceuticals.
 CONCLUSION: Molecular modeling methods offer one of several complementary approaches to evalu-
 ate the risk to human health and the environment as a result of exposure to environmental chemicals.
 These tools can streamline the hazard assessment process by simulating possible modes of action and
 providing virtual screening tools that can help prioritize bioassay requirements. Tailoring these strate-
 gies to the particular challenges presented by environmental chemical interactions make them even
 more effective.
 KEY WORDS: computational toxicology, docking, enrichment, false negatives, high-throughput screen-
 ing, molecular modeling, prioritizing bioassays, virtual screening. Environ Health Perspect 116:573—577
 (2008). doi:10.1289/ehp.H077 available via http:lldx.doi.orgl [Online 1 February 2008]
A diverse spectrum of anthropogenic mole-
cules is found in the environment, including
chemicals introduced deliberately as well as
unintended by-products of human  activity.
Through diligent monitoring, we are learn-
ing the identity, distribution, extent, and
environmental persistence of these chemicals.
To  provide a reliable evaluation of the risk
presented by these compounds, information
about the specific molecules is required. This
includes  knowledge of the interaction of the
chemicals  with the environment  and the
effects of the chemicals or their successors on
human health and ecologic systems.
    The health and environmental effects of a
chemical derive from a continuum of processes
that proceed from the source of a chemical or
its predecessors to a set of outcomes. However,
it is often convenient to consider each process
in the continuum as a discrete entity [U.S.
Environmental Protection Agency (EPA)
2003]. Ideally, a risk assessment uses informa-
tion relative to the specific chemical being
considered. However, often the potential
effects of a chemical must be evaluated when
some relevant elements of the preferred data
matrix are missing. In these situations, an esti-
mate is derived by extrapolating from existing
information.
    Various approaches,  including computa-
tional methods, have been developed to model
these discrete steps in the source-to-outcome
paradigm. These models provide approxima-
tions of the missing experimental information
and a measure of the impact of specific miss-
ing data on the evaluation of risk. The mod-
els use existing information and can suggest
new experiments. As a result, the source-to-
outcome continuum becomes populated with
information that includes experimental data,
model-derived data, and connection models.
The toxicant—target paradigm is a computa-
tional approach that employs molecular model-
ing methods to estimate relevant interactions
and to populate the outcomes side of the
source-to-outcome continuum.

The Toxicant-Target  Paradigm
The differential step in many mechanisms of
toxicity may be generalized as  the interaction
between  a small molecule  (a toxicant) and
one or more macromolecular targets. Targets
include genetic material, receptors,  transport
molecules, and enzymes. In addition, other
targets for toxicity could conceptually  be
described. The difference in activity observed
between chemicals acting through  the same
biologic mode of action may then be under-
stood as differences between their interactions
with putative targets.
   Some molecular modeling methods have
been developed specifically to study interactions
of this type and are commonly employed for
the discovery of novel pharmaceutical agents
(Coupez and Lewis 2006; Sousa et al. 2006).
These methods can estimate the capacity of a
chemical to interact with a specific target and
cause a biologic effect. In the context of esti-
mating chemical toxicity, this  approach can
yield predictions of the potential biologic
activity. These molecular modeling tools can
inform testing strategies or provide elements
in a scheme for estimating toxicity that also
include experimental results.

Molecular Modeling in
Computational Toxicology:
Probing Toxicant-Target
Structure

The toxicant—target paradigm can be used to
develop models for predicting chemical toxic-
ity. These models are composed of approxi-
mate mathematical descriptions of the
underlying physics and chemistry governing
the behavior of the interacting molecules.
These descriptions and their computational
implementations construct a bridge between
the information domains of experimental bio-
molecular structure  and biologic effects.
Figure 1  depicts how molecular modeling can
be used to estimate chemical toxicity via the
toxicant—target paradigm.
    Experimental information is used to pro-
vide a putative list of potential macromolecu-
lar targets related to chemical toxicity. For
some of these targets,  structural information
is available or may be inferred from similar
structures via homology modeling (Hillisch
et al.  2004). The specific  interactions
between potential toxicants and the struc-
tures of  known targets may be modeled via
Address correspondence to M.A. Pasquinelli, North
Carolina State University, Fiber and Polymer
Science/Department of TECS,  Campus Box 8301,
2401 Research Dr., Raleigh, NC 27695 USA.
Telephone: (919) 515-9426. Fax:  (919) 515-6532.
E-mail: Melissa_Pasquinelli@ncsu.edu
  During a portion of this work, M-R.G. was sup-
ported by National Health and Environmental Effects
Research Laboratory-Department of Environmental
Sciences  and Engineering Training Agreement EPA
CT829471.
  This work was reviewed by the U.S. Environmental
Protection Agency and approved for publication but
does not necessarily reflect official agency policy.
  The authors declare they have no competing
financial  interests.
  Received 16 November 2007;  accepted 1 February
2008.
Environmental Health Perspectives •  VOLUME 1161 NUMBER 5 I May 2008
                                                                                573
                                       Previous
                 TOC

-------
Rabinowitz et al.
"docking" molecular modeling formalisms
(Kuntzl992).
    In the absence of specific structural infor-
mation  about the targets, an alternative is to
employ a ligand-based, cheminformatics strat-
egy. This method derives relationships among
various  attributes of a database of ligands and
known  target-based activities. The attributes
of the ligand may be simple or complex struc-
tural descriptions and properties that are
either measured or derived computationally
(Tong et al.  1997; Waller et al. 1996). Note
that these cheminformatics methods have also
been applied to predict chemical toxicity
without direct consideration of a target
(Prathipati et al.  2007), but methods of this
type are not the primary subject of this report.
    With both the structural bioinformatics
and cheminformatics approaches, predictive
models  are developed and tested with experi-
ments.  A feedback process  may be used to
improve the quality of the predictions. In
addition, these prediction tools can be used to
identify important missing experimental infor-
mation and relevant  bioassays or properties
that are currently unavailable. The underlying
mechanism of action  determines the range of
applicability of the model. In order to use this
approach as an element in a toxicity screen or
for developing bioassay strategies, a number of
choices must be made.
    To  a large extent,  the  pharmaceutical
industry has driven  recent advances in the
design of molecular modeling tools for study-
ing the interactions between a small molecule
and a complex macromolecule (Jorgensen
2004). One approach for the discovery of
leads for developing novel pharmaceutical
agents employs computational "docking" of
each member of a chemical library to macro-
molecular targets that are chosen for potential
therapeutic benefit. Molecular docking is
designed to simulate the binding feasibility
and affinity of small molecules to protein tar-
gets (Abagyan and Totrov 2001; Halperin
et al. 2002). A docking calculation generates a
variety of poses of a small molecule within a
"binding region" of the macromolecular tar-
get, and  typically includes ligand flexibility
(Sousa et al. 2006). At times, some form of
macromolecular flexibility (Carlson 2002) is
also included. An important component of the
docking simulation is to identify the potential
binding sites within a macromolecular target.
These sites could be an interior pocket or an
indentation on the macromolecular surface
(Huang and Schroeder 2006).
    The  calculation of a score assesses the
potential relevance of each docking pose.
Functions used for scoring poses typically
take into account geometric shape comple-
mentarity as well as the physicochemical
interactions between the small molecule and
the macromolecular target (Coupez and Lewis
2006;  Sousa et al. 2006). The docking score
can be construed as a surrogate for the energy
                           Structural bioinformatics
                                                                       Analytics
                                                                    numeric and visual}
                                                           Experiment
                                                         kinetic parameters
                                                      thermodynamic parameter*
                                                        structural parameters
                                                         functional insight
                                                      molecular mode of action
                                                       biophysical interactions
                                                            screening
                             >
                            ••MCnowledge
Figure 1. An overview of molecular modeling in computational toxicology. Abbreviations: QSAR, quantita-
tive structure-activity relationship; QSPR, quantitative structure-property relationship. After the identifica-
tion of a putative toxicant and target complexes (yellow sphere), the target structure  (red spheres) is
either experimentally determined or modeled based on structures with known sequence identity.
Cheminformatics approaches and molecular docking (green spheres) can be used to obtain information
about the putative toxicant (overlap of red and green spheres) and predict the desired properties, such as
target-specific binding affinity and molecular modes of binding. Mathematical and visual analytics, such as
hierarchical clustered heat maps or target-specific linkage maps, can yield knowledge that is chemical-
class specific or target specific. Experimental guidance (blue arrow) optimizes this virtual screening
approach.
of interaction between the target and the
small molecule, and in some cases is provided
in terms of measures such as the log of the
dissociation constant for inhibitor binding
(Kj) or kilocalories per mole that may be
directly compared with binding experiments.
Comparison  of these  scores or computed
interaction energies for a library of chemicals
provides a means for ordering the molecules
by their capacity to interact with the macro-
molecular target. Chemicals with the best
scores are most likely to interact with the tar-
get and are selected as subjects for  further
study. It is important to consider more than
just a single pose with the  best score because
there are likely several local minimum energy
poses in the interaction profile and a variety
of highly ranked poses (Coupez and Lewis
2006; Sousa et al. 2006).
   As is the  case for the design of novel
pharmaceutical agents, the  successful applica-
tion of docking methods to problems in
chemical toxicity depends on the identification
and availability of the crystal structures of the
macromolecular targets or similar proteins. A
variety of structures are available for macro-
molecular targets that are known to be linked
to the adverse  effects of environmental chemi-
cals, and their number is continually increasing
(Hu et al. 2005; Wang et al. 2005). However,
simulations of the interaction between small
molecules and a macromolecular target for the
purposes of drug discovery versus toxicity
screening have distinct differences and, thus,
present distinct challenges: a) the focus on dif-
ferent (yet overlapping) regions of chemical
space; U) the strength of interaction between a
small molecule and  macromolecular targets;
and c) the ultimate purpose of the virtual
screening results.
    Figure 2 is an approximate depiction of the
chemical space for nonpharmaceutical com-
mercial  chemicals versus druglike chemicals in
three selected  dimensions of physicochemical
characteristics.  Viable drug candidates are typi-
cally those that have a strong interaction with a
specific target, have good bioavailability,  and
are readily metabolized to inactive compounds
and cleared from the system, in other words,
compounds that have specific absorption, dis-
tribution, metabolism, excretion, and toxicity
(ADMET) profiles and prescribed chemical
properties. In contrast, environmental chemi-
cals span a considerably larger chemical space
and tread into "undesirable" property space
from an ADMET perspective (too small, too
insoluble, too reactive, etc.). They can  also
elicit adverse biologic effects from both strong
and weak interactions with targets and in both
a specific and nonspecific manner. Weak inter-
actions  and nonspecificity are also important
aspects of pharmaceutical development because
some side effects might arise from unintended
binding to secondary targets (Ekins 2004;
574
                           VOLUME 116 I NUMBER 5 I May 2008  •  Environmental Health Perspectives
                                        Previous
                  TOC

-------
                                                                                       Molecular modeling for prioritizing bioassays
Ji et al. 2006). In addition, some environmen-
tal chemicals are produced and disposed of in
significantly larger quantities than are pharma-
ceuticals and, hence, may present inadvertent
human hazards over a long-term, low-dose
exposure scenario. This  is  particularly the case
if they are more chemically  stable and persis-
tent (i.e., resistant to metabolism), are poten-
tially as bioavailable as drug  candidates, or act
through common pathways (thus posing
cumulative effects) even if their individual tar-
get-specific interactions are much weaker than
drugs or endogenous chemicals. Hence, evalu-
ating the relative effectiveness of chemicals that
bind more weakly or to multiple targets less
specifically presents  a greater challenge experi-
mentally and computationally than does the
discovery of novel pharmaceutical leads.
Scoring functions in molecular modeling
methods are typically optimized to identify
chemicals that bind best  to the target.
    Another significant difference between
pharmaceutical  optimization and assessing the
chemical toxicity of environmental chemicals
is the purpose of an  initial  screen  of a chemical
library.  For the pharmaceutical  industry, the
purpose  of the  initial screen for finding  new
drug candidates is to limit the number  of
chemicals that proceed  to the next  (more
expensive) phase of testing while increasing
the ratio of chemicals likely to become drugs
to those likely  to be inactive (i.e., increasing
the "hit rate"). As long as the hit rate becomes
significantly improved by  this  process, the

                           Percent halogen
exclusion of some active chemicals is a reason-
able cost. In contrast, the purpose of an initial
screen of environmental chemicals is to maxi-
mize the chance that active chemicals advance
to the next phase of testing while eliminating
as many inactive chemicals as possible. Given
this objective and the corresponding uncer-
tainties in assessing "potency" or activity based
solely on computed scoring functions, the goal
is to discover all or almost all of the agents
having the potential to interact with the tar-
get, even those in significantly lower binding
affinity domains than the endogenous or puta-
tive cognate ligand for the receptor.  Thus,
minimizing the number of false negatives is
critical when screening environmental chemi-
cals because the expectation is that positive
chemicals will be tested later in an experimen-
tal protocol. A toxicity screen should not reject
a compound (i.e., classify as inactive or safe)
that has a weak affinity for a target or multiple
targets without considering its ADMET prop-
erties, persistence, and chance of exposure.
Obtaining activity signatures from receptor
affinity profiles of compounds not intended
for therapeutic application may become an
important aspect of multilevel screening  pro-
grams that include measured biologic proper-
ties, such as ToxCast (Dix et al. 2007).

Enrichment and  False
Negatives
Figure 3 shows two hypothetical data scenar-
ios derived  from computational docking
   Log P(octanol/water
                                                                     Topologic
                                                                polar surface area (A3)
                                                        Registered Pharmaceuticals
                                                        High-production-volume chemicals
Figure 2. Plot of environmental anthropogenic compounds and registered Pharmaceuticals subject to a
Lipinski druglike filter. The axes represent three physicochemical characteristics for each compound: total
polar surface area, partition coefficient (log P) between octanol and water, and fraction halogenated. The
environmental compounds are the high-production-volume chemicals (Wolf et al. 2006), and the registered
Pharmaceuticals are the FDAMDD [FDA (Food and Drug Administration) maximum (recommended) daily
dose] set from the DSSTox (Distributed Structure-Searchable Toxicity) database (Matthews et al. 2005).
experiments using the same library of chemi-
cals against a model target. The difference
between the two sets arises either from choos-
ing different docking score thresholds between
predicted active and inactive chemicals or
from  using different scoring functions. For
this example,  definitive (ideal) experimental
tests determine that 5% of the chemicals are
active and 95%  are inactive relative to the
macromolecular target of interest.  Scenario A
has 89% of the chemicals classified correctly,
whereas scenario B has only 55% of the chem-
icals classified correctly. The enrichment factor
for scenario A is 4 because 20% of the chemi-
cals selected for further testing (i.e., screened
positive) will prove to be positive, whereas the
enrichment factor for scenario B is only 2.
However, the type II error for scenario A is
0.6, whereas it is 0.0 for scenario B.
   The screening method used for scenario A
appears to be better by many measures and is
an appropriate approach if the goal is to dis-
cover novel pharmaceutical leads. On the
other hand, the  screening method used for
scenario B is more appropriate when screening
chemicals for potential toxicity. Scenario B
will carry many more chemicals to the next
phase of testing, but the negatives are true
negatives. Chemicals identified by this pre-
screen as negative will have a lower priority for
continued testing and perhaps will not be
tested in any other manner for effects at this
particular target.
   This discussion addresses the challenges in
using current  docking methods for assessing
chemical toxicity. The methods that are cur-
rently available for computational molecular
docking were developed for drug discovery
and therefore are optimized to screen  large
chemical databases to find the most active
molecules and increase the enrichment factor.
                                                                                                                         Scenario B
                                                                                                                       55% predictive
                                                                                                                          accuracy
                                                                                                                         Experiment
                                                  Enrichment factor = 4     Enrichment factor = 2
                                                     Type II error = 0.6         Type II error = 0

                                              Figures. Illustration of type II errors and enrich-
                                              ment factors in chemical screening. The statistical
                                              "type II error" is the ratio of the number of false
                                              negatives to the sum of false negatives and true
                                              positives. The "enrichment factor" is the ratio of
                                              the true positive rate of the screen (the  number of
                                              true positives  divided by the number of true posi-
                                              tives plus false positives) to the ideal positive rate
                                              of the  chemical library (the number of positive
                                              chemicals in the library divided by the number of
                                              chemicals in the library).
Environmental Health Perspectives • VOLUME 1161 NUMBER 5 I May 2008
                                                                                   575
                                        Previous
                  TOC

-------
Rabinowitz et al.
Some false negatives  are  not an important
concern as long as the enrichment rate is sig-
nificantly increased. In contrast, a screen for
assessing chemicals for potential toxicity often
deals with a smaller database of chemicals (the
chemicals encountered in the environment)
and must be capable of identifying chemicals
with much lower affinities than  the natural
ligands. Therefore, scoring functions and/or
methods for delineating active chemicals from
inactive chemicals must be explored  and bet-
ter understood in the context of environmen-
tal chemicals, and may involve computational
methods that are  more accurate but computa-
tionally intensive.

Virtual Screening  of  Chemicals
The usual approach for virtual screening of
chemicals for toxicity is to screen a database of
chemicals for each chemical's capacity to inter-
act with a single  macromolecular target and
initiate a single mode  of chemical toxicity. A
virtual screen that is  receptor specific pro-
duces a score vector where each element rep-
resents  the interaction  of that receptor with a
different chemical entity.  Inverting the prob-
lem so  that the vector  now contains  elements
that represent the capacity of a single chemi-
cal to interact with each of a series of targets
allows the most likely targets and, therefore,
the most likely modes of toxicity for a specific
chemical to be identified. A matrix is pro-
duced  by interrogating a library of targets
with a  database of chemicals. The relation-
ships among the elements in this  matrix have
the potential to yield additional insights, such
as receptor cross-talk or  multiple modes of
biologic potency (Macchiarulo et al. 2004),
and modes of sequestration (Perry et al.
2004).  A combination of these computation-
ally derived data  and experimentally derived
data can be data-mined to extract patterns and
associations. These associations can provide
additional knowledge for assessing the hazards
of chemicals and chemical mixtures or be used
to improve  prediction  tools in the context of
toxicity such as scoring functions  for molecu-
lar docking calculations.
    For some targets in the library,  other inter-
actions in addition to those included  in dock-
ing algorithms  must be considered. For
instance, modes of toxicity have been identified
that require covalent interactions between the
toxicant and the target (Zhou et al. 2005) or
that necessitate the redistribution of charge in
both the toxicant and  the target (Pardo et al.
1993). These interactions involve the elec-
tronic structure of both the putative  toxicant
and target molecules and, hence, require some
level of quantum chemistry.  However, most
current docking methods include only classical
interactions. One  approach is to use molecular
docking to  determine the structure  of com-
plexes,  and then  to calculate  the  short-range
interactions with quantum chemistry methods.
A few attempts have been made in recent
years to build essential  quantum effects
directly  into molecular docking calculations,
such as  quantum polarized ligand  docking
(Choetal. 2005).
    In addition, to take  into account the
known or hypothesized biotransformation
products during the molecular docking calcu-
lations, each constituent must be included as
separate  chemical entities in the docking cal-
culations. Computational  tools already exist
for predicting metabolites (Jolivette and Ekins
2007),  so  docking calculations could be
improved by networking with metabolism
prediction models.
    Another application for virtual screening is
predicting prospective targets for a particular
chemical and its metabolites using inverse
docking strategies.  In drug discovery, this
approach can identify potential alternate uses
for drug candidates or  predict side effects of
pharmaceuticals that might arise from unin-
tended interactions with other targets (i.e., off-
target effects),  thus  producing  adverse
outcomes. As an element in a toxicity screen
for environmental chemicals, inverse docking
tools can be used to guide experimental test-
ing. Inverse docking can help focus efforts and
lead to a reduction in the  use of resources as
well as the  time required for a hazard or risk
assessment. Some attempts at inverse docking
methods have arisen in recent years (Ekins
2004; Ji  et  al. 2006),  although these  methods
still face some limitations  that prevent their
more general use in virtual screenings. Inverse
docking  strategies could become a more viable
resource as further target crystal structures
become available and molecular docking
methods are improved, and in conjunction
with systems biology methods such as pro-
teomics and genomics (Loging et al. 2007).

Conclusions
Computational molecular  modeling  methods
aid the risk assessment process by providing a
rational  approach for some extrapolations in
the evaluation of chemical hazard.  For
instance, when elements of a data set required
for evaluating the potential hazard of a chemi-
cal are unavailable and inferences can be made
based on interactions with putative targets,
molecular modeling can be used to  simulate
the relevant missing information. Both ligand-
and structure-based molecular modeling
methods used in pharmaceutical discovery can
be adapted to provide this  type of simulated
data. However, because  of the greater diversity
of chemical space  and  binding  affinity
domains being considered and the differences
in the strategic  application  of the results (the
need to minimize  false  negatives), these
molecular modeling strategies require addi-
tional considerations when assessing  chemical
hazards. Molecular docking of potential envi-
ronmental chemicals to putative macromolecu-
lar targets for toxicity provides a measure of
their capacity to interact and hence is an aid in
the (pre)screening process for specific modes of
toxicity. These results provide a rationale for
developing further, more complete testing
strategies.

                 REFERENCES

Abagyan R, Totrov M. 2001. High-throughput docking for lead
   generation. Curr Opin Chem Biol 5(41:375-382.
Carlson HA. 2002. Protein flexibility and  drug design: how to hit
   a moving target. Curr Opin Chem Biol 6(41:447-452.
Cho AE, Guallar V, Berne B, Friesner RA. 2005. Importance of
   accurate charges in molecular docking: quantum mechani-
   cal/molecular mechanical (QM/MM) approach. J Comput
   Chem 26:915-931.
Coupez B, Lewis RA. 2006. Docking and scoring: theoretically
   easy, practically impossible? Curr Med Chem 13:2995-3003.
Dix DJ, Houck KA, Martin MT, Richard AM, Setzer RW,
   Kavlock RJ. 2007. The ToxCast program for prioritizing toxi-
   city testing of environmental chemicals. Toxicol Sci
   95(1):5-12.
Ekins S. 2004. Predicting undesirable  drug interactions with
   promiscuous proteins in silico. Drug Discov Today
   9(61:276-285.
Halperin I, Ma B, Wolfson H, Nussinov R. 2002. Principles of
   docking: an overview of search algorithms and a guide to
   scoring functions. Proteins 47(41:409-443.
Hillisch A, Pineda LF, Hilgenfeld R. 2004. Utility of homology
   models in the drug discovery process. Drug  Discov Today
   9(151:659-669.
Hu L, Benson ML, Smith RD, Lerner MG, Carlson HA. 2005. Binding
   MOAD (mother of all databases). Proteins 60(31:333-340.
Huang B, Schroeder M. 2006. LIGSITECSC: predicting ligand bind-
   ing sites using the Connolly surface and degree of conser-
   vation. BMC Struct Biol 6:19; doi:10.1186/1472-6807-6-19
   [Online 24 September 2006].
Ji ZL, Wang Y, Yu L, Han LY, Zheng CJ,  Chen YZ.  2006. In silico
   search of putative adverse drug reaction related proteins
   as  a potential tool for facilitating drug adverse effect pre-
   diction. Toxicol Lett 164(21:104-112.
Jolivette LJ, Ekins S. 2007. Methods for predicting human drug
   metabolism. Adv Clin Chem 43:131-176.
Jorgensen WL. 2004. The many roles of computation in drug
   discovery. Science 303(56651:1813-1818.
Kuntz ID. 1992. Structure-based strategies for  drug design and
   discovery. Science 257(50731:1078-1082.
Loging W, Harland L.Williams-Jones B. 2007.  High-throughput
   electronic biology: mining information for drug  discovery.
   Nat Rev Drug Disc 6(31:220-230.
Macchiarulo A, Nobeli I, Thornton JM.  2004. Ligand selectivity
   and competition  between enzymes  in silico. Nat
   Biotechnol 22(81:1039-1045.
Matthews EJ, Kruhlak NL, Benz RD, Contrera JF, Rogers BA,
   Wolf  MA, et al. 2005. DSSTox FDA Maximum (Recom-
   mended) Daily Dose Database (FDAMDD): SDF Files and
   Documentation. Updated version FDAMDD_v2b_1217_
   10Apr2006. Available: http://www.epa.gov/ncct/dsstox/
   [accessed 1 September 2007].
Pardo  L,  Osman R, Weinstein H, Rabinowitz JR. 1993.
   Mechanisms of nucleophilic addition to activated double
   bonds: 1,2- and 1,4-Michael addition of  ammonia. J Am
   ChemSoc 115:8263-8269.
Perry JL, Goldsmith MR, Peterson MA, Beratan DN, WozniakG,
   Rueker F, et al. 2004. Structure of the ochratoxin A binding
   site within human serum albumin. J Phys Chem B
   108(431:16960-16964.
Prathipati P, Dixit A, Saxena AK. 2007. Computer-aided drug
   design: integration of structure-based and  ligand-based
   approaches in drug design. Curr Comp Aid  Drug Des
   3(21:133-148.
Sousa SF, Fernandes PA, Ramos MJ. 2006. Protein-ligand
   docking: current status and future challenges. Proteins
   Struct Funct Bioinform 65:15-26.
Tong W, Perkins  R, Xing L, Welsh WJ, Sheehan DM. 1997.
   QSAR models for binding of estrogenic compounds to
   estrogen receptor alpha and beta subtypes. Endocrinology
   138(91:4022-4025.
576
                             VOLUME 116 I NUMBER 5 I May 2008 • Environmental Health Perspectives
                                          Previous
                   TOC

-------
                                                                                                   Molecular modeling for prioritizing bioassays
U.S. EPA. 2003. A Framework for a Computational Toxicology
    Research Program in ORD. EPA/600/R-03/065. Washington,
    DC:U.S. Environmental Protection Agency.
Waller CL, Oprea Tl, Chae K, Park HK, Korach KS, Laws SC,
    et al. 1996. Ligand-based identification of environmental
    estrogens. Chem Res Toxicol 9(81:1240-1248.
Wang R, Fang X, Lu Y, Yang CY, Wang S. 2005. The PDBbind
    database: methodologies and updates. J Med Chem
    48(121:4111-4119.
Wolf MA, Burch J, Richard AM. 2006. DSSTox EPA High
    Production Volume Challenge Program Structure-Index
    File: SDF File and Documentation. Updated version
    HPVCSI_v1b_3548_040ct2006. Available: http://www.
    epa.gov/ncct/dsstox/[accessed 1 September 2007].
Zhou S, Chan E, Duan W, Huang M, Chen Y-Z. 2005. Drug bio-
    activation, covalent binding to target proteins and toxicity
    relevance. Drug Metab Rev 37(11:41-213.
Environmental Health Perspectives  •  VOLUME 1161 NUMBER 5 I May 2008
                                              Previous
                                                                                                577
                     TOC

-------
TOXICOLOGICAL SCIENCES 103(1), 14-27 (2008)
doi: 10.1093/toxsci/kfm297
Advance Access publication December 7, 2007
                                                     REVIEW

       Computational Toxicology—A State  of the  Science  Mini  Review

    Robert J. Kavlock,*'1 Gerald Ankley,t Jerry Blancato,* Michael Breen,$ Rory Conolly,* David Dix,* Keith Houck,*
          Elaine Hubal,* Richard Judson,* James Rabinowitz,* Ann Richard,* R. Woodrow Setzer,* Imran Shah,*
                                          Daniel Villeneuve,t and Eric Weberj:
   *National Center for Computational Toxicology; ^National Health and Environmental Effects Research Laboratory; and ^.National Exposure Research
           Laboratory, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina

                                       Received October 5, 2007; accepted December 5, 2007
  Advances in computer sciences and hardware combined with
equally  significant  developments  in  molecular  biology and
chemistry are providing toxicology with a  powerful  new tool
box. This tool box of computational models promises to increase
the efficiency and the effectiveness by which the hazards and risks
of environmental chemicals are determined. Computational toxi-
cology focuses on applying these tools across many scales, in-
cluding vastly increasing the numbers of chemicals and the types
of biological interactions  that can be evaluated. In  addition,
knowledge of toxicity pathways gathered within the tool box will
be directly applicable to the study of the biological responses
across a range of dose levels, including those more likely to be
representative of exposures to the human population. Progress in
this field will facilitate the transformative shift called for in the
recent report on toxicology in the 21st century by the National
Research Council. This review surveys the state of the art in many
areas  of computational toxicology and points to several hurdles
that will be important to overcome as the field moves forward.
Proof-of-concept studies need to clearly demonstrate the addi-
tional predictive power gained from these tools. More researchers
need to become  comfortable working with both the data  gener-
ating  tools  and the computational modeling capabilities, and
regulatory authorities must show a willingness to the embrace new
approaches as they gain scientific acceptance. The next few years
should witness the early fruits of these efforts, but as the National
Research Council indicates, the paradigm shift will take a long
term investment and commitment to reach full potential.
  1 To whom correspondence  should be addressed  at B-205-01, National
Center for Computational Toxicology, Office of Research and Development,
U.S. Environmental Protection Agency, Research Triangle Park, NC 27711.
Fax: 919-541-1194. E-mail: kavlock.robert@epa.gov.
  This mini review  is based  on presentations and discussions  at  the
International  Science  Forum  on  Computational Toxicology  that  was
sponsored by the Office of  Research and Development of  the U.S.
Environmental Protection Agency and held in Research Triangle Park, NC
on May 21-23, 2007. The complete agenda and copies  of the individual
presentations from the Forum are available on the Internet (www.epa.gov/ncct/
sciforum).

Published by Oxford University Press 2007.
  Key Words: bioinformatics;  biological  modeling; QSAR;
systems biology; cheminformatics; high throughput  screening;
toxicity pathways.
  Computational toxicology is a growing research area that is
melding advances in molecular biology and chemistry  with
modeling and computational science in order to  increase the
predictive  power  of the  field  of  toxicology.  The  U.S.
Environmental Protection Agency (U.S. EPA) defines compu-
tational toxicology as the "integration of modern computing
and information technology with molecular biology to improve
Agency prioritization of data requirements and risk assessment
of chemicals" (U.S. EPA, 2003). Success in this  area would
translate to greater efficiency and effectiveness in determining
the hazards of the many environmental stressors that must be
dealt with, and deciding  what types of information are most
needed to decrease uncertainties in the protection of human
health  and the environment. Computational toxicology differs
from traditional toxicology in  many aspects, but perhaps the
most  important is that  of scale.  Scale in  the  numbers  of
chemicals that are studied, breadth of endpoints and pathways
covered, levels of biological organization examined, range of
exposure conditions considered, and in the coverage  of life
stages, genders, and species. It will take considerable progress
in all  these  areas to  make toxicology a broadly predictive
science. Key advances leading the  field include  construction
and  curation  of  large-scale  data repositories  necessary  to
anchor the interpretation  of information from new technolo-
gies; the  introduction of virtual  and laboratory-based high-
throughput assays on hundreds to thousands of chemicals per
day and high-content assays with hundreds  to thousands of
biological  endpoints  per sample  for  the  identification  of
toxicity pathways; and the latest advances  in  computational
modeling that are providing the tools needed to integrate in-
formation across multiple levels of biological organization for
characterization of chemical hazard and risk to individuals and
                                      Previous

-------
                                              COMPUTATIONAL TOXICOLOGY
                                                                                                                       15
                                                        TABLE 1
 Tasks Identified by the National Research Council (2007) in Each Main Topic Area that are Necessary to Transform Toxicity Testing
 from the Current Animal-Model Based Approach to One that is more Reliant on In Vitro Systems to Detect and Characterize Toxicity
                                                   Pathways of Concern
Population-based and human-exposure data

Chemical characterization

Toxicity pathway characterization



Targeted testing



Dose-response and extrapolation modeling
Develop novel approaches to gather exposure data needed for making
  hazard ID and risk assessment decisions.
Environmental chemicals would be first characterized for a number of properties
  related to environmental distribution, exposure risk, physicochemical properties.
Toxicity pathways describe the key details of modes and mechanisms at a molecular level.
  By characterizing these and developing relevant in vitro assays, one can make  definitive
  statements about the potential hazards posed by chemicals being tested.
In many cases, even when it is known what toxicity pathways are activated by a  chemical,
  it will be necessary to perform specialized or targeted tests, for instance to determine
  dose-response relationships. The targeted testing phase may continue to use animal models.
Increasingly accurate and predictive computer models need to be developed to make use
  of the information derived from the earlier phases and to aid in making regulator decisions.
populations.  Collectively,  these advances reflect the wave of
change that is sweeping and reinvigorating toxicology, just in
time to facilitate the vision of toxicology in the 21st century
that was recently released by the National Research Council
(NRC) of the National Academy of Science (National Research
Council, 2007).  The NRC report's overall objective is to foster
a transformative paradigm shift in toxicology based largely on
the use of in vitro systems  that will (1) provide broad coverage
of chemicals, chemical mixtures, outcomes, and life stages; (2)
reduce the cost  and time of testing; (3) use fewer animals and
cause minimal suffering in the animals used; and (4) develop
a more robust scientific base for  assessing health effects of
environmental agents. The report describes  this effort as one
that will require the involvement of multiple organizations in
government, academia,  industry, and  the public.  This mini
review describes advances that are now occurring in many of
the areas that are contributing to computational toxicology, and
is  organized along  the dimensions outlined by  the National
Research Council (2007).  The principle  tasks outlined  in the
NRC report are  presented in Table 1, and each relevant  aspect
of computational toxicology is discussed accordingly.
            CHEMICAL CHARACTERIZATION

  Chemical characterization involves the compilation of data
on  physical and chemical properties,  uses,  environmental
surveillance, fate and transport, and properties that relate to the
potential for exposure, bioaccumulation, and toxicity (National
Research Council, 2007).

Predicting the Environmental Fate and Transport of
  Chemical Contaminants

  The ability to conduct chemical  exposure and risk assess-
ments is dependent on tools and models  capable of predicting
environmental concentrations. As the size (currently > 80,000
             chemicals) and  diversity of the regulated  chemical universe
             continues to increase, so does the need for  more sophisticated
             tools and models for calculating the physical-chemical prop-
             erties necessary for predicting environmental fate and transport.
             This need is further driven by the increasingly complex array of
             exposure  and  risk assessments necessary to develop  scientif-
             ically defensible regulations. As this modeling capability in-
             creases in complexity and scale, so must the data inputs. These
             new predictive models will require huge arrays of input data,
             and many of the required inputs are neither available nor easily
             measured.
               Currently,  the  Estimation Program  Interface  Suite  (EPI
             Suite) is the primary modeling system utilized within U.S. EPA
             for  providing estimates of the common  physical-chemical
             properties necessary for predicting chemical fate and transport
             such as octanol/water partition coefficients, water solubility,
             hydrolysis rate constants,  and Henry's  law constants (http://
             www.epa.gov/oppt/exposure/pubs/episuite.htm). The EPI Suite
             calculators are based  primarily  on a fragment constant ap-
             proach that has been validated with an independent  set of
             chemicals. In general, the EPI Suite predicts physical-chemical
             properties  within an  order of  magnitude,  which is normally
             sufficient for screening level regulatory assessments.
               The limitations of the EPI Suite calculators (e.g., inability to
             calculate ionization constants (pKas) and transformation rates
             constants  beyond hydrolysis) require the use of other compu-
             tational methods for meeting  data needs.  SPARC Performs
             Automated Reasoning in Chemistry (SPARC) uses computa-
             tional algorithms  based on fundamental  chemical structure
             theory (i.e., a blending of linear free energy [LEER] to compute
             thermodynamic properties  and PMO theory to describe quan-
             tum effects) to estimate  numerous physical-chemical proper-
             ties (Hilal  et al., 2005; Whiteside et al., 2006). The power of
             the  tool box is its ability to couple whole  molecule and site-
             specific chemistry to calculate new properties. For example,
             pKa and property  models  are coupled to calculate tautomeric
                                   Previous
       TOC

-------
16
                                                   KAVLOCK ET AL.
equilibrium  constants;  and pKa, hydrolysis,  and  property
models are coupled to calculate complex macro pKa's where
ionization, hydrolysis, and tautomerization may couple to yield
very complex apparent pKa's. This capability is essential for
calculating physical-chemical properties of organic chemicals
with complex chemical structures that contain multiple ioniz-
able functional moieties, such as many of the pharmaceuticals
that are being detected in the effluents of many waste water
treatment plants.
  In addition to the more traditional computational approaches
such as the  fragment constant approach and LFER,  quantum
mechanical calculators coupled with aqueous solvation models
are  also finding increasing applications in predicting physical-
chemical properties for predicting chemical reactivity (Lewis
et al.,  2004) and  for  investigating reaction mechanisms for
transformation processes of interest  such as  reductive trans-
formations (Arnold et al.,  2002).
  Tools for predicting transformation kinetics  and pathways
are  quite limited, particularly with respect to  biological  pro-
cesses. The  EPI Suite and  SPARC  calculators have limited
capability for the calculation of hydrolysis rate  constants, and
currently  have no ability  to calculate  biodegradation  rate
constants. CATABOL is an expert system that begins to fill this
gap by predicting biotransformation pathways and calculating
probabilities  of  individual transformations (Jaworska et  al.,
2002).  The  core of CATABOL  is a degradation simulator,
which includes a library of hierarchically ordered individual
transformations  (abiotic and  enzymatic  reactions).  It  also
provides the magnitude and chemical properties of the stable
daughter products resulting from biodegradation.
  The future development of models for predicting the environ-
mental fate and transport of chemical contaminants is driven pri-
marily by the need for multimedia and multipathway assessments
over broad spatial and temporal scales. Geographic information
system-based technologies  will  be  required for  accessing,
retrieving, and processing data contained in a wide  range of
national databases maintained by various government agencies.


Toxico-Cheminformatics

  The term "Toxico-Cheminformatics" encompasses  activities
designed to  harness, systematize,  and integrate the  disparate
and largely textual  information available on the toxicology and
biological activity  of chemicals. These data exist in corporate
archives, published literature, public data compilations, and in
the  files of U.S. government  organizations such as the  National
Toxicology Program (NTP),  U.S. EPA, and the U.S. Food and
Drug Administration. Data mining approaches and predictive
toxicity models that can advance our ability to effectively screen
and prioritize large lists of chemicals are dependent upon the
ability to effectively access and employ such data resources.
  The   National  Center   for  Biotechnology  Information
(NCBI)'s PubChem project (http://pubchem.ncbi.nlm.nih.gov/)
is a large, public chemical  data repository and open search/
  retrieval system that links chemical structures to bioassay data.
  PubChem has become an indispensable resource for chemists
  and biologists due to its wide coverage of chemical space (> 10
  million structures)  and biological space  (> 500  bioassays),
  structure-searching and analysis tools, and linkages  to the large
  suite  of NCBI databases (http://www.ncbi.nlm.nih.gov). Pub-
  Chem includes data for the NCI 60 cell line panel, used by the
  NCI Developmental Therapeutics Program to screen more than
  100,000 compounds   and  natural  products  for   anticancer
  activity and providing a rich data resource  for a comprehen-
  sively characterized set of cells. Weinstein  (2006)  has  incor-
  porated these data into a fully relational, public resource titled
  "CellMiner," and coined the term "integromics" to convey the
  highly  flexible  functionality  of this  system  for  chemical/
  biological  profiling,   spanning  genomics,  high-throughput
  screening (HTS), and chemical information domains. Contrib-
  uting to efforts in data standardization and access, U.S. EPA is
  creating a large relational data warehouse  for chemical  and
  toxicity data from various public resources. This Aggregated
  Computational Toxicology  Resource is designed  to support
  flexible data mining and modeling efforts across a  wide range
  of  biological  information  domains  and the new  U.S. EPA
  ToxCast program (Dix et al, 2007).
    With HTS  approaches being increasingly applied to tox-
  icology data  sets,  such as represented by the NTP  High-
  Throughput Testing Program (National Toxicology Program
  High-Throughput Screening Program, 2006), come challenges
  to determine the most effective means for employing such data
  to improve toxicity prediction  models. Anchoring large ma-
  trices  of HTS activity data to relatively  sparse  phenotypic
  endpoint data across chemical compound space presents  a fun-
  damental challenge. Yang (2007) has demonstrated the value
  of linking bioassays with toxicity endpoints via the structural
  feature dimension, rather than the compound level, generating
  matrices to determine correlation of bioassays with toxicity.
  This  paradigm addresses the practical problem of  the sparse
  data space and allows  quantitative multivariate analysis.
    These toxico-cheminformatics tools and public resources are
  evolving in tandem with increasing legislative pressures within
  the United States, Europe, and Canada to prioritize large lists of
  existing chemicals for  testing  and/or assessment.  Health
  Canada has been the  first to fully implement a tiered Hazard
  ID  and Exposure Assessment evaluation process relying upon
  weight-of-evidence  consideration of existing data  and results
  of toxicity prediction  models, and structure-analog inferences
  (Health  Canada, 2007). The   approach is pragmatic  and
  transparent, relying upon existing capabilities  and technolo-
  gies,  and was successfully employed  to prioritize the Domestic
  Substance  List  inventory   of  23,000  chemicals  by  the
  legislatively mandated deadline under  the Canadian Environ-
  mental Protection Act of September  2006. This approach will
  greatly benefit from advances in toxico-cheminformatics,  and
  will influence other governmental agencies as they straggle
  with  similar mandates  for prioritizing large lists of chemicals.
                                     Previous
TOC

-------
                                             COMPUTATIONAL TOXICOLOGY
                                                                                                                    17
Molecular Modeling Methods as a Virtual Screening Tool for
  the Assessment of Chemical Toxicity
  Molecular modeling methods  provide  an  approach  for
estimating  chemical activity when the relevant data is  not
available. When used in this way it becomes an important tool
for  screening chemicals for toxicity and hazard identification.
Computational  molecular methods may also be  applied to
model toxicity  pathways when some of the relevant experi-
mental data are unavailable.  As noted above, some of these
methods  have  been used  to  estimate various  physical  and
chemical properties of the molecules relevant to environmental
fate and transport.  Other molecular modeling methods may be
applied to simulate critical processes in specific mechanisms of
action involved in toxicity. An initial and often differential step
in many of these mechanisms of action requires the interaction
of  the molecular  environmental contaminant,  or  one of its
descendants, with  a (macro)molecular target.  An element of
a virtual screen for potential toxicity may be developed from the
characterization of these toxicant-target interactions. One large
and important  subset  of target-toxicant  interactions  is  the
interaction  of  chemicals with proteins. Many  computational
approaches for screening libraries of molecules for pharmaceu-
tical application have been developed. These methods also may
be applied to screen environmental chemicals for toxicity, but the
differing  requirements of these two similar problems must be
considered. For example, screening of environmental chemicals
requires minimizing false negatives, whereas drug discovery only
requires the identification of some of the most potent chemicals,
which can yield a significant number of false negatives.
  Molecular modeling methods that incorporate  both  the
structure  of the protein target and/or that  of known ligands
have been used to investigate nuclear receptor and cytochrome
P450 targets, In addition to the ligand binding site, features on
the  protein surface, such as the Activation Function 2 site or
other coactivator  and  corepressor regions  of the  Human
Pregnane X Receptor, are  potential  sites for interference by
environmental chemicals (Wang et al., 2007). Methods  that
map the binding of functional groups from chemicals to protein
surfaces  and binding sites have been developed (Kaya et al.,
2006; Sheu et al., 2005). These maps of the favorable positions
of  molecular   substructures  provide  fragment libraries  to
which chemicals may be fitted and their suitability for binding
evaluated. Current studies have demonstrated the importance of
the  motion of the  target for ligand binding, protein function,
and subunit assembly. Local motion of the amino acids in the
binding site provides the flexibility to allow the potential ligand
to sculpt the ligand binding domain. Concepts that incorporate
protein flexibility to identify binding modes of lexicological
interest are being developed  (Lill et al., 2006;  Vedani et al.,
2006). This technology combines structure-based  molecular
docking  with multidimensional quantitative structure  activity
relationships. Global modes of protein motion have been found
to influence protein function by affecting binding and subunit
      assembly (Wang et al., 2007). Metabolizing enzymes present
      potential targets for clearance of chemicals as well as activation
      that could  result in toxicity. Understanding the  relationship
      between structure and function for P450 serves to illuminate both
      of these issues that are relevant for assessing the effects of chem-
      icals. Pharmacophores and quantitative structure activity relation-
      ships have been developed for the various CYPs (Jolivette and
      Ekins, 2007), and machine learning methods have been developed
      to predict metabolic routes (Ekins et al., 2006). These approaches
      will allow relatively rapid and comprehensive coverage of the
      interaction  of chemicals with multiple macromolecules, thus
      complementing results from HTS assays (see below).
                        TOXICITY PATHWAYS

        Toxicity pathways represent the normal cellular responses
      that  are  expected to  result in adverse health effects when
      sufficiently perturbed by chemical exposure (National Research
      Council,  2007). A wide variety of in vitro and in vivo tools are
      being developed to identify critical toxicity pathways.

      Application of Drug Discovery Technologies in
        Environmental Chemical Prioritization

        Strategies for investigating the  toxicity of environmental
      chemicals have changed little over many years and continue to
      heavily rely on  animal  testing. However, recent advances in
      molecular biology, genomics, bioinformatics, systems biology,
      and computational toxicology have led to the application of
      innovative  methods  toward   more   informative  in  vitro
      approaches. The application of quantitative, HTS assays is
      a key method. Originally developed for use in drug discovery
      by the pharmaceutical industry, these assays quantify molecular
      target-, signaling pathway-, and cellular phenotype-focused
      endpoints with capacity  to evaluate thousands of chemicals in
      concentration-response  format.  As  an  example,  National
      Institutes  of Health  (NIH)  Chemical  Genomics  Center  has
      built  an  infrastructure for robust,  quantitative,  HTS  assays
      (Inglese  et al., 2006) that is  currently being used to  screen
      thousands  of   environmental  chemicals  for a   variety  of
      toxicology-related endpoints. This project utilizes data provided
      by the NTP's  HTS  Initiative  (http://ntp.niehs.nih.gov/index.
      cfm?objectid=05F80E15-FlF6-975E-77DDEDBDF3B941CD)
      and U.S.  EPA's  ToxCast Program (Dix et al., 2007).
        HTS using cellular  assays offers perhaps the greatest hope
      for  transformation of the current  toxicity testing paradigm.
      Such  systems incorporate comprehensive, functioning, cellular
      signaling pathways, the disturbance of which by environmental
      chemicals would suggest a potential for toxicity. Development
      of high-content  screening  (HCS)  platforms consisting  of
      automated, fluorescence microscope imaging  instruments and
      image analysis  algorithms greatly facilitated quantitation  of
      chemical  perturbations of cell signaling pathways and vital
      organelle function on  a  single cell basis.  As an illustration of
                                  Previous
TOC

-------
18
                                                    KAVLOCK ET AL.
the utility of this approach, human liver toxicants with a variety
of mechanisms of action were detected with both good sen-
sitivity and specificity through screening multiple  endpoints
such as nuclear area and cell proliferation in a human liver cell
line  (O'Brien et al., 2006). This  approach also  is useful in
examining  effects  of new classes of  chemicals  (e.g.,  nano-
materials) for potential toxicity by reporting effects on toxicity-
associated  endpoints  and  allowing visual appreciation for
novel, and perhaps unexpected, effects on cellular morphology
and function (Ding et al., 2005). With an eye toward repro-
ducing normal  physiology  in vitro to the  greatest extent
possible, Berg et al. (2006) established coculture systems of
primary human cells and developed assays that measure many
endpoints encompassing a wide variety of signaling pathways.
Screening of pharmacological probes in these assays demon-
strated similar behavior  of  chemicals  related by mechanism
of action,  thus providing  a  system  potentially useful for
understanding mechanisms of toxicity. Although HCS was not
used in  this  application,  the  marriage of complex,  primary
human cell cultures with HCS analysis is a likely, and highly
valuable, development in the field of toxicity screening. HTS
approaches do have imposing hurdles to overcome, however,
including volatile or aqueous  insoluble environmental chem-
icals, need for inclusion of  biotransformation capacity in the
in vitro test systems, the myriad of potential toxicity pathways
that  must be covered, the likelihood of cell-type dependent
activity,  and the probability of dependence of some mecha-
nisms of toxicity on higher level interactions not found in cell
culture systems (Houck  and  Kavlock, 2007).
   The HTS and HCS methods described are all data-intensive
and require computational approaches to analyze and properly
interpret. The high dimensionality of the data may require novel
statistical approaches. Results are likely to be used in building
models that predict the potential for toxicity for new  chemicals
based on their behavior in in  vitro assays, In addition, screening
results integrated into systems biology models  should lead to
insights into mechanisms of action that will be invaluable for
risk assessment. Validation and harmonization of protocols at
the international level should result in a much more efficient and
comprehensive safety net for  hazardous  chemical protection,
and greatly reduce the number of laboratory animals needed to
accomplish this (Hartung, 2006).


Using Genomics to Predict Potential Toxicity

   Transcriptomics  is a useful  approach for understanding the
interactions  of chemicals with biological  targets,  and  can
complement  the HTS  assays  used  for bioactivity profiling.
Using bioactivity profiles to  accurately  predict  toxicity  and
prioritize chemicals for further testing would allow for the
focusing  of resources on greater  potential  hazards  or risks.
Prioritization efforts to which genomics data might contribute
include U.S. EPA's voluntary  high production volume (HPV)
program, wherein chemicals  manufactured in large amounts are
  identified  and hazard  characterized according  to  chemical
  category. Genomics is being developed as  part of a suite of
  tools to help confirm the category groupings of HPV chemicals,
  and identify  which chemicals  or chemical categories may
  present  greater  hazard or  risk. The  U.S. EPA is  actively
  developing the methods, policies,  and infrastructure for using
  genomics data in such a regulatory context (Dix et al., 2006).
  In  vitro toxicogenomics  methods  are  being developed  and
  evaluated for toxicity prediction and for addressing fundamen-
  tal  questions  about the ability to identify toxicity pathways for
  large numbers of chemicals in a number of research programs
  in  the  United  States,  Europe, and Asia. The  throughput,
  molecular  specificity,  and  applicability of this  approach to
  human  cell systems are highly  consistent with the goals  and
  directions described in the NRC  report on the future of toxicity
  testing (National Research Council, 2007).
    Genomic signatures  predictive  of toxicological outcomes
  have been derived from in vivo studies, and the evaluation and
  application of these  signatures to hazard  identification  and
  risk assessment is an  area of active research. Perhaps most
  significantly, genomic signatures predicting tumor incidence in
  2-year rodent cancer bioassays  have the potential to provide
  shorter-term tests  as an alternative to the expensive two-year
  rodent  bioassay. The  ability  to predict chemically  induced
  increases in lung  tumor incidence  based on gene expression
  biomarkers has  been  demonstrated  in microarray  studies
  performed on mice exposed  for  90 days  to chemicals  that
  were previously tested  by the National Toxicology Program
  (Thomas et al., 2007). In an even shorter 5-day study design,
  liver gene expression data from rats treated with structurally
  and mechanistically diverse chemicals  was used to derive
  a genomic signature that predicted nongenotoxic liver tumor-
  igenicity in the 2-year bioassay  (Fielden et al.,  2007). In both
  of  these studies,  sensitivity and  specificity of the genomic
  signatures  was  high,  and  the  signatures  provided  accurate
  predictions and identified plausible modes of action. Both the
  Thomas et al. and the Fielden et al. data  sets are being utilized
  in the Microarray Quality Control assessment of best practices
  in developing and validating predictive genomic signatures (http://
  www.fda.gov/nctr/science/centers/toxicoinformatics/maqc/).
    Success in  developing  predictive genomic signatures from
  in vitro studies has been more modest, to date, than what has
  been accomplished using in vivo data. Gene expression profiles
  for  more  than  100 reference compounds in  isolated rat
  hepatocytes have  been used  to derive  predictive signatures
  identifying potential mitochondria! damage, phospholipidosis,
  microvesicular steatosis,  and  peroxisome  proliferation, with
  a high degree of sensitivity and specificity (Yang et al., 2006).
  A large  European Union program project entitled carcinoGE-
  NOMICS  (http://www.carcinogenomics.eu/) was  initiated in
  2006 to develop genomics-based in vitro screens predictive of
  genotoxicity  and  carcinogenicity  in the liver,  kidneys,  and
  lungs.  In  vitro  toxicogenomics is also part of U.S. EPA's
  ToxCast research program, which is being designed to forecast
                                      Previous
TOC

-------
                                             COMPUTATIONAL TOXICOLOGY
                                                                                                                   19
toxicity based on genomic and HTS bioactivity profiles (Dix
et al., 2007; http://www.epa.gov/comptox/toxcast/). The initial
goal of these in vitro toxicogenomic efforts is hazard prediction
and chemical prioritization for subsequent in vivo testing, but
the ultimate goal goes beyond refinement to actually replacing
in vivo testing. This  will require a sustained, systematic, and
substantial  effort  on  the part of government,  academic,
industry, and nongovernmental organization partners.

Signaling as a Determinant for Systems Behavior

  Understanding  processes at the  molecular, cellular,  and
tissue levels is an ongoing challenge in toxicology. Central to
this hierarchy of biological complexity is the field of signal
transduction, which  deals with the  biochemical mechanisms
and pathways by  which  cells respond  to  external stimuli.
Computational systems approaches are critical for mechanistic
modeling of environmental chemicals to predict adverse out-
comes in humans at low doses.
  For decades, computational  modeling has  complemented
laboratory-based biology with in silico experiments to generate
and test mechanistic hypotheses. Computational approaches
have  been used to model biological networks as  dynamical
systems in which  the quantitative variation of molecular en-
tities  are elucidated  by the solution of differential  equations
(Aldridge et al., 2006). Such models  of signaling networks
have been used to predict the dynamic response at molecular
(Behar et al., 2007), cellular (Sasagawa et al., 2005), and tissue
levels  (Schneider and Haugh, 2006). Postgenomic, large-scale
biological assays present new challenges and opportunities for
modeling signaling networks. Though large-scale data provide
a global view of  a  biological  system,  they  remain difficult
to utilize directly  in traditional dynamic models.  This has
stimulated  research  on alternative  formalisms for  modeling
pathways (Faure et al., 2006). In addition, concurrent measure-
ments on thousands of proteins, genes, and  metabolites in
response to stimuli,  or in different  disease states, enable the
"reverse-engineering" of biological  networks from data using
empirical methods  (D'haeseleer et al., 2000).
  Synthesizing disparate  information into coherent mechanis-
tic hypotheses is an important challenge for modeling toxicity
pathways. Knowledge-based approaches (Karp, 2001) provide
an avenue for efficiently managing the magnitude and com-
plexity of such information. Through such techniques, large-
scale biological interaction data can be algorithmically searched
to infer signaling pathways (Scott et al., 2006), to extrapolate
between species, or to signify mechanistic gaps. Some of these
gaps may be filled by literature mining (Krallinger et al., 2005)
and others  will require   additional  experiments.  Moreover,
intelligent computational techniques  will aid in designing such
experiments by using biological knowledge to infer testable
hypotheses about novel mechanisms (Nguyen and Ho, 2006).
  Computational  predictive  modeling of  cellular  signaling
systems will aid risk  assessment in two important ways. First,
      knowledge-based and data-driven approaches will aid in orga-
      nizing and refining biological insight on perturbations leading
      to adverse  outcomes. Second,  dynamic simulation  of these
      mechanisms will help in predicting dose-dependent response.
      This will reduce the scope of  animal  testing  and the time
      required  for understanding the  risk of toxic effects due to
      environmental chemicals.


      Systems Biology Models of the HPG Axis

        Over the past decade, there has been a focused international
      effort to identify possible adverse  effects  of endocrine  dis-
      rupting chemicals (EDCs) on humans and wildlife. Scientists
      have identified alterations in the concentration  dynamics of
      specific hormones as risk factors for common cancers such as
      breast  cancer  (estrogen, progesterone), endometrial  cancer
      (estrogen),  and  prostate cancer (estrogen, testosterone) in
      humans (Portier, 2002).  Chemicals capable of acting as EDCs
      include pesticides, pharmaceuticals, and industrial  chemicals.
      Ecological  exposures to EDCs are primarily from industrial
      and waste water treatment effluents,  whereas human exposures
      are  mainly  through the  food  chain. There is convincing  evi-
      dence that fish are being  affected by EDCs  both  at the
      individual and population levels.
        As many  of the adverse  effects have  been  related to
      alterations  in  the  function  of the hypothalamus-pituitary-
      gonadal (HPG) axis, the development of computational system
      biology models that describe the biological perturbations at the
      biochemical level  and  integrate information toward  higher
      levels of biological organization will be useful in predicting
      dose-response behaviors at the whole organism and population
      levels. For example, a mechanistic computational model of the
      intraovarian metabolic network has  been developed to predict
      the  synthesis and secretion of testosterone  and estradiol and
      their responses to  the EDC,  fadrozole  (Breen  et  al.,  2007).
      Physiologically based pharmacokinetic  (PBPK)  models cou-
      pled with pharmacodynamic models that include  the regulatory
      feedback of the HPG  axis also can  be used  to predict the
      biological response to EDCs  in  whole organisms (Plowchalk
      and Teeguarden, 2002;  Watanabe  et al., 2006). In addition,
      these computational models can be developed for fish and other
      wildlife. They can be used to identify biomarkers of exposure
      to EDCs that are indicative of the ecologically relevant  effects
      at the individual and population levels in support of predictive
      environmental risk assessments  (Rose et al., 2003).
        Because  the mechanism of  action of EDCs is generally
      understood, there has been  a considerable emphasis on the
      development of screening tools for use in hazard identification,
      and the involvement of feedback  loops in physiological re-
      gulation of hormone function has provided a foundation upon
      which to build computational models of the relevant biology.
      Hence, EDCs represent a prime  example of  how toxicity
      pathway  elucidation and characterization can be  applied to
      hazard  and risk assessment  as  envisioned by  the  National
                                  Previous
TOC

-------
20
                                                   KAVLOCK ET AL.
Research  Council (2007).  Of course, additional research is
needed in this area to bring a higher level of involvement of
cell based screening assays, especially those which incorporate
human cells  or  receptors,  and to employ the computational
models of response.


   DOSE-RESPONSE AND EXTRAPOLATION MODELS

  Dose-response is the combination of the relationship between
exposure and a relevant measure of  internal dose (pharmaco-
kinetics), and the relationship between internal dose and the
toxic effect (pharmacodynamics). They are intended to reliably
predict the consequences of exposure at other dose levels and
life stages, in other species, or in susceptible individuals.

Dose—Response and Uncertainty

  Risk analysis  for environmental exposures involves expo-
sure assessment  (factoring  in various routes such as drinking
water, food,  air, and skin exposure) and the effects of those
exposures  on  individuals  (dose-response  assessment),  In
modern exposure assessments, exposure may  well be charac-
terized by  a distribution of possible exposure levels over a
population, with confidence intervals on the quantiles of that
distribution (e.g., specifying the 99th percentile of the exposure
distribution  and its  95  percent confidence  bounds),  and
a sophisticated analysis of the components of variability and
uncertainty (e.g., Cullen and Frey, 1999;  U.S. EPA,  1997). In
contrast,  standard approaches to dose-response analysis treat
the uncertainties surrounding dose-response metrics  simplisti-
cally, using standard factors to extrapolate across species and
to quantify  variability among exposed people. Probabilistic
dose-response assessment  methods  allow a  more  complete
characterization of uncertainty and variability in dose-response
analysis  (Evans  et  al,  2001; Hattis et al,  2002;  Slob and
Pieters, 1998), and are naturally  compatible with probabilistic
exposure assessments (van der Voet and Slob, 2007). Dose-
response analysis is divided into  the analysis of the delivery of
toxic substances to target tissues (pharmacokinetics), and the
action of toxic substances at their targets (pharmacodynamics).
  Much  progress has been made in understanding pharmaco-
kinetics and in building models  (PBPK models) that quantify
that understanding. Such models may be used to  quantify the
relationship between potency in  animals  and humans, human
variability for internal dose, and the overall uncertainty of such
predictions  (Barton  et  al.,  2007). Hierarchical   Bayesian
techniques are  useful for  characterizing the uncertainty  of
model outputs (Hack et al., 2006). Monte-Carlo methods allow
uncertainty and variability in model parameters to be translated
into distributions of internal doses in a human population with
attendant uncertainty (Allen et al.,  1996; Clewell et al., 1999).
  Ideally,  pharmacodynamic relationships  also  would   be
modeled  based on  mechanistic  understanding (Setzer et  al.,
2001). In practice, however,  dose-response  evaluations  are
  based  on  empirical  dose-response  modeling  of  animal
  toxicology data. Typically,  many empirical curves may fit a
  given dataset, reflecting real uncertainty about the "true" dose-
  response  relationship. Wheeler and Bailer (2007) have de-
  veloped a method using model averaging that approximates the
  uncertainty  in  our understanding of a given  dose-response
  relationship.
     Uncertainty  in a risk  assessment may be reduced  by the
  collection  of  further  information,  and  sensitivity analysis
  (Saltelli et al.,  2000) can help  to quantify the contribution of
  individual sources of uncertainty and their interactions to that
  of the overall risk analysis. Frey and Patil (2002) and Mokhtari
  et al. (2006) have compared the utility of different sensitivity
  analysis methods in a probabilistic risk assessment. Mokhtari
  and Frey  (2005) have recommended how sensitivity analysis
  can be used and applied to aid  in addressing risk management
  and research planning questions. These  approaches  provide
  considerable information to the  risk  manager for  making
  decisions  about the  exposure levels needed  to  protect target
  populations.


  Genetic Variation, Gene-Environment Interactions, and
     Environmental Risk Assessment

     Understanding relationships  between environmental expo-
  sures and complex disease requires  consideration  of multiple
  factors, both extrinsic (e.g., chemical exposure) and intrinsic
  (e.g., genetic variation). This information must be integrated to
  evaluate gene-environment  interactions to identify vulnerable
  populations  and  characterize  life-stage  risks.  Although the
  association between genetic and environmental factors  in de-
  velopment of disease has long been recognized, tools for large-
  scale characterization  of  human genetic  variation have only
  recently become  available (The International HapMap  Con-
  sortium, 2005).
     It is well known that different species, and individuals within
  species, react differently to identical  exposures to pharmaceut-
  icals or environmental chemicals. This is, in part, driven by
  genetic variation in multiple pathways affecting multiple pro-
  cesses such as adsorption, metabolism and  signaling.  Recent
  advances  in our  understanding  of the  pattern  of  human
  molecular genetic variation have opened the door  to genome-
  wide genetic variation studies  (Gibbs and  Singleton,  2006).
  Pharmacogenetics  is  a   well-developed  field  studying the
  interaction between  human  genetic  variation and  differential
  response to pharmaceutical  compounds (Wilke, 2007). Many
  of the insights developed in these studies have direct relevance
  to environmental chemicals. Pharmacogenetic studies increas-
  ingly analyze both pharmacokinetics and pharmacodynamics
  pathways. Emphasis  is shifting from  a  focus on individual
  markers, such as  single-nucleotide polymorphisms (SNPs), to
  multi-SNP and multigene haplotypes.
     Gene-drug interaction studies have provided many insights
  for  understanding  the   effects  of  chemical  exposure  in
                                      Previous
TOC

-------
                                             COMPUTATIONAL TOXICOLOGY
                                                                                                                    21
genetically heterogeneous  populations. For  example, inves-
tigators in the NIH Pharmacogenetics Research Network are
examining multiple approaches to correlate drug response with
genetic variation. Data from this  program is stored and
annotated in a publicly accessible knowledge base (Giacomini
et al., 2007). Lessons learned from these and related studies are
being incorporated into drug development and governmental
regulation, and  are models for  approaches  to  identify vul-
nerable populations in the context of environmental exposure.
  Although genetic  variation plays  a major role in gene-
environment interactions, recent work has  shown that epige-
netic  effects also are important. This complicates the picture
because the effects of exposure can lead to multigenerational
effects even in the absence of genetic mutations. Epidemio-
logical evidence increasingly suggests that environmental ex-
posures early in development have a role in susceptibility to
disease in later life, and that some  of these effects are passed on
through a second generation. Epigenetic modifications provide
a plausible link between the  environment and  alterations in
gene  expression that might lead to  disease  phenotypes. For
example,  a potential mechanism  underpinning early life pro-
gramming is that of exposure to excess stress steroid hormones
(glucocorticoids) in early life.  It has recently been shown that
the programming effects of glucocorticoids can be transmitted
to a second generation.  This information provides a basis for
understanding the inherited association between  low birth
weight and cardiovascular disease risk later in life (Drake et al.,
2005).
  It  is becoming increasingly  clear  that  specific  genetic
variants modulate individual vulnerability  to many diseases.
A major challenge for future toxicogenomics research is to link
exposure, internal dose, genetic variation, disease, and gene-
chemical interactions (Schwartz and Collins, 2007). This effort
should  yield improved dosimetry  models  that  will  reduce
uncertainties associated  with the  assumption that populations
are homogeneous in their  response  to toxic chemicals. Ex-
posure information on par with available toxicogenomic infor-
mation  will  improve  our  ability   to  identify  vulnerable
populations, classify exposure in studies of complex  disease,
and elucidate important  gene-environment interactions.
  The study of genetic variation intersects with several issues
discussed in the NRC report. At one end, genetic variation
provides a handle for  investigating  mechanism of action of
chemicals  and  for elucidating  toxicity   pathways.  Gene
knockout strains in many species provide a standard tool for
delineating pathways (Wijnhoven et al., 2007), but less severe
changes in the form of genetic polymorphisms are also useful
and potentially  more relevant to the understanding human
health effects. By testing  a chemical in a panel of animals
with polymorphic, but well-characterized genetic backgrounds
(Roberts et al., 2007), one can generate valuable information
on  what  pathways  are  being modulated  by the chemical
(Ginsburg, 2005). At the  other  end  of the spectrum,  it is
possible in some  cases to understand in detail how genetic
      differences alter dose-response relationships, and from there to
      develop specific risk assessment recommendations which take
      into  account genetic variation  in  human populations.  The
      primary examples of this approach to risk assessment involve
      chemical metabolism (Dome, 2007), which is also the most
      well studied area in the field of pharmacogenetics. In summary,
      there is an ever growing body of knowledge about the effects
      and uses of genetic variation  in many species,  and the field of
      predictive computational toxicology will be  able to  increas-
      ingly benefit from these advances.


      Computational  Tools for Ecological Risk Assessment

        Ecological systems pose some unique challenges for quan-
      titative risk assessment. Human health risk assessment requires
      extrapolation from effects in well-characterized animal models
      to well-studied human  biology, with the  aim of protecting
      individuals, In  contrast, ecological  risk assessment requires
      extrapolation among  widely  divergent taxonomic groups  of
      relatively understudied organisms, with the intent of protecting
      populations and critical  functional processes within ecological
      communities.
        Modern computational capabilities and tools for conducting
      high-content biological analyses  (e.g., transcriptomics, proteo-
      mics, and  metabolomics)  have  the  potential  to  significantly
      enhance our ability to predict or evaluate ecological risks. For
      example, high-content assays that provide multivariate results
      can  be used to quantitatively  classify individual organisms
      (sentinels)  or  communities  of organisms (e.g.,  microbial
      communities) as within or deviated  from a normal operating
      range (Kersting, 1984; van Straalen and Roelofs, 2006). As a
      key  advantage, these general profiling and multivariate con-
      cepts can be applied to  species that  lack a well-characterized
      genome (van Straalen and Roelofs,  2006). Beyond profiling
      approaches, high-content biological analyses provide powerful
      tools for  examining  system-wide  responses  to stressors.
      Through iterations of system-oriented hypothesis  generation,
      testing, and gradual refinement of biologically based models, it
      should be feasible to establish a credible scientific foundation
      for  predicting  adverse  effects  based  on  chemical mode  of
      action  and/or extrapolating effects  among  species with well
      conserved  biological pathways (Villeneuve  et al.,  2007).
      However,  even with  the ability  to conduct  high-content
      analyses, high  quality data sets for  parameterizing computa-
      tional models,  particularly those that bridge  from effects on
      individual  model animals to predicted effects  on  wildlife
      populations,  are  likely to  remain  rare (e.g.,  Bennett and
      Etterson, 2007). Consequently, strategies for making the best
      possible  use of laboratory  toxicity  data  to  forecast/project
      population-level risks  will  remain  critical  (Bennett  and
      Etterson,  2007). Additionally,  alternative  computational  ap-
      proaches will have  an  important role to play. For example,
      computational methods that examine overall network topology
      may be used as  a  way to  deduce  system function, control
                                  Previous
TOC

-------
22
                                                   KAVLOCK ET AL.
properties, and robustness of biological networks to stressors.
Such approaches  can be applied at many scales of biological
organization, from gene regulatory networks within a single
cell to trophic interactions and  food webs  at the ecosystem
level (Proulx et al., 2005). Similarly, there is an increasingly
important role for  models, simulation,  and landscape level
spatial forecasting related to the overlapping impacts of mul-
tiple stressors (e.g.,  chemicals,  climate change, habitat loss,
exotic  species). There are many examples of creative uses of
geographic information systems and remote sensing technol-
ogies  for this  purpose (e.g., Haltuch et al., 2000; Kehler  and
Rahel, 1996; Kooistra et al., 2001; Leuven and Poudevigne,
2002,  McCormick,  1999;  Tong, 2001). Thus,  although  the
challenge of  ecological  risk  assessment  and balancing
environmental  protection  against  the demands  of human
commerce and activities remains  daunting, ecotoxicologists,
"stress  ecologists"   (van Straalen, 2003),  and  risk assess-
ment professionals have increasingly powerful tools  at their
disposal.


Virtual Tissues—The Next Big Step for Computational
  Biology

  To  date, biologically motivated  computational  modeling in
toxicology has consisted largely of dosimetry models (PBPK
and respiratory tract airway models)  and, to a lesser extent,
biologically  based dose-response models that combine dosim-
etry with descriptions of one or more modes of action (Clewell
et al., 2005; Conolly et al., 2004). PBPK models are usually
highly lumped and  contain  little  spatial information. Early
models of the lung  were one-dimensional,  though more re-
cently, three-dimensional descriptions  of both  the nasal  and
pulmonary airways have been developed (Kimbell et al., 2001;
Timchalk et  al., 2001). Thus, for  the  most  part,  current
biologically   motivated  modeling in  toxicology  involves
significant abstraction of biological structure.
  Ongoing  developments  in high-throughput  technologies,
systems biology, and  computer hardware and  software  are
creating  the  opportunity for  "multiscale"  modeling  of bi-
ological systems  (Hunter et al., 2006; Kitano, 2002). These
models incorporate structural and functional  information at
multiple  scales  of  biological  organization.  For  example,
Bottino et al. (2006) studied cardiac  effects of  drugs using
a hierarchical set of models  extending from ion channels to
cells to the tissue level. They showed how such models can be
developed for multiple species and how in silico experiments
can be conducted where drugs are  used to perturb the cardiac
system.  An  additional  important aspect of  this  kind  of
modeling is  that one  can superimpose  certain risk  factors,
such as hypokalemia and ischemia, in order to make clinical
predictions prior to  the  actual use of the  drug in the clinic. A
conceptually similar approach is  being taken in the HepatoSys
project (HepatoSys, 2007), where a suite of models describing
various aspects  of  the  functional  biology of hepatocytes is
  under development. The overall aim of the HepatoSys project
  is to  arrive at a holistic understanding  of hepatocyte biology
  and to be able to present and make these processes accessible
  in silico.
     A "virtual liver" is being developed at U.S. EPA's National
  Center for Computational Toxicology. The overall goal of this
  project is to develop a multiscale, computational model of the
  liver that incorporates anatomical and biochemical information
  relevant to toxicological mechanisms and responses. As model
  development progresses, integration of within-cell descriptions
  and cell-to-cell communication will evolve into a computational
  description of the liver. The approach will be to first describe
  normal biological processes, such as  energy and oxygen  me-
  tabolism,  and  then  describe  how  perturbations of these
  processes  by chemicals lead  to  toxic  effects. In the longer
  run, the  project also  will provide an opportunity to  develop
  descriptions of diseases, such as diabetes, and to examine how
  such diseases influence susceptibility to environmental stressors.
     Virtual tissues are being developed not only in the context of
  computational toxicology, but also in clinical and translational
  research. Thus, there  is an increasing emphasis on systematic
  integration  of scientific data,  visualization, and transparent
  computing  that  creates easily  accessible and customizable
  workflows for users. This  integration  of basic research  and
  clinical data has created the demand for more streamlined tools
  and necessary  resources for on demand  investigation  and
  modeling  of pressing  biological problems, and subsequent
  validation  of in silico predictions through further  clinical  and
  environmental observations, In  response  to  this need,  the
  National Biomedical Computation Resource (NBCR;  http://
  nbcr.sdsc.edu/) and  their collaborators  are  developing tools
  such  as  Continuity, which describes  molecular interactions,
  diffusion, and electrostatics in the human heart. Continuity is
  capable  of  transparently   accessing  remote  computational
  resources from an end  user's desktop environment. Develop-
  ment of middleware at the NCBR, such as  the Opal  toolkit,
  makes such transparent access possible.
     The potential payoffs from development of virtual tissues in
  toxicology are significant. Virtual tissues will build on current
  successes with PBPK modeling and take the development of
  quantitative descriptions of biological mechanisms to a new
  level of complexity.  Virtual tissues  will have much  greater
  capabilities than  PBPK models  for  providing insights  into
  dose-response and time course behaviors,  and will promote
  inclusion of larger amounts of integrated biological data  into
  risk assessment.
     With adequate  development, virtual tissues will also become
  capable of providing capabilities necessary  for a full imple-
  mentation  the National Research  Council  (2007)  report.
  Development of in vitro  assays of toxicity  pathways  will
  require validation studies that can at present only be conducted
  in vivo,  In the future, sufficiently mature virtual  tissues  will
  provide  an in silico alternative for  at  least some aspects of
  in  vivo  testing.  The  continuing  and probably increasing
                                      Previous
TOC

-------
                                             COMPUTATIONAL TOXICOLOGY
                                                                                                                   23
pressure  to reduce  animal use for  toxicity testing  will only
encourage this trend.
  Finally, it  must be noted that success  in  development of
virtual tissues  will depend  not  only on  coordination of
computational modeling with targeted data collection but also,
perhaps even more importantly, on the  appropriate training of
a  new  generation  of  computational  toxicologists.  These
individuals  will  have  expertise  in  computational  tools,
mathematics,  and biology,  and will be able  to move seamlessly
between  the laboratory and the computer.  It is likely that this
vision applies not  only  to development of virtual tissues but
also, more broadly, to research and development in toxicology
and risk  assessment.
             SUMMARY AND CONCLUSION

  The field of toxicology is rapidly approaching what could be
a golden era. Spurred on by-far reaching advances in biology,
chemistry, and computer sciences, the tools needed to open the
veritable black boxes that have prevented significant achieve-
ments  in predictive power are being witnessed.  We  have
highlighted many of the topic areas that have demonstrated
advances in the state of the science, and  from which  more
advances are expected in the  near future. Although the new
paradigm suggested  by the  NRC its Toxicity Testing in the
Twenty First Century:  A  Vision and a  Strategy  (National
Research Council, 2007) departs somewhat from the traditional
risk assessment approach exposed by the National Research
Council (1983), the two approaches can be mapped together,
and the tools of computational toxicology can provide  outputs
that will help close gaps in many of the areas (Table 2). Some
aspects of computational toxicology discussed here, such as the
use of fate and transport models, the development of  curated
and widely accessible databases, physiological based pharma-
cokinetic models, and characterizing uncertainty in models are
already  being used  in evaluating  chemical  risks, although
continued development is necessary to address emerging issues
such  as nanomaterials.  Other  aspects,  such  as  HTS  and
toxicogenomics  are  witnessing  extensive  development and
application efforts in toxicology but have yet to become part of
mainstream data generation.  Still others, like the assessment of
gene-environment interactions and  development  of virtual
tissues are really only beginning to be tested for applicability,
although these areas offer significant potential for  improved
understanding of susceptibility and for extrapolating responses
across life stages, genders, and species.
  Much of the high-throughput and  genomics  technology
beginning to be  applied to toxicology was developed by the
pharmaceutical industry for use in  drug discovery. Environ-
mental chemicals differ from drug candidates in a number of
important ways.  For  example,  drugs  are developed  with
discrete targets in mind, conform to physicochemical properties
that  assist  in  absorption,  distribution,   metabolism,   and


•3
a _
o M
1 I
§ S,
'•g s
^7 O
u U
rl 
W
^H CS
.-a* oo
° —
'S o
| •§
•§ £
a — i
J! <8
1 §
S "*"*
W H_*
(A ^
(D K

u 1 1

on fl


u
'1
G
QJ
O



•° B- §

a S y


"3 ;S
0 -o
1 1

§1

f2 =2
.3
2 I" "8
[2 _§ 6



>, 1 —
'o^o
.-H CO O
l|i





en"
y °°
^ S s ^
'S 1 .§> -
< 1 "1 ^
^ s & 5-

*
*
*



4, *
* * *
*



* * ^
* *




-X-
*
*


* * Jt
* * *





! 1 *




* *
* * *
* *


* *
1 *


* *
t *
-X-

I s



.2 Jf.f
?3 fr lj =3 ^^
y y T3 "^ rf
'g & M"2 2 &!
s * .s § B a ^
rt bo -H -^ ,, c ~—^ ^
S.S|^S|o^3
1 1 &-s ai 1 a
g^'oSSg'Su
1 1 1 §" S | S |
^gHHgS^S
U H Of*



2 S|
S S S
O *•* N
S|S § 1
1 a | a «
•^ OJ c« SH 3
'H T 1 ^ "
1 § l a -^
K Q W '(2

























^j
1
1


^
"^
Q

S
O
o
<
OJ
•S
Si
CO
S
^§

                                  Previous

-------
24
                                                     KAVLOCK ET AL.
excretion, have well understood metabolic profiles, and have
use  patterns  that  are known  and  quantified.  In  contrast,
environmental chemicals  generally  are  not  designed  with
biological activity as a goal, cover extremely diverse chemical
space,  have  poorly  understood  kinetic  profiles,  and  are
generally evaluated at exposures levels well in excess of likely
real  world situations. The challenge to successfully  employ
these screening technologies for broader goals in  toxicology
will  be considerable, given that they have  yet to yield the
significant increase  in the pace of drug discovery that was
expected. On the  other  hand, whereas  the goal  of drug
discovery is to find the "needle in the haystack" using targeted
screening tools, the goal of predictive toxicology is to use these
tools more broadly to discern patterns of activity with regard
chemical impacts on biological systems and  hence may  be
more achievable. It  will take a concerted effort on the part of
government,   academia,  and industry to  achieve  the trans-
formation of "Toxicity Testing in the 21st Century" that is so
eagerly awaited. Success  will depend on building  a robust
chemo-informatics  infrastructure  to  support the  field,   on
conducting large-scale proof-of-concept studies that integrate
diverse data sources and types into more complete understand-
ing of biological activity,  on developing a cadre of scientists
comfortable  with  both molecular  tools  and mathematical
modeling languages, and  on convincing risk managers  in
regulatory agencies  that the uncertainties inherent  in the new
approaches are sufficiently smaller or better characterized than
in traditional  approaches.  The rewards  from such a  success
would be significant. More chemicals will  be evaluated  by
more powerful and broad based tools, animals will be used
more efficiently and effectively  in the bioassays designed to
answer specific questions rather than to fill in a checklist, and
the effects of mixtures of chemicals will be better  understood
by employing system-level approaches that encompass  the
underlying biological pathways whose interactions determine
the responses  of the individual and joint effect of components
of mixtures.   Clearly  this  will not happen soon,  or  without
significant investment. The National Research Council (2007)
estimates a 10- to 20-year effort at  about $100 million per year
will  be required for the paradigm shift they envisioned. This is
probably several-fold more than is being invested currently in
the  area  and, in most cases,  those funds  have not been
specifically guided by an overarching  strategic vision such as
put  forth by  the NRC. Nonetheless, there  are  pockets  of
progress occurring and the first success will  likely be seen in
the ability to detect  and quantify the interactions of chemicals
with key identifiable biological targets (e.g., nuclear receptors,
transporters, kinases, ion channels) and to be able to map these
potentials to toxicity pathways and phenotypic outcomes using
computational tools.  Later successes will be seen in modeling
responses that require ever  greater understanding  of system-
level functioning that  will  ultimately  take us  to  the un-
derstanding of susceptibility factors (be they for the individual,
life-stage, gender or species).  All of  these  new methods,
   capabilities, and advances offer great promise for the predictive
   discipline of toxicology.


                           FUNDING

     The  Office  of Research and  Development  of the United
   States Environmental Protection Agency.


                     ACKNOWLEDGMENTS

     The  authors wish to  recognize  the contributions to the
   International Science Forum on Computational Toxicology of
   the  session co-chairs (Steve Bryant,  Richard Corley,  Sean
   Ekins, Tim Elston, Wout Slob, Rusty Thomas, Donald Tillit,
   Raymond  Tice, and Karen Watanabe), and presenters  (Ellen
   Berg, Robert Boethling, Steve Bryant, Lionel Carreira, Fanqing
   Frank Chen,  Harvey  Clewell,  Richard  Corley,  Christopher
   Cramer, Amanda  Drake, Sean Ekins, Tim Elston,  Matthew
   Etterson, H. Mark  Fielden, Christopher Frey, Anna Georgieva,
   Thomas Hartung,  Jason  Haugh,  Kate Johnson,  Jun Kanno,
   Shinya Kuroda, Wildred Li, Markus Lill, Bette Meek, Ovanes
   Mekenyan, John  Petterson,  Steve  Proulx,  Matt  Redinbo,
   Matthias Reuss, Kenneth Rose, Phil Sayre, Wout Slob, Roland
   Somogyi,  Clay Stephens, Justin Teeguarden,  Rusty Thomas,
   Raymond  Tice, Sandor  Vajda,  Nico van  Straalen, Chihae
   Yang, Jeff Waring, Karen Watanabe,  Richard Weinshilboum,
   John Weinstein, and Matt Wheeler) all of whom were instru-
   mental in bringing the state of the science of toxicology to the
   International Science Forum on Computational Toxicology.
                         REFERENCES

  Aldridge, B. B., Burke, J. M., Lauffenburger, D. A., and Sorger, P. K. (2006).
    Physicochemical modeling of cell signaling pathways. Nat. Cell Biol. 8,
    1195-1203; Available at: http://dx.doi.org/10.1038/ncbl497 (accessed June
    27, 2007).
  Allen, B. C., Covington, T. R., and Clewell, H. J. (1996). Investigation of the
    impact of pharmacokinetic variability and uncertainty on risks predicted with
    a pharmacokinetic model for chloroform. Toxicology 111, 289-303.
  Arnold, W., Winget, P., and Cramer, C. J. (2002). Reductive dechlorination of
    1,1,2,2-tetrachloroethane. Environ. Sci. Technol. 36, 3536.
  Barton, H.  A., Chiu, W. A., Setzer, R. W., Andersen, M. E., Bailer, A. J.,
    Bois, F. Y., Dewoskin, R. S., Hays, S., Johanson, G., Jones, N., et al. (2007).
    Characterizing uncertainty and variability in physiologically-based pharma-
    cokinetic (PBPK) models: State of the science and needs for research and
    implementation. Toxicol. Sci.  4.  [Epub ahead of print].
  Behar, M., Hao, N., Dohlman, H. G., and Elston, T. C. (2007). Mathematical
    and computational analysis of adaptation via feedback inhibition in signal
    transduction pathways. Biophys. J. 93, 806-821.
  Bennett, R.  S., and Etterson, M.  A. (2007). Incorporating results of avian
    toxicity tests into a model of annual reproductive success. Integr. Environ.
    Assess. Monitor. 3(4), 498-507.
  Berg, E.  L.,  Kunkel,  E. J., Hytopoulos, E.,  and Plavec,  I.  (2006).
    Characterization  of compound  mechanisms and secondary activities  by
    BioMAP analysis. /. Pharmacol. Toxicol. Methods. 53, 67-74.
                                       Previous
TOC

-------
                                                       COMPUTATIONAL TOXICOLOGY
                                                                                                                                             25
Bottino,  D.,  Penland, R.  C.,  Stamps,  A.,  Traebert,  M.,  Dumotier,  B.,
  Georgieva, A., Helmlinger, G., and Lett, G. S. (2006). Preclinical cardiac
  safety assessment of pharmaceutical compounds using an integrated systems-
  based computer model of the heart. Prog. Biophys. Mol. Biol. 1-3, 414^143.
Breen, M. S., Villeneuve, D. L.,  Breen, M., Ankley, G. T., and Conolly, R. B.
  (2007). Mechanistic computational model  of ovarian steroidogenesis to
  predict biochemical responses  to endocrine active compounds. Ann. Biomed.
  Eng. 35, 970-981.
Clewell,  H.  J.,  Gearhart,  J.   M.,   Gentry,  P.  R.,  Covington,  T.  R.,
  VanLandingham, C. B., Crump, K.  S., and Shipp, A.  M. (1999). Evaluation
  of the uncertainty in an oral Reference Dose for methylmercury due to
  interindividual variability in pharmacokinetics. Risk Anal. 19, 547-558.
Clewell,  H. J., Genty, P. R.,  Kester, J.  E.,  and  Andersen, M.  E. (2005).
  Evaluation of physiologically  based pharmacokinetic  models in risk assess-
  ment: An example with perchloroethylene. Crit. Rev.  Toxicol. 35, 413^33.
Conolly, R. B., Kimbell, J. S., Janszen, D. J., Schlosser, P. M., Kalisak, D.,
  Preston, J., and Miller, F. J. (2004). Human  respiratory tract cancer risks of
  inhaled formaldehyde: Dose-response predictions derived from biologically-
  motivated computational modeling of a combined rodent and human dataset.
  Toxicol. Sci. 82, 279-296.
Cullen, A. C., and Frey, H. C. (1999). The Use of Probabilistic Techniques in
  Exposure Assessment: A  Handbook for Dealing  with  Variability and
  Uncertainty in Models and Inputs. Plenum, New York.
D'haeseleer, P., Liang, S., and Somogyi, S. (2000). Genetic network inference:
  From  co-expression  clustering  to reverse  engineering.  Bioinformatics
  (Oxford, England) 16, 707-726.
Ding, L., Stilwell, J., Zhang, T., Elboudwarej, O., Jiang, H., Selegue, J. P.,
  Cooke,  P.  A.,  Gray, J.  W., and  Chen, F.  F.  (2005).  Molecular
  characterization of the cytotoxic mechanism of multiwall carbon nanotubes
  and nano-onions on human skin fibroblast. Nano Lett. 5, 2448-2464.
Dix, D. J., Gallagher, K., Benson, W. H., Groskinsky, B. L., McClintock, J. T.,
  Dearfield, K. L., and Farland, W. H. (2006).  A framework for the  use of
  genomics data at the EPA. Nat. Biotechnol.  24, 1108-1111.
Dix, D. J., Houck, K. A., Martin, M. T., Richard, A. M., Setzer, R. W., and
  Kavlock, R. J. (2007). The ToxCast program for prioritizing toxicity testing
  of environmental chemicals. Toxicol. Sci. (Forum)  95, 5-12.
Dome, J. L. (2007). Human variability  in hepatic and renal elimination:
  Implications for risk assessment. /.  Appl. Toxicol. 27, 411^-20.
Drake, A.  J., Walker,  B.  R.,  and  Seckl, J. R. (2005). Intergenerational
  consequences of fetal programming by in utero exposure to glucocorticoids
  in rats. Am. J. Physiol. Regul. Integr. Comp. Physiol. 288(1), R34-R38.
Ekins,  S.,  Andreyev,  S., Ryabov,  A., Kirillov,  E.,  Rakhmatulin, E.  A.,
  Sorokina, S.,  Bugrim, A., and Nikolskaya, T. (2006).  A combined approach
  to drug  metabolism and toxicity  assessment. Drug Metab. Dispos.  34,
  495-503.
Evans, J. S., Rhomberg, L. R., Williams, P. L., Wilson, A. M., and  Baird, S. J.
  (2001). Reproductive and developmental risks from ethylene oxide: A
  probabilistic characterization of possible regulatory thresholds.  Risk Anal.
  21, 697-717.
Faure, A., Naldi, A., Chaouiya, C., and Thieffry, D. (2006). Dynamical analysis
  of a generic Boolean model for the control of the mammalian cell cycle.
  Bioinformatics (Oxford, England) 22(14), el24-e!31.
Fielden,  M.  R., Brennan, R.,  and  Gollub, J.  (2007).  A gene  expression
  biomarker provides early prediction and mechanistic  assessment of hepatic
  tumor  induction by non-genotoxic chemicals. Toxicol. Sci. 99(1), 90-100.
Frey, H.  C.,  and Patil, S. R. (2002).  Identification and review of sensitivity
  analysis methods. Risk Anal. 22, 553-578.
Giacomini, K. M., Brett, C. M., Altaian, R. B., Benowitz, N. L., Dolan, M. E.,
  Flockhart, D. A., Johnson, J. A., Hayes, D. F., Klein, T., Krauss, R. M., et al.
  (2007). The pharmacogenetics research  network: From SNP discovery to
  clinical drug response,  din. Pharmacol. Ther. 81,  328-345 (Review).
Gibbs, J. R., and Singleton, A. (2006). Application of genome-wide single
  nucleotide  polymorphism typing: Simple  association  and beyond. PLoS
  Genet. 2(10), e!50.
Ginsburg, G. (2005). Identifying  novel genetic determinants of hemostatic
  balance. /.  Thromb. Haemost. 3, 1561-1568.
Hack, C. E., Chiu, W. A., Jay Zhao, Q., and Clewell, H. J. (2006). Bayesian
  population  analysis of a harmonized physiologically based pharmacokinetic
  model of trichloroethylene and its metabolites. Regul. Toxicol. Pharmacol.
  46, 63-83.  [Epub 2006 Aug 4].
Haltuch, M.  A., Berkman, P.  A., and  Garton, D.  W.  (2000). Geographic
  information system (GIS) analysis of ecosystem invasion: Exotic mussels in
  Lake Erie. Limnol. Oceanogr. 45, 1778-1787.
Hartung, T. (2006). ECVAM's progress in implementing the 3Rs in Europe.
  ALTEX 23( Suppl.), 21-28.
Hattis, D.,  Baird,  S.,  and Goble, R. (2002). A  straw  man  proposal  for
  a quantitative definition of the RfD.  Drug Chem. Toxicol. 25, 403^-36.
Health  Canada.  (2007).  An integrated  framework for the health-related
  components of categorization of the  Domestic Substances List under CEPA
  1999. Available at: http://www.hc-sc.gc.ca/exsd.  Accessed December  24,
  2007.
HepatoSys. (2007).  HepatoSys competence network: Systems biology  of
  hepatocytes. Available  at: http://www.systembiologie.de/en/ (accessed July
  10, 2007).
Hilal, S. H.,  Karickhoff,  S. W., Carreira, L. A.,  and Shresth, B. P.  (2005).
  Hydration equilibrium  constants of  aldehydes,  ketones, and quinazolione.
  QSAR Comb. Sci. 24, 631.
Houck, K.  A.,  and  Kavlock, R.  J. (2007). Understanding mechanisms  of
  toxicity: Insights from  drug discovery research. Toxicol. Appl. Pharmacol.
  in press.
Hunter, P. J. and Borg, T. K. (2003).  Integration from proteins to organs: the
  Physiome Project. Nat. Rev. Mol. Cell Biol. 4(3),  237-243.
Inglese, J., Auld, D. S., Jadhav, A., Johnson, R.  L., Simeonov, A., Yasgar, A.,
  Zheng,  W.,  and  Austin,  C.  P.   (2006).  Quantitative  high-throughput
  screening: A titration-based  approach that efficiently  identifies biological
  activities in large  chemical libraries.  Proc.  Natl. Acad. Sci.  U.S.A. 103,
  11473-11478.
Jaworska, J.,  Dimitrov,   S., Nikolova,  N.,  and   Mekenyan,  O.  (2002).
  Probabilistic assessment of biodegradability based on metabolic pathways:
  CATABOL system. SAR QSAR  Environ. Res. 13, 307-323.
Jolivette, L. J., and  Ekins, S.  (2007). Methods for predicting human drug
  metabolism. Adv. Clin.  Chem. 43, 131-176.
Karp,  P. D.  (2001). Pathway  databases: A  case  study  in  computational
  symbolic theories.  Science 293,  2040-2044.
Kaya, T., Mohr, S. C., Waxman, D. J., and Vajda, S. (2006). Computational
  screening of phthalate monoesters for binding to PPARgamma. Chem. Res.
  Toxicol. 19, 999-1009.
Kehler, C. J.,  and Rahel, F. J. (1996). Thermal limits to salmonid distributions
  in the Rocky Mountain region  and potential  habitat loss due to  global
  warming: A geographic information system (GIS) approach. Trans.  Am.
  Fish. Soc. 125, 1-13.
Kersting, K. (1984). Normalized ecosystem strain a system parameter for the
  analysis of toxic stress  in micro-ecosystems. Ecol. Bull. 36, 150-153.
Kimbell, J. S.,  Subramaniam,  R. P.,  Gross, E.  A., Schlosser, P. M.,  and
  Morgan,  K.  T.  (2001). Dosimetry  modeling  of  inhaled formaldehyde:
  Comparisons  of local flux predictions in the rat, monkey, and human nasal
  passages. Toxicol. Sci. 64, 100-110.
Kitano, H.  (2002).  Systems  biology: A  brief  overview.  Science 295,
  1662-1664.
Kooistra, L., Leuven,  R. S. E.  W., Nienhuis,  P. H., Wehrens,  R.,  and
  Buydens, L. M. C. (2001). A procedure for incorporating spatial variability
                                          Previous

-------
26
                                                               KAVLOCK ET AL.
  in ecological risk assessment of Dutch river flood plains. Environ. Manage.
  28, 359-373.
Krallinger,  M.,  Erhardt,  R. A.,  and Valencia,  A.  (2005).  Text-mining
  approaches in molecular biology and biomedicine. Drug Discov. Today.
  10, 439^45.
Leuven, R.  S. E. W., and Poudevigne, I. (2002). Riverine landscape dynamics
  and ecological risk assessment. Freshw. Biol. 47, 845-856.
Lewis, A., Bumpus, J. A., Truhlar, D. G., and Cramer, C. J. (2004). Molecular
  modeling of environmentally important processes: Reduction potentials. /.
  Chem. Educ. 81, 596.
Lill, M. A., Dobler, M., and Vedani, A. (2006). Prediction of small-molecule
  binding  to  cytochrome P450 3A4: Flexible docking  combined with
  multidimensional QSAR. Chem. Med. Chem. 103(1), 14-27.
McCormick, C. M. (1999). Mapping exotic vegetation in the Everglades from
  large-scale  aerial  photographs.  Photogramm.  Eng.  Remote  Sens.   65,
  179-184.
Mokhtari, A., and Frey,  H. C. (2005).  Recommended practice  regarding
  selection of sensitivity analysis methods applied to microbial food safety
  process risk models. Hum. Ecol. Risk Assess. 11, 591-605.
Mokhtari, A., Frey, C., and Zheng, J. (2006).  Evaluation and recommendation
  of sensitivity analysis methods for application to stochastic human exposure
  and dose simulation  models.  /. Exposure Sci.  Environ. Epidemiol.  16,
  91-506.
National Research Council. (1983). Risk Assessment in the Federal Govern-
  ment: Managing the Process. National Academies Press, Washington D. C.
National Research Council. (2007). Toxicity testing in the twenty-first century:
  A vision and strategy.  Available  at: http://dels.nas.edu/dels/reportDetail.
  php?link_id=4286&;session_id. Accessed December 24, 2007.
National Toxicology Program High-Throughput Screening  Program.  (2006).
  Available at: http://ntp.niehs.nih.gov/ntpweb/index.cfm?objectid=05F80E15-
  F1F6-975E-77DDEDBDF3B941CD. Accessed December 24, 2007.
Nguyen, T. P.,  and  Ho, T.  B. (2006).  Discovering signal transduction
  networks using  signaling domain-domain interactions.  Genome Inform.
  17(2), 35^5.
O'Brien, P. J., Irwin, W.,  Diaz, D.,  Howard-Cofield, E., Krejsa,  C.   M.,
  Slaughter, M. R., Gao, B., Kaludercic, N., Angeline, A., Bernardi, P., et al.
  (2006). High concordance of drug-induced human hepatotoxicity with in
  vitro cytotoxicity measured in a novel cell-based model using high content
  screening. Arch.  Toxicol. 80, 580-604.
Plowchalk,  D. R., and Teeguarden, J. (2002). Development of a physiologically
  based  pharmacokinetic  model  for  estradiol  in rats  and  humans:  A
  biologically motivated quantitative framework  for evaluating responses to
  estradiol  and other endocrine-active compounds. Toxicol.  Sci.  69, 60-78.
Portier, C. J. (2002). Endocrine dismodulation and cancer. Neuro Endocrinol.
  Lett. 23(Suppl. 2), 43^7.
Proulx, S. R., Promislow, D. E. L., and Phillips, P. C. (2005). Network thinking
  in ecology and evolution. Trends Ecol. Evol. 20, 345-353.
Roberts,  A., Pardo-Manuel  de Villena, F.,  Wang, W.,  McMillan, L.,  and
  Threadgill,  D. W. (2007). The polymorphism architecture of mouse genetic
  resources elucidated using genome-wide resequencing data: Implications for
  QTL discovery and systems genetics. Mamm. Genome. 18, 473^-81.
Rose, K. A., Murphy, C. A., Diamond, S. L., Fuiman, L. A., and Thomas, P.
  (2003). Using nested models and laboratory data for predicting population
  effects of contaminants  on fish: A step towards a bottom-up approach for
  establishing causality in  field studies. Hum. Ecol. Risk Assess. 9, 231-257.
Saltelli,  A., Chan, K., and Scott, E. M. (2000). Mathematical and Statistical
  Methods: Sensitivity Analysis. Wiley, Chichester.
Sasagawa, S., Ozaki, Y.-L, Fujita, K., and Kuroda, S. (2005). Prediction and
  validation of the distinct dynamics of transient and sustained ERK activation.
  Nat. Cell Biol. 7, 365-373.
   Schneider, I. C., and Haugh, J. M. (2006). Mechanisms of gradient sensing and
     chemotaxis: Conserved pathways, diverse regulation. Cell Cycle (George-
     town, Tex.) 5,  1130-1134.
   Schwartz, D., and Collins, F.  (2007). Environmental biology  and human
     disease. Science 316, 695-696.
   Scott, J., Ideker,  T., Karp, R. M., and Sharan, R. (2006). Efficient algorithms
     for detecting signaling pathways in protein interaction networks. /. Comput.
     Biol. 13, 133-144.
   Setzer, R. W., Lau, C., Mole, L. M., Copeland, M. F., Rogers, J. M., and
     Kavlock, R. J.  (2001). Toward a biologically based dose-response model for
     developmental toxicity of 5-fluorouracil in the rat: A mathematical construct.
     Toxicol. Sci. 59, 49-58.
   Sheu, S.  H., Kaya, T.,  Waxman, D. J., and Vajda, S. (2005).  Exploring  the
     binding site structure of the PPAR-y ligand binding domain by computa-
     tional solvent mapping. Biochemistry 44, 1193-1209.
   Slob, W., and Pieters,  M. N. (1998).  A probabilistic approach for deriving
     acceptable human intake limits and human health risks  from toxicological
     studies: General framework. Risk Anal. 18, 787-798.
   The International HapMap Consortium. (2005). A haplotype map of the human
     genome. Nature 437, 1299-1320.
   Thomas,  R. S., Pluta, L., Yang, L., and Halsey, T. A. (2007). Application of
     genomic biomarkers  to predict increased lung tumor incidence in 2-year
     rodent cancer bioassays. Toxicol. Sci. 97, 55-64.
   Timchalk, C., Trease, H. E., Trease, L. L., Minard, K. R., and Corely, R. A.
     (2001). Potential technology  for  studying  dosimetry and response to
     airborne chemical and  biological pollutants.  Toxicol. Ind. Health.  17,
     270-276.
   Tong, S. T. Y. (2001).  An integrated exploratory approach to examining  the
     relationships of environmental stressors and fish responses. /. Aquat. Ecosys.
     Stress Recov. 9, 1-19.
   U.S. EPA. (2003).  A  framework for a computational toxicology research
     program. Washington, D.C. EPA600/R-03/65.
   U.S. EPA. (1997).  Guiding principles for Monte Carlo analysis. U.S. Envi-
     ronmental Protection Agency, Risk Assessment Forum,  Washington, D.C.
     EPA/630/R-97/001.
   van der Voet, H., and Slob, W. (2007). Integration of probabilistic exposure
     assessment and probabilistic hazard characterization. Risk Anal. 27, 351-371.
   van Straalen, N.  M. (2003). Ecotoxicology becomes stress ecology. Environ.
     Sci. Technol. 37, 324A-330A.
   van  Straalen, N. M.,  and Roelofs, D.  (2006). Introduction  to Ecological
     Genomics. Oxford University Press, New York.
   Vedani, A., Dobler, M., and Lill, M. A.  (2006). The challenge of predicting
     drug toxicity in silico. Basic Clin. Pharmacol. Toxicol. 99, 195-208.
   Villeneuve, D.  L.,  Larkin, P., Knoebl, L, Miracle, A.  L.,  Kahl, M.  D.,
     Jensen, K. M., Makynen, E. A., Durhan, E. J., Carter, B. J., Denslow, N.  D.,
     et al.  (2007).  A  graphical  systems model to  facilitate hypothesis-driven
     ecotoxicogenomics research on  the teleost  brain-pituitary-gonadal  axis.
     Environ. Sci. Technol. 41, 321-330.
   Wang, H., Huang,  H., Li, H.,  Teotico, D.  G., Sinz, M., Baker, S.  D.,
     Staudinger, J.,  Kalpana, G., Redinbo, M. R., and Mani, S. (2007). Activated
     pregnenolone X-receptor is  a target for ketoconazole and its  analogs. Clin.
     Cancer Res. 13, 2488-2495.
   Watanabe, K. H., Jensen, K. M., Villeneuve, D. L., Nichols, J.,  Zhenhong, L.,
     Breen, M. S., Bencic, D. C., Collette, T., Denslow, N. D., and Ankley, G. T.
     (2006). A  physiologically  based model of  the  hypothalamus-pituitary-
     gonadal axis  in female fathead  minnows.  27th  Annual  Society  of
     Environmental Toxicology and Chemistry (SETAC) North America Meeting.
     Montreal, Quebec, Canada, November 2006 (Abstract).
   Weinstein, J. (2006). Spotlight on molecular profiling: 'Integromic' analysis of
     the NCI-60 cancer cell lines. Mol. Cancer Ther. 5, 2601-2605.
                                              Previous
TOC

-------
                                                      COMPUTATIONAL TOXICOLOGY
                                                                                                                                            27
Wheeler, M. W., and Bailer,  A. J.  (2007). Properties  of model-averaged
  BMDLs:  A  study  of model averaging in dichotomous response risk
  estimation. Risk Anal. 27, 659-670.
Whiteside,  T. S., Hilal, S. H., and Carreira, L. A.  (2006). Estimation  of
  phosphate  ester hydrolysis  rate constants—Alkaline  hydrolysis. QSAR
  Comb. Sci. 25, 123-133.
Wijnhoven, S. W., Hoogervorst, E. M., de Waard, H., van der Horst, G. T., and
  van Steeg, H. (2007). Tissue specific mutagenic  and carcinogenic responses
  in NER defective mouse models. Mutat. Res. 3, 77-94.
Wilke, R. A., Lin, D. W., Roden, D. M., Watkins, P. B., Flockhart, D.,
  Zineh, I., Giacomini, K. M., and  Krauss, R. M. (2007). Identifying genetic
  risk factors for  serious  adverse drug  reactions:  current  progress and
  challenges. Nat Rev Drug Discov. 6(11), 904-916.
Yang,  C. (2007).  Understanding toxicity through  chemical  and biological
  fingerprints. Int.  Sci. Forum Comput. Toxicol. 21-23; (Abstract).
Yang, Y., Abel, S. J., Ciurlionis, R., and Waring, J. F. (2006). Development of
  a  toxicogenomics in vitro  assay  for the  efficient characterization  of
  compounds. Pharmacogenomics 7, 177-186.
                                         Previous

-------
This article was downloaded by: [US EPA Environmental Protection Agency]
On: 2 September2009
Access details: /Access Details: [subscription number 789514190]
Publisher Taylor & Francis
Informa Ltd Registered in England and Wales  Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK
     JOURNAL OF
   TOXICOLOGY AND
  ENVIRONMENTAL
        HEALTH
Journal of Toxicology and Environmental Health, Part B
Publication details, including instructions for authors and subscription information:
http://www. info rmaworld. co m/smpp/title~content=t713667286


Database for Physiologically Based Pharmacokinetic (PBPK) Modeling:
Physiological Data for Healthy and Health-Impaired Elderly
Chad M. Thompson a; Douglas O. Johns b; Babasaheb Sonawane a; Hugh A. Barton c; Dale Hattis d; Robert
Tardife; Kannan Krishnan e
* National Center for Environmental Assessment, Office of Research and Development, U.S. Environmental
Protection Agency, Washington, DC, USA b National Center for Environmental Assessment, Office of
Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina,
USA c Pfizer Inc., Pharmacokinetic/Pharmacodynamic Modeling, Groton, CT, USA d Marsh Institute Center for
Technology, Environment and Development, Clark University, Worcester, Massacusetts, USA e Groupe de
recherche interdisciplinaire en sante et Departement de sante environnementale et sante au travail,
Universite de Montreal, Montreal, Canada

Online Publication Date: 01 January 2009
To cite this Article Thompson, Chad M., Johns, Douglas O., Sonawane, Babasaheb, Barton, Hugh A., Hattis, Dale, Tardif, Robert and
Krishnan, Kannan(2009)'Database for Physiologically Based Pharmacokinetic (PBPK) Modeling: Physiological Data for Healthy and
Health-Impaired Elderly',Journal of Toxicology and Environmental Health, Part 6,12:1,1 —24
To link to this Article: DOI: 10.1080/10937400802545060
URL: http://dx.doi.org/10.1080/10937400802545060
                              PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions  of  use:  http://www.informaworld.com/terms-and-conditions-of-access.pdf

This article may be used for  research,  teaching and private study purposes. Any  substantial  or
systematic reproduction, re-distribution,  re-selling, loan or sub-licensing,  systematic  supply or
distribution in any form to anyone is  expressly forbidden.

The publisher does not give any warranty express or implied or make any representation that  the contents
will be complete or accurate  or up to  date.  The accuracy of any instructions,  formulae and drug doses
should be independently verified with  primary sources. The publisher shall not be  liable for any loss,
actions, claims, proceedings,  demand or costs or damages whatsoever or howsoever caused arising directly
or indirectly in connection with or arising out of the use of this material.
                                  Previous

-------
Journal of Toxicology and Environmental Health, PartB, 12:1-24, 2009
ISSN: 1093-7404 print /1521-6950 online
DOI: 10.1080/10937400802545060
                                         j Taylor £t Francis
           DATABASE FOR PHYSIOLOGICALLY BASED PHARMACOKINETIC (PBPK)
           MODELING: PHYSIOLOGICAL DATA FOR HEALTHY
           AND HEALTH-IMPAIRED ELDERLY

           Chad M. Thompson1, Douglas O. Johns2, Babasaheb Sonawane1, Hugh A. Barton3,
           Dale Hattis4, Robert Tardir, Kannan Krishnan5
           1 National Center for Environmental Assessment, Office of Research and Development, U.S.
           Environmental Protection Agency, Washington, DC, USA, 2National Center for Environmental
           Assessment, Office of Research and Development, U.S. Environmental Protection Agency,
           Research Triangle Park, North Carolina, USA, 3Pfizer Inc., Pharmacokinetic/Pharmacodynamic
           Modeling, Groton, CT, USA, 4Marsh Institute Center for Technology, Environment and
           Development, Clark University, Worcester, Massacusetts, USA, and 5Groupe de  recherche
           interdisciplinaire en sante et Departement de sante environnementale et sante au travail,
           Universite de Montreal, Montreal, Canada
           Physiologically based pharmacokinetic (PBPK) models have increasingly been employed in chemical health risk
           assessments. By incorporating individual variability conferred by genetic polymorphisms, health conditions, and
           physiological changes during development and aging, PBPK models are ideal for predicting chemical disposition in
           various subpopulations of interest. In order to improve the parameterization  of PBPK models for healthy and health-
           impaired elderly (herein defined as those aged 65 yr and older), physiological parameter values were obtained from
           the peer-reviewed literature, evaluated, and entered into a Microsoft ACCESS database. Database records include
           values for key age-specific model inputs such as ventilation rates, organ volumes and blood flows, glomerular filtration
           rates, and other clearance-related processes. In total, 528 publications were  screened for relevant data, resulting in
           the inclusion of 155 publications comprising 1051 data records for healthy elderly adults and 115 data records for
           elderly with conditions such as diabetes, chronic obstructive pulmonary disease (COPD), obesity, heart disease, and
           renal disease. There are no consistent trends across parameters or their associated variance with age; the gross
           variance in body weight decreased with advancing age, whereas there was no change in variance for brain weight.
           The database contains some information to inform ethnic and  gender differences in parameters; however, the
           majority of the published data pertain to Asian (mostly Japanese) and Caucasian males.  As expected, the number of
           records tends to decrease with advancing age. In addition to a general lack of data for parameters in the elderly with
           various health conditions, there is also a dearth of information on blood and tissue composition in all elderly groups.
           Importantly, there are relatively few records for alveolar ventilation rate; therefore, the relationship between this
           parameter and cardiac output (usually assumed to  be 1:1) in the elderly  is  not well  informed by the database.
           Despite these limitations, the database represents a potentially useful resource for parameterizing PBPK models for
           the elderly to facilitate the prediction of dose metrics in older populations for application in risk assessment.
    Physiologically based pharmacokinetic (PBPK) models and biologically based  dose-response
(BBDR) models  have increasingly been employed in chemical  health risk assessments, and their
evaluation and application in a regulatory context has recently received much attention (Clark et al.,
2004; Barton et  al., 2007; Chiu et al., 2007; Thompson et al., 2008). These models utilize biological
information in order to predict the disposition of chemicals for which limited human data exist. It is
widely recognized that, in addition  to  genetic variation, differences  in life stage and  health status
affect the disposition of environmental  toxicants. When pharmacokinetic models are developed for
predicting the disposition of environmental toxicants  in humans, the  models frequently represent a


    The authors extend their gratitude to Drs. Andrew Geller and Linda Birnbaum of the National Health and Environmental Effects
Research Laboratory for their insightful assistance in the development of this database. This database was developed under a contract
(RFQ-DC-03-00328) funded by the U.S. Environmental Protection Agency.
    The views expressed in this article are those of the authors and do not necessarily reflect the views or policies of the U.S. Environ-
mental Protection Agency.
    Address correspondence to Chad M. Thompson, National Center for Environmental Assessment, Office of Research and Development,
U.S. Environmental Protection Agency, 1200 Pennsylvania Avenue, NW, Washington, DC, 20460, USA. E-mail: Thompson.Chad@epa.gov
                             Previous
TOC

-------
                                                                        C. M. THOMPSON ET AL
standard healthy adult, in part because the human data used to develop such models are often
obtained from younger adults. Thus children, the elderly, and health-impaired individuals represent
subpopulations  that may  benefit from specific consideration of susceptibility by modeling internal
dosimetry via PBPK modeling.
    In order to construct models for these subpopulations, it is important to have as much information
as possible for the physiological parameter values that most influence disposition. Such data include
alveolar ventilation, cardiac  output, organ and tissue weights/volumes and  corresponding  blood
flows, clearance parameters (e.g., glomerular filtration rate,  liver enzyme content), and body com-
position. In developing PBPK models, analysts often  rely on a variety of resources for obtaining
these data; unfortunately, this can add to the variability and uncertainty in and among models. In
an attempt to alleviate these problems,  peer-reviewed compilations of physiological parameters
were developed for children, laboratory animals, and adult humans (Davies & Morris, 1993; Brown
et al., 1997; Price et al., 2003a, 2003b; Gentry et al., 2004);  however, no such compilations exist
for  empirical measures of the quantitative changes in physiological  parameters in  healthy and
health-impaired older adults. Recognizing that aging  may  impart various sensitivities to environ-
mental exposures as well  as increased variability in responses (Geller & Zenick, 2005;  Brown et al.,
2005), it is anticipated that the development of such information will improve health risk assess-
ments by better characterizing risk to older populations and perhaps reducing the uncertainty
inherent in  risk  assessment.
    The primary objectives of this study were to review and collect relevant physiological data for
the elderly  (i.e., individuals >65 yr of age), and to integrate these data within a free downloadable
relational database. In  addition, an analysis of the influence of age on selected physiological param-
eters is presented,  and data gaps in physiological parameters required for PBPK modeling in these
populations are identified.


    LITERATURE REVIEW AND DATABASE DESCRIPTION

    Literature Review
    This work was conducted between 2005 and 2006 using published reports in the peer-reviewed
literature, and searched via PUBMED using the terms aged, elderly, old, age-dependent, aging, ageing,
or geriatric, in combination with the terms related to body mass index (BMI), body composition,
cardiac output,  breathing  rate, tissue volumes, tissue blood flows, tissue composition, and clearance
parameters. Initially, about 3000 to 24,000 references were obtained every time a search was con-
ducted using two terms in combination (e.g., cardiac output and age-dependent). Further, the use
of aged, old,  age-dependent, aging, ageing, and geriatric redundantly produced  several thousand
papers that were all either previously captured by the  use of the term "elderly" or irrelevant to the
present study. On  the basis  of these observations and feasibility  issues, the search  strategy was
refined as follows. Searches were conducted using elderly as the sole key word in combination with
each  of the physiological terms and by specifying certain limits  during  the search.  These limits
resulted in literature that (1) contained both search terms in either the title or the abstract, (2)  inves-
tigated human subjects, and  (3) collected data in subjects aged 65  yr or older. The searches were
then repeated with the addition  of a term representative of the  pathophysiological  condition of
interest, i.e., diabetes, COPD, obesity, angina  pectoris, and  renal disease (Table 1). There is no
claim that 100% of publications on quantitative physiological information in elderly were identified
during this  process, but it is believed that these approaches yielded a reasonably unbiased repre-
sentation of the available  literature.
    Following the literature search, the abstract of each publication (title  when the abstract is not
available in PUBMED) was evaluated to see whether it might contain relevant data. In this process,
clinical and in vivo pharmacokinetic studies with various drugs were excluded except when the
study used  a probe substance for an enzyme of interest, measured renal clearance parameters, or
included measurements of organ weights  or blood flows. In total,  528 peer-reviewed publications
were retrieved and reviewed to identify the specific parameter  evaluated as well as the  experimental
                         Previous  I    TOC

-------
PHYSIOLOGICAL PARAMETERS IN THE ELDERLY
TABLE 1. Health Conditions Included in the Database
Health condition
Obesity3

Diabetes3
COPD3

Heart disease3
Liver disease
Renal disease3
Reason for inclusion
Prevalent among elderly; about 26% men aged 65-74 and greater percentage of women
aged between 65 and 74 are obese
1 8% of all people over 60 have diabetes
Fourth largest cause of death in the United States and ranks as the second major Social
Security-compensated disability
Has a major incidence (30%) and is the leading causing of death in persons aged >65 yr
Influence on hepatic clearance
75% of elderly persons have renal failure at presentation with 20% requiring dialysis
Records
31

25
30

14
10
5
  3Beers (2005).

design/protocol, sampling method, number of samples, statistical analysis, precision/variability, and
any bias. A publication was excluded from further consideration if it reported data only in figures,
data not relevant to the database, or data without appropriate or interpretable units. Studies reporting
physiological parameters without body weight or cardiac output measurement were retained, as
were studies that did not identify the ethnicity (frequent) or gender (rare) of the subjects; however,
these studies reported physiological  data (e.g., tissue blood flow) along with age or some other
parameter relevant for further analysis (e.g., tissue weight, body mass index). Designation of study
subjects' disease states by the original investigator was accepted as  such, and no  reevaluations of
these classifications were attempted. Generally, the medication  history of the study subjects was not
reported; however, based on the summary information available, studies with individuals (or patients)
on treatment/medication affecting the physiological  parameter of interest were excluded from the
present study. In all, 155  publications were retained as sources of data for inclusion in the database.
Although some relevant studies may have been overlooked, the current structure of the database
will permit the future addition of  new data and updating of existing physiological information in
older adults.

    Description of the Database
    A general description of the structure of the database can be found in Gentry et al. (2004), as
well as at the following U.S. Environmental Protection Agency (EPA) web site, where the database is
freely available for download (http://cfpub.epa.gov/ncea/cfm/recordisplay.cfm?deid = 188288). This
Microsoft ACCESS database is an extension of a similar database for early human life stages developed
in cooperation with the International Life Sciences  Institute. (The early life stage database is available
at http://rsi.ilsi.org/Projects/Physio_Parameters_db.htm). In brief, this  database contains three tables
linked by a study identification number (Figure 1). Following the entry of information into the study
table, the information on the study subjects was entered  into the subject characteristics and PBPK
tables. Figure 1 shows the fields in the each table.  Some of the  fields in the PBPK table include
those relating to specific  parameter evaluated, value of parameter, parameter units, group or indi-
vidual data,  parameter variability, type of variability, number evaluated, body weight, method used,
and comments.
    Each data record  contained the numerical value along with variability for each parameter (if
applicable and  available), recognizing that for grouped data the values capture  some elements of
biological variability and  measurement-related  uncertainty. Multiple data records were created for
studies reporting values for more than one parameter. Similarly, multiple records were created for
studies containing values in  more  than one study group  (e.g.,  diabetic group and  control group).
Although the initial intent was  to collect  parameters in  older adults  of specific  health statuses
(healthy, obese, diabetic, chronic obstructive pulmonary disease [COPD], renal disease, or cardiac
disease), data were also collected in those designated as patients but not specifically representative
of the aforementioned categories.  This was done, in part, because the literature review indicated
that a number of studies  reported  data collected in patients often  admitted for  diagnosis or
treatment of other symptoms. Therefore, in an effort to include such data, the term "patients" was
                         Previous  I     TOC

-------
                                                                              C. M. THOMPSON ET AL
        Study
         f DM0
           ReferenceGtaison
           Laborataryl^fsapallnvestsgatof
           Published
           RtflD
Sufaje tf Characterises
 * IDNO
 f SSNO
   S«
   AgeCategof>'
   Age
   AgeUnrts
   EthrwGroup
1DNO
SSNO
NumberEyaJuated
BodyWeight
BWUnits
BWVarType
BWVar
BMI
BMIUnits
BMIVarType
EMIVar
BSA
BSAUnits
BSAVarType
BSAVaf
Specif lePajameterEyaSuMed
MethodsUsedtoMeasufePaTametef
SemrtivityofMethod
GraupData
                                                                  ParametefUnrts
                                                                  ParameterVarTYpe
                                                                  TypeofDistributionfcHMeasurement
                                                                  ParameterVar
                                                                  Digit iieafijyte
                                                                  Comments
                                                                  LinfctoData
                                                                  UneertaintyinParameter
                                                                  Multiplexes
                                                                  QuestionabieRciiabiiityofMetnods
                                                                                         J
FIGURE 1. Database structure. Database contains three tables linked by a study identification number. Abbreviations: IDNO, unique
identification number for each study; RefID, reference  manager database identification number; SSNO, subject-specific number
assigned to each individual or group within a given study (i.e., it is a subfield under IDNO); BW, body weight; Var, variability; BMI, body
mass index; BSA, body surface area.


introduced  into the database. It  is likely a consideration of the end users of this database as to
whether individuals identified  as  patients should be  included as healthy subjects, as it is probable
that many older adults are affected by similar yet undiagnosed conditions. Some studies  reported
physiological  parameter values along  with  data on BMI and  body surface area (BSA), but did  not
include data  on  body weight.  Thus, additional fields were added for BMI and BSA to potentially
allow for normalization of physiological parameters (e.g., cardiac output divided by  BSA)  to facili-
tate comparison across studies.
    The large majority of the studies were cross-sectional in design, comparing individuals or popu-
lations of different ages at the  same time. There were also  a  few longitudinal studies that  reported
measures of a parameter obtained more than once during  a period of several years. Such data
corresponding to  each sampling time  were entered  as  individual  records.  In total, 1166 data
records were entered  and in several cases a single record  consisted  of several useful observations
(e.g., body weight, BMI, BSA, blood flow rate). For quality assurance, 10% of the data entered into
the database were independently checked  by a second scientist, who compared the values entered
into  in the database with that in the original study. Studies reporting parameter values for individuals
between 50 and 64 yr of age were included when there were limited or no data in elderly, as these
data can provide  additional support of changes, if any, in physiological parameter values as a
function of age.


    PHYSIOLOGICAL DATA IN HEALTHY ELDERLY

    Gross Weight

    Body Weight   Body weight is a basic anthropometric parameter reported in most studies included
in the database. Most records in the database containing a numerical  value for a physiological
                           Previous
          TOC

-------
PHYSIOLOGICAL PARAMETERS IN THE ELDERLY
parameter also  have data on the body weight, BSA, or BMI  measured in the same individual or
study group.  Four  major reports used in  populating the database provide data on  body weight.
Inoue and Otsu (1987) published data based on 1067 serial autopsies (493 women and 507 men)
performed from June 1972 to  March 1977 at a general hospital  in Tokyo. Puggard et al. (2002)
reported body weight data for  Danish women (65 yr of age (n =  22), 75 yr of age (n  = 26), and
85 yr of age (n  = 31)). Galloway et al. (1965) reported mean  body weights from 400 autopsies of
people  in their twenties through  nineties performed  at a Veteran's Administration  Hospital in
Wisconsin.  Unfortunately, the latter study did not provide data on the number of subjects in each
of their 10-yr age groups, the standard deviations of the measurements, ethnicity, or gender. It was
presumed that these data represent male subjects only, since they were collected before large numbers
of women veterans became eligible for medical services from this agency. Unlike the previous three
studies,  the Third National Health and Nutrition Examination  Survey (NHANES III)* contains body
weight data for 5679 living individuals (2742 males and 2937 females). Table 2 provides an illustra-
tion  of the data on body weight as a function of age in Japanese males and females based on
autopsy data  from  Inoue and Otsu (1987). These data indicate an overall decline in body weights
with age, and perhaps a modest tendency toward narrowing of the distribution, as indicated by the
lower gross coefficient of variation (CV) values in older age groups for both genders. Comparison of
gross mean male body weights in the Japanese and U.S. populations (Figure 2) indicates  that the
U.S. subjects studied by Galloway et al. (1965) are much heavier on average than the male  subjects
as reported by Inoue and Otsu (1987). Therefore it is important to judge the rates of body-weight
decline  indicated for the two populations in relative terms. When the slope of each  of the regres-
sion  lines in Figure  2 is normalized to the  mean body weights  in the 60-69 yr  age group, it can be
seen that both populations show body-weight declines of approximately 0.4% per year in  the eld-
erly age range.
    Body Mass  Index, Fat Mass, and Fat-Free Mass  BMI or Quetlet's index is defined  as body
weight (kg) divided  by height in square meters (m2) (Quetelet, 1869). The database contains numer-
ous entries for BMI, as this index is computed fairly routinely in clinical studies, particularly  in those
with elderly subjects. Data for BMI were entered into the database along with body-weight infor-
mation or data on specific physiological parameters for each individual or study group, where avail-
able. Figure 3 depicts changes in body weight and BMI as a function of age based on  the data from
the NHANES III study. It can be seen that despite the larger body weights in adult males, mean BMI


TABLE 2. Elderly Body weight (kg) in Japanese Population

Age range   Number of cases  Gross mean   SD   Gross CV   Standardized mean   Standardized SD   Standardized CV
Males
60-69
70-79
80-89
90-99
100+
Total
Females
60-69
70-79
80-89
90-99
100+
Total

104
234
158
11

507

73
198
174
44
4
493

45.1
44.0
41.8
39.6



39.2
38.4
35.1
33.9
34.1


9.4
8.8
7.9
4.4



8.6
8.8
7.3
6.6
5.9


0.208
0.200
0.189
0.111



0.219
0.229
0.208
0.195
0.173


46.0
44.3
42.5
39.5



40.5
38.8
36
34.5
31.9


9.0
9.0
8.3
7.5



9.5
9.8
7.3
7.3
6.7


0.196
0.203
0.195
0.190



0.235
0.253
0.203
0.212
0.210

  Note. Data from Inoue and Otsu (1 987).


    *NHANES III is based on a nationally representative sample of the U.S. civilian non-institutionalized population between the ages
of 2 months and 90 yr.
                         Previous
TOC

-------
                                                                            C. M. THOMPSON ET AL
                            70
                          gi
                          '33
                            50-
                            40-
                            30
                                0.340; y = 84.2-0.236x RA2 = 0.824 D U.S. Males

                                0.415;y = 57.6-0.187x RA2 = 0.980 • Japanese Males
                              60        70        80       90
                                       Midpoint of Age Range (Yr)
                                                                   100
 FIGURE 2. Age-related declines in male body weight. Data for U.S. (Galloway et al., 1965) and Japanese (Inoue & Otsu, 1987) autopsies.
   100
    80-
 =-  60-
gi
I
c
s
    40-
    20-
     0   10  20   30  40  50   60  70  80   90
                      Age (yrs)
                                                      0   10  20   30  40   50   60  70   80   90
                                                                       Age (yrs)
           FIGURE 3. Population-weighted mean body weights (A) and BMI (B). Data are derived from NHANES III.


values are almost identical between the genders with the possible exception of a short period  in
early adulthood. It is also evident, by comparing Figures 2 and 3A, that body weight has increased
in U.S. males during the 30 yr between the Galloway et al. (1965) and NHANES III studies.
    One reason for calculating BMI is that it shows a strong relationship with body fat content. Even
though there are sophisticated methods for the quantification  of body fat, the BMI  has been used
frequently for this purpose (Deurenberg et al.,  1991). "Body fat"  in this  context refers to organic
compounds  constituting the esters of glycerol and  fatty acids and  their associated organic groups,
which serve  as a reserve of energy for the body. Body fat reported in the literature represents both
essential and nonessential fat. "Essential"  fat refers to the lipid constituents of cells and represents
                          Previous
                                            TOC

-------
PHYSIOLOGICAL PARAMETERS IN THE ELDERLY
about 2 to 5% of the lean body mass, whereas the "nonessential" fat is contained in adipose tissue,
which occurs principally in subcutaneous tissue, yellow bone marrow, and the abdominal cavity—
genital,  perirenal, mesenteric, and omental compartments (ICRP, 1975).  It is worth noting that in
PBPK modeling,  "essential" fat is included in all tissue compartments, whereas the "nonessential"
fat is often represented as a separate compartment with its own tissue volume and separate blood
flow. Several hundred values of fat mass (in kilograms and as percentages) in older adults have been
entered into the database. The fat mass is a function of a number of factors: height, weight, age, gen-
der, diet, and physical activity (Sitar, 1998). Generally, women have a higher proportion of body fat
than men with the same BMI, and in both genders the fat percentage increases with age. In women,
there appears to be a postmenopausal acceleration of this trend. In general, the maximum age for
fat accumulation  is around the sixth decade of life, with a plateau phase and a subsequent reduction
in the amount and proportion of body fat occurring in the seventh and eighth decades of life (Sitar,
1998). The data for older adults included in the database are consistent with this general trend.
Although the fat-free mass refers to body mass devoid of all physically extractable fat, the lean body
mass additionally includes the essential fat. These data were entered along with body weight or BMI
information from each study, thus facilitating  further analysis, particularly with respect to tissue
weights.

    Cardiopulmonary Parameters

    Breathing Rate and Pulmonary Function Parameters  Vital capacity is the most widely mea-
sured respiratory parameter and information is  available from both cross-sectional and longitudinal
studies. Previous analyses  suggest a vital capacity decline of about 36 ml/yr  in men and about
21 ml/yr in women after age 40 (Goldman & Becklake, 1959; Knudson,  1991; Crapo, 1993). On
the other hand, cross-sectional and longitudinal studies are suggestive of an age-dependent increase
in dead air space; i.e., the volume of the conducting airways where gas exchange does not occur
(Pierce & Ebert,  1958; Anthonisen etal., 1994). Seven records on alveolar ventilation rate, obtained
from three studies (Morris et al., 1956, 1964; Miller & Tenney, 1956), are included in the database.
Table 3 contains data for most of the cardiopulmonary entries in the database, including those for
individuals with  impaired health (note that not all fields in the database are shown in the table).
    Cardiac Output and Cardiac Index   Cardiac output refers  to the volume of blood  ejected
from each ventricle of the heart per unit time.  Cardiac output normalized to BSA is referred to as
cardiac  index. Some studies suggested a reduction in cardiac output with increasing age whereas
others did not (Lauson et al., 1944; Bradfonbrener et al., 1955; Julius et al.,  1967;  Collis et al.,
2001; Katori, 1979; Luisada etal., 1980;  Crean etal., 1986;  Lakatta, 1990; Okazaki etal., 1996).
This might indicate that factors other than age exert important influences on cardiac output and that
these factors may have differed among the individuals in the various studies.

    Tissue Weights (Volumes)

    Liver Weight A number of studies showed that liver weight decreases with the advancement
of age. Boyd  (1933) reported a decrease in liver weight between ages 20 and 80 yr in men and
women, based on measurements in 1582 subjects following accidental death. This study and others
are suggestive of an approximate 20% decrease in liver weight between the second or third decade
and the eighth or ninth decade (Swift et al.,1978; Marchesini et al.,  1988; Bach et al., 1981). The
database contains 33 records on liver volume  from the following studies: Galloway et al. (1965),
Sato et  al. (1970),  Swift et al.  (1978), Bach et al. (1981), Inoue and Otsu (1987), Wynne  et al.
(1989), Puggaard et al. (2002), and Chouker et al. (2004). Tissue volumes for liver weight and other
organs/compartments are shown in Table 4. Comparison of the gross liver weights in the U.S. and
Japanese male subjects indicates a more rapid decline with age in the U.S subjects (1.17 vs. 0.97%
per year) (Figure  4A).
    Brain Weight  The database contains  16 records of brain weight in older adults, collected from
four studies (Gordan, 1956; Galloway et al., 1965; Inoue & Otsu, 1987; Puggaard et al., 2002).
Based on data from Inoue and Otsu (1987), brain weight exhibits the smallest interindividual variation
                         Previous  I     TOC

-------
                                                                                  C. M. THOMPSON ET AL
TABLE 3. Sample Query Data for Cardiopulmonary Function in Healthy and Health-Impaired Elderly
Parameter
Alveolar ventilation3
Alveolar ventilation6
Alveolar ventilation0
Alveolar ventilation6
Alveolar ventilation0
Alveolar ventilation6
Alveolar ventilation0
Cardiac indexd
Cardiac indexd
Cardiac index6
Cardiac indexf
Cardiac indexf
Cardiac index6
Cardiac index6
Cardiac index6
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index11
Cardiac index'
Cardiac index'
Cardiac index'
Cardiac index6
Cardiac index'
Cardiac index6
Cardiac index6
Cardiac index
Cardiac index'
Cardiac index"1
Cardiac index"1
Cardiac index"1
Cardiac index"1
Cardiac index"
Cardiac index"
Cardiac index"
Cardiac index"
Cardiac index0
Ventilation rate01
Ventilation rateP
Ventilation rate*3
Vital capacity^
Vital capacity01
Health
Healthy
NS
NS
NS
NS
NS
NS
COPD
COPD
COPD
Diabetes
Diabetes
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Healthy
Heart disease
Heart disease
Heart disease
Heart disease
Heart disease
Heart disease
Patients
Patients
Patients
Patients
Patients
Healthy
Hypertensive
Hypertensive
COPD
COPD
Gender
M
M
M
M
M
M
M
F/M
F/M
F/M
F/M
F/M
F
F
F
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
M
M
M
M
M
M
M
F/M
F/M
M
F
M
M
F
M
M
M
NS
M
F
M
F/M
F/M
Age
68-89
60-69
60-69
70-79
70-79
80-89
80-89
42-72
42-72
65 ±8
29-70
29-70
60-69
70-79
80-89
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
65.4
73.3
82
60-69
63 ±2
70-79
80-89
27-67
>66
60
60
60
68
64
63
63
65
90-97
62
61
71
42-72
42-72
Values
4.86
3.80
4.34
3.89
4.37
3.87
4.19
3.5
3.09
2.8
3.18
4.03
2.59
2.56
2.60
3.49
3.47
3.44
3.42
3.39
3.37
3.35
3.32
3.30
3.28
3.26
3.24
3.21
3.19
3.17
3.15
3.12
3.10
3.08
3.05
2.58
2.54
2.36
2.83
2.43
2.86
2.39
1.5-3.7
1.75
3.3
2.4
3.3
1.6
1.96
1.55
2.40
1.85
2.72
4.90
6.45
3.72
3.5
2.5
Units
L/min
L/min
L/min
L/min
L/min
L/min
L/min
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
ml/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
ml/min/sq.m
ml/min/sq.m
L/min/sq.m
L/min/sq.m
L/min/sq.m
L/min
L/min
L/min
L
L
Var Type BW Units Var Type n
0.8 SD 66.4 kg 9.7 SD 18
71
39
54
38
8
21
0.7 SD 22
0.5 SD 9
0.3 SD 57 kg 15 SD 11
0.2 SE 19
0.2 SE 25
0.6 142 Ib 12
0.3 142 Ib 5
0.2 121 Ib 7
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0.2 SD 10
0.2 SD 9
0.2 SD 7
0.7 160 Ib 25
0.1 SE 79.8 kg 2.3 SE 14
0.4 165 Ib 15
0.3 148 Ib 13
Range 20
0.5 SD 41
1
1
1
1
1
1
1
1
0.6 9
70.5 kg 1
63.7 kg 1
58.8 kg 1
0.6 SD 22
0.4 SD 9
                                                                                              (Continued)
                            Previous
TOC

-------
PHYSIOLOGICAL PARAMETERS IN THE ELDERLY
TABLES. (Continued)
Parameter         Health      Gender  Age       Values   Units       Var  Type   BW   Units  Var  Type  n
Vital capacity'
Vital capacity'
Vital capacity6
Vital capacity6
Vital capacity6
COPD
COPD
NS
NS
NS
M
M
M
M
M
67.7 ±6.8
71 .5 ±8.8
60-69
70-79
80-89
2.81
2.15
4.18
3.87
3.38
L
L
L
L
L
0.8
0.7
0.8
0.7
0.7
SD
SD
SD
SD
SD
55.3
43.8



kg
kg



6.4 SD 13
6.2 SD 11
71
55
8
  Note. For clarity, only select fields are shown in this table (and those that follow). More detailed queries and output can be generated
by users after downloading the database. NS, not specified; SD, standard deviation; SE, standard error; Var, Variability; sq.m, square
meters.
  aMiller and Tenney et al. (1956).
  6Norrisetal. (1964).
  cNorris et al. (1964); Shock and Yiengst (1955).
  dSeibold etal. (1988).
  eCapderouetal. (2000).
  fjermendy etal. (1986).
  sLuisada etal. (1980).
  hKatorietal. (1979).
  'Brandfonbrener et al. (1955).
  'Dinenno etal. (2001).
  ^Creanetal. (1986).
  'Cody etal. (1988).
  mBenchimol etal. (1968).
  "Lauson etal. (1944).
  °Okazaki et al. (1996).
  ^Bolomeyetal. (1948).
  'Nishimuraetal. (1995).
among all organ weights covered in the database; however, there is no clear trend of gross coeffi-
cient of variation (CV) values with age. Given that neurotoxicity is of particular concern in the eld-
erly  (Ginsberg et al., 2005;  Brown  et al., 2005; Geller & Zenick, 2005), it may  be important to
include this organ specifically in PBPK models for older adults. Figure 4B shows that the age-related
decline in brain weights in elderly is identical between Japanese and U.S. males.
    Heart Weight   Twenty-two records in the database relate to heart weight in healthy older adults.
With the exception of 2 records that contain data collected in mixed groups of adults (>60 yr) and
elderly, all other records contain values of heart weight from studies that investigated elderly adults
aged 65 to over 100 yr (Galloway et al., 1965; Inoue & Otsu, 1987; Olivetti et al., 1995;  Puggaard
etal., 2002).
    Kidney Weight   The  database  contains  36  records on  kidney  weight in  healthy elderly,
obtained primarily from 5  studies (Galloway et al., 1965; Tauchi etal., 1971;  Inoue & Otsu 1987;
Schmitz et al., 1990; Puggaard et al., 2002). Although some records contain data collected in mixed
groups of adults over the age of 60, there are enough data to ascertain kidney weight as a function
of age in adults from 65 to 100 yr of age. Because some studies reported whole kidney weight while
others  reported  right and  left kidney weights, database entries can  be found  for "kidney weight,"
"left kidney weight," and "right kidney weight."
    Spleen Weight  There are 16 records for spleen weight in the database, based on the studies
of Galloway et al. (1965), Inoue and Otsu  (1987), and Puggaard et al.  (2002). Of all the organ weights
entered into the database, the largest variability is seen for the  spleen. This may be the result of
upregulation of the immune system leading to increases in functional cells to fight infections on the
one  hand, and possible vulnerability to stress-hormone related decreases in organ size on  the other.
    Other Tissue Weights  The database also contains several entries on weights of left lung, right
lung, testes, thyroid, pituitary, pancreas, prostate and adrenal glands reported for three or four sub-
groups of the elderly (average age: 65, 73, 84, or 92 yr), on the basis of 400 autopsies performed by
                          Previous  I     TOC

-------
10
                                                                                C. M. THOMPSON ET AL
TABLE 4. Sample Query Data for Weight of Selected Tissues in Elderly
Tissue
Adrenal3
Adrenal3
Adrenal3
Brain6
Brain6
Brain6
Brainc
Brainc
Brainc
Brainc
Brainc
Brain3
Brain3
Brain3
Brainc
Brainc
Brainc
Brainc
Heart6
Heart6
Heart6
Hearf
Hearf
Hearf
Hearf
Hearf
Heart3
Heart3
Heart3
Heart3
Hearf
Hearf
Hearf
Hearf
Kidney6
Kidney6
Kidney6
Kidneyd
Kidneyd
Kidney6
Kidneyd
Kidneyd
Kidneyd
Kidneyd
Left kidneyc
Left kidneyc
Left kidneyc
Left kidneyc
Left kidneyc
Left kidney3
Left kidney3
Left kidney3
Left kidney3
Left kidneyc
Left kidneyc
Left kidneyc
Left kidneyc
Left lung3
Gender
F/M
F/M
F/M
F
F
F
F
F
F
F
F
F/M
F/M
F/M
M
M
M
M
F
F
F
F
F
F
F
F
F/M
F/M
F/M
F/M
M
M
M
M
F
F
F
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F
F
F
F
F
F/M
F/M
F/M
F/M
M
M
M
M
F/M
Age
64.5
73.3
83.8
64.3
74
83.9
60-69
70-79
80-89
90-99
>99
64.5
73.3
83.8
60-69
70-79
80-89
90-99
64.3
74
83.9
60-69
70-79
80-89
90-99
>99
64.5
73.3
83.8
92
60-69
70-79
80-89
90-99
64.3
74
83.9
60-69
60-69
61 ±9
70-79
70-79
>81
>81
60-69
70-79
80-89
90-99
>99
64.5
73.3
83.8
92
60-69
70-79
80-89
90-99
64.5
Value
16.7
19.6
14.6
1321
1271
1236
1220.1
1158.5
1139
1092.1
1107.5
1300
1270
1198
1321.9
1294
1242.1
1165.5
344
342
337
287.9
321.9
308.4
307.6
306.3
444
463
369
390
345.9
343.3
327.1
285
252
213
200
355
249.5
141
327
224.5
294.9
183.6
119.2
112.2
100
95
93.8
177
149
145
110
144.7
145.1
123.4
96.2
575
Units
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g/1 .73 m2
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
Var



126
104
112
93.3
108.5
101.2
69.9
63.4



124.4
127.1
116
77.2
76
64
64
99.9
96.7
77.1
76.7
83.2




116.3
96.3
79
42.4
47
39
40
13.67
17.7
24
10.89
9.5
18.38
7.33
40.9
34.8
29.2
27.4
31.5




38.7
106.5
33.4
18.2

Type



SD
SD
SD
SD
SD
SD
SD
SD



SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD




SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD




SD
SD
SD
SD

BW
153
143
145
61.7
60
56.2
39.2
38.4
35.1
35.1
34.1
153
143
145
45.1
44
41.8
39.6
61.7
60
56.2
39.2
38.4
35.1
35.1
34.1
153
143
145
135
45.1
44
41.8
39.6
61.7
60
56.2







39.2
38.4
35.1
35.1
34.1
153
143
145
135
45.1
44
41.8
39.6
153
Units
Ib
Ib
Ib
kg
kg
kg
kg
kg
kg
kg
kg
Ib
Ib
Ib
kg
kg
kg
kg
kg
kg
kg
kg
kg
kg
kg
kg
Ib
Ib
Ib
Ib
kg
kg
kg
kg
kg
kg
kg







kg
kg
kg
kg
kg
Ib
Ib
Ib
Ibs
kg
kg
kg
kg
Ib
Var



13.3
11.8
11.6
8.6
8.8
7.3
7.3
5.9



9.4
8.8
7.9
4.4
13.3
11.8
11.6
8.6
8.8
7.3
7.3
5.9




9.4
8.8
7.9
4.4
13.3
11.8
11.6







8.6
8.8
7.3
7.3
5.9




9.4
8.8
7.9
4.4

Type



SD
SD
SD
SD
SD
SD
SD
SD



SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD




SD
SD
SD
SD
SD
SD
SD







SD
SD
SD
SD
SD




SD
SD
SD
SD

n
400


61
48
19
70
194
172
43
4
400


99
225
150
11
61
48
19
73
196
174
44
4
400



103
225
150
11
61
48
19
35
19
14
32
11
22
14
73
196
172
44
4
400



102
225
150
11
400
                                                                                           (Continued)
                            Previous
TOC

-------
PHYSIOLOGICAL PARAMETERS IN THE ELDERLY
                                                                                            11
TABLE 4. (Continued)
Tissue
Left lung3
Left lung3
Left lung3
Liver6
Liver6
Liver6
Liver0
Liverf
Liver
Liverf
Liver0
Liver0
Liver0
Liver0
Liver3
Liver3
Liver3
Liver3
Liver0
Liver6
Liver6
Liver6
Liver6
Liverf
Liverf
Liverf
Liver0
Liver6
Liver0
Liver6
Liver0
Pancreas3
Pancreas3
Pancreas3
Pancreas3
Pituitary3
Pituitary3
Pituitary3
Prostate3
Prostate3
Prostate3
Right kidneyc
Right kidney0
Right kidney0
Right kidney0
Right kidney0
Right kidney3
Right kidney3
Right kidney3
Right kidney3
Right kidney0
Right kidney0
Right kidney0
Right kidney0
Right lung3
Right lung3
Right lung3
Right lung3
Gender
F/M
F/M
F/M
F
F
F
F
F
F
F
F
F
F
F
F/M
F/M
F/M
F/M
M
M
M
M
M
M
M
M
M
M
M
M
M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F/M
F
F
F
F
F
F/M
F/M
F/M
F/M
M
M
M
M
F/M
F/M
F/M
F/M
Age
73.3
83.8
92
64.3
74
83.9
60-69
61-70
61-70
61-70
70-79
80-89
90-99
>99
64.5
73.3
83.8
92
60-69
60-69
60-69
60-69
60-69
61-70
61-70
61-70
70-79
70-79
80-89
>81
90-99
64.5
73.3
83.8
92
64.5
73.3
83.8
64.5
73.3
83.8
60-69
70-79
80-89
90-99
>99
64.5
73.3
83.8
92
60-69
70-79
80-89
90-99
64.5
73.3
83.8
92
Value
539
612
690
1425
1295
1178
1006.9
1581.81
1720
1825
984
803.6
718.8
565
1569
1398
1273
1000
1137.1
1410.1
1487.8
1479.5
1004.9
1953
2042
1832.23
1064.4
886.6
879.2
771.8
833
112
105
103
110
0.72
0.69
0.7
40.6
52.9
50.4
115.6
111.6
97.4
88.3
81.3
173
146
138
105
139.9
126.5
114.6
93.7
706
640
722
690
Units
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
g
Var



305
241
289
405.1
381.1
364
264
482
260.2
200.2
102.5




389.3
53.93
46.1
54.57
32.59
377
432
452
417.5
26.4
252.1
38.53
121.2










37.5
32.8
31
22.7
22.5




48.4
36.1
32.7
14.2




Type



SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD




SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD
SD










SD
SD
SD
SD
SD




SD
SD
SD
SD




BW
143
145
135
61.7
60
56.2
39.2
68.48


38.4
35.1
35.1
34.1
153
143
145
135
45.1






80.03
44

41.8

39.6
153
143
145
135
153
143
145
153
143
145
93.2
38.4
35.1
35.1
34.1
153
143
145
135
45.1
44
41.8
39.6
153
143
145
135
Units
Ib
Ib
Ib
kg
kg
kg
kg
kg


kg
kg
kg
kg
Ib
Ib
Ib
Ib
kg






kg
kg

kg

kg
Ib
Ib
Ib
Ib
Ib
Ib
Ib
Ib
Ib
Ib
kg
kg
kg
kg
kg
Ib
Ib
Ib
Ib
kg
kg
kg
kg
Ib
Ib
Ib
Ib
Var



13.3
11.8
11.6
8.6
15.49


8.8
7.3
7.3
5.9




9.4






17.21
8.8

7.9

4.4










8.6
8.8
7.3
7.3
5.9




9.4
8.8
7.9
4.4




Type



SD
SD
SD
SD
SD


SD
SD
SD
SD




SD






SD
SD

SD

SD










SD
SD
SD
SD
SD




SD
SD
SD
SD




n



61
48
19
72
48
16
11
196
173
44
4
400



103
214
214
214
231
34
29
101
225
231
150
231
11
400



400


400


72
196
171
43
4
400



104
225
150
11
400



                                                                                     (Continued)
                          Previous
TOC

-------
12
                                                                              C. M. THOMPSON ET AL
TABLE 4.  (Continued)
Tissue
Gender    Age
                              Value
                                       Units
                                                 Var
                                                          Type
                                                                 BW
                                                                        Units   Var
Type    n
Testes3
Testes3
Testes3
Thyroid3
Thyroid3
M
M
M
F/M
F/M
64.5
73.3
83.8
73.3
83.8
42.1
45.3
32.9
26.2
29.5
g
g
g
g
g
153
143
145
143
145
Ib
Ib
Ib
Ib
Ib
400




  Note. Data are for individuals with health characterized as "Healthy" or "Unspecified" in the database.
  3Callowayetal. (1965).
  6Puggaard etal. (2002).
  clnoueetal. (1987).
  dTauchi etal. (1971).
  "Schmitzetal. (1990).
  thouker etal. (2004).
  sSatoetal. (1970).
        1800
        1600-
      u>
     2 1400 •
      O)
     '5

      oi 1200 •
        1000 •
         800
              1.17;y = 2776-18.3x RA2 = 0.974 n U.S.
              0.97; y = 1856 -11 .Ox RA2 = 0.947 • Japanese
                                     O)

                                     I
                                     c
                                     're
                                     m
                                                    1400
                                       1300-
                                                    1200-
            60       70      80       90
                  Midpoint of Age Range (Yr)
                                             100
                                                    1100
                                             0.394 y = 1673-5.21x RA2 = 0.958 • Japanese

                                             0.392 y = 1638-5.10x RA2 = 0.947 n U.S.
                                           60       70       80      90
                                                  Midpoint of Age Range (Yr)
                                                                            100
FIGURE 4. Age-related declines in male organ weights. Comparison of age-dependent declines in liver (A) and brain (B) weight. Data are
from U.S. (Galloway et al., 1965) and Japanese (Inoue & Otsu, 1987) autopsies. Note that ethnicity was not reported in the Galloway
etal. (1965) study; thus, these data may not be representative of a mixed U.S. population.


Galloway et al. (1965). For each tissue, there are three or four records in the database—each corre-
sponding to one age group as reported in the original study. For several of these tissues, there is no
significant age-related change in  absolute  weight. Data on blood volume obtained from Lauson
etal. (1944),  Fulop et al. (1985),  Jermendy et al. (1986),  Kenney and Zappe  (1994), and Kenney
and  Ho (1995) are also included in the database.

    Tissue Blood Flows

    Liver (Hepatic and Splanchnic Blood Flow)   Blood flow to  liver consists of hepatic arterial
flow and portal venous flow. The portal flow represents the blood collected  from the alimentary
organs  such as stomach, spleen, pancreas and  intestine. Total  hepatic blood flow  and portal
blood flow measurements in older adults were reported in Dencker et al. (1972); and Zoli et al.
(1989, 1999). The records based  on  Dencker et al. (1972) contain subject-specific data for 5 indi-
viduals aged 61, 66, 66, 67, and 74 yr, whereas the data from Zoli  et al. (1989,  1999) are presented
as age-specific group means and standard deviations. These data are presented in 20 records in the
database. Of these, 15 records contain flow rates  exclusively for those 65 to 75 yr of age, whereas
                           Previous
                                 TOC

-------
PHYSIOLOGICAL PARAMETERS IN THE ELDERLY                                                     13
the 5 other records contain data collected in mixed groups of elderly and adults over the age of 56.
The hepatic blood flow specified in PBPK models frequently represents  the sum of the hepatic
artery flow and the portal vein flow. Hepatic artery flow rises to about 25% of total  hepatic flow
after age 45, and then remains relatively stable thereafter (Zoli et al., 1999). However, the portal
vein flow decreases significantly, resulting in a 29% reduction of total hepatic flow in adults 75 yr of
age and older as compared to those 45 yr of age and younger (Zoli et al., 1989, 1999).
    Splanchnic blood flow refers to the blood flow to the abdominal  organs such as  liver, spleen,
stomach, pancreas and intestine.  However, when the splanchnic blood flow is estimated on the
basis of clearance of substances that are extracted primarily by liver (e.g., indocyanine green and
bromsulfalein), the resulting data essentially reflect liver blood flow (Bender, 1965; Williams & Leg-
gett, 1989). Eleven records on splanchnic blood flow from three studies in older adults  have been
included in the current database (Sherlock et al. 1950; Kenney &  Ho 1995; Ho etal. 1997).  Tissue
blood flows for liver and other organs/compartments are shown in Table 5.
    Cerebral Blood Flow   Blood  flow to the brain in older adults has been measured more fre-
quently than the blood flow to other tissues. Even though  the data are generally suggestive of a
decline in absolute blood  flow to the brain with advancing age (Williams & Leggett, 1989), several
of the earlier studies reported brain blood flow without additional  information such as  body weight,
BSA, BMI, cardiac output,  brain weight, gender, race, or health status. Thus, the older data on brain
blood flow were included  in the database only when, at the very least, the age and health status of
individuals were known.  In total, 39 records  on cerebral  blood flow in  elderly were included
(Schieve  & Wilson, 1953; Fazekas et al., 1952,  1953,  1960; Gordan, 1956;  Scheinberg  et al.,
1950; Shenkin et al., 1953; Meltzer et al., 2000; Slosman  et al., 2001; Kamper et al., 2004). Of
these records,  34 relate to data collected in individuals  aged 65 and greater only.  Four records
relate to data  on  the blood flow to the grey and white  matter,  reported separately (Frackowiak
etal., 1980). The mean values for the gray  and white matter can be combined  to calculate the
mean of cerebral blood flow, on the basis of volume ratio of these components (3:2) (Frackowiak
etal., 1980).
    Renal Blood Flow  Renal  blood flow  refers to the volume of blood  perfusing the excretory
tissue (kidney)  per unit time. It is  understood that renal blood flow decreases with increasing age,
and the  age-related  decrease in perfusion  holds  even  after correcting  for the kidney volume
(Hollenberg et al., 1974). There are 40 data records on renal blood flow in the database, reflecting
data from 5 studies (Lauson et al., 1944; Bolomey et al.,  1949; Davies & Shock, 1950; Kenney &
Ho, 1995; Ho et al., 1997). Most of the  records correspond to individual data obtained in people
between 61 and 89 yr of age. Additionally, the data on the clearance of p-aminohippurate (PAH) in
individuals from 60 to 88 yr of age were included (McDonald et al., 1951; Miller  et al.,  1951;
Lindeman et al., 1966; Slack & Wilson,  1976). These data  can be used to calculate  renal  plasma
flow, and converted  to renal  blood flow  on the basis of hematocrit levels (Williams  & Leggett,
1989).
    Muscle and Skin  Blood Flows   There are 17  records  in the database relating to muscle
and skin blood flow in older adults. Because the data on skin blood flow in the literature often
are measured/reported as forearm or leg blood  flow, not all records will be found under "skin
blood  flow." As described by Williams and Leggett (1989), the blood flow to forearm, calf,
and foot may be used to estimate the skin blood flow rate for the entire body.  It should be
noted  that several of the studies included in the database reported muscle blood flow for a
mixed group of younger adults  and elderly (e.g., 7 records cover 59  to 79 yr old), but the
results were reported on the basis of milligrams per minute per 100 g (or 100 me) tissue. All
values on muscle blood flow  rate included  in the database represent means and variability for
a group, since none of the studies  reported subject-specific muscle blood flow rates (Hellon &
Clarke, 1959;  Lassen et al., 1964; Amery et al.,  1969;  Dwyer and Howe,  1995; Kenney & Ho,
1995;  Ho et al., 1997; Dinenno et al., 2001). Several of  the studies just mentioned as well as
other reports  not included in the database investigated  leg and forearm blood  flow rates in
older adults during exercise; however, this is not an aspect specifically covered  in the data-
base at this point.
                        Previous  I    TOC

-------
14
                                                                               C. M. THOMPSON ET AL
TABLE 5. Sample Query for Data on Blood Flow to Tissues in the Elderly
Tissue
Adipose3
Adipose3
Adipose3
Adipose3
Adipose3
Adipose3
Adipose3
Calf6
Cerebral0
Cerebral
Cerebral6
Cerebral6
Cerebral6
Cerebral6
Cerebral0
Cerebral'
Cerebral'
Cerebral6
Cerebral6
Cerebral6
Cerebral6
Cerebral6
Cerebral6
Cerebral6
Cerebral6
Cerebral6
Cerebral6
Cerebral6
Cerebral6
Cerebral6
Cerebral11
Cerebral'
Cerebral'
Cerebral'
Cerebral'
Cerebral'
Cerebral'
Cerebral'
Cerebral'
Cerebral'
Cerebral'
Foot6
Forearm*
Forearm*
Forearm'
Forearm"1
Forearm"
Leg0
Liverp
Liverp
Liver'
Liver'
LiverP
Liverp
Muscle'
Gender
F
F
M
M
M
M
M
M
F
F/M
M
M
M
M
M
M
M
NS
NS
NS
NS
NS
NS
NS
NS
NS
NS
NS
NS
NS
NS
NS
NS
NS
NS
NS
NS
NS
NS
NS
NS
M
F/M
F/M
M
M
M
M
F
F
F/M
F/M
M
M
M
Age
66
77
64
66
70
74
74
70-82
50-71
69.8 ±5.4
65
69
74
79
50-71
78 ± 6.6
78 ± 6.6
72
72
77
78
78
78
78
79
81
86
91
91
91
50-76
66-92
66-92
66-92
66-92
66-92
66-92
66-92
66-92
66-92
90-1 02
70-82
60-79
60-79
38-73
59-71
65 ±1
43-68
>66
>66
61-75
>76
>66
>66
38-73
Value
0.453
0.491
0.202
0.538
0.602
0.397
0.387
0.7-3.1
36.4
62
56
55
50
40
38.5
612
567
41.1
42.7
27.8
32.0
37.0
34.2
34.7
57.9
34.7
30.6
36.1
42.4
44.1
55
17.2
33.7
66.4
38.8
32.5
18.7
24.6
44.9
40.0
39.3
1 .0-6.7
4.4
4.1
4.01
1.38
4.13
0.34
0.95
869
1297
1020
1211
1.07
2.8
Units
L/min
L/min
L/min
L/min
L/min
L/min
L/min
ml/100g/min
ml/min/100g
ml/min/100 g
ml/100g/min
ml/100g/min
ml/100g/min
ml/100g/min
ml/min/100g
ml/min
ml/min
ml/min/100 g
ml/min/100g
ml/min/100g
ml/min/100g
ml/min/100 g
ml/min/100 g
ml/min/100 g
ml/min/100g
ml/min/100g
ml/min/100g
ml/min/100 g
ml/min/100 g
ml/min/100 g
ml/min/100g
ml/100g/min
ml/100g/min
ml/100g/min
ml/100g/min
ml/100g/min
ml/100g/min
ml/100g/min
ml/100g/min
ml/100g/min
ml/100g/min
ml/100g/min
ml/min/100g
ml/min/100g
ml/100g/min
ml/min/100 g
ml/min
L/min
ml/min/g tissue
ml/min
ml/min
ml/min
ml/min
ml/min/g tissue
ml/100g/min
Var








6.4
4.4




5.3
34
21













9











1.9
1.8
0.35
0.11
0.61
0.04
0.04
62
253
148
66
0.04
0.48
Type








SD
SD




SD
SE
SE













SD











SD
SD
SE
SE
SE
SE
SE
SE
SD
SD
SE
SE
SE
BW Units Var Type n
62.8 kg 1
74.7 kg 1
57.3 kg 1
78.9 kg 1
78.1 kg 1
56.9 kg 1
77.1 kg 1
20
3
9
1
1
1
1
18
7
7
1
1
1
1
1
1
1
1
1
1
1
1
1
7
1
1
1
1
1
1
1
1
1
22
20
70 kg 5 SD 10
71 kg 7 SD 11
25
74.1 kg 5.2 SE 6
89 kg 2 SE 4
10
11
11
10
10
11
11
25
                                                                                          (Continued)
                           Previous
TOC

-------
PHYSIOLOGICAL PARAMETERS IN THE ELDERLY
                                                                                            15
TABLES. (Continued)
Tissue
Muscler
Muscles'
PAH clearance'
PAH clearance"
PAH clearance'
PAH clearance"
Portal""
Portal""
Portal""
Portal""
Renalx
Renalx
Renalx
Renal^
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renalx
Renal"1
Renalx
Renal"
Renalx
Renalx
Skin'
Skinz
Splanchnic33
Splanchnic33
Splanchnic33
Splanchnic33
Splanchnic33
Splanchnic33
Gender
M
NS
F/M
F/M
F/M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
Age
50-75
>51
60-69
>61
70-79
77-88
56-70
56-70
>72
>72
61
62
62
62
64
65
66
66
67
68
69
70
71
71
71
72
72
74
77
78
80
80
80
80
81
83
85
86
86
87
87
89
59-71
61-68
65 ±1
70-78
80-89
38-73
66-82
60
60
62
63
64
65
Value
3.15
1.96
439
431
386
283
656
394
595
361
806
941.4
853
903
733.6
931.2
673.6
837
453.2
845.8
672
411
728.7
425.9
603.6
568
844.6
474.4
627.3
587.8
460
284.1
731.5
680.8
360
498.6
411.1
506
570.6
412.2
237
553.8
1004
774.7
895
589
475.4
25.6
2.99
765
660
890
500
920
1020
Units Var Type
ml/min/1 00 g 0.55 SD
ml/100g/min 0.56 SD
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min 27 SE
ml/min 175 SD
ml/min/sq.m 101 SD
ml/min 106 SD
ml/min/sq.m 53 SD
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min/1 .73 sq.m
ml/min 71 SE
ml/min/1 .73 sq.m 139 SD
ml/min 69 SE
ml/min/1 .73 sq.m 133 SD
ml/min/1 .73 sq.m 141 SD
ml/100g/min 5.05 SE
ml/100g/min 0.46 SD
ml/min/sq.m
ml/min/sq.m
ml/min/sq.m
ml/min/sq.m
ml/min/sq.m
ml/min/sq.m
BW Units Var Type n
10
42
5
4
3
61 kg 7
60
60
60
60
1
1
1
70.5 kg 1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
74.1 kg 5.2 SE 6
10
89 kg 2 SE 4
9
12
25
74.9 kg 5.3 SD 7
60 kg 1
49 kg 1
55 kg 1
57 kg 1
64 kg 1
48 kg 1
                                                                                     (Continued)
                          Previous
TOC

-------
16
                                                                              C. M. THOMPSON ET AL
TABLES.  (Continued)
Tissue
Gender   Age
                                 Value
                                          Units
                                                         Var   Type
                                                                       BW
Units  Var   Type   n
Splanchnic33
Splanchnic33
Splanchnic33
Splanchnic"1
Splanchnic"
M
M
M
M
M
67
68
75
59-71
65 ±1
795
723
578
1302.00
1050
ml/min/sq.m
ml/min/sq.m
ml/min/sq.m
ml/min
ml/min


109
158


SE
SE
48
46
50
74.1
89
kg
kg
kg
kg
kg


5.2
2


SE
SE
1
1
1
6
4
  Note.  Data are for individuals with health characterized as "Healthy" or "Unspecified"  in the database. NS, not specified; SD,
standard deviation; SE, standard error; sq.m, square meters.
  3Lesseretal. (1967).
  'Allwood etal. (1958).
  cSlosmanetal. (2001).
  dMeltzeretal. (2000).
  eScheinbergetal. (1950).
  fKamperetal. (2004).
  gFazekas et al. (1952).
  Achieve et al. (1953).
  'Fazekasetal. (1960).
  'Fazekasetal. (1953).
  ^Dwyeretal. (1995).
  'Mellon etal. (1959).
  "Tenneyand Ho (1995).
  "Ho etal. (1997).
  "Thomson etal. (1988).
  ^Wynneetal. (1989).
  'Zolietal. (1999).
  rAmery etal. (1969).
  sLassenetal. (1964).
  'Miller etal. (1951).
  "Slack etal. (1976).
  'Lindeman et al. (1966).
  "Zoli etal. (1989).
  xDaviesand Shock (1950).
  xBolomey et al. (1948).
  zRookeetal. (1994).
  33Sherlocketal. (1950).
    Adipose Tissue Blood Flow   Published  reports  on blood flow to adipose tissues  are  scant.
During the literature review, only one study on the  measurement of blood flow to adipose tissue
was identified (Lesser & Deutsch, 1967). In this study, blood flow to adipose tissue was reported for
7 healthy individuals (5 males and 2 females) aged 64 to 74 yr. These individual data were included
in the database along with subject-specific body weight and adipose tissue volume.
    Blood Flow to Other Organs  The literature search did not yield  any data on  blood flow to
other organs in subjects aged 65 yr or older.
    Metabolic Clearance
    The data on metabolic clearance included in the  database relate  to the activities (measured
with a probe substrate) and levels of enzymes responsible for phase I biotransformation, as well as
the levels of cofactors  and activities of enzymes responsible for various phase II conjugation reac-
tions. It should  be noted  that several seminal studies on drug metabolism in  older adults reported
results either in terms of percent activity relative to controls or to  some specific age group (usually
young  adults).  Further,  many of these  studies  report data  in graphical  form,  which  were  not
included  in this current iteration of the database.  For many environmental chemicals, the lack of
such data is not crucial since the decline in  metabolic activity in elderly individuals might  be more a
                           Previous
                                TOC

-------
PHYSIOLOGICAL PARAMETERS IN THE ELDERLY                                                      1 7
consequence of the decrease in blood flow to liver and reduction in hepatic mass than a change in
protein concentration or enzyme activity. A compilation and analysis of age-related differences for
in vivo metabolism for a large number of individual drugs is reported by Ginsberg et al. (2005) and
made accessible at http://www2.clarku.edu/faculty/dhattis.

    Phase I Enzymes
    Schmucker et al. (1990) analyzed  liver samples from individuals aged 9 to  89 yr (= 54) for
the levels  of  microsomal protein, cytochrome  P-450 (CYP),  epoxide hydrolase, and NADPH
cytochrome c reductase. They did not  find any change with age in  enzyme activity (assessed with
13  substrates and 9 procarcinogens),  enzyme content,  or  cytochrome  P-450 reductase  levels
expressed on volume basis (Schmucker et al., 1990). Shimada et al. (1994) analyzed liver samples
from 60  individuals aged 12 to 73 yr (30 Japanese and 30 Caucasian patients) for the content
and activities related  to CYP 1A2, 2A6, 2B6,  2C,  2D6, 2E1, and  3A. These authors and others
(Woodhouse et al., 1984; Wynne et al., 1988; Hunt et al., 1990) could not detect any apparent
age-related changes in  CYP content or activity.  Furthermore, Wynne et al. (1988) reported that
there was no correlation between age and apparent affinity (KJ of either high- or low-affinity com-
ponents of 7-ethoxycoumarin O-deethylase. Blanco et al. (2000) did not find any significant age-
related differences in the oxidation of  substrates reflective of the activities of various isoforms of
CYP in 37 samples (age: 6 mo to 93 yr): ethoxyresorufin (CYP1A2), ethoxycoumarin (CYP2E1), teni-
poside and midazolam (CYP3A4/5), pacilitaxel (CYP2C8), and tolbutamide (CYP2C9).
    In contrast to the studies just mentioned, some studies  reported age-dependent changes in
phase I enzymes. Sotaniemi et al. (1997) reported a 29% reduction in antipyrine clearance and
33% reduction in liver P-450 content in subjects >70 yr of age relative to individuals 20-29 yr of
age. Parkinson et al.  (2004) reported that CYP activities varied widely  in liver microsomes from
donors of all  ages and  that there were only few statistically significant differences. Based on few
samples per age group, these authors  reported that activities associated with CYP1A2, CYP2B6,
CYP2C19, CYP2D6, and CYP2E1 appeared to decrease with  age, whereas CYP2A6 and CYP4A11
activities  appeared to increase  with age. An in vivo study by Bebia et  al. (2004)  reported  some
trends  consistent with  Parkinson et  al. (2004) (decreased CYP2C19 activity),  but also reported
contrasting data (increased CYP2E1 activity and no change in CYP2D6). The changes reported in
these studies appear to be relatively small compared to the extent of interindividual differences and
consequence of decreased liver weight  in elderly.
    Data  on alcohol dehydrogenase class I (ADHI) activity included in this database were obtained
form Seitz et al. (1993), who reported that elderly men had lower gastric ADH activity compared to
young  men.  Elderly  women, however,  exhibited ADH  activity levels  comparable to younger
women.
    Phase II Enzymes  The limited available data suggest that there is no age-related difference in
phase II metabolism of xenobiotics in older adults (e.g., acetylation of isoniazid and glucuronidation
of temazepam) (Farah et al., 1977). Data indicate no significant difference in the  number of fast
acetylators among the young (< 35 yr) and  elderly (>65 yr), on the basis of the half-lives of isoniazid
(Farah et al., 1977). Further, Woodhouse et al.  (1984) reported that  hepatic glutathione concentra-
tions are  not affected by aging.  Limited data on extrahepatic  phase  II metabolism (e.g.,  sulfotrans-
ferase activity in kidney and lungs) included in this database do not indicate any significant change
or loss of the activity in elderly compared to younger adults (Pacific! et al., 1988).
    Renal Clearance   Kidney function in general is known to be impaired by aging Gassal &
Oreopoulos, 1998). A decrease in glomerular filtration rate (GFR), as reflected by creatinine  clear-
ance (CLcr), was documented in a number of studies in the elderly. More recent analyses appear
to suggest that the decrease in  GFR in elderly,  as reflected by CLcr, may be due to increased
incidence of renal disease  rather than a true age-related effect (McLean & LeCouteur, 2004).
Further, the estimates of CLcr have been questioned on the basis of the fact that the  rate of produc-
tion  of this endogenous substrate is highly  variable  (e.g., decrease  in  serum  creatinine due  to
decrease in muscle mass)  and  the  possibility  of tubular secretion of  creatinine. Most reliable
estimates of GFR have  been made with the urinary clearance of inulin, a starch-like polymer of
                         Previous  I     TOC

-------
18                                                                      C. M. THOMPSON ET AL
fructose. GFR estimates were also obtained using ethylenediamine tetraacetic acid (EDTA) and the
results compare well with CLcr (Groth et al., 1989). The database contains 91 records relating to
measurements of GFR, based on  clearance of creatinine, inulin, and EDTA in  the elderly. The
subjects in these studies were not on medication known to affect renal function.
    Body, Blood, and Tissue Composition  The data under this category specifically refer to the
neutral lipids (triglycerides,  diglycerides, monoglycerides,  cholesterol, and other  nonpolar lipids),
phospholipids (phosphatidylcholine, phosphatidylethanolamine, phosphatidylserine, sphingomyelin,
and other lipids containing  phosphoric acid esterified at the 1-position of the glycerol molecule),
proteins (albumin, hemoglobin, gamma-globulin, alpha-1 acid glycoprotein), and  water content in
blood and tissues. These data are potentially  useful in estimating the volume of distribution and
partition coefficients essential for development of PBPK models for organic chemicals. In general,
the observations included in the database indicate that serum cholesterol, high-density lipoproteins,
and triglycerides tend to increase with age but decrease in individuals over 90 yr of age (Tietz et al.,
1992).
    In regard to serum albumin  level,  a decline with older age groups was observed  (Greenblatt,
1979). Data on albumin levels entered into the database are consistent with the pharmaceutical
literature,  which suggests that the  reduced binding  of a number of drugs in elderly (compared to
younger subjects) is  not a consequence of an age-related change in binding affinity, but rather due
to decreased serum  albumin levels leading to a reduction of the total number of binding sites.
    Body water content is generally considered as a measure of fat-free mass.  For all ages, percent
body water is less in women than  in men, and  it declines  with increasing age, coinciding with the
decline in muscle mass in both genders (Sitar, 1998). The database contains several entries for body
water content as well as extracellular and intracellular water content  in older adults. Further, data
on the composition  of adipose tissue were included  in the  database; however,  data on the compo-
sition  of other tissues were  not obtained during the literature review. Indications are that hepatic
triglyceride and cholesterol  levels  rise  with aging while phospholipid content  remains unchanged
(Schneeman & Richter, 1993).


    PHYSIOLOGICAL DATA IN  HEALTH-IMPAIRED ELDERLY

    It  is well established that health  conditions affect the disposition of pharmaceuticals and
environmental toxicants  (Timbrell, 2000). There is little doubt that fat mass is increased in  obese
individuals, or that alveolar  ventilation rate is altered in those with chronic obstructive pulmonary
disease (COPD). Overall, there are only 115 data records in the database that relate to physiological
parameters in health-impaired older adults. Some of these data  were collected in mixed groups of
middle-aged and older adults. Additional data supporting the alteration of physiological parameters
in specific disease conditions are available in the literature but relate to young  adults and  therefore
are not relevant to the present database effort.  Reasons for including certain disorders and disease
conditions are briefly summarized in Table 1.

    Obesity
    Obesity is characterized by excessive adipose tissue deposition, which is primarily the result of
increased  caloric  intake above  the  maintenance energy  requirements of  the body  (Crandall &
DiGirolamo, 1990). Obesity is characterized as a  condition of volume expansion, accompanied
by increased blood volume  and  elevated cardiac output (Alexander et al., 1962-1963). According
to the Centers for  Disease Control and Prevention, individuals  with a BMI exceeding 30  are
considered obese  (http://www.cdc.gov/nccdphp/dnpa/obesity/defining.htm);  alternatively,  those
who are at or above 120%  of their  ideal  body  weight are  considered obese (Beers,  2005).
In extreme obesity,  adipose tissue blood flow can represent up to one half of the entire cardiac
output,  emphasizing the importance  of expanding adipose tissue  mass on the hemodynamic
changes (Alexander et al., 1962-1963).  Lesser and  Deutsch (1967) concluded that adipose tissue
perfusion rose with increasing degrees of obesity. Kjellbergand Reizenstein (1970)  reported that the
blood volume increases with body weight in obese  individuals (38 to 42 yr old). Data relating to all
                         Previous  I    TOC

-------
PHYSIOLOGICAL PARAMETERS IN THE ELDERLY                                                      19
of the above parameters were included in the database. Further, Lucas et al. (1998) and Wang et al.
(2003), using chlorzoxazone as a substrate, found that hepatic CYP2E1  activity increased in  obese
type 2 diabetes; similarly, O'Shea et al. (1994) reported enhanced clearance of chlorzoxazone in
obese individuals aged 24 to 48 yr. None of these data were obtained specifically in obese elderly,
and therefore were not included in the database. There are also data that suggest increases in
glucuronidation  and  sulfation in obese individuals with no changes in other phase II  reactions
(glycine conjugation and acetylation) in comparison with lean individuals (Blouin & Warren, 1999).
Data on  GFR in obese individuals are mixed. Some  studies  indicate no  change, whereas others
indicate an increase or decrease in obese women in GFR. The discrepancy among the various stud-
ies might be due to the difference in the extent of obesity and/or associated renal pathology (Blouin
& Warren, 1999) in the samples of people studied.  In all, the database contains 31  records on phys-
iological parameters in obese older adults.

    Diabetes
    Non-insulin-dependent diabetes mellitus (NIDDM  or type II diabetes)  is characterized  by
persistent hyperglycemia but rarely leads to ketoacidosis. Type II diabetes generally manifests
after age 40  and is often a result  of genetic defects  causing both insulin  resistance and  insulin
deficiency. There is also  a strong correlation between  obesity and  onset of  type II  diabetes
(http://web.indstate.edu/theme/mwking/diabetes.html). Although reduced GFR is associated with
diabetes  (Knobler et al., 2004), little is  known  about possible alterations  in other physiological
parameters. The database  contains a total of 25 records  on GFR, cardiac output, fat mass, and
serum triglyceride  levels in elderly with type II diabetes.  Data on hepatic CYP2E1  activity and
CYP2E1  mRNA  in obese type II diabetics were also  included  (Wang et al.,  2003). It should  be
noted that in  these studies the treatment history is not always known or reported, even though the
diagnostic information is routinely provided.

    Chronic Obstructive Pulmonary Disease (COPD)
    COPD, characterized by obstruction of airflow, is projected to be  the  third  leading cause of
mortality and fifth leading cause of disability worldwide by the year 2020 (Sandstrom et al., 2003).
In patients with COPD, decreased airway diameter limits the airflow rates (Mahler et al., 1986). The
ventilation-perfusion relationships are also altered  in COPD patients. Mahler et al. (1985) showed
that the  relationship between cardiac index and pulmonary arterial pressure was increased in
COPD patients compared to normal individuals. The literature search yielded 8 articles (30 records)
for the alteration of cardiac output, alveolar ventilation rate, and physiological dead space in aged
COPD patients. The search also identified two studies on the fat mass and BMI in COPD patients,
which were entered into the database.

    Heart, Kidney, and Liver Disease
    Limited data were retrieved from the literature on values of cardiac output, cerebral blood flow,
leg  blood flow, and GFR in  older adults with heart disease. In total, 14 records on the elderly with
heart disease were included in the database.  In regards to kidney disease, the only data found dur-
ing  the literature search relate to GFR in patients with  renal disease; thus, there are no data entries
for renal function in individuals with other health conditions. There are  10 records in the database
from individuals characterized as having liver disease or acute liver failure.


    DATA GAPS AND CONCLUDING REMARKS

    The data availability for the various physiological parameters in elderly are summarized in
Figure 5. A lack of data is particularly marked in those over the age of 85. However, the overall data
gap across the age groups  in healthy older adults relates to tissue composition data (neutral lipid,
phospholipids, and protein  levels), which are necessary for estimating volume of distribution and
partition coefficients for PBPK models. The blood composition data found in the database are also
very limited and  need to be expanded. There is some information, albeit limited, on the age-specific
                         Previous  I    TOC

-------
20
                                                                          C. M. THOMPSON ET AL
         .
      CD ra
Tissue Volume Tissue Blood Flow
I 1
\
Card
Renal
M
Tis

Brain
Liver
Kidney
SPT
Fat
Other
Brain
Liver
Kidney
SPT
Blood
^— Other
/entilation
ac Output
Clearance
etabolism
sue Comp


r i i i i i
	 1 1 1 1 1 I
S I I 1 I I
r3 | | | | |
	 1 * >! ''•
r* III
^ ] 1 \
m=^ 	 ' i I \
	 " 	 " 	 " i ! \

""""""" """"""— 	 ' i i i
                                                                                  D 50-64

                                                                                  D 65-74

                                                                                  • 75-84

                                                                                  • >85
                               25
                                         50
                                                  75
                                            Number of Records
                                                           100
                                                                    125
                                                                              150
FIGURE 5. Database summary. Note that collection of data for the 50-64 yr age range was not the focus of this work; thus the record
numbers are not directly comparable to the other age ranges. SPT, slowly perfused tissue; Comp, composition.
variation between ethnic groups and genders. Most of the information in the database, however,
relates to values collected in Asian and Caucasian males. The physiological measurements in health-
impaired older adults are also highly limited and  represent a major data gap; although there is some
information regarding the direction of change in physiological parameters in  specific pathophysio-
logical conditions, most of these data are from middle-aged adults or those 60-65 yr of age.
    There are several ways to  utilize the physiological parameter values in  this database for risk
assessment purposes. For instance, the data can  be analyzed statistically to characterize the impact
of age, gender, health, and ethnicity on specific parameters  (e.g., ventilation  or CYP3A content)
for informing dosimetric adjustment in risk assessment, particularly when PBPK models are not
available for a specific chemical. In addition  to  such dosimetric adjustments, the database can be
useful  for PBPK  modeling in  three ways. First, the database can be viewed  as  a collection  of
peer-reviewed  literature containing relevant data for lifestage  PBPK model development. Second,
the database can  be used to develop age-, gender-, health-, or  ethnic-specific  point estimates based
on multiple sources within the database and subsequently to use these point estimates as inputs
into deterministic PBPK models. Finally, the database can be used together with statistical analyses
to replace point estimates of PBPK model inputs with population distributions (or plausible ranges)
for probabilistic modeling (e.g., Monte Carlo simulation).
    This database, in its current form, may not yet be sufficient for obtaining or deriving physiological
parameters for all  elderly age groups and conditions of interest. However, it should be recognized that
the current database can aid researchers and risk assessors in the development of PBPK models, but
also that the database can be viewed as a starting point for further physiological parameter data collec-
tion and analysis.
    Overall, the literature review conducted in this study resulted in the identification of 528 publi-
cations potentially useful for populating the database; of these, 155 publications contained relevant
physiological data that resulted in  1051 and  115 data records for healthy  and diseased  elderly,
respectively. These data provide a scientific basis for developing distributions and reference values
of several key physiological parameters for PBPK modeling in these populations.
                          Previous
TOC

-------
PHYSIOLOGICAL PARAMETERS IN THE ELDERLY                                                                        21
     REFERENCES

Alexander, J. K., Dennis, E. W., Smith, W. G., Mad, K. H., Duncan, W. C, and Austin, R. C. 1 962-1963. Blood volume, cardiac output,
     and distribution of systemic blood flow in extreme obesity. Cardiovasc. Res. Cent. Bull. 1:39-44.
Allwood, M. J. 1958. Blood flow in the foot and calf in the elderly; A comparison with that in young adults. Clin. Sci. (Lone/.) 1 7:331-338.
Amery, A., Bossaert, H., Verstraete, M., and Belgium, L. 1 969. Muscle blood flow in normal and hypertensive subjects: Influence of age,
     exercise, and body position. Am. Heart]. 78:211-216.
Anthonisen,  N.  R., Connett, J. E.,  and Kiley, J. P. 1994.  Effects of smoking intervention and the use  of an  inhaled anticholineergic
     bronchodilator on the rate of decline of FEV1 .J. Am. Med.Assoc. 272:1497-1505.
Bach, B., Hansen, J. M., Kampmann, J. P., Rasmussen, S. N., and Skovsted, L. 1981. Disposition of antipyrine and phenytoin correlated
     with age and liver volume in man. Clin. Pharmacokinet. 6:389-396.
Barton, H. A., Chiu, W. A., Setzer,  R. W., Andersen, M. E., Bailer, A. J., Bois, F. Y., DeWoskin, R. S., Hays, S.,  Johanson, G., Jones, N.,
     Loizou, G., MacPhail, R. C., Portier, C. J., Spendiff,  M., and Tan, Y.-M. 2007.  Characterizing uncertainty and variability in
     physiologically based pharmacokinetic models: State of the science and needs for research and implementation. Toxicol. Sci.
     99:395-402.
Bebia, Z., Buch, S. C., Wilson, J. W., Frye, R. F., Romkes, M., Cecchetti, A., Chaves-Gnecco, D., and Branch, R. A. 2004. Bioequivalence
     revisited: Influence of age and sex on CYP enzymes. Clin. Pharmacol. Jher. 76:61 8-627.
Beers, M. H. 2005. The Merck  manual of geriatrics, 3rd ed. Rahway, NJ: Merck & Co.
Benchimol, A., Maroko, P. R.,  Pedraza, A., Brener, L., and Buxbaum, A. 1968. Left ventricular end-diastolic pressure and cardiac output
     at rest and  during exercise in patients with angina pectoris. Cardiology 53:261-279.
Bender, A. D. 1965. The effect of increasing age on the distribution of peripheral blood flow in man.). Am. Geriatr. Soc. 13:192-198.
Blanco, J. G., Harrison, P.  L., Evans, W. E., and Relling, M. V. 2000. Human cytochrome P450 maximal activities in pediatric versus adult
     liver. DrugMetab. Dispos. 28:379-382.
Blouin, R. A., and Warren, G. W. 1999. Pharmacokinetic considerations in obesity.). Pharm. Sci. 88:1-7.
Bolomey, A. A.,  Michie, A. J., Michie, C., Breed,  E. S., Schreiner, G. E., and Lauson, H. D. 1 948. Simultaneous measurement of effective
     renal blood flow and cardiac output  in resting normal subjects and patients with essential hypertension. J. Clin. Invest. 28:10-1 7.
Boyd, E. 1933. Normal variability in weight of the adult human liver and spleen. Arch. Pathol. 16:350-372.
Brandfonbrener, M., Landowne,  M., and Shock, N. W. 1955. Changes in cardiac output with age. Circulation 12:557-566.
Brown, R. P., Delp, M. D., Lindstedt, S. L.,  Rhomberg, L. R., and Beliles,  R. P. 1997. Physiological parameter values for physiologically
     based pharmacokinetic models. Tox/co/. Ind. Health 13:407-484.
Brown, R. P., Lockwood,  A. H., and Sonawane, B.  R. 2005. Neurodegenerative diseases: An overview of environmental risk factors.
     Environ. Health Perspect.  113:1250-1256.
Galloway, N. O., Foley, C. F., and Lagerbloom, P. 1965. Uncertainties in geriatric data. II. Organ size.). Am. Geriatr. Soc. 13:20-28.
Capderou, A., Aurengo, A., Derenne, J. P., Similowski, T., and Zelter, M. 2000. Pulmonary blood flow distribution in stage 1 chronic
     obstructive pulmonary disease. Am. ]. Respir. Crit. Care Med. 162:2073-2078.
Chiu, W. A., Barton, H. A., DeWoskin, R. S., Schlosser, P., Thompson, C. M., Sonawane, B., Lipscomb, J. C., and Krishnan,  K.  2007.
     Evaluation  of physiologically based pharmacokinetic models for use in risk assessment. ]. Appl. Tox/co/. 27:21 8-237.
Chouker, A., Martignoni, A., Dugas, M., Eisenmenger, W., Schauer, R., Kaufmann, I., Schelling, G., Lohe, F., Jauch, K. W., Peter, K., and
     Thiel, M. 2004. Estimation of liver size for liver transplantation: The impact of age and gender. Liver Transplant. 10:678-685.
Clark, L.  H., Setzer,  R. W., and Barton, H. A. 2004. Framework for evaluation of physiologically-based pharmacokinetic models for use
     in safety or risk assessment. Risk Anal. 24:1697-1 71 7.
Cody, R. J., Ljungman, S.,  Covit, A.  B., Kubo, S. H., Sealey, J. E., Pondolfino, K., Clark, M., James, G., and Laragh, J. H. 1988. Regulation
     of glomerular filtration rate in chronic congestive heart failure patients. Kidney Int. 34:361-367.
Collis, T., Devereux, R. B., Roman, M. J., de, S. G., Yeh, J., Howard, B. V., Fabsitz, R. R., and Welty, T.  K. 2001. Relations of stroke
     volume and cardiac output to body composition: The strong heart study. Circulation 103:820-825.
Crandall, D. L.,  and DiGirolamo, M.  1990.  Hemodynamic and metabolic correlates in adipose tissue: pathophysiologic considerations.
     FASEBJ. 4:141-147.
Crapo, R. O. 1993. The aging lung:  Pulmonary disease in the elderly patients, ed. D. A. Mahler,  pp. 1-25. New York: Dekker.
Crean, P. A., Pratt, T., Davies, G. J., Myers, M.,  Lavender, P., and Maseri, A. 1986. The fractional distribution  of the cardiac output in
     man using  microspheres labelled with technetium 99m. Br. J.  Radiol.  59:209-215.
Davies, B., and Morris, T.  1993. Physiological parameters in laboratory animals and  humans. Pharm. Res. 10:1093-1095.
Davies, D., and  Shock, N. 1950. Age changes in glomerular filtration rate, effective renal plasma flow, and tubular excretory capacity in
     adult males.). Clin. Invest. 29:496-507.
Dencker, H., GothlinJ., Olin, T., and Tibblin, S. 1972. Portal circulation in humans studied by a dye-dilution technique. Eur. Surg. Res. 4:81-89.
Dinenno, F. A.,  Seals,  D.  R., DeSouza, C. A., and Tanaka, H. 2001. Age-related decreases in basal limb blood flow in humans:  Time
     course, determinants and habitual exercise effects.). Physiol. 531:573-579.
Duerenberg, P., Weststrate, J. A., and Siedell, J. C. 1 991. Body mass index as a measure of body fatness: Age and  sex specific prediction
     formulas. Br.].  Nutr.  65:105-114.
Dwyer, R., and Howe, J. 1995.  Peripheral blood flow in the elderly during inhalational anaesthesia. Acta Anaesthesiol. Scand. 39:939-944.
Farah, F., Taylor, W., Rawlins,  M. D., and James, O. 1977. Hepatic drug acetylation and oxidation: Effects of aging in man. Br. Med. ].
     2:155-156.
Fazekas, J., Thomas, A.Johnson, J.,  and Young, W. 1960. Effect of arterenol (norepinephrine)  and epinephrine in cerebral hemodynamics
     and metabolism. AMA Arch. Neurol. 2:435-438.
                                 Previous  I      TOC

-------
22                                                                                               C. M. THOMPSON ET AL
Fazekas, J., Kleh, J., and Witkin, L. 1 953. Cerebral hemodynamics and metabolism in subjects over 90 years of age. ]. Am. Geriatr. Soc.
     1:836-839.
Fazekas, J., Alman, R., and Bessman, A. 1952. Cerebral physiology of the aged. Am. J. Med. Sci. 223:245-257.
Frackowiak, R. S., Lenzi, G. L., Jones, T., and Heather, J. D. 1 980. Quantitative measurement of regional cerebral blood flow and oxygen
     metabolism in man using 15O and positron emission tomography: theory, procedure, and normal values.). Comput. Assist. Tomogr.
     4:727-736.
Fulop, T., Worum, I., Csongor, J.,  Foris, G., and Leovey, A. 1 985. Body composition in elderly people. I. Determination of body compo-
     sition  by multiisotope method and the elimination kinetics of these isotopes in healthy elderly subjects. Geronto/ogy 1 7:6-14.
Geller, A. M., and Zenick, H. 2005. Aging and the environment: A research framework. Environ. Health Perspect. 113:1 257-1 262.
Gentry, P. R., Haber, L. T., McDonald, T. B., Zhao, Q., Covington, T., Nance, P., Clewell, J. J. III., and Lipscomb, J. C. 2004. Data for
     physiologically  based pharmacokinetic modeling in neonatal animals: Physiological parameters in mice and Sprague-Dawley rats.
     J. Child. Health 2:363-411.
Ginsberg, G., Hattis, D., Russ, A., and Sonawane, B. 2005. Pharmacokinetic and pharmacodynamic factors that can affect sensitivity to
     neurotoxic sequelae  in the elderly. Environ. Health Perspect. 113:1243-1 249.
Goldman, H. I., and Becklake, M. R. 1959. Respiratory function tests:  Normal values of median altitudes and the prediction of normal
     results. Am. Rev. Tuberc.  79:457-467.
Gordan, G. S. 1956. II. Hormones and metabolism. Influence of steroids on cerebral metabolism  in man.  Recent Prog.  Horm. Res.
     12:153-174.
Greenblatt, D. J.  1979. Reduced serum albumin concentration in the elderly: A report from the Boston Collaborative Drug Surveillance
     Program.). Am. Geriatr. Soc. 27:20-22
Groth, S., Assted, M., and  Vestergaard, B.  1989.  Screening of kidney function by plasma creatinine and  single-sample  51Cr-EDTA
     clearance determination—A comparison. Scand. J.  Clin. Lab. Invest. 49:707-710.
Hellon, R. F, and Clarke, R. S. J. 1959. Changes in forearm blood flow with age. Clin. Sci. 18:1-7.
Ho, C. W., Beard, J.  L., Farrell, P. A., Minson, C. T., and  Kenney, W. L.  1997. Age, fitness, and regional blood flow during exercise in the
     heat.;. Appl. Physio/  82:1126-1135.
Hollenberg, N. K., Adams, D. F., Solomon, H. S., Rashid, A., Abrams, H. L., and Merrill,]. P. 1974. Senecence and the renal vasculature
     in normal man. Circ.  Res. 34:309-316.
Hunt, C. M., Strater, S., and Stave, G. M. 1990. Effect of normal aging on the activity of human hepatic cytochrome P450IIE1. Biochem.
     Pharmacol.  40:1666-1669.
ICRP. 1 975. Report of the  Task Group on Reference Man. Publication 23. Oxford: Pergamon Press.
Inoue, T., and Otsu, S. 1987. Statistical analysis of the organ weights in  1,000 autopsy cases of Japanese aged over 60 years. Acta Pathol.
     Jpn. 37:343-359.
Jassal, S. V., and Oreopoulos, D. 1998. The aging kidney. Geriatr. Nephrol. Urol. 8:141-147.
Jermendy, G.,  Istvanffy, M.,  Kammerer, L., Koltai, M., and Pogatsa, G. 1 986. Circulating blood volumes in diabetic patients. Exp. Clin.
     Endocrinol.  88:123-125.
Julius, S., Amery, A., Whitlock, L., and Conway, J. 1967. Influence of age on the hemodynamic response to exercise. Circulation 36:222-230.
Kamper, A. M., Spilt, A., de Craen, A. J., van Buchem, M. A., Westendorp, R. G., and Blauw, G. J. 2004. Basal cerebral blood flow is
     dependent on the nitric oxide pathway in elderly but not in young healthy men. Exp. Gerontol. 39:1245-1248.
Katori, R. 1979. Normal cardiac output in relation to age and body size. Tohoku J. Exp. Med. 128:377-387.
Kenney, W. L., and Zappe, D. H.  1 994. Effect of age on renal blood flow during exercise. Aging (Milano) 6:293-302.
Kenney, W. L., and  Ho, C. W. 1995. Age alters regional distribution of blood flow during moderate-intensity exercise. ]. Appl. Physiol.
     79:1112-1119.
Kjellberg, J., and  Retzenstein, P. 1970. Body composition in obesity. Acta Med. Scand. 1 88:161-169.
Knobler, H., Zornitzki, T.,  Vered,  S., Oettinger, M., Levy, R., Caspi, A.,  Faraggi, D., and  Livschitz, S. 2004. Reduced glomerular filtration
     rate in asymptomatic diabetic patients: Predictor  of increased risk  for  cardiac events independent of albuminuria. J.  Am. Coll.
     Card/o/. 44:2142-2148.
Knudson, R. J. 1991. Physiology of the aging lung. In The lung, Scientific foundations, ed. R. G.Crystal, pp. 1 749-1 759. New York: Raven
     Press.
Lakatta, E. G. 1990. Changes in cardiovascular function with aging. Eur. Heart]. 11(Suppl. C):22-29.
Lassen, N., Lindbjerg, J.,  and Munck, O. 1964.  Measurements of blood-flow through skeletal muscle  by intramuscular  injection of
     xenon-1 33. Lancet 1  5:686-689.
Lauson, H. D., Bradley, S. E., Cournand, A., and Andrews, V. V. 1944.  The renal circulation  in shock.). Clin. Invest. 23:381-402.
Lesser, G.  T., and Deutsch, S. 1967. Measurement of adipose tissue blood flow and perfusion in man by uptake of Kr1. ]. Appl. Physiol
     23:621-630.
Lindeman,  R. D., Lee, T. D., Yiengst, M. J., and Shock N. W. 1966. Influence of age, renal disease, hypertension, diuretics, and calcium
     on the antidiuretic responses to suboptimal infusions of vasopressin.j. Lab. Clin. Med. 68:206-223.
Lucas, D., Farez, C.,  Bardou, L. G., Vaisse, J., Attali, J. R., and Valensi, P. 1998. Cytochrome  P450 2E1 activity in diabetic and obese
     patients as assessed by chlorzoxazone hydroxylation. Fundam. Clin. Pharmacol. 12:553-558.
Luisada, A. A., Bhat, B. A.,  and  Bioeng, V. K. 1980. Changes of cardiac  output caused by aging: An impedance cardiographic study.
     Angiology 31:75-81.
Mahler, D. A., Matthay, R. A., Snyder, P. E.,  Neff, R. K., and Loke, J. 1 985. Determination of cardiac output at rest and  during exercise
     by carbon dioxide rebreathing method in obstructive airway disease. Am. Rev. Respir. Dis.  131:73-78.
Mahler, D. A.,  Barlow, P. B., and Matthay, R. A. 1986. Chronic obstructive pulmonary disease. Clin. Geriatr. Med. 2:285-312.
                                  Previous   I      TOC

-------
PHYSIOLOGICAL PARAMETERS IN THE ELDERLY                                                                         23
Marchesini, G., Bua, V., Brunori, A., Bianchi, G., Pisi, P., Fabbri, A., Zoli, M., and Pisi, E. 1988. Galactose elimination capacity and liver
     volume in aging man. Hepato/ogy 8:1079-1083.
McDonald, R. K., Solomon, D. H., and Shock, N. W. 1951. Aging as a factor in the renal hemodynamic changes induced by a standardized
     pyrogen. J. Clin. Invest. 30:457-462.
McLean, A. J., and Le Couteur, D. G. 2004. Aging biology and geriatric clinical pharmacology. Phannacol. Rev. 56:1 63-184.
Meltzer, C. C, Cantwell, M. N., Greer, P.J., Ben-Eliezer, D., Smith, G., Frank, G., Kaye, W. H., Houck, P. R., and Price,].  C. 2000. Does
     cerebral blood  flow decline in healthy aging? A PET study with partial-volume correction.). Nucl. Med. 41:1842-1848.
Miller, J. H., McDonald, R. K., and  Shock, N. W.  1951. The renal extraction  of p-aminohippurate in the aged individual. J. Gerontol.
     3:213-216.
Miller, R. M., and Tenney, S. M. 1956. Dead  space ventilation in old age. J. Appl. Physio!. 9:321-327.
NHANES III. 1997. National Health and Nutrition Examination Survey, http://www.cdc.gov/nchs/about/major/nhanes/nh3data.htm
Nishimura, Y., Tsutsumi, M., Nakata, H., Tsunenari, T., Maeda, H., and Yokoyama, M. 1995. Relationship between respiratory muscle
     strength and lean body mass in men with COPD. Chest 107:1232-1 236.
Norris, A. H., Mittman, C., and Shock, N. W.  1964. Changes in ventilation with age. In Aging of the lung perspectives, eds. L. Gander and
     J. H. Moyer, pp. 311-317. New York. Grune and Stratton.
Okazaki, K., Hirano, M., Suga, T., and Ohara, Y. 1996. [Cardiac output in patients in their 90s]. Masui 45:352-355.
Olivetti, G., Giordano, G., Corradi, D., Melissari, M., Lagrasta, C., Gambert, S. R., and Anversa, P. 1995. Gender differences and aging:
     effects on the human  heart.). Am. Co!!.  Cardiol. 26:1068-1079.
O'Shea, D. O., Davis, S. N., Kim, R. B., and Wilkinson, G. R. 1994. Effect of fasting and obesity in  humans on the 6-hydroxylation of
     chlozoxazone:  A putative probe of CYP2E1 activity. Clin. Pharmacol. Ther. 56:359-367.
Pacific!,  G. M., Franchi, M., Colizzi, C., Giuliani, L., and Rane, A. 1988. Sulfotransferase in humans: Development and tissue distribution.
     Pharmacology 36:411-419.
Parkinson, A., Mudra, D. R., Johnson, C., Dwyer, A., and Carroll, K. M. 2004. The effects of gender, age, ethnicity, and liver cirrhosis on
     cytochrome P450 enzyme activity in human liver microsomes and  inducibility in cultured human hepatocytes.  Tox/co/. Appl.
     Pharmacol. 199:193-209.
Pierce, J. A., and  Ebert, R. V. 1958. The elastic properties of the lungs in the aged. J. Lab.  Clin. Med. 51:63-71.
Price, K., Haddad, S., and Krishnan, K. 2003a Physiological modeling of age-specific changes in the pharmacokinetics of organic chemicals
     in children. J. Tox/co/.  Environ. Health A  66:41 7-433.
Price, P. S., Conolly, R. B., Chaisson, C. F., Gross, E. A., Young, J. S., Mathis,  E. T., and  Tedder,  D.  R. 2003b Modeling  interindividual
     variation in  physiological factors used  in PBPK models of humans. Cr/t. Rev. Tox/co/. 33:469-503.
Puggaard, L., Bjornsbo, K. S.,  Kock, K., Luders, K., Thobo-Carlsen, B.,  and  Lammert, O. 2002. Age-related decrease in energy expenditure
     at rest parallels reductions in mass of internal organs. Am. J. Hum. Biol. 14:486-493.
Quetelet, L. A. 1869. Physique socia/e, Vol. 2, p. 92. Brussels: C. Muquardt.
Rooke, G. A., Savage, M. V., and Brengelmann, G. L. 1994. Maximal skin blood flow is decreased in elderly men. J. Appl. Physiol. 77:11 -14.
Sandstrom, T., Frew, A. J., Svartengren, M.,  and  Viegi, G. 2003. The need for a focus on air pollution research in the elderly, Eur. Respir. ].
     21:92S-95S.
Sato, T., Miwa, T., and Tauchi, H. 1 970. Age changes in the human liver of the different  races. Geronto/og/a 1 6:368-380.
Scheinberg, P., Blackburn, I., Rich, M., and Saslaw, M. 1950. Effects of aging on cerebral circulation and metabolism.Am. J. Med. 8:77-85.
Schieve, J. F., and Wilson, W. P. 1953. The influence of age, anesthesia and cerebral arteriosclerosis on cerebral vascular activity to CO2.
     Am.]. Med. 15:171-174.
Schmitz, A., Nyengaard, J.  R., and Bendtsen, T. F.  1990. Glomerular volume in type 2 (noninsulin-dependent) diabetes estimated by a
     direct and unbiased stereologic method. Lab.  Invest. 62:108-113.
Schmucker, D. L., Woodhouse, K. W., Wang, R. K., Wynne, H., James, O. F., McManus, M., and Kremers, P.  1990. Effects of age and
     gender on in vitro properties of human liver microsomal monooxygenases. Clin. Pharmacol. Ther. 48:365-374.
Schneeman, B. O., and Richter, D. 1993. Changes in plasma and hepatic lipids, small intestinal histology and pancreatic enzyme activity
     due to aging and dietary fiber in rats.). Nutr. 123:1328-1337.
Seibold, H., Wieshammer,  S., and  Kress,  P. 1988. Relation of pressure and flow  of pulmonary circulation in patients with  chronic
     obstructive pulmonary disease. Clin. Physiol Biochem. 6:29-35.
Seitz, H. K., Egerer,  G., Simanowski, U. A.,  Waldherr, R., Eckey, R., Agarwal, D. P., Goedde, H. W., and von Wartburg, J. P. 1993.
     Human gastric  alcohol dehydrogenase activity: Effect of age, sex, and alcoholism. Gut 34:1433-1437.
Shenkin, H. A., Novak, P.,  Goluboff, B., Soffe, A. M., Bortin, L., Golden, D., and Batson, P. 1953. The effects of aging, arteriosclerosis,
     and hypertension upon the cerebral circulation.).  Clin. Invest. 32:465.
Sherlock, S., Beam, A. G., Billing, B. H., and Paterson, J. C. 1950. Splanchnic blood flow in man by the bromsulfalein method: The relation
     of peripheral plasma bromsulfalein level to the calculated flow.). Lab. Clin. Med. 35:923-932.
Shimada, T., Yamazaki, H., Mimura, M., Inui, Y., and Guengerich, F. P. 1994. Interindividual variations in human liver cytochrome
     P-450 enzymes involved in the oxidation of drugs, carcinogens and toxic chemicals: Studies with liver microsomes of 30 Japanese
     and 30 Caucasians.).  Pharmacol. Exp. Ther. 270:414-423.
Shock, N. W. and Yiengst, M. J. 1955. Age  changes in basal respiratory measurements and metabolism in males.). Gerontol. 10:31-40.
Sitar, D. S. 1 998. Geriatric clinical pharmacology. In Principles of medical pharmacology,  eds. H. Kalant and W. H. E. Roschlau, 6th ed.,
     pp. 820-829. New York: Oxford University Press.
Slack, T. K., and Wilson, D. M. 1976. Normal renal function. Mayo Clin. Proc. 51:296-300.
Slosman, D. O., Chicherio, C., Ludwig, C., Genton, L., de, R. S., Hans, D., Pichard, C., Mayer, E., Annoni, J. M., and de, R. A. 2001.
     (1 33)Xe SPECT cerebral blood flow study in a healthy population: Determination of T-scores.).  Nud. Med. 42:864-870.
                                  Previous   I      TOC

-------
24
                                                                                                 C. M. THOMPSON ET AL
Sotaniemi, E. A., Arranto, A. J., Pelkonen, O., and Pasanen, M. 1997. Age and cytochrome P450-linked drug metabolism in humans:
     An analysis of 226 subjects with equal histopathologic conditions. Clin. Pharmacol. Ther. 61:331-339.
Swift, C. G., Homeida, M., Halliwell, M., and Roberts, C. J. 1978. Antipyrine disposition and liver size in the elderly. Eur. J. Clin. Pharmacol.
     14:149-152.
Tauchi, H., Tsuboi, K., and Okutomi, J. 1971. Age changes in the human kidney of the different races. Geronto/og/a 1 7:87-97.
Thomson, A., Fletcher, P. J., Harris, P. J., Freedman, B., and Kelly, D. T. 1988. Regional distribution of cardiac output at rest and during
     exercise in patients with exertional angina pectoris before and after nifedipine therapy.). Am.  Coll. Cardiol. 11:837-842.
Thompson, C.  M., Sonawane, B., Barton, H. A., DeWoskin, R. S., Lipscomb, J. C., Schlosser, P., Chiu, W. A., and Krishnan, K.  2008.
     Approaches for applications of physiologically based pharmacokinetic models  in risk sssessment. J. Toxicol.  Environ.  Health  B
     11:1-29.
Tietz, N.  W., Shuey, D.  F., and Wekstein, D.  R. 1992. Laboratory values in fit aging individuals—Sexagenarians through centenarians.
     Clin. Chem. 38:1167-1185.
Timbrell, J. 2000. Principles of biochemical toxicology, 3rd ed. Philadelphia: Taylor & Francis.
Wang, Z., Hall, S. D., Maya, J. F., Li, L., Asghar, A., and Gorski, J. C. 2003. Diabetes mellitus increases the in vivo activity of cytochrome
     P450 2E1  in humans. Br.J. Clin. Pharmacol. 55:77-85.
Williams, L., and Leggett, R. 1989. Reference values for resting blood flow to organs of man. Clin. Phys. Physiol. Meas. 10:187-217.
Woodhouse, K. W., Mutch, E., Williams,  F. M., Rawlins, M. D., and James, O.  F. 1984. The effect of age on pathways of drug metabolism
     in human liver. Age Ageing 13:328-334.
Wynne, H. A.,  Mutch, E., James, O. F., Wright, P., Rawlins, M. D., and Woodhouse,  K. W. 1988. The effect of age upon the affinity of
     microsomal mono-oxygenase enzymes for substrate in human liver. Age Ageing 1 7:401-405.
Wynne, H. A.,  Cope, L. H., Mutch, E., Rawlins, M. D., Woodhouse, K. W., and James, O. F. 1989. The effect of age upon liver volume
     and apparent liver blood flow in healthy man. Hepatology 9:297-301.
Zoli, M., lervese, T., Abbati, S., Bianchi, G. P., Marchesini, G., and Pisi, E. 1989. Portal blood velocity and flow in aging man. Gerontology
     35:61-65.
Zoli, M., Magalotti, D., Bianchi, G., Gueli, C., Orlandini, C., Grimaldi, M., and Marchesini, G. 1999. Total and functional  hepatic blood
     flow decrease in parallel with ageing. Age Ageing 28:29-33.
                                 Previous
TOC

-------
ELSEVIER
    Available online at www.sciencedirect.com

               ScienceDirect

Regulatory Toxicology and Pharmacology 50 (2008) 400-411
                                     Regulatory
                                     Toxicology and
                                     Pharmacology

                                www. elsevier. com/locate/yrtph
         Development of  good  modelling practice for physiologically

based pharmacokinetic models for use in risk assessment: The first steps


             George Loizoua'*, Martin Spendiffa, Hugh A. Bartonb, Jos Bessemsc,

     Frederic Y.  Bois d, Michel Bouvier d'Yvoire e, Harrie Buistf,  Harvey J. Clewell IIIg,
           Bette Meekh, Ursula Gundert-Remy \ Gerhard Goerlitzj, Walter Schmittj

                               a Health and Safety Laboratory, Harpur Hill, Buxton, SK17 9JN, UK
          b National Centre for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency,
                                         Research Triangle Park, NC 27711,  USA
         c Centre for Substances and Integrated Risk Assessment (SIR) National Institute for Public Health and the Environment (RIVM),
                                      P. O. Box 1, Bilthoven, 3720 BA, The Netherlands
                      d INERIS, Pare Technologique Alata BP2, 5 rue Taffanel, Verneuil-en-Halatte 60550, France
         ° ECVAM, Institute for Health and Consumer Protection, EC-Joint Research Centre, via E. Fermi 1, 1-21020 Ispra (VA), Italy
                                    {TNO, P.O. Box 360, Zeist, 3700 AJ, The Netherlands
               eThe Hamner Institutes for Health Sciences, P.O. Box 12137, 6 Davis Drive, Research Triangle Park, NC, USA
                     h McLaughlin Institute,  University of Ottawa, 1 Stewart Street, Ottawa, Ont., Canada KIN 6N5
                        1 BfR, Federal Institute for Risk Assessment,  Thielallee 88-92, Berlin D-14195, Germany
                                1 BayerCropScience AG, Building 6690, Monheim 40765, Germany

                                               Received 10 January 2008
                                            Available online 1 February 2008
Abstract

   The increasing use of tissue dosimetry estimated using pharmacokinetic models in chemical risk assessments in various jurisdictions
necessitates the development of internationally recognized good modelling practice (GMP). These practices would facilitate sharing of
models and model evaluations and consistent applications in risk assessments. Clear descriptions of good practices for (1) model devel-
opment i.e., research and analysis activities, (2) model characterization i.e., methods to describe how consistent the model is with biology
and the strengths and limitations of available models and data, such as sensitivity analyses, (3) model documentation, and (4) model
evaluation i.e., independent review that will assist risk assessors in their decisions of whether and how to use the models, and also model
developers to understand expectations for various purposes e.g., research versus application in risk assessment. Next steps in the devel-
opment of guidance for GMP and research to improve the scientific basis of the models are described based on a review of the current
status of the application of physiologically based pharmacokinetic (PBPK) models in risk assessments in Europe, Canada, and the Uni-
ted States at the  International Workshop on the Development of GMP for PBPK Models in Greece on  April 27-29, 2007.
Crown copyright © 2008  Published by Elsevier Inc. All rights reserved.

Keywords:  Good modelling practice; PBPK; Risk assessment
1. Introduction

  The  increasing  use  of  tissue  dosimetry  estimated
using pharmacokinetic models in chemical  risk  assess-

  Corresponding author.
  E-mail address: George.loizou@hsl.gov.uk (G. Loizou).
ments  in a number  of countries necessitates the need
to  develop internationally  recognized  good modelling
practices.  These practices  would facilitate  sharing  of
models and model  evaluations  and  consistent  applica-
tions  in risk  assessments.   Clear  descriptions of good
practices for:
0273-2300/S - see front matter Crown copyright © 2008 Published by Elsevier Inc. All rights reserved.
doi:10.1016/j.yrtph.2008.01.011
                                     Previous

-------
                            G. Loizou et al. I Regulatory Toxicology and Pharmacology 50 (2008) 400-411
                                                                                                               401
1. Model development i.e., research and analysis activities,
2. Model characterization i.e., methods to describe how
   consistent the model is in capturing the relevant biolog-
   ical events  with respect to mode of action  and the
   strengths and limitations of available model and data,
   e.g., sensitivity analyses,
3. Model documentation, and
4. Model evaluation i.e.,  independent review,will assist risk
   assessors in their decisions of whether and how to use
   the models, and assist  model developers to meet various
   expectations (e.g.,  research versus application in risk
   assessment) (Cobelli  et al.,  1984; Portier  and Lyles,
   1996; Rescigno and Beck,  1987).

   For risk assessors, good modelling practice would pro-
vide guidance as a basis to evaluate the  potential for a
pharmacokinetic model, particularly a  physiologically
based pharmacokinetic (PBPK) model, to  contribute to a
risk assessment.  PBPK models represent part of a contin-
uum of  increasingly  data-informed approaches to  dose-
response  characterization that  increasingly  incorporate
more information and as such, contribute to better under-
standing   and  precision in   estimating  risks. These
approaches range  from  default  ("presumed  protective")
to  more  "biologically-based  predictive"  (Meek et  al.,
2001). Default approaches are based on empirical  observa-
tions  from broad  databases of information that  are not
group, species or chemical specific; pharmacokinetics and
dynamics are  not explicitly addressed. "Categorical" and
"species-specific"  approaches incorporate  category  or
group specific information  and increasingly along  the
continuum, chemical-specific data are incorporated. This
includes  development of chemical   specific  adjustment
factors   (CSAF)   incorporating  compound-related   or
chemical-specific   pharmacokinetic   (including   PBPK
models)  or pharmacodynamic data  (Gundert-Remy and
Sonich-Mullin, 2002;  IPCS, 2005) When appropriate, fully
data-derived,     chemical-specific,     biologically-based
dose-response risk  assessment methods can be employed
for chemicals of high  concern  or with  high economic
impacts thus entailing fuller quantitative characterization
of toxicokinetic  and toxicodynamic aspects.
   Increasingly, data-derived approaches to dose-response
assessment are based  on  weight of evidence descriptions of
known or hypothesized modes of action, the latter being a
description of the key events leading to toxicity rather than
a full mechanistic understanding. A framework for orga-
nizing and evaluating the weight of evidence supporting
modes of action in  animals and their relevance to  humans
has been developed which is applicable to all toxicity end-
points. This framework has evolved from consideration of
the weight of evidence  of  an  animal mode  of action
(Sonich-Mullin et  al., 2001)  to extension to human rele-
vance (Boobis et al.,  2006; Meek et al., 2003b) and from
cancer (Boobis et al., 2006; Meek et al., 2003b) to non can-
cer endpoints  (Boobis et al., 2008; Seed et al., 2005). This
framework which  is  now widely  used  in  assessments
nationally  and internationally continues to  evolve, cur-
rently being extended to integrate dose-response analysis.
As such, it provides a transparent basis for defining the suf-
ficiency of data on mode of action that is needed to inform
the use of physiologically based pharmacokinetic models in
risk assessment.
   For modellers, GMP is important to delineate the nat-
ure of model  characterization and documentation that is
optimal for application in risk assessment. The initial cre-
ation of models, along with needed laboratory experimen-
tation, can be a creative and unpredictable  process that
will be minimally altered  by GMP. However, even at this
very early stage,  awareness of  GMP can be  valuable,
including  recommendations regarding transparency  for
publication of the models in the peer reviewed  literature
(Andersen et al., 1995). For example, modellers  often try
several alternative structures as they attempt to  reconcile
the available  data and the description of the biology in
the model. While documentation to the same degree as a
model  proposed for  use in risk assessment is unnecessary,
understanding of the alternatives considered  is important
in supporting  the model structure eventually selected (Bar-
ton et  al., 2007).
   The International Workshop  on the Development  of
GMP for PBPK models1 was convened with two principal
themes:

1. The selection and evaluation of an appropriate determin-
  istic^ model structure.
2. Increasing  the  understanding  of regulators  and risk
  assessors through increased transparency and accessibil-
  ity to user-friendly modelling techniques.

   This was the first  forum dedicated to promotion of best
practice in deterministic  PBPK  model development and
parameterisation, including consideration of transparency
in documentation with clear audit trails for model compo-
nents. Increase in consistency and transparency of support-
ing documentation is expected to facilitate dialogue and
understanding between PBPK practitioners, risk assessors
and  regulators.  By  bringing together PBPK modellers,
mathematicians,  statisticians,  risk  assessors, regulators
and  laboratory scientists, the sponsors of this workshop
seek increased implementation of PBPK modelling in risk
assessment internationally, which GMP for PBPK should
 1  April 26-April 28, 2007, at the Mediterranean Agronomic Institute of
Chania, Crete, Greece. Presentations, and discussion papers are at http://
www.hsl.gov.uk/news/news_pbpk.htm. Additional information is avail-
able at www.pbpk.org.
 2  A "deterministic" model  is  the mathematical representation of the
biological/chemical system (e.g., PBPK model and metabolic scheme) as
opposed to a "non-deterministic"  model, which is the mathematical/
statistical representation of the uncertainty, variability, and covariance of
the data and parameters of the deterministic model (e.g., statistical model
for measurement errors and population variability).  Non-deterministic
modelling was a focus of the International Workshop on Uncertainty and
Variability in PBPK Models,  2006, North Carolina, US http://www.epa.
gov/ncct/uvpkm/ (Barton et al., 2007).
                                    Previous

-------
402
                           G. Loizou et al. I Regulatory Toxicology and Pharmacology 50 (2008) 400^tll
facilitate. This paper presents the results and conclusions of
the GMP workshop.

2. Current practice—where do we stand?

   The structure of PBPK models may differ to reflect the
requirements of the application, e.g., research (hypothesis
testing) and risk  assessment.  Appropriate practice  for
these different uses and various stages of model  develop-
ment  are desirable.  Past efforts  to  develop GMP  for
other types of models applied in environmental regulation
are informative in terms  of  their  form, content,  and
application.

3. Value of PBPK  models in risk assessments

   The need for increasing incorporation  of kinetic data
in the current risk  assessment paradigm is  due to an
increasing demand from risk  assessors and regulators
for higher precision of risk  estimates,  a  greater under-
standing of uncertainty  and  variability   (Allen et  al.,
1996;  Barton et al.,  1996; Clewell et al.,  1999, 2002b;
Cox, 1996; Delic et  al., 2000), more  informed means of
extrapolating across  species, routes, doses and time (Cle-
well and Andersen, 1987), the need for a more meaning-
ful  interpretation   of   biological  monitoring  data
(Georgopoulos et  al., 1994; Hays et al., 2007) and reduc-
tion in the reliance on animal testing (Barratt et al.,  1995;
Blaauboer  et al.,  1996,  1999; DeJongh  et  al.,  1999).
Incorporating PBPK modelling into  the risk  assessment
process can  advance all of these  objectives. Further,  the
increasing  trend  to  cost-benefit analysis should  also
increase  the  utility of biologically based  approaches in
the support  of risk management  decisions by regulatory
agencies (US EPA, 2006).
   In addition, increasingly, testing and risk assessment is
being  driven by considerations of mode  of action and
resulting in more data-informed approaches to character-
ization of dose response, which  should facilitate the incor-
poration of PBPK   modelling.  These approaches  are
increasingly  being  adopted by risk assessment and regula-
tory communities,  based on, for example, international ini-
tiatives such as the IPCS harmonization initiative for the
risk assessment of chemicals (Sonich-Mullin et al., 2001).
The latter initiative  seeks to  improve  methods and  to
increase understanding and acceptance through the pursuit
of common principles and approaches by drawing on glo-
bal expertise, leading ultimately to  greater consistency
and convergence which will permit the sharing of assess-
ments  and avoid duplication. Potential areas of conver-
gence  for which  analytical  frameworks,   guidance  and
associated training materials have been developed through
this initiative include  weight of evidence for mode  of
action, CSAFs and more recently PBPK modelling (Boobis
et al.,  2006, 2008; IPCS, 2005; Meek et al., 2001,  2002,
2003a,b;  Meek and Renwick, 2006; Sonich-Mullin et  al.,
2001).
4. Current status of implementation of PBPK models in risk
assessments

  One of the first PBPK models to be adopted in regula-
tory risk  assessment was  that  for methylene chloride,
whose  evolution involved an iterative hypothesis  testing
process for the pharmacokinetics and glutathione transfer-
ase-mediated mode of action leading to cancers in rodents.
The mathematical model gave a quantitative form to  the
researcher's conception of the biological system, permitting
the  development of a testable, quantitative hypothesis,  the
design  of informative experiments and the ability to recog-
nize inconsistencies between theory (model) and data. The
explicit description  of model parameters also led  to  the
ability  to  study and quantify uncertainty.  The model  has
been widely applied in risk assessments by the US Con-
sumer  Products Safety Commission  (Babich,  1998),  by
the  US Occupational Safety  and Health  Administration
for  establishing  the permissible  exposure  level including
use  of  Bayesian statistical parameter estimation and char-
acterization  of uncertainty  and variability  (OSHA, 1997),
by the  US Environmental Protection Agency in the Inte-
grated  Risk Information System (IRIS)  assessment  for
inhalation cancer risk (Dewoskin, 2007; US  EPA, 1987)
and Health Canada in their  assessment for the general pop-
ulation under the Canadian Environmental Protection Act
(Government of Canada, 1993).
  An overview of the use of PBPK modelling by various risk
assessment/regulatory authorities is presented in Table 1.
  The results of this limited analysis presented in Table
1 indicate that  PBPK  models  are  increasingly  being
adopted  in  risk assessment  by  regulatory  agencies  in
Europe and North America, most often to  date, as a
basis for  quantitatively considering interspecies  differ-
ences as a basis to replace  the default  approach. The
extent  of documentation of  the  rationale for accepting
or rejecting  the use of particular models varies consider-
ably and  is  likely to  be dependent upon  access to rele-
vant expertise.  In  most  cases,  lack of  adoption   of
particular models within  risk  assessment  has been a
function  of  insufficient weight of evidence of the  under-
lying hypothesized mode of action and/or the lack of a
standardized procedure for the  evaluation of  PBPK
models and their output.

5. What can we learn from other similar modelling
experiences?

  While  the use of quantitative  modelling in  human
health  risk assessment has been more limited, particularly
biologically-based dose-response analyses, modelling  for
environmental fate  and transport  has gained increased
acceptance since the  1990s and is  now widely accepted
in European,  Canadian, and  US regulatory contexts.
Today  in Europe modelling  endpoints for  groundwater
are  decisive in the registration  of pesticides.  In  North
America and Europe, risk assessments for  specific con-
                                     Previous

-------
                               G. Loizou et al. I Regulatory Toxicology and Pharmacology 50 (2008) 400-411
                                                                                                                              403
Table 1
The use of PBPK modelling by various risk assessment/regulatory authorities
Assessments
                                              Use of PBPK Models
                                       Impact/Rationale
Random selection of 80/141 EU Existing Substances  Mentioned in 8/80
  Reports (1996-2007) (European Chemicals
  Bureau)
                                              Adopted in 418
                                              (vinyl acetate, 2-butoxyethanol, propylene
                                              methyl glycol,styrene)
                                              Not used in 418 (benzene,

                                              acrylic acid,
UK Health and Safety Executive
cyclohexane, methyl methacrylate,)

Formaldehyde

2-butoxyethanol
French Agency for Environmental and Occupational  Consideration in setting reference values
  Health                                      for reproductive health

Health Canada Priority Substances under the       Considered inadequate for quantifying
  Canadian Environmental Protection Act (n = 44   interspecies differences for
  on the first Priority Substances List (PSL 1) and   tetrachloroethylene, styrene and
  25 on PSL 2 (1989-1994) (Health Canada Priority  diethylhexylphthalate (PSL 1)
  Substances Assessment Program)
                                              Adopted for
                                              Cadmium (PSL 1)
                                              Formaldehyde

                                              Chloroform
                                              2-butoxyethanol
US FDA
US EPA (IRIS)
Trans retinoic acid


Not applied for
Acetone
Chloroform

Methyl ethyl ketone

Adopted for
Dichloromethane
Ethylene glycol monobutyl ether
Vinyl chloride
                                              Xylene
Reduction of uncertainty factor for interspecies
differences or reduction of classification category
Mode of action judged to vary between humans and
animals
In vitro activity of enzymes in one tissue as
surrogates of in vivo activity in another tissue judged
to be implausible
No reason provided

Lack of biological plausibility of association with
leukaemia (Franks, 2005)
Consideration of validity of a biomarker and
robustness of past regulatory decisions (Delic et al.,
2000, Franks et al., 2006)

(INERIS, 2007)
Quantification of human variability
Quantification of interspecies differences in
biologically motivated case specific model
Quantification of interspecies differences
Quantification of interspecies differences

Consideration of potential risk of dermal application
(Clewell et al., 1997, Rowland et al., 2004)


Lack of necessary exposure route in model.
Lack of model parameterization in species with
critical effect.
Lack of sufficient supporting data for model and
demonstration of predictive capability.

Quantification of interspecies differences.
Quantification of interspecies differences.
Quantification of interspecies differences in PK and
demonstration of interspecies similarities in cancer
PD. Route-to-route extrapolations to derive point of
departure.
Comparison to default RfCa
 a The concentration of a chemical in air that is very unlikely to have adverse effects if inhaled continuously over a lifetime (http://cfpub.epa.gov/ncea/
cfm/recordisplay.cfm?deid=55365).
laminated sites  or permitting of industrial facilities also
rely heavily  on often complex models for exposure  path-
ways  including  food  chains  (US  EPA,  1989).   More
recently, in view of the introduction of demanding  man-
dates  to consider much larger numbers  of existing chem-
ical substances (e.g.,  categorization  and screening of the
Domestic   Substances   List   (DSL)   (Health   Canada
Domestic Substances List) in Canada and the Registra-
                     tion, Evaluation  and  Authorization  of Chemical  Sub-
                     stances  (REACH)  (European  Community  Regulation
                     REACH) in Europe, there is increasing development of
                     quantitative structure activity relationship  (QSAR) mod-
                     els particularly for  application  in  human  health  risk
                     assessment  and associated GMP. These experiences pro-
                     vide perspectives that are potentially useful for the devel-
                     opment of GMP for PBPK modelling.
                                        Previous

-------
404
                           G. Loizou et al. I Regulatory Toxicology and Pharmacology 50 (2008) 400^tll
6. Environmental modelling-achieving acceptance in the
regulatory world

  The development of good practice in environmental fate
modelling may provide a relevant perspective for the devel-
opment of GMP for PBPK modelling. GMP for environ-
mental fate  modelling evolved in Europe  as a result of
two issues; firstly EU legislation in the late 1980s set a max-
imum pesticide residue concentration of 0.1  (ig L"1 in  both
drinking and ground water and secondly lysimeter3 studies
which  took  between three to  four years leading to  long
delays  on decisions on the use of critical products in  agri-
culture  while  avoiding  contamination  of groundwater
resources. Environmental fate modelling was recognized
as a promising approach to address these issues but ques-
tions were  raised  concerning  whether model predictions
were sufficiently reliable and how to  ensure  the integrity
of model calculations.
  Clear  divisions  in attitudes among environmental fate
modellers, regulators, and registrants emerged following ini-
tial discussions. Researchers used the models for the investi-
gation  of processes and systems,  requiring flexibility and
adaptability while maintaining full control  of processes
and  algorithms in  the models. Regulators and registrants
wanted to predict exceedence or adherence to a regulatory
limit. They required  scientific and legal certainty and pre-
ferred models for which  the code could not be altered, and
had complete documentation with clear audit trails for cal-
culations. Further  conflicts arose because version control
and documentation of research models was rudimentary at
best, no  guidance  on the  selection of appropriate input
parameters was available, and it was rarely properly estab-
lished whether a model design was suitable for regulatory
purposes. These issues reflected the variations in objectives
of models developed for research versus regulatory applica-
tion  with the former being intended for use by a specialist
with specific and intensive training, which at the time was
almost totally lacking in regulatory agencies and companies
assessing the environmental behaviour of plant protection
products. As a consequence, results for different modellers
using the same models in similar applications varied.
  Initially, software packages comprising models with  a
user-friendly graphical interface and pre-configured scenar-
ios were developed.  However, non-expert users still  pro-
duced  poor  results for  two  main  reasons  (i)  model
processes, algorithms and standard parameters  did  not
appropriately reflect substance properties, and (ii)  sub-
stance data from standard environmental fate studies  were
conceptually different from these required for model imple-
mentation. This led to proposals from regulatory agencies
to apply good laboratory practice (GLP) for modelling to
ensure  that all data could be 'verified'. Also, GLP had just
been successfully transferred from toxicology to metabo-
lism, environmental fate and residue analysis laboratories.
 3 The measurement of the water percolating through soils and  the
determination of the materials dissolved in the water.
On the other  hand, measurements are  never  perfectly
reproducible  (especially not for living  systems) whereas
simulations are and GLP is difficult to apply to electronic
data systems  and calculations.
   This led to the development of a short document by the
Federal Biological Research Centre for Agriculture and For-
estry (BBA), the Federal Environmental Agency (UBA), the
Fraunhofer Institute for Environmental Chemistry and Eco-
toxicology (FhG IUCT) and  the German Agrochemical
Industry (IVA) entitled, "Rules for the correct performance
and evaluation  of model calculations for simulation of the
environmental  behaviour  of pesticides" (Gorlitz,  1993).
Later referenced as the 'Codex'  this document outlined gen-
eral principles of GMP, rather than prescriptive guidance. It
focused on leaching models but was generally applicable to
other simulation models and addressed the following topics:
selection of models, documentation  of models, validation,
support, official recognition and version control, selection
and treatment of input data, consistency of input data and
models, documentation of simulations, reporting and inter-
pretation.  The Codex led to regulatory acceptance of simula-
tion  models  on a national scale in Germany, as well as
providing  a basis to address the requirements of the Euro-
pean directive 91/414 (European Community  Regulation
Council Directive 91/414).
   After several informal meetings between modellers, reg-
ulators and registrants, the F<9rum for the Coordination of
pesticide fate  models and their t/Se (FOCUS) was created
in 1993 through an initiative of the European Commission
(European Commission FOCUS). The steering committee
of FOCUS met under the auspices of the  EU Directorate
General for Health and Consumer Affairs (DG SANCO)
for the first time in 1993 and approved two research area
themes  on models for groundwater and surface  water.
FOCUS   has  an equal  representation  of  regulators,
researchers and industry that  operate  by consensus and
offer technical support to the EU registration process 91/
414.  It has  no administrative infrastructure  but  DG
SANCO provides funds for attendance at meetings for reg-
ulatory  experts and  researchers. The FOCUS committee
meets approximately four times per year and has two per-
manent institutions. The FOCUS website (European Com-
mission FOCUS) provides all the reports of past FOCUS
projects, the  actual recommended  versions of models as
well as essential scenario data.  Members of the supporting
technical Version  Control Group are model developers/
supporters. This group approves new model versions, and
the content of the website, by correspondence.
   Currently,  FOCUS reports figure prominently in expo-
sure  assessments for the registration of plant protection
products in the  EU. This is best illustrated by the fact that
the present draft of the revision of the EU  directive 91/414
on the authorization of plant protection  products  refer-
ences directly FOCUS reports as guidance on important
decision points.  FOCUS  outputs are also  widely adopted
as guidance by member states in their exposure and risk
assessments.
                                     Previous

-------
                           G. Loizou et al. I Regulatory Toxicology and Pharmacology 50 (2008) 400-411
                                                                                                             405
7. Evolving acceptance of QSAR modelling

  Evolving advancements internationally in the documen-
tation and implementation of quantitative structure activ-
ity  relationship  (QSAR)  models  to  meet  demanding
mandates to  consider  much larger  numbers  of existing
substances may also contribute  in  the development  of
GMP for PBPK models. These include principles for ver-
ification of QSAR model output (OECD, 2007) and pro-
posed templates for QSAR development, prediction and
reporting   (http://ecb.jrc.it/qsar/)  (European   Chemicals
Bureau RIP 3.3). Whilst the documentation is still evolv-
ing  internationally, information on the training domain,
internal validation, cross validation  and external valida-
tion requirements has been proposed to be included in a
'development template' whereas substance-specific  infor-
mation  is proposed to  be included  in  a  'prediction
template'.

8. Future directions—where do we need to go?

  The following sections briefly summarize some of the
major issues  considered and recommendations from the
workshop designed to facilitate the development of GMP
for  PBPK modelling   as well  as to  identify research
priorities.

8.1. Risk assessors needs and their role in the process

  Two possible paradigms were proposed for the involve-
ment of the risk assessor throughout the modelling pro-
cess: (1)  issues raised  by the  risk assessor are included
during model development, and (2) at appropriate times,
the  model would be evaluated for fitness  for  regulatory
use. To the extent that it is possible, the former process
is clearly preferred  and necessitates involvement of  an
interdisciplinary team in model development and charac-
terization (Barton et al., 2007), whereas the latter process
is more   typical for  models  that  have  already been
published.
  Risk assessors have important roles to play in mode of
action  and dosimetry-based  risk  assessments utilizing
PBPK models.  These include  transparently assessing the
weight of evidence of hypothesized modes  of action as a
basis for clearly delineating the goals for using the model
in the  risk assessment  (Clewell et al.,  2002a;  US  EPA,
2006) and participating  in a  transparent process that
brings together appropriate interdisciplinary expertise  to
evaluate the model and its proposed risk assessment appli-
cations (Chiu et al., 2007; Clark  et  al., 2004). Further-
more, risk assessors play pivotal roles  in  organizing the
information on  mode  of action and dose-response e.g.,
critical studies  and endpoints  that form the context for
applying  a dosimetry  model.  Transparent  frameworks
developed  for this  purpose (Boobis  et al., 2008; IPCS,
2005) may assist the risk assessor in assimilation  of this
information.  Determining whether  a  PBPK  model  is
parameterised for  the chemical(s), including metabolites,
species and life stages,  exposure routes and matrices in
the toxicity studies used in dose-response analysis or the
human exposures  relevant for the risk assessment,  can
be accomplished by non-modellers. Identifying the  dose
metrics relevant to the  modes  of action under  consider-
ation and evaluation of the biology captured by the model
often requires communication among risk assessors,  toxi-
cologists, and modellers. Evaluation of the  mathematical
and  computer implementation as well as characterization
of its consistency  with available data and the model's
strengths and weaknesses for the proposed risk assessment
applications will generally require involving those  with
appropriate mathematical, statistical  and computational
expertise. However, to ensure a transparent process, com-
munications describing the review process and its conclu-
sions need  to be clear and comprehensible to all parties.

8.2. Model development practices

   Model standardization can facilitate intra- and inter-dis-
ciplinary communication but creates challenges of adapting
to a variety of software  used to produce a range of model
structures necessary to describe different kinetic behaviours
and address varying model purposes. There are significant
benefits to  the  use  of generic model structures,  including
the establishment of standard abbreviations or parameter
nomenclature and glossary, which would facilitate efficient
communication of models and avoid confusion in seman-
tics that can hinder understanding. In addition, the  need
to justify selected aspects of the model could be eliminated
as is currently done by citing existing literature. To be  truly
generic, however, a model would have to encompass a wide
range of physiological  compartments  and all useful  dose
metrics.
   Standard methodology for model building might  be a
more viable alternative  than a  fixed model form (Cobelli
et al.,  1984).  Moreover,  the use of  a hybrid  approach
whereby a simple standard model is used as a starting point
and  refinements during  the modelling workflow are  con-
ducted utilising a standardized  model building methodol-
ogy may be a viable compromise. In discussing the issues
associated with model code that is specific to a particular
solver package, it was  agreed by the workshop delegates
that the use of a standard representation similar to Systems
Biology Mark-up Language (SBML) or Cell Mark-up  Lan-
guage (cellML)4 would  improve communication between
modellers and risk  assessors.  Mark-up Language (ML) is
a type of representation that gives a structured description
of the conceptual model,  free of mathematical equations
and confusing syntax. The provision of an intuitive graph-
ical interface such  as MEGen5 could make such standard
formats more  accessible  to  non-modellers  by allowing
rapid generation of this  'PBPKML' representation.
 4 http://sbml.org; www.cellml.org
 5 www.opentox.com/megen
                                   Previous

-------
406
                           G. Loizou et al. I Regulatory Toxicology and Pharmacology 50 (2008) 400^tll
8.3. Model verification

   Models can be analysed to demonstrate that they  are
mathematically and  computationally  free of errors and
that the behaviour of the model in the region of parameter
space that is biologically plausible, reasonably  approxi-
mates the  available  data (Barton et  al.,  2007;  Oreskes,
1998).  Demonstration that a model  is  mathematically
and computationally correctly implemented can involve
checks  incorporated  in  the  model, e.g., mass  balance
checks,  rigorous manual checking of the equations and
computer code, and independent receding of the model
using another software  environment. The ease  of imple-
menting these options varies with the particular software
used.  A  PBPK  model code  generator tool  such  as
MEGen5 could facilitate these checks by permitting rapid
receding of models.
8.3.1. Roles and methods of sensitivity analysis
   Sensitivity analysis is a tool for model characterization
that can address a number of issues frequently raised con-
cerning PBPK models.
   Sensitivity analyses can be implemented in model devel-
opment, characterization and evaluation to address several
aspects including the following:

1. Characterizing parameters  that are well determined by
   available data.
2. Iterating with experiments and evaluating the sensitivity
   of parameters to new data that will be collected (Cho
   et al., 2003; Gueorguieva et al., 2006; Nestorov et al.,
   1998).
3. For dose-response analysis predictions, evaluating the
   sensitivity of dose metrics predicted under the conditions
   relevant to the toxicity studies (or epidemiological stud-
   ies) to the parameters in the model.
4. For risk assessment, evaluating the predicted dose met-
   rics  in humans under relevant environmental exposure
   conditions to  characterize their sensitivity with respect
   to the model parameters.

   The many existing sensitivity analysis methods can be
grouped into two categories:  (1) local methods that con-
sider sensitivities close to a specific  set of input param-
eter values, and (2)  global methods that calculate the
contribution  of  a parameter  over the set of all possible
input  parameters.  Currently, gaining  insight  into  a
model often involves the adjustment of individual  model
parameters and  observation of the  predicted changes in
model output,  either at a single time  or throughout a
time course.  This useful  practice can be supplemented
by examining the time-dependent global sensitivities of
the chosen dose-metric for dominant parameters.  When
trying  to  establish the contribution  of a parameter to
model predictions, local  sensitivity  analysis techniques
are fairly  rapid and  simple to implement but can give
somewhat  misleading results  if  there  are substantial
interactions among multiple parameters.
   Global sensitivity analysis using the Extended Fourier
Amplitude Sensitivity  Test  (FAST)  is  a variance-based
method that is independent of any assumptions about the
model structure and is effective for monotonic, exclusively
increasing or decreasing predictions, and non-monotonic
models  (Campolongo and Saltelli, 1997). The  FAST  is
preferable over other global methods due to its computa-
tional efficiency and capability to consider parameter inter-
actions  as well as main effects. Since PBPK models are
likely to become  increasingly complex as more pertinent
data become more readily available more robust sensitivity
analysis techniques will be required and FAST appears to
satisfy these criteria.
8.4. Model documentation

   Suggestions for documenting models in publications
have  been   presented  previously  (Andersen   et  al.,
1995).  As noted  therein, model documentation  must
address a diverse readership. Recommendations  from
this workshop were  to  develop a standard, brief model
description  summary  for the  broad  risk  assessment
audience and more detailed  documentation for  special-
ists. The summary would  contain at least seven elements
including:

1.  Introduction including problem formulation (applicabil-
   ity of model).
2.  A text description of the model (species, routes, etc) with
   schematic diagram, and an overview of the information
   and data supporting the model structure.
3.  Metabolic pathways for the chemical and an overview of
   the supporting information and data.
4.  Relationship to  mode of action including dose metric
   predictions and supporting information.
5.  Distributional predictions of model  outputs  and their
   implications (e.g.,  Monte Carlo simulation of human
   variability).
6.  Overview of uncertainty and sensitivity analyses.
7.  Source  of complete information (e.g., citation).

   Further recommendations  for more complete model
documentation  could include  the  possibility  of utilising
hyperlinked documents that facilitate easy access  to sup-
porting materials, including calculations done to  convert
published scientific information into the form used in
the model. This extended model  documentation would
be utilized by subject experts in the model  evaluation pro-
cess and would ideally be publicly accessible via the Inter-
net. The  documentation  would strive for transparency
through the integration of diagrams of model structure
and metabolic pathways,  tables of model state variables
and parameters  and  mathematical equations and model
code.
                                    Previous

-------
                           G. Loizou et al. I Regulatory Toxicology and Pharmacology 50 (2008) 400-411
                                                                                                             407
8.5. Model evaluation

   Best  practices  allow efficient  evaluation  of models
through  standardization,  documentation,  and  transpar-
ency. The six-step process of assessment of model purpose,
assessment of model structure and biological characteriza-
tions, assessment of mathematical descriptions, assessment
of  computer  implementation,  parameter  analysis  and
assessment of model fit and assessment of any specialized
analyses described by Clark et al. (2004)  and extended by
providing more detail by Chiu et al. (2007) provides a use-
ful framework for model evaluation. Further, specification
of criteria that would assist reviewers in determining the
strengths and limitation of a specific model and  a process
for implementation of model evaluation, which must be
transparent and involve independent review,  would be
valuable.
   Development of a robust model evaluation process must
take into account the need for external review since while
involvement of risk assessors and modellers throughout
the steps leading from model development to application
in risk assessment is valuable, it can impact on the percep-
tion of the model evaluation as an independent process. An
independent review is essential to identify and correct mis-
takes and to make judgments on the adequacy of the model
and its supporting scientific database. Such reviews present
a challenge internationally, not least because of the limited
PBPK modelling expertise globally.  For this  reason,  it
would be valuable to be able to share model evaluations
among countries, by agreeing upon a common framework
and process even if the final decisions concerning model use
might be different, for example due  to risk assessment
needs.
   A major challenge of model evaluation is to provide per-
spective on the scientific uncertainties identified by a model
and its supporting scientific database. Models allow char-
acterization of uncertainty in a way that default analyses
cannot:  for example, a default value of 10 is commonly
applied for interspecies extrapolation, but the uncertainty
for any  specific chemical with  regard to  the toxicity  it
causes in animals ranges from close to zero (the effect only
occurs in the animals) to  a much larger value  (the effect
only occurs in humans). While the factor of 10 represents
a judgment concerning the general tendency across many
chemicals, it cannot describe the uncertainties for a specific
chemical whereas this is possible using biologically based-
modelling. However, this creates a challenge for  consider-
ing whether  the model adequately captures the  science
and thus, should be implemented in the risk assessment.

8.6. Improving the scientific basis supporting models

   Efforts to use PBPK models more broadly  have  also
resulted in a range of scientific issues that require addi-
tional research. These include improving methods for using
in vitro data in order to limit controlled animal and human
studies,  for model development  by  extrapolating  from
widely studied chemicals to those with limited information
and for better characterizing uncertainty and variability in
PBPK models.
8.6.1. In vitro to in vivo extrapolations
   Ideally, in vitro data should be used in PBPK models
because they can limit the need for in vivo studies in ani-
mals or humans. However, limitations of models to predict
in vivo rat data using metabolic parameters estimated from
in vitro studies have been noted (Csanady and Filser, 2007;
Faller et al., 2001; Lee et al., 2005; Osterman-Golkar et al.,
2003).
   In  vitro  to  in vivo extrapolation,  particularly  with
regards to metabolism, requires further detailed study (Bla-
auboer et al., 1999, 1996; DeJongh et al., 1999; Gulden and
Seibert, 2003; Houston,  1994;  Kedderis,  1997; Lipscomb
et al.,  1998; Miners et al.,  1994; Rostami-Hodjegan and
Tucker, 2004; Verwei  et al., 2006; Wilson et al., 2003).
The  importance of protein  and non-specific binding and
partitioning of substrates are fundamental to improving
the utility of in vitro systems and the use of such data in
PBPK models. While there are  initiatives underway to
assist in addressing many of these issues6 and encouraging
results have recently been reported (Acutetox Newsletter
July, 2007), the limitations of in vitro metabolism data must
be borne in mind until and unless they can be demonstrated
to be reliable surrogates.
8.6.2. Cross chemical extrapolation
   Risk assessors are increasingly having to address prior-
itisation and assessment for the large numbers of chemicals
in commerce, notably the REACH legislation in Europe or
the Categorization  and Screening of the  Domestic Sub-
stances List under the Canadian Environmental Protection
Act (1999)7. Methods to develop initial PBPK models for
chemicals using  cross-chemical prediction  methods would
be valuable and efforts to date have primarily been directed
at predicting tissue:blood or tissue: air partition coefficients
(Beliveau et al., 2005), though  in vitro to in vivo extrapola-
tion for metabolism and other  aspects of pharmacokinetics
are also receiving attention.


8.6.3. Uncertainty and variability in PBPK models
   Much of the focus in the development of PBPK models
has been to identify and capture the average behaviour of
the key biological processes controlling the pharmacoki-
netics  of a chemical.  These models  have successfully
assisted in evaluating biological  hypotheses for mode of
action e.g., methylene  chloride  carcinogenesis described
previously, as well as identifying previously unrecognised
pharmacokinetic behaviours.  The increasing  application
of PBPK models in risk assessment  has  led  to  a range
of efforts to better characterize  the relationship  between
 6 (http://www.acutetox.org/)
 7 (http://www.ec.gc.ca/substances)
                                   Previous

-------
408
                           G. Loizou et al. I Regulatory Toxicology and Pharmacology 50 (2008) 400^tll
the model and supporting data and quantify uncertainty
and variability.
   Improved computing power is essential to more wide-
spread use of distributional analyses to characterize human
variability with Monte Carlo sampling techniques  and
methods of parameter estimation ranging from optimisa-
tion of selected chemical specific parameters (e.g., meta-
bolic   rates)   to  global  parameter  estimation  using
Bayesian statistical characterization of  uncertainty  and
variability. Priorities for research and implementation  of
concepts of uncertainty and variability in risk assessments
using PBPK models have been previously described (Bar-
ton et al., 2007).
8.7. Good modelling practices for PBPK models: developing
a description, case studies and training materials

   The  International Programme  on  Chemical  Safety
(IPCS) steering group of the World Health Organization
(WHO) identified PBPK modelling  as an important com-
ponent  of chemical  risk assessment that  merits interna-
tional  harmonization8.  The  ability to review a  PBPK
model  according to accepted criteria would greatly facili-
tate widespread acceptance, in particular amongst regula-
tors. While  agreement amongst PBPK model developers
is paramount for the development of GMP, the guidelines
must also be acceptable to regulators and risk assessors.
Development  of  guidelines  for GMP  is  best achieved
through  a cross-disciplinary exchange  of  experience and
ideas among laboratory scientists, PBPK modellers, regula-
tors and risk assessors.
   The adequacy of the GMP description can be evaluated
using  case studies that  in turn could form the basis for
training materials on GMP.  Some recommendations were
proposed for case studies:

 • Comparing a dose metric for which data were directly
   available versus one where they were not.
 • Examples where PBPK models were accepted and used
   by  regulatory  Agencies  and ones  where  they were
   rejected to ensure  appropriate documentation.
 • Comparisons of data-rich chemicals  with data-limited
   chemicals including not just comparison  of pharmacoki-
   netic or metabolic data, but also mode of action data
   such as toxicogenomic or  metabolomic data.
 • Illustrations of different risk assessment  applications.

   Potential  chemicals  to  use as  case  studies  would
include those  for which PBPK models had been consid-
ered or applied in risk  assessments in Europe, Canada,
and the United States. Other chemicals could include iso-
propanol (with  acetone metabolite   sub model)   for
non-cancer endpoints, styrene as an example of an inac-
cessible dose metric, acrylamide as an example of great
 8 http://www.who.int/ipcs/methods/harmonization/areas/pbpk/en/
index.html
current regulatory interest with multiple proposed modes
of action and target  sites and 1,3-butadiene due to  the
substantial animal modelling  and uncertainty in human
metabolism  resulting   in    assessment   based   upon
epidemiology.
   Finally, development of training materials and hiring of
personnel with  the required expertise will be essential to
facilitate  implementation of mode of action and dosime-
try-based risk assessment by  regulatory Agencies. Training
materials are needed so that risk assessors and managers
with diverse expertise can successfully interact with model-
lers to implement PBPK models in risk assessment. Train-
ing will  also be important for modellers  to learn about
newer methodologies  for characterizing uncertainty and
variability in PBPK models or implementing  local and
global sensitivity  analyses at appropriate stages of model
maturation. A  longer-term strategy would be to include
a more quantitative, computationally based study of toxi-
cology in university courses. The adaptation of a PBPK
model generator tool  such as  MEGen as  a teaching tool
would be very  useful in demonstrating to students how
biological knowledge  can be applied  to solve real-world
problems.

9. Conclusions and recommendations

   The use  of  PBPK modelling  in  risk  assessment is
increasing in various jurisdictions but would benefit from
development of  principles  and guidance for  GMP  to
assist modellers during design and verification and risk
assessors in  evaluation  for application.  Experience  in
development of similar guidance in other areas such as
environmental fate modelling and more recently evolving
principles  and  documentation prototypes  for QSAR
response modelling can inform this process. Recommen-
dations for  aspects to be addressed in GMP for model
development, characterization, documentation and evalu-
ation were  based on an  international  workshop  and
include:

1. Transparency of model documentation  and the weight
   of evidence for the underlying  hypothesized mode of
   action  is needed to aid transition of models from devel-
   opers to evaluators  and users.
2. Independent  review of models is essential to evaluate
   documentation and implementation quality for applica-
   tions in risk assessment.
3. Consistent model evaluation approaches would facilitate
   international sharing of the analyses that would then
   form the basis for decisions appropriate  to different reg-
   ulatory applications. An international committee would
   be valuable to further this goal.
4. Successful development, evaluation and application of a
   PBPK model requires multidisciplinary skills through-
   out the process. Regulatory agencies need to develop
   access  to those who can  provide those skills through
   training, hiring or other approaches.
                                     Previous

-------
                              G. Loizou et al. I Regulatory Toxicology and Pharmacology 50 (2008) 400-411
                                                                                                                         409
5. Training in toxicology at the university and professional
   levels needs to recognize that quantitative risk assess-
   ment applications are major drivers for interest in toxi-
   cological   data   by   providing   more   quantitative,
   computationally-based studies.

   These recommendations will be considered further in the
development of relevant guidance  in an ongoing initiative
of the IPCS harmonization project.

Disclaimer

   The views expressed in this article  are  those of  the
authors and do not necessarily represent the views or pol-
icies of the UK Health and Safety  Executive, US Environ-
mental Protection Agency  or  INERIS.  Mention of  trade
names or commercial products does not constitute endorse-
ment or recommendation for use.
   The present document does  not represent an  official
position of the European Commission.

Acknowledgments

   The Health and  Safety Executive (UK),  the European
Chemical  Industry Council (CEFIC), Health Canada and
the European Centre for the Validation of Alternative Meth-
ods (ECVAM) provided funds for the support of the work-
shop.  The  scientific committee  and  organizers  of  the
workshop thank the speakers; Woodrow Setzer (US EPA),
Gerhard  Goerlitz (Bayer  CropScience, Germany),  Mel
Andersen  (The Hamner  Institutes  for  Health  Research,
USA),  Bette Meek (McLaughlin  Institute,  University of
Ottowa, Canada), Martin  Spendiff (HSL,  UK), Gyorgy
Csanady  (GSF-Institute  of  Toxicology,  Germany)  and
Ursula Gundert-Remy (BfR, Germany) for her impromptu
presentation,  the breakout session  chairs;  Hugh  Barton
(US EPA),  George Loizou (HSL, UK), Martin Spendiff
(HSL, UK) and Kannan Krishnan (University of Montreal,
Canada),  the rapporteurs;  Harvey Clewell  (The Hamner
Institutes  for Health Research, USA) and Rob DeWoskin
(US EPA) and the recorders; Cecilia Tan  (The Hamner
Institutes  for Health Research, USA), Tammie Covington
(Henry Jackson Foundation, USA), Jos Bessems (RIVM,
The Netherlands)  and  Marco  Zeilmaker  (RIVM,  The
Netherlands).  The  scientific  committee  and organizers
also thank  Katerina  Karapataki and her staff  at  the
Mediterranean  Agronomic Institute at  Chania, Crete,
Greece for their invaluable logistical  help and advice.

References

Acutetox  (Newsletter July 2007). .
Allen, B.C., Covington, T.R., Clewell,  H.J.,  1996. Investigation  of the
   impact  of  pharmacokinetic variability  and  uncertainty  on risks
   predicted with a pharmacokinetic model for chloroform.  Toxicology
   111, 289-303.
Andersen,  M.E., Clewell  III,  H.J.,  Frederick, C.B.,  1995. Applying
   simulation modeling to problems in toxicology and risk assessment—a
   short perspective. Toxicology and Applied Pharmacology 133, 181-
   187.
Babich, M.A., 1998. Risk assessment of low-level chemical exposures from
   consumer products under the u.s. consumer product safety commission
   chronic hazard  guidelines. Environmental Health Perspectives 106
   (Suppl. 1), 387-390.
Barratt, M.D., Castell, J.V., Chamberlain, M., Combes, R.D., Dearden,
   J.C., Fentem, J.H., Gerner, I., Giuliani, A., Gray, T.J.B., Livingstone,
   D.J., Provan, W.M.L., Rutten, F., Verhaar, H.J.M., Zbinden, P., 1995.
   The integrated  use of alternative approaches for predicting toxic
   hazard—the report and  recommendations  of Ecvam  workshop-8.
   ATLA-Alternatives To Laboratory Animals 23, 410-429.
Barton, H.A., Chiu, W.A., Setzer, R.W.,  Andersen, M.E., Bailer, A.J.,
   Bois, F.Y., Dewoskin, R.S., Hays, S., Johanson, G., Jones, N., Loizou,
   G.,  Macphail, R.C., Portier, C.J., Spendiff, M., Tan,  Y.M., 2007.
   Characterizing  uncertainty and variability  in  physiologically-based
   pharmacokinetic (PBPK)  models:  state of the science and needs for
   research and implementation. Toxicological Sciences 99, 395-402.
Barton, H.A., Flemming, C.D., Lipscomb, J.C., 1996. Evaluating human
   variability in chemical risk assessment: hazard identification and dose-
   response assessment for noncancer oral toxicity of trichloroethylene.
   Toxicology 111,  271-287.
Beliveau, M., Lipscomb, J., Tardif, R., Krishnan, K., 2005. Quantitative
   structure-property relationships for interspecies extrapolation of the
   inhalation pharmacokinetics of organic chemicals. Chemical Research
   in Toxicology 18, 475^185.
Blaauboer, B.J., Barratt, M.D., Houston, B.J., 1999. The integrated use of
   alternative methods in  toxicological risk evaluation:  ECVAM inte-
   grated testing strategies task force report 1.  ATLA-Alternatives To
   Laboratory Animals 27, 229-237.
Blaauboer, B.J., Bayliss, M.K., Castell, J.V., Evelo, C.T.A., Frazier, J.M.,
   Groen, K., Gulden, M., Guillouzo, A.,  Hissink, A.M., Houston, B.J.,
   Johanson, G., de Jongh, J., Kedderis, G.L., Reinhardt, C.A., van de
   Sandt, J.J.M., Semino, G., 1996. The use of biokinetics and in vitro
   methods in  toxicological risk evaluation.  ATLA-Alternatives To
   Laboratory Animals 24, 473^197.
Boobis, A.R., Cohen, S.M.,  Dellarco, V., McGregor, D., Meek, M.E.,
   Vickers, C.,  Willcocks,  D., Farland, W.,  2006.  IPCS  framework for
   analyzing the relevance of a cancer mode of action for humans. Critical
   Reviews in Toxicology 36, 781-792.
Boobis, A.R., Doe, J.E., Heinrich-Hirsch, B., Meek, M.E., Munn,  S.,
   Ruchirawat, M., Schlatter, J., Seed,  J., Vickers, C., 2008. IPCS
   framework for analysing the relevance of a non-cancer mode of action
   for humans.  Critical Reviews in Toxicology 38,  87-96.
Campolongo, F., Saltelli, A.,  1997. Sensitivity analysis of an environmen-
   tal model;  a worked  application of different  analysis methods.
   Reliability Engineering and System Safety, 49-69.
Chiu, W.A., Barton, H.A., Dewoskin, R.S., Schlosser,  P., Thompson,
   C.M.,  Sonawane, B., Lipscomb, J.C., Krishnan, K., 2007. Evaluation
   of physiologically based  pharmacokinetic models for use in  risk
   assessment. Journal of Applied Toxicology 27, 218-237.
Cho, K.-H., Shin, S.-Y., Kolch, W., Wolkenhauer, O., 2003. Experimental
   design in systems biology, based on  parameter sensitivity  analysis
   using a Monte Carlo method: a case study  for the TNFalpha-mediated
   NF-kappaB signal transduction pathway.  Simulation 79, 726-739.
Clark, L.H., Setzer,  R.W., Barton, H.A., 2004. Framework for evaluation
   of physiologically-based pharmacokinetic  models for use in  safety or
   risk assessment.  Risk Analysis 24,  1697-1717.
Clewell 3rd, H.J., Andersen, M.E., Barton,  H.A., 2002a. A consistent
   approach for the application of pharmacokinetic modeling in cancer
   and noncancer risk assessment. Environmental Health Perspectives
   110, 85-93.
Clewell 3rd, H.J.,  Andersen, M.E., Wills, R.J., Latriano, L., 1997. A
   physiologically based pharmacokinetic model for retinoic acid and its
   metabolites.  Journal of the American Academy of Dermatology 36,
   S77-S85.
                                       Previous

-------
410
                                 G.  Loizou et al. I Regulatory Toxicology and Pharmacology 50 (2008) 400^tll
Clewell, H.J., Gearhart, J.M., Gentry, P.R., Covington, T.R., VanLand-
   ingham, C.B., Crump,  K.S., Shipp, A.M., 1999. Evaluation of the
   uncertainty in an oral reference dose for methylmercury  due to
   interindividual variability in pharmacokinetics. Risk Analysis 19, 547-
   558.
Clewell, H.J., Teeguarden, J., McDonald, T., Sarangapani, R., Lawrence,
   G.,  Covington,  T.,  Gentry,  R., Shipp, A.,  2002b.  Review  and
   evaluation of the  potential  impact  of  age-  and gender-specific
   pharmacokinetic differences on tissue dosimetry. Critical Reviews in
   Toxicology 32, 329-389.
Clewell III,  H.J., Andersen,  M.E.,  1987.  Dose, species  and route
   extrapolation using physiologically-based pharmacokinetic modeling.
   Drinking Water and Health 8, 159-184.
Cobelli, C, Carson, E.R., Finkelstein, L., Leaning, M.S., 1984. Validation
   of simple and complex models in physiology and medicine. American
   Journal of Physiology 246, R259-R266.
Cox Jr., L.A., 1996. Reassessing benzene risks using internal doses and
   Monte-Carlo uncertainty analysis. Environmental Health Perspectives
   104 (Suppl. 6), 1413-1429.
Csanady, G.A., Filser, J.G., 2007. A physiological toxicokinetic model for
   inhaled propylene oxide in rat and human with special emphasis on the
   nose. Toxicological Sciences 95,  37-62.
DeJongh,  J., Forsby, A.,  Houston, J.B., Beckman, M.,  Combes,  R.,
   Blaauboer, B.J.,  1999. An integrated approach to the  prediction of
   systemic toxicity using computer-based biokinetic models and biolog-
   ical in vitro test methods: overview of a prevalidation study based on
   the ECITTS project. Toxicology In Vitro 13, 549-554.
Delic, J.I., Lilly, P.D., MacDonald,  A.J., Loizou, G.D., 2000. The utility
   of  PBPK  in  the safety  assessment  of chloroform  and  carbon
   tetrachloride. Regulatory Toxicology and Pharmacology 32, 144-155.
Dewoskin,  R.S., 2007. PBPK models in risk  assessment-A focus on
   chloroprene. Chemico-Biological Interactions 166, 352-359.
European   Chemicals  Bureau. Existing  Substances Lists.  .
European  Chemicals Bureau (RIP  3.3).  .
European Commission (FOCUS), .
European Community Regulation (Council Directive 91/414). .
European  Community Regulation (REACH), .
Faller, T.H., Csanady, G.A., Kreuzer, P.E., Baur, C.M., Filser, J.G., 2001.
   Kinetics of propylene oxide metabolism in  microsomes and cytosol of
   different organs from  mouse,  rat,  and  humans.  Toxicology  and
   Applied Pharmacology  172, 62-74.
Franks, S.J.,  2005. A  mathematical  model for  the  absorption  and
   metabolism of formaldehyde  vapour  by  humans.  Toxicology  and
   Applied Pharmacology  206, 309-320.
Franks, S.J., Spendiff, M.K., Cocker, J., Loizou, G.D.,  2006. Physiolog-
   ically based pharmacokinetic modelling of human  exposure to 2-
   butoxyethanol. Toxicology Letters 162, 164-173.
Georgopoulos, P.G., Roy, A., Gallo, M.A., 1994. Reconstruction of
   short-term multi-route exposure  to volatile organic compounds using
   physiologically based pharmacokinetic models.  Journal of Exposure
   Analysis and Environmental Epidemiology 4, 309-328.
Gorlitz, G., 1993. Rules for the correct performance and evaluation of
   model calculations for simulation of the environmental behaviour of
   pesticides (Biologische Bundesanstalt fur Land- und Forstwirtschaft,
   Fraunhofer Institut  fur Umweltchemie  und  Okotoxikologie  and
   Arbeitsgruppe "Simulationsmodelle" in Industrieverband Agrar and
   the  Umweltbundesamt,  eds.).   BayerCropScience  AG,  Internal
   Report.
Government  of Canada,  1993. Priority Substances Assessment Report
   on    Dichloromethane.    .
Gueorguieva, I., Aarons, L., Ogungbenro, K., Jorga, K.M., Rodgers, T.,
   Rowland, M., 2006. Optimal design for multivariate response phar-
   macokinetic models. Journal of Pharmacokinetics and Pharmacody-
   namics 33, 97-124.
Gulden, M., Seibert, H., 2003. In vitro-in vivo extrapolation: estimation
   of human serum concentrations of chemicals equivalent  to cytotoxic
   concentrations in vitro. Toxicology 189, 211-222.
Gundert-Remy, U., Sonich-Mullin, C., 2002. The use of toxicokinetic and
   toxicodynamic data in risk assessment: an international perspective.
   Science of the Total Environment 288, 3-11.
Hays, S.M., Becker,  R.A., Leung, H.W., Aylward, L.L., Pyatt, D.W.,
   2007. Biomonitoring equivalents: a screening approach for interpreting
   biomonitoring results from a public health risk perspective. Regulatory
   Toxicology and Pharmacology 47, 96-109.
Health Canada  (Domestic Substances  List),  .
Health  Canada  (Priority Substances Assessment  Program).  .
Houston,  J.B.,  1994.  Relevance  of in vitro  kinetic  parameters  to
   in  CTuometabolism of xenobiotics.  Toxicology In  Vitro  8, 507-
   512.
INERIS, 2007. Reprotoxicity of Ethylene Glycol Ethyl Ether (EGEE) in
   Humans—Development of a Dose-Response Relationship.
IPCS, 2005. Chemical-Specific Adjustment Factors (CSAFs) for Interspe-
   cies Differences and Human Variability: Guidance Document for Use of
   Data in Dose/Concentration-Response Assessment, WHO, Geneva, p.
   96. .
Kedderis, G.L., 1997. Extrapolation of in vitro enzyme induction data to
   humans in vivo. Chemical and Biological Interactions 107, 109-121.
Lee, M.S., Faller, T.H., Kreuzer, P.E., Kessler, W., Csanady, G.A., Putz,
   C., Rios-Blanco, M.N., Pottenger, L.H.,  Segerback, D., Osterman-
   Golkar, S., Swenberg, J.A., Filser,  J.G., 2005. Propylene oxide in
   blood and soluble nonprotein thiols in nasal mucosa and other tissues
   of male  Fischer 344/N rats exposed to  propylene oxide vapors—
   relevance  of  glutathione depletion for propylene oxide-induced rat
   nasal tumors. Toxicological Sciences  83, 177-189.
Lipscomb,  J.C., Fisher,  J.W., Confer, P.D., Byczkowski, J.Z.,  1998.
   In vitro to in vivo extrapolation for trichloroethylene  metabolism
   in  humans.  Toxicology  and  Applied  Pharmacology   152,
   376-387.
Meek, B., Renwick, A., Sonich-Mullin, C., 2003a. Practical application of
   kinetic data in risk assessment—an IPCS initiative. Toxicology Letters
   138,  151-160.
Meek,  M.E., Bucher,  J.R., Cohen, S.M.,  Dellarco,  V.,  Hill,  R.N.,
   Lehman-McKeeman, L.D., Longfellow, D.G., Pastoor, T., Seed, J.,
   Patton, D.E., 2003b. A framework for human relevance analysis of
   information on carcinogenic modes  of action. Critical Reviews in
   Toxicology 33, 591-653.
Meek, M.E., Renwick, A. (Eds.), 2006. Guidance for the development of
   chemical specific adjustment factors—integration with mode of action
   frameworks. Informa Healthcare, New York.
Meek,  M.E.,  Renwick,  A.,  Ohanian,  E.,  Dourson, M., Lake,  B.,
   Naumann, B.D., Vu, V., 2001. Guidelines  for application of chemical
   specific adjustment factors (CSAF)  in dose/concentration response
   assessment. Comments on Toxicology 7, 575-590.
Meek,  M.E.,  Renwick,  A.,  Ohanian,  E.,  Dourson, M., Lake,  B.,
   Naumann, B.D., Vu, V.,  2002. Guidelines for application of chemi-
   cal-specific adjustment factors in dose/concentration-response assess-
   ment. Toxicology  181-182, 115-120.
Miners,  J.O., Veronese, M.E., Birkett, D.J., 1994. In vitro approaches for
   the prediction  of human  drug  metabolism.  Annual  Reports  in
   Medicinal Chemistry 29, 307-316.
Nestorov, I.A., Aarons, L.J., Arundel, P.A., Rowland, M., 1998. Lumping
   of whole-body physiologically based pharmacokinetic models.  Journal
   of Pharmacokinetics and Biopharmaceutics 26, 21-46.
OECD,  2007. Guidance on the Validation  of (Quantitative) Structure
   Activity Relationship  [(Q)SAR] Modelling, Paris.  In: Environment
                                             Previous

-------
                                 G.  Loizou et al. I Regulatory Toxicology and Pharmacology 50 (2008) 400-411
                                                                                                                                    411
   and Health Publications Series on Testing and Assessment. Organiza-
   tion for Economic Cooperation and Development, Paris.
Oreskes, N.,  1998. Evaluation (not validation) of quantitative models.
   Environmental Health Perspectives 106 (Suppl. 6), 1453-1460.
OSHA, 1997. Occupational Exposure to Methylene Chloride; Final Rule
   20  CFR  Parts  1910, 1915  and   1926  70FR1493-1619.   (Accessed August 1, 2007).
Osterman-Golkar, S., Czene, K., Lee, M.S., Faller, T.H., Csanady, G.A.,
   Kessler, W., Perez, H.L., Filser, J.G., Segerback, D., 2003. Dosimetry
   by means of DNA  and hemoglobin adducts  in  propylene  oxide-
   exposed rats. Toxicology and Applied Pharmacology 191, 245-254.
Portier,  C.J., Lyles,  C.M., 1996. Practicing safe  modeling: GLP  for
   biologically based mechanistic models. Environmental  Health Per-
   spectives 104, 806.
Rescigno, A., Beck, J.S., 1987. The use and abuse of models. Journal of
   Pharmacokinetics and Biopharmaceutics 15, 327-344.
Rostami-Hodjegan, A.,  Tucker, G.T.,  2004. 'In  silico' simulations to
   assess the 'in vivo' consequences of 'in vitro' metabolic drug-drug
   interactions. Drug Discovery Today: Technologies 1, 441-448.
Rowland, M., Balant, L., Peck, C., 2004. Physiologically based pharma-
   cokinetics in drug development and regulatory science:  a workshop
   report (Georgetown University, Washington, DC, May 29-30, 2002).
   American Association of Pharmaceutical Scientists 6, 1-12.
Seed,  J.,  Carney,  E.W.,  Corley, R.A.,  Crofton, K.M., DeSesso, J.M.,
   Foster,  P.M., Kavlock, R., Kimmel, G.,  Klaunig, J., Meek, M.E.,
   Preston, R.J., Slikker  Jr., W., Tabacova, S., Williams, G.M., Wiltse, J.,
   Zoeller, Fenner-Crisp, P., Patton, D.E., 2005. Overview: using mode of
   action and life stage information to evaluate the human relevance of
   animal toxicity data. Critical Reviews in Toxicology 35, 664-672.
Sonich-Mullin, C., Fielder, R., Wiltse,  J.,  Baetcke, K., Dempsey, J.,
   Fenner-Crisp, P., Grant,  D., Hartley, M.,  Knaap, A.,  Kroese,  D.,
   Mangelsdorf, I., Meek, E., Rice,  J.M.,  Younes, M.,  2001.  IPCS
   conceptual framework for evaluating a mode of action for chemical
   carcinogenesis. Regulatory  Toxicology and Pharmacology 34, 146-
   152.
US EPA,  1987. Update  to the Health Assessment Document and
   Addendum for Dichloromethane (Methylene Chloride): Pharmacoki-
   netics, Mechanism of Action, and Epidemiology. External Review
   Draft, EPA/600/8-87/030A.
US EPA, 1989. Risk Assessment Guidance for Superfund, vol 1. Human
   Health Evaluation Manual (Part A), EPA/540/1-89/002.
US EPA, 2006. Approaches for the Application of Physiologically Based
   Pharmacokinetic (PBPK)  Models  and  Supporting Data in Risk
   Assessment (Final Report).  EPA/600/R-05/043A.
Verwei, M., van Burgsteden, J.A., Krul, C.A., van de Sandt, J.J., Freidig,
   A.P.,  2006. Prediction of in vivo embryotoxic effect levels with a
   combination of in vitro  studies  and PBPK modelling.  Toxicology
   Letters.
Wilson, Z.E., Rostami-Hodjegan, A., Burn,  J.L., Tooley, A., Boyle, J.,
   Ellis, S.W., Tucker, G.T.,  2003. Inter-individual variability in levels of
   human microsomal protein and hepatocellularity per  gram of liver.
   British Journal of Clinical Pharmacology  56, 433^140.
                                          Previous

-------
                                         APPLICATIONS NOTE
                              Vol. 25 no. 5 2009, pages 692-694
                              doi: 10.1093/bioinformatics/btp042
Databases and ontologies

DSSTox chemical-index  files for exposure-related experiments
in ArrayExpress and  Gene  Expression  Omnibus: enabling
toxico-chemogenomics  data linkages
ClarLynda R. Williams-DeVane1'*, Maritja A. Wolf2  and Ann M.  Richard1
1 National Center for Computational Toxicology, Office of Research and Development, US EPA and
2Lockheed Martin, Research Triangle Park, NC 27711, USA
Received on October 31, 2008; revised on January 12, 2009; accepted on January  18, 2009
Advance Access publication January 21, 2009
Associate Editor: Alex Bateman
ABSTRACT
Summary:  The Distributed Structure-Searchable Toxicity (DSSTox)
ARYEXP and GEOGSE files are newly published, structure-annotated
files of the chemical-associated and  chemical exposure-related
summary  experimental  content  contained in the  ArrayExpress
Repository  and Gene Expression Omnibus (GEO) Series (based on
data extracted on September 20, 2008).  ARYEXP and GEOGSE
contain 887 and 1064 unique chemical substances mapped to
1835 and 2381  chemical exposure-related experiment  accession
IDs, respectively. The  standardized files allow one  to  assess,
compare and search the chemical content in each resource, in the
context of the larger DSSTox toxicology data  network, as well as
across large public  cheminformatics resources such as PubChem
(http://pubchem.ncbi.nlm.nih.gov).
Availability: Data files and documentation may be accessed online
at http://epa.gov/ncct/dsstox/.
Contact:  williams.clarlynda@epa.gov
Supplementary information: Supplementary data are available at
Bioinformatics online.
1   INTRODUCTION
In recent years, the number of publicly available gene expression,
toxicology and cheminformatics resources with the potential to
support toxicogenomics  investigation  has  grown considerably
(http://www.microarryaworld.com/DatabasePage.html;    Richard
et al., 2008; C.R.Williams-DeVane et al., manuscript submitted).
These trends are encouraging aggregation and use of data in a
much broader context, spanning domains of inquiry in relation to
toxicology, chemistry and genomics (Waters et al., 2008).
  The European Bioinformatics Institute's (EBI) ArrayExpress
Repository (http://www.ebi.ac.uk/microarray-as/ae/) and the National
Center for Biotechnology Information's (NCBI) Gene Expression
Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/) are the two main
public repositories for gene expression experiments associated
with the published scientific literature. Although they each support
MIAME-compliant submissions  (i.e.  adhering  to  guidelines
for   Minimum  Information  about  Microarray  Experiments;
http://www.mged.org/Workgroups/MTAME/miame.html),  neither

*To whom correspondence should be addressed.
   resource has standardized requirements for reporting of chemical
   information  associated  with  submitter-deposited  microarray
   experiments. As a result, not only has it been difficult to assess the
   chemical-related content within these resources, but also microarray
   data have been effectively isolated from rapidly growing public
   sources of chemically indexed information pertaining to toxicology
   (Richard et al, 2006).
     We report here  the publication of chemical-index files for
   experimental  content in the  ArrayExpress  Repository  and GEO
   Series (data extracted on September 20,  2008), in association
   with the  Environmental Protection Agency's (EPA) Distributed
   Structure-Searchable Toxicity (DSSTox) Data  Network  project
   (http://www.epa.gov/ncct/dsstox/) (Supplementary References).


   2   DATABASE METHODS AND COMPONENTS

   2.1  DSSTox
   The DSSTox project publishes high-quality, standardized chemical
   structure toxicity data files pertaining to high-interest  chemicals
   for  environmental toxicology and  of potential use  for  structure-
   activity  relationship  (SAR)  modeling. The DSSTox  website
   offers  documentation,  freely  downloadable  structure   data
   files (SDF) and tabular  data  files  (.xls) for  each  published
   Data  File  (http://www.epa.gov/ncct/dsstox/DataFiles.html).  A
   unique  aspect of this effort is the  quality annotation, review
   and  representation  of chemical information both  in terms of
   a unique mapping  to  a  curated  chemical structure, as well
   as  at the generic test substance level   (similar  to  Chemical
   Abstracts Service  (CAS) Registry Number distinctions)—see
   http://www.epa.gov/ncct/dsstox/MoreonStandardChemFields.html.
   The  current  DSSTox inventory  contains  over  8000  unique
   chemicals and has been incorporated into the  online DSSTox
   Structure-Browser (http://www.epa.gov/dsstox_structurebrowser/),
   the  NCBI PubChem (http://pubchem.ncbi.nlm.nih.gov/) inventory
   containing millions  of  searchable  chemical  structures  and
   thousands  of bioassays,  ChemSpider (http://chemspider.com/)
   containing millions more  chemical  structures, properties  and
   linkages and the new EPA Aggregated Computational Toxicology
   Resource (ACToR) database (http://www.epa.gov/actor/) providing
   searchability and comparative read-across for over  200 chemical
   inventories specifically pertaining to environmental toxicology.
692    © The Author 2009. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org
                                   Previous
TOC

-------
                                                                          Chemical-indexing of gene expression experiments
2.2  ArrayExpress
Since the  public launch of ArrayExpress in 2002, the repository
has grown to more  than  6500 experiments.  Public microarray
data in ArrayExpress are  available  for  browsing  and querying
across  a wide  range  of experiment  properties, including array
type, submitter, species, MIAME score, etc. and complete datasets
or  subsets can be  retrieved  (http://www.ebi.ac.uk/microarray-
as/aer/entry). Although the TOXM label is available to designate
toxicogenomics experiments, it is rarely  used and represents a
very small portion of the total ArrayExpress chemical-experiment
inventory.  Chemical involvement is  primarily indicated by non-
standard,  error-prone chemical names or abbreviations included
in free-text  user  description  fields,  and these  are  very  rarely
accompanied  by chemical  identifiers  such as  CAS  or  Chemical
Entities of Biological Interest  (ChEBI) numbers. Where ChEBI
identifiers  are used (~20 instances), these have been recently cross-
referenced within the ChEBI system (http://www.ebi.ac.uk/chebi/).

2.3   GEO series
A GSE is a GEO Series Accession ID (e.g. GSE5594) that defines a
set of related samples considered to be part of a single experiment
and, for present purposes, most closely (or precisely) corresponds to
an ArrayExpress Accession ID (e.g. E-GEOD-5594). GEO currently
contains over 9900 Series Accession entries and more than 2000
curated GEO Datasets. However, fewer than 6500 of the  Series
Accession entries  could be programmatically extracted due to a
backlog in the  GEO  curation process at the time of annotation.
Chemical  information is most often located in GEO  Series records
as chemical names or abbreviations in the Summary field (a user-
submitted, free-text description field),  but in some  cases  this
information only was provided in the 'Title' or 'Samples' field.
A handful  of GEO records contained  chemical identifiers such as
CAS, but the quality of chemical annotation across GEO, in general,
was poor and in some cases absent entirely.

2.4   Methods: ARYEXP_Aux and GEOGSE_Aux
Recent  additions to  ArrayExpress  include  new  portals  for
programmatic access  where users can query and download data
in a systematic manner from the ArrayExpress  FTP site (http://
www.ebi.ac.uk/microarray/doc/help/programmatic_access.html).
To create  the  ARYEXP auxiliary file (ARYEXP_Aux), content
was  extracted using  Perl scripts from the programmatic  access
FTP site in XML format. Similarly, the creation of GEOGSE_Aux
involved first  programmatically accessing GEO to  retrieve  all
current  experimental  descriptions associated with Series records
(http://www.ncbi.nlm.nih.gov/projects/geo/info/geo_paccess.html).
Entrez tools were  used to generate an XML document containing
a  summary  of each  of the GEO Series experiments currently
curated in the GEO  Data Sets Accession  ID (GDS) system.
A series  of  Perl  scripts  were  developed  to  parse  both  the
ArrayExpress  and  GEO  XML  documents  and  to  retrieve
records with   probable  chemical   association.  The  resulting
experiment descriptions were evaluated and verified  manually, and
chemical information was  extracted and  subsequently underwent
stringent review and annotation according to DSSTox procedures
(http://www.epa.gov/ncct/dsstox/ChemicalInfQAProcedures.html).
   Each of  the resulting  DSSTox  files,  ARYEXP_Aux  and
GEOGSE_Aux, is a chemical-experiment pair index (one  record
       per  chemical  per experiment). Each file also contains  the full
       complement  of 20 DSSTox Standard  Chemical Fields  and an
       additional  14 Standard Genomics Fields  (including URL field
       linking to experiment accession IDs) to allow cross  comparisons
       of ArrayExpress and GEO content similarly indexed by DSSTox
       (Supplementary  Tables 1  and 2).  In addition, ARYEXP_Aux
       contains  an  additional   set  of 30   experimental  description
       fields  specific  to ArrayExpress (Supplementary Table  3), and
       GEOGSE_Aux contains  an additional  set of four experimental
       description fields specific to GEO Series (Supplementary Table 4).
       Further details of chemical-indexing and data extraction methods,
       comparison of ArrayExpress  and GEO chemical-experimental
       summary content and assessment of lexicologically  relevant
       chemical content are  provided elsewhere (C.R.Williams-Devane,
       manuscript submitted; Supplementary References).

       2.5    Methods: ARYEXP and GEOGSE
       Of greatest toxicogenomics and SAR interest are those experiments
       for which a chemical  treatment and the resulting gene expression
       changes are the primary focus of the experiment. We introduced
       the Standard Genomics Field, Chemical_StudyType, to annotate the
       purpose of the chemical  in the experiment, which could include
       uses such as 'Treatment', 'Vehicle', 'Reference', 'Media',  etc. The
       main DSSTox files, ARYEXP and GEOGSE, pertain only to  the
       'Treatment' category of ArrayExpress and GEO Series experiments
       and contain one record per unique chemical substance. In ARYEXP
       and  GEOGSE, one chemical substance can map to one or more
       experiments in GEO or ArrayExpress. Hence, unlike the Auxilliary
       files, where  records  map  to  individual experiments, the main
       DSSTox structure-index files do not contain summary details of
       particular experiments. Rather, these files contain DSSTox Standard
       Chemical Fields, Chemical_StudyType  (which  can  be  Treatment
       AND other conditions), one or more Experiment Accession IDs and
       the corresponding URLs to ArrayExpress or GEO Series Experiment
       Summary pages.
       3   CONCLUSIONS AND PERSPECTIVES
       The ARYEXP_Aux file  contains a  total of  2365  chemical-
       experiment records (with 44 total source fields), corresponding
       to  1011 unique chemical substances. Of these 2365  chemical-
       experiment pairs, 1835 were identified as 'Treatment' and these map
       to 887 unique chemical records in the ARYEXP file. Similarly, the
       GEOGSE_Aux file contains a total of 2381 chemical-experiment
       records (with 18 total source fields), corresponding to 1064 unique
       chemical  substances. Of these 2381  chemical-experiment pairs,
       2134 were identified  as  'Treatment'  and these map to 1014
       unique  chemical records  in the  GEOGSI file.  These numbers
       indicate that the exposure-related experimental content in these
       two public resources covers a significant range of chemicals of
       potential interest and utility for toxicogenomics investigations. All
       four files are  available for download  from the  DSSTox website
       (http://www.epa.gov/ncct/dsstox/).
         The  ARYEXP   and   GEOGSE  chemical-index  files,  with
       associated ArrayExpress  Experiment Accession ID  URLs, have
       been incorporated into the DSSTox Structure-Browser, ACToR and
       PubChem. This enables ArrayExpress and GEO Series experiments
       to be structure located from these public  resources, creating the
                                                                                                                      693
                                   Previous
TOC

-------
C.R.Williams-DeVane et al.
               Chemical Search Paradigm
                                           Urerlptton    {Olft£ta|)
                   Acetaminophen
       •"V.
                                     E-TABM-131 Custom Array, Rat
                                     E-MEXP-82: Affymetrix; Rat
                                     E-TOXM-18. Agilent; Mouse
                                     E-TOXM-31.Custom Array, Human
 SEPA DSSTox
CPDBAS   EPAFHIJ
FDAMDD   NTPBSI
                               CEBS
 Rodent carzinogenicsty
 Fish acute toxscrty
 Genetic toxicity
 Cell viability toxicity
                  JZJChemSpider
                    Publgphem
200005594 Aoj lent. Rat
200005593 Agilent, Rat
200005595 AgilenlRat
200000633: Custom Array, Rat
200005$52;A9iient. Rat
2000WS74:Custom. Mouse
200005860:Agi!ent Rat
200008858 GE Healthcare.Rat
                               Acetaminophen
                                M eta-Data Set
Fig. 1. Illustration showing linkages from DSSTox Structure-Browser to
experiments from ARYEXP and GEOGSE, linked to various chemically
indexed resources of bioassay and toxicity data for constructing a meta-
dataset for acetaminophen; links to actual microarray data are provided by
GEO or ArrayExpress (dashed arrows indicate future linkages).
microarray  data  or annotation files.  To  encompass  these  data
types, integration of ArrayExpress and GEO  into the Chemical
Effects in Biological Systems (CEBS; http://cebs.niehs.nih.gov/)
toxicogenomics database is (Waters et al., 2008). An automated
process of porting microarray data and annotation files directly
from ArrayExpress and GEO to CEBS  is to be implemented with
chemical  annotation  handled  in collaboration with the DSSTox
project. Recommendations for chemical standards for microarray
experiments  are  being  forwarded to the  MIBBI  (Minimum
Information for Biological and Biomedical Investigations) project
(http://www.mibbi.org/).
   Better coordination betweenNCBI's GEO and PubChem projects
is needed.  Likewise, EBI's  ChEBI and  ArrayExpress  projects
have recently improved their coordination. The present effort has
chemically annotated a large portion of the current inventories of
ArrayExpress Repository and GEO Series, although more efficient
mechanisms  for updates must be instituted. The preferred solution
is to incorporate  standard chemical reporting requirements into
these resources directly. The present effort charts a path forward
and  will be used to encourage implementation of these changes.
ArrayExpress currently has the ability  to adequately capture  most
of the  required  chemical information; however, depositors are
not  doing so.  We recommend that GEO and ArrayExpress, in
coordination with  projects such  as MIBBI  and DSSTox,  each
adopt formal  requirements for  a minimum  level of  chemical
annotation in relation to experiments  (e.g. valid chemical name,
Chemical_StudyType). Once repositories and depositors recognize
the importance and enhanced capabilities to be gained from chemical
annotation and indexing, and make it a priority, the possibility to
more fully integrate  and utilize existing data  in toxicogenomics
studies can be realized.  Updates to DSSTox GEOGSE and ARYEXP
files by current means are scheduled for February 2009.
capability to query multiple domains of data by chemical structure.
Figure 1 illustrates this capability with a chemical structure search
of acetaminophen. Summary toxicology and microarray results are
provided  or  located through  the  DSSTox structure  search, and
other domain results are provided through linkages with PubChem,
ACToR and ChemSpider. With these new capabilities, a meta-dataset
on a particular chemical or family of structurally similar chemicals
could be constructed for further analysis.
  The  DSSTox ARYEXP  and  GEOGSE chemical-index  files
have  been deposited within PubChem such that chemicals can
be  located by  keyword search under 'PubChem Substance' on
the main search page  (e.g.  ARYEXP, ArrayExpress, etc.).  From
the PubChem  Substance  results page, a user  can  link  directly
to the chemical-associated  experimental  accession ID summary
pages  in GEO and ArrayExpress. Likewise,  through chemical
linkages, these microarray data could be placed in a much larger
data and chemical context  (including linkage to data for structurally
and  biologically  similar chemicals) within PubChem. ARYEXP
and GEOGSE  auxiliary data files  contain summary experimental
factors (Supplementary  Tables 3 and 4), but do not contain actual
                                                                 ACKNOWLEDGEMENTS
                                                                 This  manuscript was approved by the US EPA's National Center
                                                                 for Computational Toxicology for publication; the contents do not
                                                                 necessarily reflect the views and policies of the EPA and mention of
                                                                 trade names or commercial products does not constitute endorsement
                                                                 or recommendation for use.
                                                                 Funding:   NCSU/EPA   Cooperative   Training   Program   in
                                                                 Environmental   Sciences   Research,    Training    Agreement
                                                                 CT833235-01-0 with North Carolina State University (to C.R.W.).
                                                                 Conflict of Interest: none declared.
                                                                 REFERENCES
                                                                 Richard,A. et al. (2008) Toxicity data informatics: supporting a new paradigm for
                                                                   toxicity prediction. Tox. Mech Meth, 18, 103-118.
                                                                 Richard,A.M. et al. (2006) Chemical structure indexing of toxicity data on the Internet.
                                                                   Curr. Opin. DrugDiscov. Dev., 9, 314-325.
                                                                 Waters,M. et al. (2008) CEBS—chemical effects in biological systems: a public data
                                                                   repository integrating study design and toxicity data with microarray and proteomics
                                                                   data. Nucleic Acids Res., 36, D892-D900.
694
                                        Previous
                                                             TOC

-------
Bioinformation
www.bioinformation.net
by Biomedical Informatics Publishing Group
        open access
Hypothesis
      Effect of single  nucleotide  polymorphisms on
         Affymetrix® match-mismatch  probe pairs
                    Eric Christian Rouchka1' *, Abhijit Waman Phatak1 and Amar Vir Singh 2'3

 'Department of Computer Engineering and Computer Science, University of Louisville, Louisville, KY, USA; 2Department of Molecular, Cellular,
  and Craniofacial Biology, University of Louisville, Louisville, KY, USA; 'Department of Botany and Industrial Microbiology, JV College, CCS
       University, Meerut, UP, India; Eric Rouchka* - E-mail: eric.rouchka@louisville.edu; Phone: 502-852-1695; Fax: 502-852-4713;
                                             * Corresponding author

                      received January 06, 2008; revised June 17, 2008; accepted July 03, 2008; published July 14, 2008

Abstract:
Microarrays provide a means of studying expression level of tens of thousands of genes by providing one or more oligonucleotide
probe(s) for each transcript studied. Affymetrix® GeneChip™ platforms historically pair each 25-base perfect match (PM) probe
with a mismatch probe (MM) differing by a complementary base located in the  13th position to quantify  and deflate effects of cross-
hybridization. Analytical routines for analyzing these arrays take into account difference in expression levels of MM and PM probes
to determine which ones are useful for further study.  If a single nucleotide polymorphism (SNP) occurs at the 13* base, a probe
with a higher MM expression level may be incorrectly omitted. In order to examine SNP affects on PM and MM expression levels,
known human SNPs from dbSNP were mapped to probe sets within the Affymetrix® HG-U133A platform. Probe sets containing
one or more probe pairs  with a single  SNP at the 13th position were extracted. A set of twelve microarray experiments were
analyzed for the PM and MM expression levels for these probe sets. Over 6,000,000 human SNPs and their flanking regions were
extracted from dbSNP. These sequences were aligned against each of the 247,965 probe pair sequences  from the Affymetrix® HG-
U133A platform. A total of 915 probe sets containing  a single probe sequence with a SNP mapped to the 13th base were extracted.
A subset containing 166 probe sets result in complementary base SNPs. Comparison of gene expression levels for the SNP to non-
SNP PM and MM probes does not yield a significant  difference using %2 analysis. Thus, omission of probes with MM expression
levels higher than PM  expression levels does not appear to result in a loss of information concerning SNPs for these regions.

Keywords: Affymetrix HG-U133A; single nucleotide polymorphism; microarray; probe; mismatch
Background:
Microarray technology
Technological  breakthroughs within  the  past  couple  of
decades  have  changed the face  of molecular biology by
allowing researchers to generate large volumes of biologically
relevant data in a short period of time. These advances have
led to the "omics"  era of research [1]  marked by genomics
(study of genomes), proteomics (study of protein expression),
cellomics (study of the cell), and transcriptomics (study of
transcribed regions). Transcriptomics has been aided by the
invention of the microarray [2] which allows researchers to
study patterns of gene expression across tens of thousands of
genes   simultaneously.   Major   companies   providing
commercial solutions include Affymetrix®, Agilent®,  and
CodeLink®. Each of these approaches  provides one or more
short oligonucleotide probe(s) sequence complementary to the
product of the transcript of interest.

The Affymetrix® oligonucleotide platforms are constructed to
allow multiple oligonucleotide probes  per probe set,  where
each  probe set represents  a single  gene or transcript. For the
HU-133A platform  for studying human transcripts, there are
over  22,000 different probe sets represented, with each non-
control probe  set  containing  11  25-base  oligonucleotide
sequence probes [3]. In order to help quantify and control the
ISSN  0973-2063                                       405
Bioinformation 2(9): 405-411 (2008)
              effects of cross-hybridization,  the  Affymetrix®  approach
              groups probes into pairs consisting of a perfect match probe
              (PM) and a mismatch probe (MM). The perfect match probe
              is a 25 base oligonucleotide complementary to the transcript
              and the mismatch probe is the  same as the PM  with the
              exception that  the  13th base  is complementary  to the
              corresponding position in the PM set. For example, one of the
              eleven probe pairs for the 206055_s_at probe set is as follows:
              PM GCACAGCTTGCAAAGGATATTGCCA
              MM GCACAGCTTGCATAGGATATTGCCA

              Figure 1  shows an example of the expression levels of the
              MM and PM probes for three Affymetrix® probe sets found
              within the HU-133A platform.

              Since  mismatch  data  allows  for   detection of  cross-
              hybridization, a probe set could be selected for inclusion or
              exclusion based on the corresponding match/mismatch values.
              For the probe set 206055_s_at in Figure 1, each of the probe
              pairs could be used since the expression values of the match is
              consistently  higher than the  value  of  the corresponding
              mismatch probe located directly  below. However, for probe
              set 219820_at in Figure 1, the fifth match/mismatch pair from
                                                                             Bioinformation, an open access forum
                                                                     © 2008 Biomedical Informatics Publishing Group
                          Previous
   TOC

-------
Bioinformation
www.bioinformation.net
by Biomedical Informatics Publishing Group
        open access
Hypothesis
the left is potentially excluded since the mismatch expression
value is much greater than the match expression value. This
              probe pair would be excluded since the resulting differences
              in expression level is thought to be due to cross hybridization.
    HflTCH

    niSHHTCH
                       Affymetrix ID: 2Q6Q55_s_at
    NOTCH

    MIEMflTCH
                       Affymetrix ID: 208913_at
    NOTCH

    MISMATCH
                       Affymetrix ID: 219820_at
Figure 1: Affymetrix probe set pair expression levels. Shown are the expression levels for three separate Affymetrix probe sets
represented on a 0-255 color scale.  Each probe set contains eleven probe pairs, with each pair represented by a match and
mismatch sequence.
Single nucleotide polymorphisms (SNPs)
Single nucleotide polymorphisms (SNPs, often pronounced as
snips) are single  nucleotide base differences in a specific
position of genomic DNA among two different individuals of
the same species. SNPs are the most common form of genetic
variation  that  helps to  differentiate  individuals  in  a
population.  A number of diseases  and  abnormalities,
including sickle cell anemia [4], cystic fibrosis [5], muscular
dystrophy [6], type II diabetes [7], and migraine headaches
[8] are influenced by the presence of SNPs occurring within
gene coding regions.

The  rate  of occurrence of SNPs in the human genome is
around one every  100 to 300 base pairs. The National Center
for Biotechnology Information (NCBI) maintains a publicly
available  database of annotated human  SNPs, known as
dbSNP [9]. The current build of dbSNP (build 127) contains
nearly 12 million annotated human SNPs.

While SNPs are important in disease association studies, their
presence becomes problematic for genome wide analysis. As
an example, one  of the difficulties with the Affymetrix®
microarray platforms is that each of the chips are designed to
be representative for all individuals within an  organism-level
classification. However, with the high frequency of SNPs, it is
ISSN 0973-2063                                        406
Bioinformation 2(9): 405-411 (2008)
              possible that a SNP locus is found within a particular probe
              sequence. This becomes especially problematic if the  locus
              corresponds to the 13th base pair, and the SNP variant is the
              complementary base. Such a case  would  result in a higher
              hybridization rate for the mismatch probe as opposed to the
              match probe.

              The independent and dependent effect of both SNPs and copy
              number variants (CNVs) on gene expression has been known
              to be  an  issue  when  studying  microarrays  [10].  The
              development of SNP chips  [11] has made  it possible to
              genotype SNPs and has  led to the  real possibility of Whole
              Genome  Association  Studies (WGAS). However,  a  large
              number of gene expression studies using microarray probe
              technology exist that might label certain probes for exclusion
              due to  higher MM hybridization rates that is actually due to
              the presence of complementary SNPs.

              In order to test for the effect of SNPs on probe hybridization,
              we looked  at  all   247,965  match  probes  within the
              Affymetrix® HU-133A platform and compared them against
              dbSNP to see which probes contained SNP loci within them.
              For those where the sole SNP loci was found at the 13th base,
              we compared the expression levels of the  PM probe to the
              MM probe for each of the probes  within the probe set. We
                                                                               Bioinformation, an open access forum
                                                                       © 2008 Biomedical Informatics Publishing Group
                          Previous
   TOC

-------
Bioinformation
www.bioinformation.net
by Biomedical Informatics Publishing Group
        open access
Hypothesis
specifically wanted to see if the MM probe with the SNP at
the 13th base varied more than those probes that  did  not
contain any SNP loci.

Methodology:
Data acquisition
For the  purposes of this study, three key data components
were required:  human genomic data, human SNP data, and
probe sequence data. Human genomic sequence was obtained
from the University of California-Santa Cruz's goldenpath
web  site (http://goldenpath.ucsc.edu) [12, 13]  for the hg!7
build of the human genome. The resulting data was contained
in 27 files. Human SNPs were downloaded from build 124 of
the dbSNP  database  [14] maintained by the National Center
for Biotechnology Information (NCBI).  This  data  is itself
based on build 33  of  the NCBI Human genome. Probe
sequence data  representing  the 247,965 twenty-five base
perfect match  oligomers from the HG-U133A microarray
manufactured by Affymetrix®  were downloaded from  the
netaffx   utility   on   the    Affymetrix®    web   site
http://www.affymetrix.com/.

Expression  levels of probe sequences containing SNPs were
compared within a set of twelve samples for the GEO [15]
record GDS1758. This dataset originates from a study on the
developmental  pathway  involved in  pterygium, an ocular
surface disorder within a  sample set of Chinese patients [16].
Rather than focus on a large dataset with a mixture of patients
from different  ethnicities, this  smaller  dataset was chosen
from a single ethnic  group so that ethnic specific major and
minor allele frequencies could be determined. The individual
CEL    files    used    are    labeled    GSM48026.CEL,
GSM48027.CEL,    GSM48028.CEL,    GSM48029.CEL,
GSM48030.CEL,    GSM48031.CEL,    GSM48032.CEL,
GSM48033.CEL,    GSM48034.CEL,    GSM48035.CEL,
GSM48036.CEL, and GSM48037.CEL.

Preprocessing
Perfect match probe  sequences for the HG-U133A platform
were  stored in a  tab-delimited  format with information
concerning  the probe set name, probe x position, probe y
position,  interrogation   position,  probe  sequence,  and
strandedness (Table  1  under  supplementary material). The
resulting tab-delimited files were parsed using perl scripts to
reconstruct  sequence files in FASTA  format  for sequence
comparison.

Sequences  originating  from  the dbSNP  database each
represent a  single instance of a known SNP denoted by the
standard IUPAC-IUB code [17]. dbSNP  sequences  typically
range from  a few hundred to a few thousand bases in length.
For the purpose of our study we were only interested in the
sequence immediately  surrounding the  SNP  for alignment
with the twenty-five base oligomer sequence  from the HG-
U133A  microarray. A perl script was created to extract a
forty-nine base  segment from each dbSNP sequence, spanning
twenty-four bases upstream and downstream of the SNP
location, when  available.  In some cases, the allele position or

ISSN 0973-2063                                          407
Bioinformation 2(9): 405-411 (2008)
               length of the original sequence did not allow for all of the
               bases to be extracted.

               The original downloaded dbSNP sequences were soft-masked
               for low-complexity regions and tandem repeats. While this
               can be  beneficial in  order  to  remove regions  of low
               significance and to avoid spurious sequence hits, our study
               required us  to unmask  the source data in order to  produce
               exact alignments with microarray probe sequences. Sequences
               were  thus  restored  to  their original format for  sequence
               alignment purposes. The resulting unmasked data was verified
               by comparison between  the original and truncated data.

               Sequence Alignment
               Alignments between the microarray oligomer probes and the
               dbSNP sequences were  performed using  the nucleotide-
               nucleotide comparison tool wublastn from the WU-BLAST
               2.0 suite of programs  [18, 19].  The dbSNP database was
               formatted into  a BLASTable  database using the  xdformat
               utility, leaving the microarray probes as the query sequences.
               Since the sequences were expected to be exact matches with
               the exception of any SNPs present, ungapped alignments were
               performed.  This has the  additional benefit  of decreasing
               search time. A word size of eight was  used to allow for
               alignments with up to  two mismatches  within a  25  base
               alignment.  A  score  cutoff of 95 was used  to allow for  a
               combination of two mismatches/gaps  within a  25  base
               alignment   using  a   scoring   scheme  of +5/-4   for
               matches/mismatches and  -10  for  gap  open penalty. The
               remaining parameters   were set  at the  default values.  In
               summary, the  blast command  line was  as  follows: blastn
                  -nogaps  -S=95  -W=8.  To  further
               maintain the focus of the project, the parameterized wublastn
               results were filtered through a Perl script  to only store those
               alignments that were at least 22 bases in length.

               Parsing and storing the results
               The wublastn  searches were  conducted  chromosome-wise,
               keeping  the structure  of the  source  data intact,  wublastn
               output was piped through Perl  scripts to  filter out the basic
               statistical information required for a database table. A Perl
               script incorporating BPlite [20]  was used to further parse the
               output to store alignments of 22 or more bases with the
               following information stored in plain text files are as follows:
                   1.   Reference  numbers   of  the  query and  target
                       sequences.
                   2.   Sequence locations on the microarray and on the
                       dbSNP and genomic databases.
                   3.   Location of the SNP within a dbSNP segment.
                   4.   Lengths of the query and target sequences.
                   5.   Start and end positions of the alignments found.
                   6.   Aligned segment pairs.
                   7.   Alignment string itself,  which is a  means  of
                       depicting the  matched and mismatched  base  pairs
                       within a sequence.   Matching  pairs have  a  '|'
                       between them, mismatches remain blank and where
                       a base in the query sequence matches any one of the
                       possible variations of the SNP, a plus sign ('+') is
                       used to show this 'partial' match.
                                                                                  Bioinformation, an open access forum
                                                                         © 2008 Biomedical Informatics Publishing Group
                           Previous
   TOC

-------
Bioinformation
www.bioinformation.net
  by Biomedical Informatics Publishing Group
              open access
      Hypothesis
    8.  Length of the alignment found.
    9.  Number of matches within the sequence.
    10. Percent identity of the matches (number of matches
        divided by the alignment length).
                    11. Raw  score of the alignment  from  the  standard
                        scoring scheme of wublastn.
                         Percentage of observations of MM > PM
                                                                          —0— ALL-NoSNP

                                                                          —•— ALL-SNP

                                                                          —_— COUPLEMEf/TARY MoSNP

                                                                             COMPLEMENTARY SNP
                0     1
45678
 Number of MM > PM
10    11    12
Figure 2: Percentage of observations of mismatch probe expression greater than perfect match probe expression.
A MySQL database was created to store the parsed results.
The database  schema consists of six  tables.  One  of these
tables captures the database information, and a second is used
for the organism records. The remaining four tables captured
alignment  data: one to  hold the identification  for  the
microarray  probes;  a  second  to  hold the identification
information for the segments from the dbSNP files; a third to
store the alignment data  such  as  the  source and target
alignment strings; and the fourth to store the statistical data
corresponding to the alignments. A shared key field was
generated for each of the tables using  the chromosome and
alignment number.

Discussion:
SNP and probe alignments
Over six million ungapped alignments were found between
microarray probes  and  SNP segments. These resulted in  a
total of 45,984 perfect match sequences between the probes
and SNP segments. An additional of 1,656 probe sequence
alignments result in a mismatch nucleotide in the 13th base.
Further  filtering yields  915 alignments where the probe
sequence contains only a  single SNP,  and that probe is the
only one within its corresponding probe group to contain any
known  SNPs. Of these 915, a subset of 166  results in  a

ISSN 0973-2063                                        408
Bioinformation 2(9): 405-411 (2008)
                complementary  base  mismatch.  Bearing  in mind that
                Affymetrix® microarray s have pairs of probes where one half
                of the pair has the complement of the other's 13th base, these
                probes were marked for further analysis. A total of 58,505
                sequences with  a single mismatch  were  detected, but not
                included in our analysis.

                The 166 alignments resulting in a single, complementary base
                mismatch originate from unique probe sets. Those probe pairs
                containing a region where a SNP is present have the potential
                to have higher expression level in the mismatch probe than in
                the match probe depending upon the individual's genotype. In
                order  to test if this  was the case, each of the eleven probe
                pairs within each corresponding probe sets were compared to
                see  how frequently  the match expression level was greater
                than the  mismatch  expression both in those  probe  pairs
                without a SNP and those probe pairs where a SNP is mapped
                to the 13th position.  Expression data  was obtained using CEL
                files  for  12  different experiments as discussed  in the
                Methodology section. The resulting data set yields 1670 non-
                SNP containing probes, and 166 SNP containing probes. For
                the complete set of 1836 probes, the number of times that the
                mismatch  expression  data  was  higher  than the  match
                expression data  was reported. Table 2 (see  supplementary
                                                                               Bioinformation, an open access forum
                                                                      © 2008 Biomedical Informatics Publishing Group
                          Previous
     TOC

-------
Bioinformation
www.bioinformation.net
by Biomedical Informatics Publishing Group
         open access
Hypothesis
material) is constructed from this dataset, noting the number
of times that the mismatch probe expression level is observed
to be greater than the match probe expression level.

Analysis
Table 2 (see supplementary material) indicates the number of
times the MM probe is greater than the PM probe occurs with
less frequency in the probes with a SNP in the 13th base than
it does in probes without SNPs, debunking our hypothesis.  It
is observed  that about 52% of the time in  probes without
SNPs,  the perfect match probe expression level is  always
greater than  the mismatch probe expression level, while this
occurs at approximately the same rate in probes with a SNP in
the 13th base. One interesting piece of information is that the
mismatch probe expression level is always greater than the
match  expression level in both instances around 9% of the
time. A graph of the frequency of these events is shown in
Figure 2. The number of observations from the SNP  and
nonSNP data was compared using %2 analysis. The resulting
j2 value of 6.0108 with 12 degrees of freedom has a p-value
of 0.9155, thus rejecting the alternative hypothesis that the
SNP and nonSNP data are significantly different.

Minor allele frequencies
The  166 unique complementary  SNPs have been  mapped
according to known SNPs from the dbSNP database which
contains SNPs for all population types. However, since the
experiments  selected  are  focused on the  HapMap  Han
Chinese in  Beijing (HCB) population,  it is  possible the
observed results are skewed according to population-specific
SNPs.  Each of the 166  dbSNP  references were searched
against HapMap using HapMart release 23a. Sixty-four of the
166 SNP containing probes have been genotyped for the HCB
group by the HapMap project, each with between 72 and 90
allelic observations. However, only 20 of these have a minor
allele frequency of 5% or greater (Table 3 in supplementary
material). The SNP to nonSNP probe groups for this set of 20
SNPs were compared as previously discussed. A comparison
of the number of times the MM probes was greater than the
PM probes is given in Table 4  (shown under supplementary
material) for these 20 SNPs. Since the number of observations
is low, Fisher's exact  test was performed on the  SNP to
nonSNP group, resulting in ap-value of 0.04113. The p-value
is much lower  than before,  and indicates a  significant
difference in the distributions when  a p-value threshold of
0.05  is considered.  This  indicates that perhaps  with more
observations, it could be possible to differentiate between
differences in MM and PM arising due to allelic variations
and those from cross hybridizations.

Conclusion:
The recent publications of the  complete diploid genome of
two individual humans indicate that the rate of SNP variation
within an individual is much larger than previously expected
[21]. The higher rate of variation in the zygosity presents an
issue when looking at gene expression. Our hypothesis states
that we would expect to  see higher hybridization rates for
mismatch probes in regions where a SNP is found in the 13th
base  of a probe sequence. However, initial results on twelve
ISSN 0973-2063                                         409
Bioinformation 2(9): 405-411 (2008)
               microarray experiments illustrate this is not the case, and in
               fact, the opposite is true. Further analysis of the samples used,
               including  genotyping  information,  would  be  useful  in
               determining if these discrepancies result due to the frequency
               of certain haplotypes within a population.

               When known haplotype frequencies are considered, it is still
               difficult to  differentiate  between  true  SNPs and  cross
               hybridization although the distributions are more distinct. Part
               of this inability may be due to low number of SNPs (20)
               falling  into this  category. As more haplotype  frequency
               information becomes available  for all 166 candidate  SNPs
               through the HapMap project, it may become plausible  to
               differentiate    between   cross-hybridization.   Additional
               haplotype information for the other HapMap populations may
               result  in  additional  alleles   with  higher   minor  allele
               frequencies.

               The  ability  to discern  between  cross-hybridization and
               infrequent  SNPs based on PM  and MM data is difficult at
               best.  SNPs remain a tricky  issue when microarray  probe
               design is considered. It is our conclusion that information is
               not lost when these probes are discarded, since the source of
               the discrepancy cannot be consistently determined.

               Acknowledgment:
               Support for this project was provided by NIH-NCRR grant
               P20RR16481  and  NIH-NIEHS grant  P30ES014443.  The
               contents of this manuscript are solely the responsibility of the
               authors and  may  not  represent the official  views of the
               National Center for Research Resources, the National Institute
               for  Environmental  and Health Science,  or  the National
               Institutes of Health.  ECR and AWP contributed equally to this
               project. The  authors would like to thank the  University  of
               Louisville  Bioinformatics  Research Group (BRG) and the
               University of Louisville  Bioinformatics Laboratory  for
               numerous fruitful discussions.

               References:
               [01]   B. Palsson, Nat.BiotechnoL, 20: 649 (2002) [PMID:
                      12089538]
               [02]   M.  Schena et al, Science, 21Q: 467 (1995) [PMID:
                     7569999]
               [03]   http://www.affymetrix.com/
               [04]   J. C. Chang and Y. W. Kan, Lancet, 2: 1127 (1981)
                     [PMID: 6118575]
               [05]   E. Mateu et al, Am. J. Hum. Genet, 68: 103 (2001)
                     [PMID: 11104661]
               [06]   M.  Koenig et al, Am. J. Hum. Genet., 45: 498 (1989)
                     [PMID: 2491009]
               [07]   N. Vionnet et al, Nature, 356: 721 (1992) [PMID:
                      1570017]
               [08]   M.  Wessman et al,  Am. J. Hum. Genet., 70: 652
                     (2002) [PMID: 11836652]
               [09]   S. T. Sherry et al, Nucleic Acids Res., 29: 308 (2001)
                     [PMID: 11125122]
               [10]   B. E. Stranger et al, Science, 315: 848 (2007) [PMID:
                      17289997]
                                                                                   Bioinformation, an open access forum
                                                                          © 2008 Biomedical Informatics Publishing Group
                           Previous
   TOC

-------
Bioinformation
www.bioinformation.net
by Biomedical Informatics Publishing Group
        open access
Hypothesis
[11]   J. B. Fan et al, Genome Res., 10: 853 (2000) [PMID:      [17]
      10854416]
[12]   W. J. Kent and D. Haussler, Genome Res.,  11: 1541
      (2001) [PMID: 11544197]                           [18]
[13]   W. J. Kent et al, Genome Res.,  12: 996 (2002)
      [PMID: 12045153]                                 [19]
[14]   S. T. Sherry et  al., Genome Res.,  9: 677 (1999)      [20]
      [PMID: 10447503]                                 [21]
[15]   T. Barrett et al., Nucleic Acids Res., 35: D760 (2007)
      [PMID: 17099226]
[16]   Y. W. Wong et al., Br. J.  Ophthalmol, 90: 769 (2006)
      [PMID: 16488932]
                                                                                    Edited by S. Datta
                                                   Citation: Rouchka etaL, Bioinformation 2(9): 405-411 (2008)
            License statement: This is an open-access article, which permits unrestricted use, distribution, and reproduction in
                          any medium, for non-commercial purposes, provided the original author and source are credited.
                   IUPAC-IUB    commission    on    biochemical
                   nomenclature (CBN), J. Mol. Biol., 55: 299 (1971)
                   [PMID: 5551389]
                   S. F. Altschul et al., J. Mol. Biol., 215: 403 (1990)
                   [PMID: 2231712]
                   E. C. Rouchka, Conversation with: W. Gish (2004)
                   E. C. Rouchka, Conversation with: I. Korf (1999)
                   S. Levy et al., PLoS.Biol., 5:  e254 (2007) [PMID:
                   17803354]
Supplementary material
Probe set
name
1007_s_at
1007_s_at
1007_s_at
1007_s_at
1007_s_at
1007_s_at
1007_s_at
1007_s_at
1007_s_at
1007_s_at
1007_s_at
1007_s_at
1007_s_at
Probe X
467
531
86
365
207
593
425
552
680
532
143
285
383
Probe Y
181
299
557
115
605
599
607
101
607
139
709
623
479
Probe
interrogation
position
3330
3443
3512
3563
3570
3576
3583
3589
3615
3713
3786
3793
3799
Probe sequence target
CACCCAGCTGGTCCTGTGGATGGGA
GCCCCACTGGACAACACTGATTCCT
TGGACCCCACTGGCTGAGAATCTGG
AAATGTTTCCTTGTGCCTGCTCCTG
TCCTTGTGCCTGCTCCTGTACTTGT
TGCCTGCTCCTGTACTTGTCCTCAG
TCCTGTACTTGTCCTCAGCTTGGGC
ACTTGTCCTCAGCTTGGGCTTCTTC
TCCTCCATCACCTGAAACACTGGAC
AAGCCTATACGTTTCTGTGGAGTAA
TTGGACATCTCTAGTGTAGCTGCCA
TCTCTAGTGTAGCTGCCACATTGAT
GTGTAGCTGCCACATTGATTTTTCT
Strandedness
Antisense
Antisense
Antisense
Antisense
Antisense
Antisense
Antisense
Antisense
Antisense
Antisense
Antisense
Antisense
Antisense
Table 1: Sample source data from the HG-U133Amicroarray in tab-delimited format.
# observations
MM>PM
0
1
2
3
4
5
6
7
8
9
10
11
12
Probes without SNP
No. of occurrence
873
123
92
64
54
60
46
34
48
39
37
54
146
% Occurrence
52.3%
7.4%
5.5%
3.8%
3.2%
3.6%
2.8%
2.0%
2.9%
2.3%
2.2%
3.2%
8.7%
Probes with SNP
No. of occurrence
88
14
8
7
2
5
5
4
3
4
6
3
17
% Occurrence
53.0%
8.4%
4.8%
4.2%
1.2%
3.0%
3.0%
2.4%
1.8%
2.4%
3.6%
1.8%
10.2%
Table 2: Mismatch to match expression level results.
ISSN 0973-2063
Bioinformation 2(9): 405-411 (2008)
        410
                                                                         Bioinformation, an open access forum
                                                                 © 2008 Biomedical Informatics Publishing Group
                        Previous
   TOC

-------
Bioinformation
www.bioinformation.net
by Biomedical Informatics Publishing Group
      open access
Hypothesis
Probe ID
202192_s_at
20761 l_at
203680_at
201678_x_at
206529_x_at
210732_s_at
219502_at
215261_at
206226_at
21681 l_at
215986_at
221344_at
214836_x_at
209313_at
210618_at
217530_at
219093_at
216463_at
207075_at
219424_at
dbSNP
reference
Rs9545
Rs200485
Rs257378
Rsl0712
Rs272679
Rs2273865
Rsl055677
Rsl2198616
Rsl 042464
Rsl 1009339
Rs2 19307
Rsl011985
Rs232230
Rs8731
Rs4654973
Rs7447593
Rs3755302
Rsl6849300
Rsl0754558
Rs6613
Major
allele
G
G
C
G
C
T
C
G
T
C
G
G
G
G
C
C
T
G
G
A
Frequency
0.932
0.911
0.9
0.875
0.872
0.852
0.849
0.844
0.756
0.714
0.689
0.659
0.655
0.633
0.622
0.622
0.589
0.578
0.5
0.5
Minor
allele
C
C
G
C
G
A
G
C
A
G
C
C
C
C
G
G
A
C
C
T
Frequency
0.068
0.089
0.1
0.125
0.128
0.148
0.151
0.156
0.244
0.286
0.311
0.341
0.345
0.367
0.378
0.378
0.411
0.422
0.5
0.5
Table 3: Complementary SNP probes with Minor Allele Frequency > 5%.

#MM>PM  Expected % of SNPs*     Observed        Observed
                           SNP probes (%)  nonSNP probes (%)
0
1
2
3
4
5
6
7
8
9
10
11
12
5%
35%
10%
30%
5%
10%
0%
0%
0%
0%
0%
0%
0%
6 (30%)
0 (0%)
0 (0%)
1 (5%)
1 (5%)
2(10%)
1 (5%)
1 (5%)
0 (0%)
0 (0%)
5 (25%)
1 (5%)
2 (10%)
77 (38.3%)
23(11.4%)
10(5.0%)
12 (6.0%)
9 (4.5%)
10(5%)
5 (2.5%)
8 (4.0%)
11 (5.5%)
3 (1.5%)
6 (3%)
9 (4.5%)
18 (9%)
Table 4: Probes with MM > PM for SNPs with Minor Allele Frequency > 5%.
ISSN 0973-2063
Bioinformation 2(9): 405-411 (2008)
      411
                    Previous
  TOC
                                                            Bioinformation, an open access forum
                                                     © 2008 Biomedical Informatics Publishing Group

-------
                                                Aquatic Toxicology 92 (2009) 168-178
                                             Contents lists available at ScienceDirect
                                                Aquatic Toxicology
                                 journal  homepage: www.elsevier.com/locate/aquatox
Endocrine  disrupting chemicals in fish:  Developing exposure indicators and

predictive  models of effects based on mechanism of action

Gerald T. Ankley3'*, David C. Bencicb, Michael S. Breenc, Timothy W. Colletted, Rory B. Conollyc,
Nancy D. Denslow6, Stephen W. Edwardsf,  Drew R. Ekmand, Natalia Garcia-Reyero6'1,
Kathleen M. Jensen3, James M. Lazorchakb, Dalma Martinovic3'2, David H. Millerg,
Edward J. Perkins11, Edward F. Orlando1, Daniel L Villeneuve3, Rong-Lin Wangb, Karen H. WatanabeJ
' USEPA, National Health and Environmental Effects Research Lab, Duluth, MN, United States
b USEPA, National Exposure Research Lab, Cincinnati, OH, United States
c USEPA, National Center for Computational Toxicology, RTP, JVC, United States
d USEPA, National Exposure Research Lab, Athens, GA, United States
e University of Florida, Gainesville, EL, United States
1 USEPA, National Health and Environmental Effects Research Lab, RTP, JVC, United States
z USEPA, National Health and Environmental Effects Research Lab, Grosse lie, MI, United States
h US Engineer Research and Development Center, Vicksburg, MS, United States
1 University of Maryland, College Park, MD, United States
J Oregon Health and Science University, Beaverton, OR, United States
ARTICLE   INFO

Article history:
Received 28 August 2008
Received in revised form 28 January 2009
Accepted 31 January 2009

Keywords:
Fish
EDC
Toxic ity
MOA
Model
Genomics
                                        ABSTRACT
Knowledge of possible toxic mechanisms (or modes) of action (MOA) of chemicals can provide valuable
insights as to appropriate methods for assessing exposure and effects, thereby reducing uncertainties
related to extrapolation across species, endpoints and chemical structure. However, MOA-based testing
seldom has been used for assessing the ecological risk of chemicals. This is in part because past reg-
ulatory mandates have focused more on adverse effects  of chemicals (reductions in survival, growth
or reproduction) than the pathways through which these effects are elicited. A recent departure from
this involves endocrine-disrupting chemicals (EDCs), where there is a need to understand both MOA
and adverse outcomes. To achieve this understanding, advances in predictive approaches are required
whereby mechanistic changes caused by chemicals at the molecular level can be translated into apical
responses meaningful to ecological risk assessment. In this paper we provide an overview and illustrative
results from a large, integrated project that assesses the effects of EDCs on two small fish models, the fat-
head minnow (Pimephales promelas) and zebrafish (Danio rerio). For this work a systems-based approach
is being used to delineate toxicity pathways for 12 model EDCs with different known or hypothesized
toxic MOA. The studies employ a combination of state-of-the-art genomic (transcriptomic, proteomic,
metabolomic), bioinformatic and modeling approaches, in conjunction with whole animal testing, to
develop response linkages across biological levels of organization. This understanding forms the basis for
predictive approaches for species, endpoint and chemical extrapolation. Although our project is focused
specifically on EDCs in fish, we believe that the basic conceptual approach has utility for systematically
assessing exposure and effects of chemicals with other MOA across a variety of biological systems.
                                                               Published by Elsevier B.V.
1. Background

   Prospective ecological risk assessments of most chemicals typi-
cally are conducted with little consideration for toxic mechanisms
  * Corresponding author.
   E-mail address: ankley.gerald@epa.gov (G.T. Ankley).
  1 Current affiliation: Jackson State University, Jackson, MS, United States.
  2 Current affiliation: University of St. Thomas, St. Paul, MN, United States.

0166-445X/S - see front matter. Published by Elsevier B.V.
doi:10.1016/j.aquatox.2009.01.013
                        (or modes) of action (MOA). Testing for ecological effects usually
                        includes a wide array of species and endpoints, with a focus primar-
                        ily on apical responses. When little is known about the properties
                        of a test chemical, this is a pragmatic approach; however, substan-
                        tial benefits can be realized by basing testing and subsequent risk
                        management decisions on known or probable MOA. For example, a
                        priori knowledge of MOA can lead to identification of mechanism-
                        based (and, hence, stressor-specific) molecular indicators that can
                        potentially be linked to environmental concentrations and used to
                        inform exposure assessments. Furthermore, knowledge of MOA can
                                       Previous
                     TOC

-------
                                         C.T. Ankley et al/Aquatic Toxicology 92 (2009) 168-178
                                                                                                                    169
        Compartment
     ftiitlrnseii / Estrosen
     Responsive Tissues
                                                                           Chemical  "Probes"
                                                                                    Fipronil  (-)
                                                                                    Muscimol (+)


                                                                                    Apomorphine (+)
                                                                                    Haloperidol (-)


                                                                                    Trilostane (-)
                                                                                    Ketoconazole (-)
                                                                    / /     yfl Fadrozole (-)
                                                                    /    /\/fl Prochloraz (-,-)
                             Vinclozolin (-)
                             Flutamide (-)
                             17(3-Trenbolone  (+)
                             17a Ethinylestradiol (+)
     (e.g.. liver, falpad. gonads)
Fig. 1. Overview of the fish hypothalamic-pituitary-gonadal (HPG) axis, and experimental chemical "probes" with different mechanisms of action. The "+" or "-" shown in
parentheses indicate, respectively, stimulation or inhibition of a particular target (enzyme or receptor) by the test chemical. See text for further details. Depiction of the HPG
axis is adapted from Villeneuve et al. (2007b).
serve as a basis for effective extrapolation of biological effects across
species, biological levels of organization, and chemical structures.
This information can help identify potentially sensitive responses,
and even species prior to extensive testing, thereby optimizing time
and resource use (Bradbury et al., 2004).
   Endocrine-disrupting chemicals (EDCs) represent a compara-
tively recent departure from past regulatory activities with toxic
compounds in that there is a need to know both MOA and potential
adverse effects. There have been several definitions of EDCs from a
MOA perspective, ranging from (in the most limited sense) chemi-
cals which are estrogenic (specifically, estrogen receptor agonists)
to (in the broadest sense) "an exogenous agent that interferes with
the production, release, transport, metabolism, binding, action, or
elimination of natural hormones in the body responsible for the
maintenance of homeostasis and the regulation of developmental
processes" (Kavlock et al., 1996). From a regulatory perspective, the
definition currently most widely used for EDCs encompasses agents
that cause alterations in reproduction or development through
direct effects on the vertebrate hypothalamic-pituitary-thyroidal
or hypothalamic-pituitary-gonadal (HPG) axes (USEPA, 1998).
   Due to the emphasis on MOA, consideration of EDCs in current
testing and regulatory frameworks has been challenging. For exam-
      ple, it is important that tests include responses other than apical
      endpoints if the assays are to be indicative of specific MOA. How-
      ever, mechanism-specific endpoints are not necessarily predictive
      of an adverse biological outcome, and it is problematic to inten-
      sively regulate a chemical that does not cause adverse effects even
      if it does, for example, activate the estrogen receptor (ER). A com-
      mon (and logical) approach to addressing this seeming dilemma
      has been the development of tiered testing frameworks that use
      short-term assays first to identify chemicals as possessing a MOA
      of concern before proceeding with longer-term tests better suited
      to quantifying adverse effects (e.g., USEPA, 1998). However, even
      relatively efficient tiered testing programs for EDCs may not be
      sustainable in terms of resources (or timeliness) if hundreds or
      thousands of chemicals need to be assessed using long-term assays.
         The efficiency of EDC testing programs could  be enhanced
      through the use of emerging technologies in the areas of genomics
      and computational biology to provide mechanistic  insights as to
      exposures and possible adverse effects in animals, such as fish (e.g.,
      Ankley et al., 2006; Hoffmann et al., 2006, 2008; Hook et al., 2006;
      Samuelsson et al., 2006; Filby et al., 2007; Martyniuk et al., 2007).
      This type of approach is consistent with recent recommendations
      from the National Research Council (NRC, 2007), who suggest a
                                   Previous
TOC

-------
170
                                          C.T. Ankley et al/Aquatic Toxicology 92 (2009) 168-178
               Increasing Diagnostic (Screening) Utility      Increasing Ecological Relevance
               Levels <
               Biologic
               Organization
                            Oman
                          Functional and
                          (ructural ch
                           (Patholoi
Individual Population
                                                          Decreased
               Computational
                  modeling
               Poorly char
                    genoi
                 high ecolo
               regulatory re
 PhaseS.       I              Phase 1.
                    Fathead minnow 21 d reproduction test
    riptomlcs     I  (e g  fecundity, histology, vitellogenin, sex
.-.abolomics     I               steroids)
                                                         Population
                                                          modeling
Fig. 2. Conceptual linkages across biological levels of organization for effects of endocrine-disrupting chemicals in fish. Different research phases of the project are aligned
to reflect where/how they address these linkages and arrows reflect the flow of information. See text for further details.
shift toward greater use of short-term (e.g., in vitro) assays and
predictive toxicology tools for assessment of human health risks of
chemicals. In this paper we describe a research effort to support the
development of approaches for assessing chemicals with the poten-
tial to impact the HPG axis offish. These approaches, which could
encompass techniques ranging from computational models to in
vitro assays and short-term in vivo tests, would help provide regu-
latory agencies throughout the world with cost-effective, predictive
tools for monitoring and testing EDCs.
   This is a large, highly-integrated project that includes govern-
ment, academic and industry scientists from several laboratories
across North America. In this paper, we  describe the conceptual
basis of the approach  we have employed and present illustrative
results. The information provided herein is necessarily brief; for
further detail on methods  and results, interested readers should
consult the indicated citations  or contact us  directly,  as many
aspects of the data collection/analyses are ongoing.

2. Experimental overview

   The basic approach used for our work involves perturbation
of the HPG axis with chemical probes known or hypothesized
to impact different key control points, ranging from neurotrans-
mitter receptors in the brain to  steroid hormone  receptors  in
gonads (Fig. 1). Following  perturbation of the  axis by chemicals
with different MOA, information is collected at multiple biological
levels of organization, ranging from molecular changes to api-
cal responses (i.e., reproductive success), and even (via modeling)
to likely  population-level effects (Fig. 2). This type  of integrated
analysis facilitates a mechanistic understanding of the effects of
HPG-active chemicals from the molecular to whole-organism levels
from a toxicity pathway perspective (Bradbury et al., 2004).
                        2.1. Organisms

                           The experimental organisms for this research are small fish.
                        There are several different EDC testing programs  being  imple-
                        mented throughout the world, and most include fish assays (Ankley
                        and Johnson, 2004). A pragmatic reason for this is that there are
                        clearly documented adverse impacts of EDCs on fish populations
                        in the field; this differs from the situation in humans where expo-
                        sure to, and  subsequent effects of environmental EDCs tend to be
                        more uncertain (WHO, 2002). In addition, in terms of animal avail-
                        ability (e.g., generation of large numbers of high-quality organisms
                        at suitable life-stages), chemical exposure dynamics and biological
                        flexibility, small fish species are well suited for mechanistic stud-
                        ies with chemicals such as EDCs (Stoskopf, 2001, and references
                        therein; Ankley and Johnson, 2004). Significantly, although there
                        are some unique aspects of fish reproductive endocrinology, the
                        basic structure and function of the HPG  axis across all vertebrates
                        tends to be well conserved. Hence, the results of fish studies with
                        EDCs potentially can serve as the basis for effective  cross-species
                        extrapolation of potential effects.
                           Our research  utilizes the zebrafish (Danio rerio) and fathead
                        minnow (Pimephales promelas), two small cyprinids that have com-
                        plementary  attributes that make them  useful for this work. The
                        genome of the zebrafish is fully sequenced, thus reducing bioinfor-
                        matic challenges when evaluating alterations in gene and protein
                        expression (Hill  et al., 2005). As such,  the zebrafish is a useful
                        model for exploratory or hypothesis-generating work focused on
                        the effects of EDCs with different MOA on response profiles of genes
                        and proteins (Hoffmann et al., 2006, 2008). In contrast to zebrafish,
                        the fathead  minnow has a rich history  of use in regulatory pro-
                        grams in the US, including testing for EDCs (Ankley and Villeneuve,
                        2006). In addition to its relevance to regulatory activities, a fair
                                       Previous
                    TOC

-------
                                          C.T. Ankley et al/Aquatic Toxicology 92 (2009) 168-178
                                                                                                                         171
amount is known about basic reproductive biology in the fathead
minnow, thus providing a basis for "anchoring" observed alterations
in gene, protein or metabolite expression caused by test chemicals
to phenotypic changes in gonad histology and reproductive success.

2.2. Test chemicals

   Test  chemicals used for the  work include  those that (could)
impact HPG function relatively  "high" in  the  axis, such as mus-
cimol (a pharmaceutical) and fipronil (an insecticide) which act,
respectively, as an agonist and antagonist of specific GABA (gamma-
amino butyric acid)  receptors  (Fig. 1). The drugs apomorphine
and haloperidol act as an agonist and antagonist, respectively, of
dopamine receptors (D2) involved in the release of gonadotrophic
hormones from the  pituitary. The fungicides ketoconazole and
prochloraz,  and the Pharmaceuticals  trilostane and  fadrozole
inhibit one or more enzymes involved in steroid biosynthesis in the
gonad, including reactions catalyzed by 3(3-hydroxysteroid dehy-
drogenase (3|3HSD) and different cytochromes P450 (CYPs). Finally,
several chemicals that directly impact hormone receptors located
in the gonad and other steroid-responsive tissues are being tested,
including  17a-ethinylestradiol  and 17(3-trenbolone, potent syn-
thetic steroidal agonists of the  ER and androgen  receptor (AR),
respectively, and vinclozolin (a fungicide) and flutamide (a phar-
maceutical), which antagonize the AR (Fig. 1). Although several of
these chemicals do occur in the environment as contaminants (e.g.,
the pesticides and synthetic  steroids), others are less likely to do
so (some of the drugs). Overall, our strategy in selection of test
chemicals was not necessarily to focus on known environmental
contaminants, but to perturb HPG pathways of known (or potential)
biological relevance.

2.3. Phased testing

   Sexual development, including gonad differentiation, during lar-
val and juvenile life-stages, and reproduction in mature adults offer
two "windows" of enhanced sensitivity of fish to EDCs  (Ankley
and Johnson, 2004).  For this research, the adult  life-stage was
chosen because we felt that the substantial alterations  in gene
and protein  expression and metabolite profiles that occur during
early development might complicate understanding of the effects
produced by the test chemicals. However, due to the known sen-
sitivity of fish to EDCs during sexual development, studies of the
type described herein encompassing this life-stage also would be
desirable.
   For our research, three different types of studies—termed Phases
1, 2 and 3—are conducted on adult fish exposed to the various HPG-
active chemicals (Fig. 2). In Phase 1 studies, each chemical is tested
in a standardized  21-d reproduction assay with the fathead min-
now using flow-through (water) exposures and measured chemical
concentrations to  produce a high-quality exposure/effects dataset.
Endpoints measured in the  Phase  1  studies span a wide range
of biological levels of  organization, including determination  of
plasma concentrations of sex steroids (testosterone, 17(3-estradiol,
ll-ketotestosterone)andvitellogenin(Vtg; egg yolk protein precur-
sor), gonad size and histopathology, secondary sex characteristics,
reproductive behavior, fecundity, fertility and hatchability (Ankley
et al., 2001). In addition, the Phase 1 studies incorporate analyses of
a small complement of genes (measured via quantitative real-time
polymerase  chain reaction;  PCR) known  to be involved in HPG
function/control, and  hypothesized to be impacted by the chemical
exposure (Villeneuve et al., 2007a,b). Information from the Phase
1 studies  is subsequently used for three primary purposes, to:
(a) aid in design (e.g., selection of  test chemical concentrations)
for subsequent, shorter-term Phase 2 and 3 assays, (b) generate
information  for systems and population modeling, and (c) provide
      a robust phenotypic dataset for anchoring the various genomic
      responses collected in subsequent testing (Fig. 2).
         Phase 2 tests are short-term assays conducted with zebrafish in
      which samples from multiple tissues (gonad, liver and brain) are
      collected after 1, 2 and 4 d of exposure to the test chemicals. The
      samples are used for genomic measurements, with an emphasis
      on  gene  expression  determined using  commercially-available
      22,000 or 4 x 44,000 gene microarrays (Agilent, Palo Alto, CA, USA;
      Wang et al., 2008a,b) as well as hypothesis-driven and microarray-
      confirmatory PCR analyses.  A subset of the zebrafish samples
      also are analyzed for alterations in protein expression using two-
      dimensional  (2-D)  Fluorescence Difference Gel  Electrophoresis
      (Ettan™ DICE) technology (G.E. Healthcare Bio-Sciences  Corp.,
      Piscataway, NJ, USA). Overall, information from the Phase 2 studies
      provides insights  on relationships  between gene  and protein
      expression, and  helps identify candidate indicators/markers of
      EDC exposure and effects for  subsequent evaluation in Phase  3
      fathead minnow studies focused on temporal changes in the HPG
      axis (Fig. 2).
         The HPG axis is a highly dynamic system capable of respond-
      ing to environmental stressors, including contaminants, through
      various feedback mechanisms to maintain conditions conducive to
      reproduction. These types of compensatory responses can occur
      both during exposure to the stressor, and after the stressor has been
      removed. This, coupled with the fact that changes in some end-
      points (e.g., gene expression) can be rapid and/or transitory, dictates
      a need for temporal studies to develop robust exposure indicators
      and predictive models. The Phase 3 studies in our research address
      this through systematic time-course experiments with the fathead
      minnow. In these studies, animals are sampled  after  1, 2, 4 and 8
      d of exposure to the various test chemicals, as well as 1, 2, 4 and 8
      d after cessation of exposure. A variety of endpoints are examined,
      including a subset of those considered in the Phase 1  studies such
      as plasma steroid and Vtg concentrations, gonad histopathology
      and secondary sexual characteristics. Gene expression is evaluated
      using both targeted (PCR) assays (e.g., Villeneuve et al., 2007a),
      and through  microarray analysis, using custom microarrays con-
      structed on an Agilent platform (N.  Denslow, unpublished data).
      Among other uses, the time intensive microarray analyses in Phase
      3 studies provide data for reverse engineering of transcriptional
      networks within the HPG axis  (di Bernardo et al.,  2005). Changes
      in biological networks can be particularly useful in discerning MOA
      and mechanisms of response and compensation.  Finally, Phase 3
      samples are used for examination of changes in protein expression
      (based on targets from the Phase 2 studies), and metabolomic anal-
      yses via nuclear-magnetic resonance (NMR) and mass spectroscopy
      (MS) techniques  (Ekman et al., 2007, 2008, in press). Information
      from Phase 3 serves a number of overall  purposes, including (a)
      directly identifying linkages between changes in gene, protein and
      endogenous metabolite profiles; (b) relating these genomic changes
      to apical endpoints such as histopathology; (c) evaluating the con-
      sistency in responses to EDCs across  species (zebrafish, fathead
      minnows) exposed under the same conditions; (d)  providing infor-
      mation as to temporal alterations in a stressed (and unstressed
      or recovering) system as a basis for modeling HPG axis function;
      (e)  evaluating the rapidity and persistence of potential indicator
      responses identified in earlier phases of testing; and (f) developing
      dynamic models to understand feedback control and compensation
      for  stress (Fig. 2).

      3. Insights from experimental work

      3.1.   Phase 1

         Table 1 summarizes the fathead minnow 21 -d reproduction tests
      that have been conducted to date and, where available, provides ref-
                                    Previous
TOC

-------
172
                                             C.T. Ankley et al/Aquatic Toxicology 92 (2009) 168-178
Table 1
Overview of reproductive toxicity to the fathead minnow of chemicals with differing MOA in the hypothalamic-pituitary-gonadal (HPG) axis.
Test chemical
                               Presumptive HPG target(s)3
                                         Reproduction LOECb
                                                                                                          Reference
Fipronil
Muscimol
Apomorphine
Haloperidol
Trilostane
Ketoconazole
Fadrozole
Prochloraz
Vinclozolin
Flutamide
Trenbolone
Ethinylestradiol
 GABA receptor antagonist
 GABA receptor agonist
 D2 receptor agonist
 D2 receptor antagonist
 3(3HSD inhibitor
 CYP11A/CYP17 inhibitor
 CYP19 inhibitor
 CYP17/19 inhibitor
 AR antagonist
 AR antagonist
 AR agonist
 ER agonist
>5
NCC
NC
>20
1500
25
2
100
60
500
0.05
NC
    Kahl et al. (2007)
    Villeneuve et al. (in preparation)
    Villeneuve et al. (2008)
    Ankley etal. (2007)
    Ankley etal. (2002)
    Ankley etal. (2005)
    Martinovic et al. (2008)
    Jensen et al. (2004)
    Ankley etal. (2003)
  ' Abbreviations used: GABA, gamma-amino butyric acid; D2, dopamine; 3(3HSD, 3(3-hydroxysteroid dehydrogenase; CYP11A, cytochrome P450scc (side-chain-cleavage);
CYP17, cytochrome P450cl7,20-lyase; CYP19, cytochrome P450 aromatase; AR, androgen receptor; ER, estrogen receptor.
  b Lowest-observable effect concentration (LOEC) for egg production in 21-d tests. Values are nominal water concentrations provided in jxg/L
  c NC, not conducted/completed.
erence information for the completed studies. In terms of exposure
concentrations that cause impacts on reproductive health, the test
chemicals span a wide range of potency and efficacy, ranging from
trenbolone which significantly decreased egg production at a water
concentration of 0.05 |jig/L (Ankley et al., 2003), to trilostane which
affected egg production at a concentration of 1500 |jig/L (Villeneuve
et al., 2008). Some of the test chemicals (e.g., fipronil, haloperidol)
did not cause marked effects on reproductive endocrine function,
even  when tested at concentrations at maximum water  solubil-
ity, or within a factor of five  of those that produced toxicity in
short-term range-finding assays. When effects were observed, bio-
chemical and apical responses in the 21-d test generally reflected
the anticipated MOA of the test chemicals. For example, consistent
with  activation of the AR, trenbolone caused morphological mas-
culinization of female fathead  minnows, while the AR antagonist,
vinclozolin, demasculinized males (Ankley et al., 2003; Martinovic
et al., 2008). Although different enzymes were affected, inhibitors
of steroidogenesis (fadrozole, prochloraz, trilostane)  all decreased
Vtg concentrations in female fish due to a depression in synthesis
of estradiol (Ankley et al., 2002, 2005; Villeneuve et al., 2008).
   A  critical role of the Phase 1 studies in the  overall project is
delineation of hypothesized toxicity pathways across biological lev-
els of organization, such that  the Phase 2 and  3 transcriptomic,
proteomic and metabolomic data can be mechanistically linked to
higher-level apical responses. The fadrozole data provide a partic-
ularly good example of how this is achieved (Fig. 3). Fadrozole was
developed to treat breast cancer as a relatively specific inhibitor of
CYP19 aromatase, the enzyme  that catalyzes conversion of testos-
terone to estradiol. The pharmaceutical decreases brain and ovarian
aromatase  activity in vitro and in vivo in the fathead  minnow, and
produces a corresponding decrease in circulating plasma estradiol
concentrations in female fish  (Ankley  et al., 2002; Villeneuve et
                                      al., 2006). This, in turn, translates into a decreased circulating con-
                                      centration of Vtg (which is produced in the liver via activation of
                                      the ER) in the females and, ultimately, decreased deposition of the
                                      lipoprotein in the developing oocytes. This corresponds with signif-
                                      icant reductions in fecundity of fadrozole-exposed fish, resulting in
                                      complete cessation of egg production at higher fadrozole exposure
                                      concentrations (Fig. 3). As described in greater detail below,  these
                                      laboratory fecundity data can then be taken, via modeling, one step
                                      further to predict likely population-level responses of fish exposed
                                      to HPG-active chemicals (Fig. 2).
                                         In addition to providing  baseline effects and toxicity pathway
                                      data for the various chemicals, several other significant observa-
                                      tions have been made in the Phase 1 studies. One of these involves
                                      indirect changes in the HPG axis in response to certain EDCs. For
                                      example, evidence for at least some degree of compensation within
                                      the axis comes from 21-d tests with three of the test chemicals,
                                      trenbolone, vinclozolin and ketoconazole,  with the latter providing
                                      the most complete demonstration of the  phenomenon (Ankley et
                                      al., 2007). Ketoconazole is a pharmaceutical that decreases fungal
                                      growth through inhibition of an ergosterol (cell wall component)
                                      biosynthesis step catalyzed by CYP51. However, ketoconazole  is not
                                      particularly specific to CYP51, and can inhibit a variety of vertebrate
                                      CYPs involved  in xenobiotic metabolism and steroid biosynthesis.
                                      In fact, the fungicide is considered a model inhibitor of testosterone
                                      production in mammals (Feldman,  1986). In  the fathead minnow,
                                      ketoconazole decreased fecundity in 21 -d tests and, consistent with
                                      its anticipated MOA, inhibited  testosterone production by gonad
                                      tissue from both males and females. However, after a continuous
                                      21-d  exposure, this inhibition was not translated into decreased
                                      circulating testosterone (or estradiol) concentrations in vivo in the
                                      fish, suggesting that the animals were somehow able to compensate
                                      for effects of the fungicide. This response was manifested in several
                           (B)
                        (C)
                          .1-
liJ
        (D)
                                    Fadrozole (pg/l)
                                                               Fadrozole (pg/l)
(E)
                                                                                                               Exposure (d)
Fig. 3. Example of linkage of effects across biological levels of organization for a model endocrine-disrupting chemical, the aromatase inhibitor fadrozole (2, 10, 50jxg/L
water), tested in a 21-d reproduction assay with the fathead minnow. Panels (from left to right) depict (A) inhibition of aromatase activity in male and female fish, (B) decrease
in plasma estradiol (E2) concentrations in female fish, (C) depression in plasma vitellogenin (Vtg) concentrations in exposed females, (D) decreased Vtg deposition in the
ovary (compare the amount of dark staining material in the top [control] versus bottom [treated] fish), and (E) reduction in cumulative egg production in the fish. Results
from Ankley et al. (2002).
                                         Previous
                                  TOC

-------
                                          C.T. Ankley et al/Aquatic Toxicology 92 (2009) 168-178
                                                                                                                        173
different ways (Ankley et al., 2007). For example, in males there
was more than a two-fold increase in relative gonad weight, accom-
panied by a proliferation of testicular Leydig cells (responsible for
steroid production), and up-regulation of genes coding for two key
steroidogenic enzymes, CYP11A and CYP17, both of which could be
specific targets of ketoconazole. The net result of these alterations
was that circulating steroid concentrations in the fish did not differ
from controls after 21 d of exposure to the fungicide. Understand-
ing the basis of possible compensatory responses within the HPG
axis clearly is needed to identify reliable exposure indicators and
develop approaches to predict adverse effects of EDCs; achieving
this understanding is a critical component of the Phase 3 studies of
the overall project.

3.2. Phase 2

   Phase 2 zebrafish exposures and subsequent microarray mea-
surements have been  completed for all the chemicals shown in
Fig. 1. Initial analysis of the microarray data has been described
for three of the chemicals: ethinylestradiol, trenbolone and fadro-
zole (Wang et al., 2008a,b). One goal of the Phase 2 research was to
determine a flexible and efficient microarray experimental design
to (1) characterize the zebrafish transcriptome, and (2) identify
an optimal combination of gene feature selection/class prediction
algorithms for evaluating gene expression changes caused by EDCs
with different MOA. An unbalanced, incomplete block microarray
experimental design was tested using  various tissues of individ-
ual zebrafish exposed to fadrozole, trenbolone, or ethinylestradiol
(Wang et al., 2008a). Based on the high microarray reproducibil-
ity/low variability, low gene-specific dye bias, and good similarity
between microarray and PCR profiles observed, the design appears
well suited to these and other ecotoxicogenomic studies. Hyper-
spectral imaging identified a  cyanine 3-background contaminant,
and correction of  this fluorescence contamination reduced the
variability of weakly expressed genes, which constitute a signifi-
cant portion of the zebrafish transcriptome (Wang et al., 2008a).
Evaluation of several methods for gene classifier (indicator) dis-
covery determined that the optimal gene feature selection method
(of those tested) for reducing the dimensionality of microarrays
was via a genetic algorithm (GA), with the best prediction algo-
rithm of those evaluated, support vector machine  (SVM; Wang
et al., 2008b). These algorithms are being applied in subsequent
microarray experiments to identify multi-gene expression profiles
(classifiers) capable of discriminating  exposures  to  EDCs acting
through varying MOA based on microarray responses. As an exam-
ple, the preliminary analysis with the  three chemicals identified
classifiers that discriminated exposures to fadrozole, trenbolone,
and ethinylestradiol, with the first two chemicals clustering more
closely to one another as chemicals that depress rather than elevate
plasma estrogen activity (Wang et al., 2008b).
   Beyond identification of effective  microarray experimental
design and analysis approaches, the Phase 2 zebrafish experiments
fulfill two critical roles. First, they provide  a means to  compre-
hensively interrogate the large number of transcripts that code for
proteins known or hypothesized to play key roles in the regulation
of the teleost HPG axis. As opposed to real-time PCR and similar
approaches that target a single or relative handful of genes, microar-
rays can be used to survey hundreds or thousands of components of
a biological system (represented by transcripts) and simultaneously
evaluate their response to various stressors. This type of approach
provides the basis for conducting a hypothesis-driven investigation
of the response of an entire system and a means to test and refine
biologically-based systems models that may ultimately be applied
to predictive risk assessment (e.g., Villeneuve et al., 2007b).
   The second critical role of the zebrafish experiments is discovery.
Whereas Phase 1 studies  examine only those endpoints selected
      by the investigators based on some prior knowledge or hypothe-
      ses, the microarray and proteomic analyses conducted as part of
      the Phase 2 experiments are unsupervised. Because of the ability
      to screen hundreds (in the case of proteomics) or thousands (in
      the case of microarrays) of endpoints/targets, data from  Phase 2
      studies can be used to identify novel responses to the stressors
      examined. By examining gene ontologies  and pathways associ-
      ated with differentially expressed genes or proteins, it is  possible
      to identify a broad spectrum of processes and/or targets  that are
      impacted either directly or indirectly by the chemical stressor. Such
      knowledge can lead to an improved understanding of the overall
      biological impact of the stressor, and may also aid the identifica-
      tion of novel indicators (biomarkers) of exposure and/or effects.
      Hypotheses and putative indicators that emerge from  the Phase 2
      analyses are being tested in a supervised fashion in the  subsequent
      Phase  3  experiments, examining the robustness of the observa-
      tions both between experiments and among species. Through the
      combination of hypothesis- and discovery-driven analyses, Phase 2
      experiments test our overall systems model, expand on the analyses
      conducted in Phase 1, and provide a foundation for novel hypothesis
      testing in Phase 3.

      3.3.  Phase 3

         Phase  3 tests have  been conducted with six  chemicals  to
      date: fadrozole, trenbolone, prochloraz, vinclozolin, flutamide and
      trilostane. In addition, a preliminary Phase 3  like exposure with
      ethinylestradiol using fewer sampling times and with a  primary
      focus on metabolomic measurements has been completed (Ekman
      et al., 2008). Although much of the information associated with
      these studies (e.g., gene expression) is still being assembled and
      analyzed, some intriguing observations have already emerged.
         Compared to transcriptomic measurements, metabolomic anal-
      yses have  received less attention in  the field of ecotoxicology
      (Lin et al., 2006). However, knowledge of profiles of endogenous
      metabolites can provide important information concerning chem-
      ical  MOA,  thereby helping to identify exposure indicators and
      define  toxicity pathways. In addition, compared to transcriptomics
      and  proteomics, metabolomic analyses are relatively inexpensive
      and amenable to high-throughput, which enables a comparatively
      large number of  samples to be processed.  This is an important
      attribute for time-course studies, such as the Phase 3 work. Ini-
      tial metabolomic studies by Ekman et al. (2007) demonstrated the
      feasibility of NMR-based analyses of urine  samples from  the fat-
      head minnow to assess impacts of the anti-androgen vinclozolin
      on metabolite profiles. The use  of urine  in such studies not only
      allows one to assess important metabolic endpoints, but also pro-
      vides the potential for non-invasive and repeated sampling from
      individual fish over time. In more recent work,  Ekman et al. (2008,
      in press) demonstrated the potential for metabolomic measure-
      ments  to provide novel insights about responses of the fish HPG
      axis to chemical stressors. Adult fathead minnows of both sexes
      were exposed to two different concentrations of ethinylestradiol,
      and  animals were sampled after 1, 4 and 8 d of exposure, and
      8 d after termination of the exposure. NMR evaluation  of polar
      metabolites in livers of the fish revealed a greater impact of the
      estrogen on males than females; in addition, the metabolite pro-
      file in  exposed males  reflected a "feminization" response, in that
      the profile assumed similarities to that of female fathead minnows
      (2008). Assessment of the metabolomic data using partial  least-
      squares discriminant analysis revealed that response  trajectories
      in the  males showed  evidence of compensation of the fish dur-
      ing the ethinylestradiol exposure, as well  as a marked recovery
      after cessation of exposure to the estrogen (Fig. 4).  Evaluation of
      other more traditional endpoints in the fish (changes in plasma Vtg
      concentrations and secondary sex characteristics) indicated fern-
                                    Previous
TOC

-------
174
                                             C.T. Ankley et al/Aquatic Toxicology 92 (2009) 168-178
        0.4
        0.2-
      CN
      OT
         o-
       -0.2-
       -0.4
                                              8d (post-exp)
                       8d
          -0.6
-0.4
-0.2       0
    PLS1
0.2
0.4
Fig. 4. Exposure response trajectory plots for male fathead minnows exposed to
17a-ethinylestradiol for 1, 4, or 8 d, followed by 8 d of depuration (i.e., "post-
exp") during which the fish were maintained in water without test  chemical.
Exposures were conducted using two ethinylestradiol concentrations (either 10 or
100 ng/L) delivered via a continuous flow-through system. The scores plot shown
was generated using the first two components (i.e., PLS1 and PLS2) of a validated
partial-least squares discriminant analysis (PLS-DA) model built using NMR spectral
data acquired from the livers of these fish. Each point represents the average score
value for a given class (n = 7 or 8), shown with its associated standard error. Note:
the controls across all time points showed relatively little variation and thus were
modeled as a single class. Results from Ekman et al. (2008).
inization of the males, as well as (in the case of secondary sex
characteristics) recovery following termination of exposure, con-
firming that alterations observed in metabolite profiles are a robust
indicator of the physiological state of the animals exposed to the
estrogen. Ekman et al. (in press) also evaluated the non-polar frac-
tion of hepatic metabolites  from the ethinylestradiol study, and
noted a number of alterations in lipid profiles associated with expo-
sure to the estrogen. Ongoing MS-based  metabolomic studies are
focused on assessing changes in sex steroids and steroid precursors
in fish exposed to EDCs for differing periods of time. Overall, the
types of temporally-intensive data collected from NMR- and MS-
based metabolomic analyses can be used to better understand how
exposure parameters—such as chemical concentration, frequency,
and duration—influence adverse outcomes. This new understand-
ing can help regulators differentiate chemical exposures that have
a lasting and detrimental biological effect from those that are either
not effective, or those to which an organism can adapt (albeit with
some potential cost to the organism).
   Data from the fadrozole  Phase 3 study also  provide insights
as to compensatory responses of the HPG axis (Villeneuve et al.,
in press). As would be expected based  on the MOA of fadrozole
(described above),  exposure to the drug caused rapid (within 1
d), concentration-dependent reductions in estradiol production in
ex vivo assays with ovary tissue held in culture (detailed meth-
ods for the ex vivo  assay can be found in Ankley et al. (2007) and
Martinovic et al. (2008)), and plasma concentrations of both estra-
diol and Vtg in female fish  (Fig. 5). However, by the  eighth day
of the exposure period, ex vivo estradiol production had returned
to control levels, and plasma estradiol  concentrations  had also
recovered to  control levels  in the fish  exposed to  3 |jig fadro-
zole/L, albeit not in those exposed to 30 |Jig/L (Fig. 5). This apparent
compensation coincided with significant concentration-dependent
increases  in the abundance  of mRNA transcripts coding for aro-
matase (CYP19A isoform),  CYP11A, steroidogenic  acute regulatory
protein and follicle stimulating hormone receptor (Villeneuve et
al.,  in  press). Shortly after cessation of the  fadrozole exposure,
there was a rapid recovery of plasma estradiol concentrations in
the fish, and a gradual recovery of plasma Vtg concentrations, even
                                                              0    2   4    o   a    iu    n   it    ID
                                                                                Day
                                                             • Exposure period  .  . Recovery  period .


                                                Fig. 5. (A)  Ex vivo estradiol (E2) production, (B) plasma E2, and (C) plasma
                                                vitellogenin (Vtg) measured  in female fathead minnows exposed to 0, 3, or
                                                30 jxg fadrozole/L and sampled after 1, 2, 4, or 8 d of exposure or 1, 2, 4, or 8 d after
                                                cessation of exposure (days 9, 10, 12, 16, respectively; recovery period). Data are
                                                expressed as fold change (log 2) relative to the control mean measured on a given
                                                day. Error bars indicate standard error. The * and # indicate statistically significant
                                                difference from the control for the 3 and 30 jxg/L treatments, respectively (p<0.05).
                                                Results from Villeneuve et al. (in press).
                                                in the 30 |Jig/L group. In fact, in the 3 |Jig/L treatment, there was a
                                                brief period of elevated plasma estradiol accompanied by a seem-
                                                ing over-production of estradiol, ex vivo, relative to the  control
                                                group (Fig. 5). These data are consistent with the idea that estra-
                                                diol production rates had been increased as part of a compensatory
                                                response to the stressor. Of over a dozen transcript-level responses
                                                examined, expression of mRNAs coding for follicle-stimulating hor-
                                                mone  receptor appeared to have the greatest potential utility as
                                                an indicator of reproductive dysfunction mediated through the
                                                estradiol synthesis-disruption toxicity pathway, based on the rapid-
                                                ity, persistence, and  concentration-dependence of  the response
                                                (Villeneuve et al.,  in press). Thus, based on the preliminary Phase
                                                3  experiments substantively analyzed to date, the results  have
                                                shown excellent promise for identifying potentially useful expo-
                                                sure indicators, detailing toxicity pathway  characterization, and
                                                improving our understanding of compensatory responses of the
                                                HPG-axis to  chemical stressors, all of  which should enhance the
                                         Previous
                                            TOC

-------
                                            C.T. Ankley et al/Aquatic Toxicology 92 (2009) 168-178
                                                                                                                             175
                                CYP11A1	CYP17H	CYP17L
                         CHOL ^^^PREGknumiilMHPREG bzz±H DHEA
Fig. 6. Steroidogenesis model forthe female fathead minnow gonad based on in vitro data from control and fadrozole-treated fish. The model consists of two compartments,
medium and ovary tissue. Transport processes (black arrows) occur between the medium and ovary. Irreversible metabolic reactions (arrows with each pattern representing
a unique enzyme) occur in the ovary. Six enzymes labeled in italics next to reactions they catalyze are: cytochrome P450scc (side-chain-cleavage) (CYP11A1), cytochrome
P450cl7ahydroxylase (CYP17H), cytochrome P450cl7, 20-lyase (CYP17L), 3(3-hydroxy-dehydrogenase (3(3HSD), 17-beta-hydroxy-dehydrogenase (17(3HSD), and cytochrome
P450 aromatase (CYP19). Steroids and their precursors are: cholesterol (CHOL), pregnenolone (PREG), 17a-hydroxypregnenolone (HPREG), dehydroepiandrosterone (DHEA),
progesterone (PROG), 17a-hydroxyprogesterone (HPROG), androstenedione (AD), testosterone (T), estrone (El) and 17(3-estradiol (E2). Fadrozole is depicted as an inhibitor
of CYP19. The steroidogenic metabolic pathway encompasses two ovarian cell types: theca cells and granulosa cells. In theca cells, cholesterol is converted to AD and T. In
granulosa cells, AD and T are converted to El and E2. Model from Breen et al. (2007).
ability to generate predictive models with utility for ecological risk
assessment.

4. Integrating the data: predictive modeling

   To help design the Phase 1, 2 and 3 studies and subsequently
interpret and integrate the large amounts of data collected,  we
are using a systems biology/toxicology approach. Villeneuve et
al. (2007b) described development of a graphical systems model
focused on defining the HPG axis of teleost fish, which  enables
consideration of the interactive nature of the system at multiple
levels of biological organization, ranging from changes in gene, pro-
tein and metabolite expression profiles to effects in cells/tissues
that directly influence  reproductive success. The model  plays  a
role both in terms of designing our studies (e.g., deciding where
to perturb the system), and interpreting the sometimes seemingly
disparate biological responses observed (e.g., those associated with
compensation), both from hypothesis- and discovery-driven per-
spectives. The model also enables consideration of the HPG axis in
an integrated manner, such that effects of mixtures of chemicals
with similar or dissimilar MOA can be more directly evaluated. The
overall framework,  which is written in open-source code (SBML;
Systems Biology  Markup Language) is not intended to be static
but, rather, to evolve as this project  (and the many  other stud-
ies on EDCs and fish reproductive endocrinology being conducted
throughout the world) generate mechanistic data to better inform
the model. Although intended to support prediction of the effects of
HPG-active chemicals with different MOA on reproductive function
in fish, the model does  not do so from a quantitative ("computa-
tional") perspective. Rather, the model described by Villeneuve et
al. (2007b) provides a framework for incorporation of more  focused
computational models into an integrated assessment of the poten-
tial ecological risk of EDCs. Several of these types of computational
models associated with our current effort are discussed further
below.
   Steroid hormones are critical to maintenance of HPG axis func-
tion, and feedback controls on the system are achieved largely
through alterations in steroid production. In addition, several estab-
       lished EDCs exert adverse effects through their ability to directly
       modulate (generally inhibit) different enzymes involved in steroid
       synthesis (Figs. 1 and 6). Despite the importance of steroid produc-
       tion, until recently there had been no mechanistic computational
       models for describing baseline and/or chemically-perturbed con-
       ditions in vertebrates. As part of our effort, Breen et al. (2007)
       developed a steady-state model to predict synthesis and release
       of testosterone and estradiol by ovarian tissue, and evaluated the
       model using data  generated  from the fathead minnow (Fig.  6).
       Model-predicted concentrations of the two steroids over time cor-
       responded well with both baseline (control) data, and information
       from experiments  in  which estradiol synthesis was  blocked  by
       fadrozole. A sensitivity analysis  of the model identified specific
       processes that most influenced production of testosterone and
       estradiol, thereby lending insights as to potential points of control
       in the HPG axis. We have further developed predictive capabilities
       for understanding Steroidogenesis by integration of the  graphical
       model of Villeneuve et al. (2007b), with the steady state model of
       Breen et al. (2007), and the Hao  et al. (2006) model of G protein,
       protein kinase A, and steroid acute regulatory protein activation, to
       examine effects of chemicals on steroid production and regulation
       (Shoemaker et al.,  2008). In this expanded model, we examined
       the  role of local  regulation (within the ovary) and global regula-
       tion (between components of the HPG axis) in maintaining control
       of steroid synthesis. Incorporation of gene expression  data into
       the  Shoemaker et al. (2008) model suggests that local regulation
       reacts to fadrozole to increase gene  expression of Steroidogene-
       sis enzymes. Higher enzymatic  capability is then coupled with
       increased cholesterol transport due to testosterone and estradiol
       feedback regulation via the pituitary and hypothalamus. The fat-
       head minnow  appears to react  locally  in the  ovary  to  increase
       steroidogenic enzymes and inter-organ signaling reacts to increase
       cholesterol pools available to the enzymes. From this model we can
       formulate additional, testable hypotheses by which feedback reg-
       ulation, combined  with local gene expression, could compensate
       for low-dose chemical exposure. In addition, these mechanism-
       based models should facilitate the study of the effects of mixtures of
       EDCs with different MOA, by predicting inhibition constants of each
                                      Previous
TOC

-------
176
                                          C.T. Ankley et al./Aquatic Toxicology 92 (2009) 168-178
steroidogenic enzyme for the individual chemicals. As information
from additional EDCs are used for parameterization and valida-
tion, the computational models will serve as an effective "module"
within the broader systems framework for making quantitative pre-
dictions of the impacts on endocrine function of either direct or
indirect chemical effects on steroidogenesis.
   Documenting alterations in function of the HPG axis at molecu-
lar and biochemical levels is useful to regulatory decision-making
only if these changes can be translated into effects at higher biolog-
ical levels of organization (Fig. 2). To achieve this we are developing
two predictive models: one for male and another for female fathead
minnows. Both models  are physiologically-based computational
models that link exposure to EDCs with changes in measured repro-
ductive endpoints such as plasma concentrations of sex steroids and
Vtg. The models are parameterized using information both from
large-scale "control" datasets (Watanabe et al., 2007), and data from
Phase 1 and 3 studies with EDCs with differing MOA. In males, our
model currently accounts for endocrine  responses to estradiol or
ethinylestradiol based on their relative binding affinities to ERs and
salient downstream effects of ER activation, such as induction of
Vtg production (Watanabe et al., in press). The model is formulated
in such a way that other estrogenic chemicals (or mixtures of chem-
icals) can be simulated with only minor modifications. Our model
for female fathead minnows incorporates biological processes anal-
ogous to the male model, but extends the male model by including
the AR, and interactions  of the receptor with agonists and antago-
nists (K.H.W.,  unpublished data). Both models yield predictions of
plasma concentrations of sex steroids and Vtg which fit measured
data from unexposed and exposed fathead minnows. To connect the
physiologically-based model at the individual organism level with
a population model for predictions at a  higher level of biological
organization, predictions of changes in plasma Vtg concentrations
can be used as input into a dynamic population model (Murphy et
al., 2005).
   The inclusion of microarray,  proteomics,  and metabolomics
measurements in these studies also allows the  reverse engineering
of molecular networks (Schadt and Lum, 2006) which can then be
compared with the measured endpoints. The use of this approach
for disease characterization (Loscalzo et al., 2007) and subsequent
target discovery for drug development (Chen et al., 2008) has been
described, and application to environmental risk assessment has
been proposed (Edwards and  Preston, 2008). In this project, the
molecular networks derived from genome-wide  measurements
provide an unbiased assessment of the physiologically-based mod-
els discussed above. This aids in the interpretation of the existing
models since  the completeness of each model  can be estimated
based on the percentage of variation in the  molecular network
explained by the descriptive model. It also aids in further devel-
opment of the physiologically-based models by providing clues as
to the missing components of each model.
   The final component of the project involves development of tools
for  the prediction of population-level effects  of EDCs. Except for
instances in which threatened or endangered species are involved,
most ecological assessments of the risk of contaminants ultimately
are concerned with potential population-level responses. Kidd et
al. (2007) evaluated the effects of ethinylestradiol on fish popu-
lations in dosing studies with a whole-lake ecosystem; however,
the opportunity to conduct a controlled study of this  magnitude is
rare, so the only practical way to routinely link the effects of EDCs
in individuals to population-level impacts is via modeling (Gleason
and Nacci, 2001; Brown et al., 2003; Hurley et al., 2004; Gurney,
2006). Miller and Ankley (2004) describe a modeling approach for
predicting the status of  fathead minnow populations exposed to
trenbolone, based on fecundity data from the 21-d Phase 1 study
design. The basic model employs a Leslie matrix in conjunction with
the logistic equation (to account for density dependence) to trans-
    late laboratory toxicity information into prediction of population
    trajectories. Miller et al. (2007) expanded on this effort by first
    relating changes in Vtg to fecundity in female fathead minnows,
    and then using this relationship in the population model to predict
    population status in fish exposed to EDCs which inhibit produc-
    tion of the egg yolk protein, most notably compounds that depress
    steroid synthesis (e.g., fadrozole, prochloraz, trenbolone; Fig.  1).
    That analysis is unique in that it focuses on a biochemical endpoint,
    female Vtg, that reflects both toxic MOA of EDCs and has a functional
    relationship to reproductive success (formation of eggs). As such,
    within the overall systems framework for the project, the computa-
    tional model described by Miller et al. (2007) and Miller and Ankley
    (2004) can serve as the basis via which genomic information can
    be quantitatively translated to responses in populations.

    5. Prospectus

      In this paper we describe a MOA/systems-based research effort
    with HPG-active chemicals that will help provide the technical basis
    for development  of predictive toxicology tools  (models, in vitro
    and short-term in vivo assays) which could improve the efficiency
    of current testing and monitoring programs for EDCs. As we con-
    template informational needs for chemical risk assessments in the
    coming years, it is clear that historical toxicology approaches which
    focus  mostly  on generating empirical data cannot solely suffice.
    Toxicologists and risk assessors are being asked to do more with
    fewer resources, in a  sociopolitical environment that emphasizes
    reduced animal testing. Examples  of new testing  mandates that
    promise to require additional toxicity data for a large number of
    chemicals include the REACH (registration, evaluation,  authoriza-
    tion and restriction of chemicals) program in Europe, and the high
    production volume  challenge program in the US, in addition to a
    variety of EDC testing efforts throughout the world. In recognition
    of these informational needs, the NRC (2007) proposed a greater
    emphasis on predictive toxicology tools to support human health
    assessments. There is  an analogous requirement for advanced pre-
    dictive methods in ecotoxicology, and many of the tools discussed
    in the NRC report are applicable to ecological risk assessments.
    However, there are added challenges in ecological assessments; for
    example, in contrast to human health toxicology, ecological assess-
    ments need to extrapolate toxicity from a few  (sometimes one)
    species to many (sometimes thousands), and require  an under-
    standing of impacts of chemicals at the population (rather than
    individual) level. We  feel that the research approach presented
    herein provides a broad conceptual framework for developing
    mechanism-based, predictive approaches for effectively assessing
    the ecological risk of chemicals with a variety MOA, in addition to
    EDCs.

    Acknowledgements

      We thank our many colleagues who have been involved in dif-
    ferent aspects of this work, including L Blake, J. Brodin, J. Cavallin, E.
    Durhan, K. Greene, M. Kahl, A. Linnum, E. Makynen and  N. Mueller
    from the Duluth EPA lab; M. Henderson and Q, Teng from the Athens
    EPA lab; A. Biales, M. Kostich, D. Lattier and G. Toth from the Cincin-
    nati EPA lab; X. Guan,  C. Warner, L. Escalon, Y. Deng and S. Brasfield
    from the US Army Engineer Research and Development Center; J.
    Shoemaker, K. Gayen, and F.J. Doyle III from the University of Califor-
    nia at Santa Barbara; K. Kroll and C. Martyniuk from the University
    of Florida; and Z. Li from Oregon Health Sciences University.
      We also thank A. Miracle, who was involved  in initial concep-
    tualization of this project, and R. Kavlock and D. Hoff for helpful
    comments on an earlier version of the manuscript.
      This work  was supported in part by the USEPA National Cen-
    ter for Computational Toxicology, a grant from the USEPA National
                                       Previous
TOC

-------
                                                     G.T. Ankley et al/Aquatic Toxicology 92 (2009) 168-178
                                                                                                                                                       177
Center  for Environmental  Research  to the University  of Florida,
and the US Army Environmental Quality Installations Program. The
manuscript has  been reviewed in accordance with USEPA guide-
lines; however, the views expressed are those of the authors and
do not necessarily reflect USEPA policy. Permission was granted by
the Chief of Engineers to publish this information.


References

Ankley, G.T., Johnson, R.D., 2004. Small fish models for identifying and assessing the
    effects of endocrine-disrupting chemicals. Inst. Lab. Anim. Res. J. 45,469-483.
Ankley, G.T., Villeneuve, D.L., 2006. The fathead minnow in aquatic toxicology: past,
    present and future. Aquat. Toxicol. 78,91-102.
Ankley, G.T., Jensen, K.M.,  Kahl, M.D., Korte, J.J.,  Makynen, E.A., 2001. Description
    and evaluation of a short-term reproduction test with the fathead minnow
    (Pimephales promeias).  Environ. Toxicol. Chem. 20,1276-1290.
Ankley, G.T., Kahl, M.D., Jensen, K.M., Hornung, M.W., Korte, J.J., Makynen, E.A., Leino,
    R.L., 2002. Evaluation of the aromatase inhibitor fadrozole in a short-term repro-
    duction assay with the fathead minnow (Pimephales promeias). Toxicol. Sci. 67,
    121-130.
Ankley, G.T., Jensen, K.M., Makynen, E.A., Kahl, M.D., Korte, J.J., Hornung, M.W., Henry,
    T.R., Denny, J.S., Leino, R.L., Wilson, V.S., Cardon, M.C., Hartig, P.C.,  Gray, L.E.,
    2003. Effects of the androgenic growth promoter 17(3-trenbolone on fecundity
    and reproductive endocrinology of the fathead minnow (Pimephales promeias).
    Environ. Toxicol. Chem. 22,1350-1360.
Ankley, G.T., Jensen,  K.M., Durhan, E.J., Makynen, E.A., Butterworth, B.C., Kahl, M.D.,
    Villeneuve, D.L., Linnum, A., Gray, L.E., Cardon, M., Wilson, V.S.,  2005. Effects
    of two  fungicides with multiple modes of  action on reproductive endocrine
    function in the fathead minnow (Pimephales promeias). Toxicol.  Sci. 86, 300-
    308.
Ankley, G.T., Daston, G.P.,  Degitz, S.J., Denslow, N.D., Hoke, R.A., Kennedy, S.W.,
    Miracle, A.L., Perkins, E.J., Snape, J., Tillitt, D.E., Tyler, C.R., Versteeg, D., 2006. Tox-
    icogenomics in regulatory ecotoxicology. Environ. Sci. Technol. 40,4055-4065.
Ankley, G.T, Jensen, K.M., Kahl, M.D., Makynen, E.A., Blake, L.S., Greene, K.J., Johnson,
    R.D., Villeneuve, D.L, 2007. Ketoconazole in the fathead minnow (Pimephales
    promeias): reproductive toxicity and biological compensation. Environ. Toxicol.
    Chem. 26,1214-1223.
Bradbury, S.P., Feijtel, T.C.J., Van Leeuwen, T.C.J., 2004. Meeting the scientific needs
    of ecological risk assessment in a regulatory context. Environ. Sci. Technol. 38,
    463A-470A.
Breen, M.S., Villeneuve, D.L., Breen, M., Ankley, G.T., Conolly, R.B., 2007. Mecha-
    nistic computational model of ovarian steroidogenesis to predict biochemical
    responses  to  endocrine active  compounds. Ann.  Biomed.  Eng.  35,  970-
    981.
Brown, A.R., Riddle, A.M., Cunningham, N.L., Kedwards, T.J., Shillabeer, N.S., Hutchin-
    son, T.H., 2003. Predicting the effects of endocrine disrupting chemicals on fish
    populations. Hum. Ecol. Risk Assess. 9, 761-788.
Chen, Y., Zhu, J., Lum, P.Y., Yang, X., Pinto, S., MacNeil, D.J., Zhang, C, Lamb, J., Edwards,
    S., et al., 2008. Variations in DNA elucidate molecular networks that cause dis-
    ease. Nature 452, 429-435.
di Bernardo, D., Thompson,  M.J., Gardner, T.S., Chobot, S.E., Eastwood, EX., Wojtovich,
    A.P., Elliott, S.J.,  Schaus, S.E., Collins, J.J., 2005. Chemogenomic profiling on a
    genome-wide scale using reverse-engineered gene networks. Nat. Biotechnol.
    23,377-383.
Edwards, S.W., Preston, R.J., 2008. Systems biology and mode of action based risk
    assessment. Toxicol. Sci. 106,312-318.
Ekman, D.R., Teng, Q., Jensen, K.M., Martinovic, D.,  Villeneuve,  D.L, Ankley, G.T,
    Collette, T.W., 2007. NMR analysis of fathead minnow urinary metabolites: a
    potential approach for  studying impacts of chemical exposures. Aquat. Toxicol.
    85,104-112.
Ekman, D.R., Teng, Q, Villeneuve, D.L., Kahl, M.D., Jensen,  K.M., Durhan, E.J., Ankley,
    G.T., Collette, T.W., 2008. Investigating compensation and recovery of fathead
    minnow (Pimephales promeias) exposed to 17a-ethynylestradiol with metabolite
    profiling. Environ. Sci. Technol 42,4188-4194.
Ekman, D.R., Teng, Q., Villeneuve, D.L., Kahl, D.L., Jensen, K.M., Durhan, E.J., Ankley,
    G.T., Collette, T.W., in press. Profiling  lipid metabolites yields unique informa-
    tion on gender- and time-dependent responses of fathead minnows (Pimephales
    promeias) exposed to 17a-ethynylestradiol. Metabolomics.
Feldman, D., 1986. Ketoconazole  and  other imidazole derivatives as inhibitors of
    steroidogenesis. Endocr. Rev. 7,409-420.
Filby, A.L., Thorpe, K.L., Maack, G., Tyler, C.R., 2007. Gene expression profiles revealing
    the mechanisms of anti-androgen- and estrogen-induced feminization in fish.
    Aquat. Toxicol. 81, 219-231.
Gleason, T.R., Nacci, D.E., 2001. Risks of endocrine-disrupting compounds to wildlife:
    extrapolating from effects on individuals to  population response. Human Ecol.
    Risk Assess. 7,1027-1042.
Gurney, W.S.C., 2006. Modeling the demographic effects of endocrine disrupters.
    Environ. Health Perspect. 114 (Suppl. 1), 122-126.
Hao, H., Zak, D.E., Sauter, T.,  Schwaber, J., Ogunnaike, B.A., 2006. Modeling the VPAC2-
    Activated cAMP/PKA signaling pathway: from receptor to circadian clock gene
    induction. Biophys.J. 90,1560-1571.
Hill, A.J., Teraoka, H., Heideman, W., Peterson, R.E., 2005.  Zebrafish as a model ver-
    tebrate for investigating chemical toxicity. Toxicol. Sci. 86, 9-16.
        Hoffmann, J.L.,  Torontali, S.P.,  Thomason,  R.G.,  Lee, D.M., Brill, J.L., Price, B.B.,
           Carr,  G.J.,  Versteeg,  D.J.,  2006.  Hepatic gene  expression profiling using
           GeneChips in zebrafish exposed to 17a-ethynylestradiol. Aquat. Toxicol. 79,
           233-246.
        Hoffmann, J.L., Thomason, R.G., Lee, D.M., Brill, J.L., Price, B.B., Carr, G.J., Versteeg, D.J.,
           2008. Hepatic gene expression profiling using GeneChips in zebrafish exposed
           to 17a-methyldihydrotestosterone. Aquat. Toxicol. 87,69-80.
        Hook, S.E., Skillman, A.D., Small, J.A., Schultz, I.R., 2006. Gene expression patterns
           in rainbow trout, Oncorhynchus mykiss,  exposed to a suite of model toxicants.
           Aquat. Toxicol. 77,372-385.
        Hurley, M.A., Matthiessen, P., Pickering, A.D., 2004. A model for environmental sex
           reversal in fish. J. Theoret. Biol. 227,159-165.
        Jensen, K.M., Kahl, M.D., Makynen, E.A., Korte, J.J., Leino, R.L., Butterworth, B.C., Ank-
           ley, G.T.,  2004. Characterization of responses  to the antiandrogen flutamide in
           a short-term reproduction assay with the fathead minnow. Aquat. Toxicol. 70,
           99-110.
        Kahl, M.D., Bencic, D.C., Blake, L.S., Brodin, J.D., Durhan, E.J., Jensen, K.M., Ankley, G.T.,
           2007. Evaluation of a novel mechanism  of endocrine disruption in the fathead
           minnow. In: Abstracts, 28th Annual Meeting of the Society  of Environmental
           Toxicology and Chemistry, Milwaukee, WI.
        Kavlock, R.J., Daston, G.P., DeRosa, C, Fenner-Crisp, P., Gray, L.E.,  Kaattari, S., et al.,
           1996. Research needs forthe risk assessment of health and environmental effects
           of endocrine disrupters: a report of the U.S. EPA-sponsored workshop. Environ.
           Health Perspect. 104 (Suppl. 4), 715-740.
        Kidd, K.A., Blanchfield, P.J., Mills, K.H., Palace, V.P., Evans, R.E., Lazorchak, J.L., Flick,
           R.W., 2007. Collapse of a fish population after exposure to a synthetic estrogen.
           Proc. Natl. Acad. Sci. 21, 8897-8901.
        Lin, C.Y., Viant, M.R., Tjeerdema, R.S., 2006. Metabolomics: methodologies and appli-
           cations in the environmental sciences. J. Pestic. Sci. 31, 245-251.
        Loscalzo, J., Kohane, I., Barabasi, A.L., 2007. Human disease classification in the
           postgenomic era: a complex systems approach to human pathobiology. Mol.
           Syst. Biol. 3,124.
        Martinovic, D., Blake, L.S., Durhan, E.J., Greene, K.J., Kahl, M.D., Jensen, K.M., Makynen,
           E.A., Villeneuve, D.L, Ankley, G.T., 2008. Reproductive toxicity of vinclozolin in
           the fathead  minnow:  confirming an anti-androgenic mode of action. Environ.
           Toxicol. Chem. 27,478-488.
        Martyniuk, C.J., Gerrie, E.R., Popesku, J.T., Ekker, M., Trudeau, V.L, 2007. Microarray
           analysis in the zebrafish (Danio rerio) liver and telencephalon after exposure to
           low concentration of 17a-ethinylestradiol. Aquat. Toxicol 84,38-49.
        Miller, D.H., Ankley, G.T, 2004. Modeling impacts on populations: fathead minnow
           (Pimephales promeias) exposure to the endocrine disrupter 17(3-trenbolone as a
           case study. Ecotoxicol. Environ. Safety 59,1-9.
        Miller, D.H., Jensen, K.M., Villeneuve, D.L, Kahl, M.D., Makynen, E.A., Durhan, E.J.,
           Ankley, G.T., 2007. Linkage of biochemical responses to population-level effects:
           a case study with vitellogenin in the fathead minnow (Pimephales promeias).
           Environ. Toxicol. Chem. 26,521-527.
        Murphy, C.A., Rose, K.A., Thomas, P., 2005.  Modeling vitellogenesis in female fish
           exposed  to environmental stressors: predicting the effects  of endocrine  dis-
           turbance due to exposure to a PCB mixture and cadmium. Repro. Toxicol. 19,
           395-409.
        National Research Council (NRC), 2007. Toxicity Testing in the 21st century: A Vision
           and a Strategy. National Academies Press, Washington, DC.
        Samuelsson,  L.M., Forlin, L,  Karlsson, G., Adolfsson-Erici, M.,  Larsson, D.G.J., 2006.
           Using NMR metabolomics to identify responses of an environmental estrogen in
           blood plasma offish. Aquat. Toxicol. 78,341-349.
        Schadt, E.E., Lum,  P.Y., 2006. Thematic review series: Systems biology approaches
           to metabolic and cardiovascular disorders. Reverse engineering gene networks
           to identify key drivers of complex disease phenotypes. J. Lipid Res. 47, 2601-
           2613.
        Shoemaker, J.E., Gayen, 1C, Garcia-Reyero, N.,  Perkins, E.J., Doyle III, F.J., 2008. Fathead
           minnow steroidogenesis—in vitro modeling and experimentation reveals global
           regulation of sex hormone synthesis. In: International Conference on Systems
           Biology, Gothenburg, Sweden, August 23-27.
        Stoskopf, M.K., 2001. Introduction. Inst. Lab.  Anim. Res. J. 42, 271-273.
        US Environmental Protection Agency. 1998. Endocrine disrupter screening and test-
           ing advisory committee (EDSTAC) final  report. Office of Prevention, Pesticides
           and Toxic Substances, US Environmental  Protection Agency, Washington, DC.
           http://www.epa.gov/scipoly/oscpendo/pubs/edspoverview/finalrpt.htm.
        Villeneuve, D.L, Knoebl, I., Kahl, M.D., Jensen, K.M., Hammermeister, D.E., Greene,
           K.J., Blake, L.S., Ankley, G.T., 2006. Relationship between  brain and ovary aro-
           matase activity and isoform-specific aromatase mRNA expression in the fathead
           minnow (Pimephales promeias). Aquat. Toxicol. 76,353-368.
        Villeneuve, D.L, Blake, L.S.,  Brodin, J.D., Greene, K.J., Knoebl,  I., Miracle, A.L,  Mar-
           tinovic, D., Ankley, G.T.,  2007a. Transcription of key genes regulating gonadal
           steroidogenesis in control and  ketoconazole- or vinclozolin-exposed fathead
           minnows. Toxicol. Sci. 98,395-407.
        Villeneuve, D.L, Larkin, P., Knoebl, I., Miracle, A.L, Kahl, M.D., Jensen, K.M., Makynen,
           E.A., Durhan, E.J., Carter,  B.J., Denslow, N.D., Ankley, G.T.,  2007b. A graphical sys-
           tems model to facilitate hypothesis-driven ecotoxicogenomics research on the
           teleost brain-pituitary-gonadal axis. Environ. Sci. Technol. 41,321-330.
        Villeneuve, D.L, Blake, LS., Brodin, J.D., Cavallin, J.E., Durhan, E.J., Jensen, K.M., Kahl,
           M.D., Makynen, E.A., Martinovic, D., Mueller, N.D., Ankley, G.T., 2008. Effects of a
           3(3-hydroxysteroid dehydrogenase inhibitor, trilostane, on the fathead minnow
           reproductive axis. Toxicol. Sci. 104,113-123.
        Villeneuve, D.L, Mueller, N.D., Martinovic, D., Makynen, E.A., Kahl, M.D.,Jensen, K.M.,
           Durhan, E.J., Cavallin, J.E., Bencic, D., Ankley, G.T, in press. Direct effects, com-
                                             Previous
TOC

-------
178
                                                     C.T. Ankley et al./Aquatic Toxicology 92 (2009) 168-178
    pensation and recovery in female fathead minnows exposed to the aromatase
    inhibitor fadrozole. Environ. Health Perspect.
Wang, R.-L., Biales,  A., Bencic, D., Lattier, D., Kostich, M., Villeneuve, D., Ankley,
    G.T., Lazorchak, J., Toth, G., 2008a. DNA microarray application in ecotoxicology:
    experimental design, microarray scanning, and factors affecting transcriptional
    profiles in a small fish species. Environ. Toxicol. Chem. 27,652-663.
Wang, R.-L.,  Bencic, D., Biales, A.,  Lattier,  D., Kostich,  M., Villeneuve, D., Ank-
    ley, G.T., Lazorchak, J., Toth, G., 2008b. DNA microarray-based ecotoxicological
    biomarker discovery in a small fish model species. Environ. Toxicol. Chem. 27,
    664-675.
Watanabe,  K.H., Jensen, K.M., Orlando, E.F., Ankley, G.T., 2007. What is normal? A
    characterization of the values and variability in reproductive endpoints of the
    fathead minnow, Pimephales promelas. Comp. Biochem. Physiol. 146,348-356.
Watanabe  K.H., Li, Z.,  Kroll, It, Villeneuve, D.L., Garcia-Reyero, N., Orlando, E.F.,
    Sepulveda, M.S., Collette, T.W, Ekman, D.R., Ankley, G.T., Denslow, N.D., in press.
    A computational model of the hypothalamic-pituitary-gonadal axis in male
    fathead minnows. Toxicol. Sci.
World Health Organization (WHO), 2002. IPCS Global Assessment of the State-
    of-the-Science of Endocrine Disruptors. International Programme on Chemical
    Safety, WHO/PCS/EDC/02.2.



-------
                                                                                                              Commentary
Exposure as  Part  of a Systems Approach for Assessing  Risk
Linda S. Sheldon1 and Elaine A. Cohen Hubal2
1National Exposure Research Laboratory, and 2National Center for Computational Toxicology, U.S. Environmental Protection Agency,
Research Triangle Park, North  Carolina, USA
 BACKGROUND: The U.S. Environmental Protection Agency is facing large challenges in managing
 environmental chemicals with increasingly complex requirements for assessing risk that push
 the limits of our current approaches. To address some of these challenges, the National Research
 Council (NRC) developed a new vision for toxicity testing. Although the report focused only on
 toxicity testing, it recognized that exposure science will play a crucial role in a new risk-based
 framework.
 OBJECTIVE: In this commentary we expand on tlie important role of exposure science in a fully inte-
 grated system for risk assessment. \^e also elaborate on tlie exposure research needed to achieve this
 vision.
 DISCUSSION: Exposure science, when applied in an integrated systems approach for risk assessment,
 can be  used to  inform and prioritize toxicity testing, describe risks, and verify the outcomes of
 testing. Exposure  research in several areas will be needed to achieve the NRC vision. For example,
 models are needed to screen chemicals based on exposure. Exposure, dose—response, and biological
 pathway models must be developed and linked. Advanced computational approaches are required
 for dose reconstruction. Monitoring methods are needed that easily measure exposure,  internal
 dose, susceptibility, and biological outcome. Finally, population monitoring studies are needed to
 interpret toxicity test results in terms of real-world risk.
 CONCLUSION: This commentary is a call for the exposure community to  step up to the challenge by
 developing a predictive science with the knowledge and tools for moving into the 21st century.
 KEY WORDS: computational biology, exposure science, modeling, risk assessment, systems  biology.
 Environ Health Perspect 117:1181-1184 (2009). doi:10.1289/ehp.0800407 available via http://
 dx.doi.org/ [Online 8 April 2009]
The U.S. Environmental Protection Agency
(EPA) and other regulatory agencies are respon-
sible for managing large numbers of environ-
mental chemicals. Although current regulatory
decisions are based on a wide range of tools and
information that represent the best available
science, often limited or no exposure or toxic-
ity data are available for making these decisions
(Judson et al. 2009). Recent statutory changes
require increasingly complex approaches for
evaluating the impact of life-stage vulnerabil-
ity, genetic susceptibility, varying exposure sce-
narios, and exposures to multiple stressors on
environmental health risks. These  new require-
ments push the limits of our current tools and
scientific  understanding. Fortunately, the rapid
explosion of new computational, physical, and
biological science tools have the potential to
address these challenges and to transform the
ways in which exposure and  toxicity testing
come together to  assess health risks.
    Because  of  the  number of chemicals
involved and the increasing  complexity
of future assessments,  new approaches are
needed. To examine and address  these limita-
tions, the National Research Council (NRC)
evaluated the issues and developed a frame-
work for toxicity testing as it could be applied
to risk assessment. The report, Toxicity Testing
in the 21st Century: A Vision and Strategy
(NRC 2007), articulates a long-range vision
that applies systems biology, rapid assay tech-
nologies, and bioinformatic tools to improve
toxicity  testing. Although  the focus was
intentionally on toxicity testing,  the NRC
recognized that exposure must be a key com-
ponent if the intended goal is to evaluate risks
and inform public health decisions. Exposure
science, when incorporated throughout the
entire framework, will increase the efficiency
of the testing process, help inform toxicity
testing, describe risks,  and verify the  out-
comes of new risk assessment approaches. The
NRC recommended that exposure  science be
considered at  every step in the new testing
and risk assessment strategy.
   In this commentary, we establish a  stra-
tegic  framework for the  exposure research
needed  to achieve a new approach for risk
assessment. Crucial to this vision is the appli-
cation of a systems  approach that  fully inte-
grates exposure  and toxicity information in
a holistic framework for improved public
health decision making.  We also elaborate on
the exposure research needed to achieve this
new vision.
A Systems Approach for
Assessing Risk
The authors of Toxicity Testing in the 21st
Century (NRC 2007) proposed to use sys-
tems biology to serve as the basis for a new
toxicity-testing paradigm. The fundamen-
tal construct is to develop in vitro tests to
characterize toxicity pathway perturbations
and then predict health impacts that could
result from these perturbations. If we broaden
this vision, systems theory also provides the
required conceptual framework for linking
exposure science and toxicology in order to
study, characterize, and predict  the complex
interactions between humans and environ-
mental chemicals that lead to health risks.
    Toxicity pathways, as articulated by the
NRC, are normal pathways for maintain-
ing cellular functions that, when sufficiently
perturbed,  will lead to an adverse health out-
come. The consequences of a perturbation
depend on its magnitude,  which is related to
dose at the cellular level, the timing and dura-
tion of the perturbation, and the  susceptibility
and life stage of the host. Exposure science
provides information on the magnitude, tim-
ing, and duration of individual exposure as
well as the  resulting dose at the tissue, cellular,
and even molecular level (Cohen Hubal et al.
2008). Importantly, exposure information will
determine  whether toxicity pathways can be
perturbed and whether there is a risk.
    A fully integrated systems approach will
reduce many of the uncertainties with current
risk assessment  approaches. Understanding
the mechanisms of toxicity pathways will
reduce uncertainties associated with using
animal data to  predict human risk. When
integrated  with  exposure and dose informa-
tion, it also affords the opportunity to reduce
uncertainties associated with using high doses
to predict  risk at lower environmental expo-
sures,  predict cumulative risks, and predict
risk to susceptible populations.

Exposure  Science for the
21st Century
Because of the complex  nature of the human
system, health risk predictions associated with
chemical exposures will be  only as good as the
least resolved or  least understood component.
Advanced tools are  available to rapidly exam-
ine toxicity pathways at a  depth and breadth
not previously possible. For a fully integrated
system, a comparable set of advanced exposure
Address correspondence to L.S. Sheldon, U.S. EPA,
National Exposure Research Laboratory, Mail Drop
D305-01, 109 Alexander Dr., Research  Triangle
Park, NC 27711 USA. Telephone: (919) 541-2205.
Fax: (919) 541-0445. E-mail: sheldon.linda@epa.gov
  We thank  L. Reiter, R. Kavlock, H. Zenick, and
J. Blancato for valuable discussions and suggestions.
  This work was reviewed by the U.S. EPA and
approved for publication, but does not necessarily
reflect official agency policy.
  The authors declare  they have no  competing
financial interests.
  Received 18 November 2008; accepted 8 April
2009.
Environmental Health Perspectives •  VOLUME 117 I NUMBER 81 August 2009
                                        Previous
                  TOC
                                                                                 1181

-------
Sheldon and Cohen Hubal
tools must be developed; these tools must be
rapid, efficient, and predictive.
    Exposure science provides the linkages
between what is present in the environment
and the internal dose that individuals and
populations receive.  A strategic long-term
program for exposure research must develop
predictive computation tools based on  a
mechanistic understanding of important (i.e.,
rate-limiting) exposure processes and determi-
nants. High-priority research needs include
the development and application of
• integrated modeling approaches to reliably
  predict exposure and dose
• highly efficient screening tools for chemical
  prioritization
• easily accessible exposure databases aligned
  with toxicity databases
• efficient and affordable tools for generating
  new exposure and dose data.
    Integrated modeling approaches for pre-
dicting exposure and dose. Computational
models that can be efficiently integrated to
predict exposure and dose at the toxicity path-
way are fundamental  to the new risk assess-
ment vision. These models, in turn, should be
integrated with dose-response and biological
pathways models to describe the entire source
to outcome continuum.
    Exposure models estimate concentrations of
chemicals in environmental media and describe
activities that bring individuals into contact
with the contaminated media.  Several mod-
els have been developed and applied for this
purpose (Williams et al., in press). The U.S.
EPA Stochastic Human Exposure and Dose

                    SHEDS
D)
=
.as
V)
c
•£

F
<


E
Q_
Q_
•^ o
15 2

— 0>
o




D)
•E


cc
o



1x10-6
5x10-7




0
1 X 1 0~6
8x10-7

4x10-7
2x10-7



c

0.0006
0.0004

0.0002
o

C

^ Dermal <*
exposure *^^**
|
^



500 1,000 1,50
Time (min)
* Inhalation n t
exposure |:|
II I
r^.»J!U
i
„.,.», ,.„ J


500 1,000 1.5C
Time (min)
• Oral exposure 1
I i
J
*1
,«, 	 „_,.. ,,«JiJ

500 1,000 1,50
Time (min)
                                             Simulation (SHEDS) model (U.S. EPA2009a)
                                             can track activities minute by minute through-
                                             out the day and link these activities to environ-
                                             mental concentrations to estimate exposures
                                             by specific  route and pathways (Zartarian et al.
                                             2006). The longitudinal aspect of the model
                                             provides the ability to estimate not only the
                                             magnitude, but also the frequency and dura-
                                             tion of exposure  over the same time period.
                                             When SHEDS model outputs are linked to a
                                             physiologically based pharmacokinetic (PBPK)
                                             model such as U.S. EPA's  Exposure Related
                                             Dose Estimating Model (ERDEM) (U.S. EPA
                                             2007; Zhang et al. 2007), the magnitude, fre-
                                             quency, and duration of internal dose can also
                                             be predicted. Figure 1  illustrates this linkage
                                             for methyl  tert-buty\ ether exposures.
                                                Integrated exposure/PBPK models can
                                             be used in  several ways. Outputs can be used
                                             directly to  inform toxicity testing as well as to
                                             conduct quantitative risk assessments. Linked
                                             models can simulate dose for multiple routes
                                             (inhalation, ingestion, and  dermal) and mul-
                                             tiple  chemicals simultaneously, thus provid-
                                             ing the ability to evaluate cumulative risks.
                                             Integrated  models also  provide the ability to
                                             evaluate risks to  susceptible populations  by
                                             considering differential activities that  could
                                             change exposures or differential physiology
                                             that could affect adsorption, distribution,
                                             metabolism, or elimination characteristics.
                                             Finally, the models can be used in reverse for
                                             dose reconstruction as an alternative approach
                                             for comparing toxicity testing results to popu-
                                             lation exposures  (Georgopoulos et al. 2008;
                                             Tan etal. 2007).
                                                               ERDEM
                                                             500        1,000
                                                               Time (min)
                                                                                   1,500
Figure 1. Illustration of linked exposure (SHEDS) and dose (ERDEM) models for methyl fert-butyl ether.
                      Realizing the potential for integrated
                  modeling approaches requires a coordinated
                  and sustained research effort. PBPK models
                  need to be extended to allow dose estimation
                  at the cellular and molecular level. Integrated
                  exposure/PBPK models must be enhanced
                  to provide distributional outputs along with
                  uncertainty and variability. Developing sys-
                  tems that are efficient and generalizable must
                  be a part of this effort.  New data and new
                  approaches are needed for exposure recon-
                  struction in order to reduce uncertainties
                  with current approaches. Current efforts
                  (Georgopoulos and Lioy 2006; Rosenbaum
                  et al. 2007) to provide models that use a com-
                  mon platform and/or common programming
                  language must continue.
                      At the same time, research  is required
                  to develop approaches for  estimating model
                  inputs and parameters without resource-inten-
                  sive and burdensome studies.  For example,
                  environmental informatics,  quantitative struc-
                  ture—activity relationships (QSARs), and com-
                  putational chemistry approaches should be
                  developed to predict and quantify behaviors
                  such as environmental fate and transport or
                  metabolism.  Development of metabolic pre-
                  dictors  or simulators that can address  single
                  chemicals, multiple chemicals, and the inter-
                  action among chemicals should be accelerated
                  (Mekenyan et al. 2005). Novel statistical and
                  informatic approaches should be applied to
                  extant exposure data to facilitate the identifica-
                  tion of critical metrics that  represent personal
                  exposure through  time, place, life stage, life-
                  style, or behavior.
                      Exposure screening tools for accelerated
                  chemical prioritization.  Current risk assess-
                  ment approaches  cannot meet demands for
                  the  large  number  of chemicals that must be
                  evaluated. Screening tools are needed that
                  reliably identify those chemicals that will
                  require  more comprehensive risk assessments.
                  Chemical prioritization  should consider
                  both exposure and hazard. The U.S. EPA,
                  through  its  ToxCast program  (U.S.  EPA
                  2009b),  is developing rapid in vitro assays to
                  screen chemicals for further testing based on
                  toxicity (Dix et al. 2007).  Innovative rapid-
                  screening tools based on exposure are also
                  needed. Predictive approaches for estimat-
                  ing  important parameters for screening need
                  to be developed. Ideally, these tools should
                  account for chemical use, physical and chemi-
                  cal properties,  occurrence and co-occurrence
                  of chemicals, potential exposure scenarios,
                  routes of exposure, and various exposure fac-
                  tors. This will include developing approaches
                  that describe a chemical's behavior in the
                  environment as well as approaches to identify
                  important human activities that will impact
                  exposure. Exposure prioritization approaches
                  will require easily accessible databases, as
                  described below.
1182
VOLUME 1171 NUMBER 8 I August 2009 •  Environmental Health Perspectives
                                        Previous

-------
                                                                                                Exposure and computational toxicology
    One plausible approach may be to formu-
late an exposure classification index based on
a limited set of metrics designed to efficiently
cover exposure potential (Cohen Hubal et al.
2008). As  a first step, innovative approaches
for chemical prioritization  (e.g., Arnot  and
MacKay 2008; Hays et al.  2007)  as well as
indexing approaches from other fields should
be reviewed and mined. This index could be
"trained" on data-rich chemicals and products
and then validated  on a representative set of
chemicals  for which little exposure data are
available. In this way, a limited set of criti-
cal metrics could be identified for efficient
screening of new chemicals. Finally, because
consumer  products often incorporate mul-
tiple chemicals in a variety of forms,  rapid
experimental screening protocols that measure
the potential for availability or release of these
compounds into exposure media are under
early development and should be pursued fur-
ther (Little and Cohen Hubal, in press).
    Significant research and model develop-
ment activities are  currently under develop-
ment within the U.S. EPA (U.S. EPA 2005)
as well as  in Canada and Europe (Bridges
2007;  Environment Canada 1999; Van der
Wielen 2007). Partnerships with these groups
should be  fostered  to leverage and establish
collaborative exposure science research for
future chemical screening and prioritization.
    Exposure databases. Easily accessible
exposure databases that can be linked to each
other and with toxicity databases can  and
should be  developed immediately. Data on
chemical manufacture, product use, environ-
mental  fate, media concentrations, biomarker
levels,  and metabolism should be identified.
International standards for exposure data rep-
resentation should be discussed. Approaches
for improving access to human exposure data
and for facilitating links between exposure
and toxicity data should be  implemented.
Existing tools  and platforms that are cur-
rently being implemented with environmen-
tal toxicity information should be adapted
for exposure information to  provide the most
useful links to existing toxicity data. Chemical
structure annotation of exposure-related data,
such as could be provided by the Distributed
Structure Searchable Toxicity (DSSTox) data-
base (Richard et al. 2006, 2008; U.S. EPA
2009c),  and incorporation of such data into
the new Aggregate Computational Toxicology
Resource (ACToR)  (Judson et al. 2008)  will
greatly  enhance linkages between these data
and toxicity-related human health end points.
For maximum impact, this activity should
be conducted in collaboration  with inter-
national partners working to achieve similar
goals (Environment Canada 1999; Van der
Wielen 2007).
    Efficient monitoring methods for assessing
risk. Population-based and surveillance studies
will provide the ability to link the results from
toxicity testing to the real world and to track
our progress in protecting public health. To be
feasible, new low-cost, low-burden  methods
and approaches for conducting these studies
will be  needed. New technologies need to be
applied to develop a toolbox of methods for
assessing exposure, susceptibility, and biologi-
cal response in large surveillance studies. New
sensor  technologies, applications of nano-
technology, geographic information systems,
and genomics assays need to be developed and
put into use for this purpose.
   Emerging tools in molecular biology
provide the potential to develop  cellular and
molecular indicators of exposure and bio-
logical  response.  Better understanding of
genomic expression may also provide insight
into factors impacting differences in suscep-
tibility  to chemical  exposure in the human
population (Oberemm et al. 2005).  "Omics"
should be explored as a way to  identify
expression patterns associated with  exposure
to individual chemicals or chemical mixtures.
Such technologies could then provide link-
ages between exposure and health outcomes
in population studies. Development of envi-
ronmental and/or molecular  indicators of
exposure combined with development of
novel sensor-based monitoring tools will pres-
ent the opportunity for simultaneous, near-
real-time measurement of exposure and dose
to multiple real-world stressors  in mixtures
(Weis et al. 2005). A strong  and immedi-
ate research effort  is required for novel tech-
nologies that will generate the data required
for risk assessments  and decisions that truly
protect  public health.

Conclusions
Exposure science is crucial for addressing many
of our important and complex environmen-
tal health issues. As discussed here,  exposure
science  is essential for toxicity testing to be
valuable in public health protection. A systems
approach is required that fully integrates expo-
sure and toxicity into a holistic framework for
risk assessment.
   The exposure  community must step up
to the challenge to develop a robust and pre-
dictive  science that can be used to address
the complex problems in the 21st century. A
research program that provides the necessary
exposure data and tools within an integrated
framework will need to be multidisciplinary
and take advantage of collaborative opportuni-
ties. Key to this work are strong collaborations
within  the exposure community and with
those researchers  who  are developing infor-
mation  on toxicity pathways and conducting
toxicity testing. Multiple collaborations are
needed  to ensure that
• chemical prioritization considers both
  exposure and toxicity
• databases are developed
• analysis is conducted using these databases
  to understand exposures, doses, and toxicity
• new information on biological interactions
  and pathways is used to develop the appro-
  priate indicators for exposure and surveillance
  studies
• models on exposure and dose are linked for
  extrapolations
• feedback loops are developed to inform
  future planned research.
Collaborations should include partnerships
between governmental and nongovernmental
research groups as well as academia and indus-
try for developing and applying new exposure
tools. Just as a new vision and initiatives have
been  developed for  toxicity testing,  it is now
time for the exposure community to dedicate
itself to engaging in similar activities to move
our science into the 21st century.

                 REFERENCES

Arnot JA, MacKay D. 2008. Policies for chemical hazard and
   risk priority setting: can persistence, bioaccumulation,
   toxicity, and quantity information be combined? Environ
   SciTechnol 42(131:4648-4654.
Bridges J. 2007. REACH epilogue. J Expo Sci Environ Epidemiol
   17(suppl):S101-S104.
Cohen Hubal EA, Richard AM, Imran S, Gallagher J, Kavlock R,
   Blancato J, et al. 2008.  Exposure science and the U.S. EPA
   National Center for Computational Toxicology. J Expo
   Sci Environ Epidemiol; doi: 10.1038/jes.2008.70 [Online
   5 November 2008].
Dix D, Houck K, Martin M, Richard A, Setzer R, Kavlock R. 2007.
   The ToxCast program for prioritizing toxicity testing of
   environmental chemicals. Toxicol Sci 95(11:5-12.
Environment  Canada. 1999. Domestic  Substances List:
   Categorization and Screening Program. Gatineau, Quebec.
   Available: http://www.ec.gc.ca/substances/ese/eng/dsl/
   dslprog.cfm [accessed  23 June 2009].
Georgopoulos P, Lioy P. 2006. Theoretical aspects of human
   exposure and dose assessment to computational model
   implementation: the Modeling ENvironment for TOtal Risk
   Studies (MENTOR]. J Toxicol Environ Health Part B Crit
   Rev 9(61:457-483.
Georgopoulos P, Sasso A, Isukapalli S, Lioy P, Vallero  D,
   Okino M, et al. 2008. Reconstructing population exposures to
   environmental chemicals from biomarkers: challenges and
   opportunities. J Expo Sci Environ Epidemiol; doi:10.1m038/
   jes.2008.9 [Online 26 March 2008].
Hays SM, Becker R, Leung H, Aylward L, Pyatt D. 2007.
   Biomonitoring equivalents: a screening approach for inter-
   preting biomonitoring results from a public health risk
   perspective. Regul Toxicol Pharmacol 47(11:96-109.
Judson R, Richard A, Dix  D, Houck K, Elloumi F, Martin M,
   et al. 2008. ACToR—Aggregated Computational Toxicology
   Resource. Toxicol Appl Pharmacol 233(11:7-13.
Judson R, Richard A, Dix DJ, Houck K, Martin M, Kavlock R,
   et al. 2009. The toxicity data landscape for environmental
   chemicals. Environ Health Perspect 117:685-695.
Little J, Cohen Hubal E. In press. Rapid screening to estimate
   health risks of materials and products. In: Proceedings
   of  the 11th  International Conference on Indoor Air
   Quality and Climate, 17-22 August 2008, Copenhagen,
   Denmark.
Mekenyan 0, Jones J, Schmieder P, Kotov S, Pavlov T,
   Dimitrov S. 2005. Performance, reliability, and improvement
   of a tissue-specific metabolic simulator. In: Abstract Book.
   SETAC North America 26th Annual Meeting "Environmental
   Science in a Global Society: SETAC's Role in the Next
   25  Years," 13-17 November, Baltimore, MD. Pensacola,
   FL:SETAC North America—Society of Environmental
   Toxicology and Chemistry.
NRC (National Research Council]. 2007. Toxicity Testing in
   the 21st Century: A Vision and a Strategy. Washington,
   DC:National Academies Press.
Environmental Health Perspectives •  VOLUME 117 I NUMBER 81 August 2009
                                          Previous
                   TOC
                                                                                     1183

-------
Sheldon and Cohen Hubal
Oberemm A, Onyon L, Gundert-Remy U. 2005. How can toxico-
    genomics inform risk assessment? ToxicolAppI Pharmacol
    207(suppl 21:592-598.
Richard AM,  Gold LS, Nicklaus MC. 2006. Chemical structure
    indexing  of toxicity data on the internet: moving toward a
    flat world. CurrOpin Drug Discov Devel 9(31:314-325.
Richard A, Yang C, Judson R. 2008. Toxicity data informatics:
    supporting  a new paradigm for toxicity prediction. Toxicol
    Mech Methods 18:103-118.
Rosenbaum R,  Margni M, Jolliet 0. 2007. A flexible  algebra
    framework for the multimedia multipathway modeling of
    emissions to impacts. Environ Int 33:624-634.
Tan YM, Lios  KH, Clewell HJ III. 2007. Reverse dosimetry: inter-
    preting trihalomethanes biomonitoring data using physi-
    ologically based pharmacokinetic modeling. J Expo  Sci
    Environ Epidemiol 17(71:591-603.
U.S. EPA (U.S. Environmental Protection Agency]. 2005. Chemical
    Assessment and Management Program (ChAMP). Available:
    http://www.epa.gov/champ/ [accessed 23 June 2009].
U.S. EPA (U.S. Environmental Protection Agency]. 2007.
    Exposure Related Dose Estimating  Model (ERDEM).
    Available: http://www.epa.gov/heasd/products/erdem/
    erdem.htm [accessed 23 June 2009].
U.S. EPA (U.S. Environmental  Protection Agency].  2009a.
    SHEDS-Multimedia. Details of SHEDS-Multimedia ver-
    sion 3: ORD/NERL's Model to Estimate Aggregate and
    Cumulative Exposures to Chemicals. Available: http://
    www.epa.gov/heasd/products/sheds_multimedia/sheds_
    mm.html [accessed 23 June 2009].
U.S. EPA (U.S. Environmental  Protection Agency].  2009b.
    ToxCase™ Program: Predicting Hazard, Characterizing
    Toxicity Pathways, and Prioritizing the  Toxicity Testing of
    Environmental Chemicals. Available: http://www.epa.gov/
    ncct/toxcasV [accessed 23 June 2009].
U.S.  EPA (U.S. Environmental Protection Agency]. 2009c.
    DSSTox:  Distributed Structure-Searchable Toxicity
    (DSSTox]  Database Network. Available: http://www.epa.
    gov/NCCT/dsstox/ [accessed 23 June 2009].
Van der Wielen A. 2007. REACH: next step to a sound chemi-
    cals management. J Expo Sci Environ Epidemiol 17(supp1|:
    S2-S6.
Weis BK, Balshaw J, Barr D, Brown A, Ellisman M, Lioy P. 2005.
    Personalized exposure assessment: promising approaches
    for human environmental health  research. Environ Health
    Perspect 113:840-848.
Williams PRO, Hubbell BJ, Weber E,  Fehrenbacher C, Hrdy D,
    Zartarian  V. In press. An overview of exposure assess-
    ment models used by the U.S. Environmental Protection
    Agency. J Expo Sci Environ Epidemiol.
1184
                                 VOLUME 1171 NUMBER 8 I August 2009 •  Environmental Health Perspectives
                                                  Previous

-------
               Genetic Basis  for Adverse Events after  Smallpox
               Vaccination
               David M. Reif,13a Brett A. McKinney,7 Alison A. Motsinger,31' Stephen J. Chanock,8 Kathryn M. Edwards,5
               Michael T. Rock,5 Jason H. Moore,1-2 and James E. Crowe, Jr.4-5-6
               'Computational Genetics Laboratory and department of Genetics, Dartmouth Medical School, Lebanon, New Hampshire; 3Center for Human
               Genetics Research, Vanderbilt University, and 4Program in Vaccine Sciences and Departments of 5Pediatrics and 6Microbiology and Immunology,
               Vanderbilt University Medical Center, Nashville, Tennessee; 'Department of Genetics, University of Alabama School of Medicine, Birmingham;
               and 8Center for Cancer Research and Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health,
               Bethesda, Maryland
               (See the editorial commentary by Relman, on pages 4-5.)

               Identifying genetic factors associated with the development of adverse events might allow screening before vac-
               cinia virus administration. Two independent clinical trials of the smallpox vaccine (Aventis Pasteur) were con-
               ducted in healthy, vaccinia virus-naive adult volunteers. Volunteers were assessed repeatedly for local and sys-
               temic adverse events (AEs) associated with the receipt of vaccine and underwent genotyping for 1442 single-
               nucleotide polymorphisms (SNPs). In the first study, 36 SNPs in 26 genes were associated with systemic AEs
               (P ^ .05); these 26 genes were tested in the second study. In the final analysis, 3 SNPs were consistently associated
               with AEs in both studies. The presence of a nonsynonymous SNP in the methylenetetrahydrofolate reductase
               (MTHFR) gene was associated with the risk of AE in both trials (odds ratio [OR], 2.3 [95% confidence interval {CI j,
               1.1-5.2] [P = .04] and OR, 4.1 [95% CI, 1.4-11.4] [P<.01]). Two SNPs in the interferon regulatory factor-1
               (IRF1) gene were associated with the risk of AE in both sample sets (OR, 3.2  [95% CI, 1.1-9.8] [P = .03] and OR,
               3.0 [95% CI, 1.1-8.3] [P = .03]). Genetic polymorphisms in genes expressing an enzyme previously associated
               with adverse reactions to a variety of pharmacologic agents (MTHFR) and an immunological transcription factor
               (IRF1) were associated with AEs after smallpox vaccination in 2 independent study samples.
               Although  reactions occurring  after  inoculation of
               vaccinia virus  were  commonly observed  in recent
               population-wide vaccination programs [1], the biologi-
               cal basis for these adverse events (AEs) is not well under-
               stood. Performance of 2 independent clinical studies of a
               single vaccinia virus vaccine at our study site afforded us
                 Received  14 September 2007;  accepted 28 December 2007; electronically
               published 2 May 2008.
                 Potential conflicts of interest: J.E.C. received research funding from Sanofi-
               Aventis and Vaxgen and a joint Small Business Technology Transfer award (with
               Mapp Pharmaceuticals). He has consulted for Medlmmune, Vaxin, Evogenix,
               Symphogen, and Syngenta. K.M.E. received research funding from Sanofi-Aventis,
               Medlmmune, Vaxgen, Merck, and Wyeth. She has also consulted for Medlmmune
               and Wyeth. All other authors: no conflicts.
                 Financial support: Vaccine Trials and Evaluation Unit, National Institutes of
               Health (Mini/National Institute of Allergy and Infectious Diseases (NIAID) (contract
               N01-AI-25462 for Division of Microbiology and Infectious Diseases (DMID) studies
               02-054 and 03-044); NIH/NIAID (grants K25-AI-64625, R21-AI-59365, and R01-AI-
               59694); and NIH/National  Institute of General Medical Sciences (grant R01
               GM-62758). The clinical study was supported in part by the National Center for
               Research Resources, NIH (grant M01 RR-00095).
               The Journal of Infectious Diseases  2008; 198:16-22
               © 2008 by the Infectious Diseases Society of America.  All rights reserved.
               0022-1899/2008/19801 -0006$15.00
               DOI:  10.1086/588670
the unique opportunity to assess genetic factors  that
might predict systemic AEs. All of the vaccinia virus-
naive subjects who were enrolled in the study developed
pock formation at the vaccination site, and a subset ex-
perienced systemic reactions that included fever, rash, or
regional lymphadenopathy. Because poxviruses  have
developed multiple mechanisms by which to evade host
immune responses, such as targeting of primary innate
immunity and manipulation  of intracellular  signal
transduction pathways [2], we questioned whether sub-
jects experiencing AEs exhibited unique genetic poly-
morphisms in these pathways that made them more sus-
ceptible to these reactions.
  The funding organizations played no role in the design and conduct of the study;
collection, management, analysis, and interpretation of the data; or in the prep-
aration, review, or approval of the manuscript.
  a Present affiliation: National Center for Computational Toxicology, US Envi-
ronmental Protection Agency, Research Triangle Park, North Carolina.
  b Present affiliation: Bioinformatics Research Center, Department of Statistics,
North Carolina State University, Raleigh.
  Reprints or correspondence: Dr. James E. Crowe, Jr., T2220 Medical Center N,
1161 21st Ave. S, Nashville, TN 37232-2905 (james.crowe@vanderbilt.edu).
16 • JID 2008:198 (1 July)  • Reifetal.
                                     Previous

-------
  In earlier studies, we characterized humoral and cellular im-
mune responses and outlined patterns of systemic cytokine ex-
pression after smallpox vaccination [3-8]. In the present study,
we utilized data collected during 2 independent studies to iden-
tify stable genetic factors associated with AEs. Because there is
failure to replicate the results of many genetic association studies
during subsequent studies, we sought to repeat the assessment in
an additional study group [9,10]. The fact that the results of our
first study were independently replicated in the second study
strengthens the plausibility of these genetic associations. An
identical panel of candidate single-nucleotide polymorphisms
(SNPs) was evaluated in each of the studies. Subjects with sys-
temic AEs, including fever, lymphadenopathy, or generalized ac-
neiform rash, were compared with subjects who did not experi-
ence these reactions. The data used in both studies were the
genotypes at 1442 SNPs across at least 386 candidate genes. The
present investigation provides, for 2 independent data sets, im-
portant  preliminary findings addressing the  contribution of
common  genetic variants  to a  complex clinical phenotype,
which also is of substantial importance with respect to public
health.



Study subjects.   The vaccines, study subjects, and study design
used in both of the clinical trials have been previously described
in detail. Both trials were conducted at the National Institutes of
Health (NIH)-funded Vaccine and Treatment Evaluation Unit
at Vanderbilt University (Nashville, Tennessee) [4, 8, 11]. The
first study [7] enrolled 85 healthy, vaccinia virus-naive adults in
genotyping studies, and  the  second study  [11]  enrolled 46
healthy, vaccinia virus-naive adults. In both studies, individuals
were asked to self-identify their ethnic background. Both studies
complied  with the  policies of the internal review boards of
Vanderbilt University and the NIH, and written informed con-
sent was obtained from all individuals.
  Clinical assessments.  For both studies, the  same team of
trained physicians and  nurses used the same forms to obtain a
medical history and to record local and systemic AEs occurring
after vaccination. Subjects were examined at regular intervals
(on days 3-5, 6-8, 9-11, 12-15, and 26-30 after vaccination).
Local and systemic AEs were recorded. Subjects who had an oral
temperature >38.3°C at any time during the study, generalized
skin eruptions on areas noncontiguous to the site of vaccination
[11], or enlarged or tender regional lymph nodes associated with
vaccination were defined as  subjects experiencing systemic AEs.
  Identification of genetic polymorphisms.   We used a previ-
ously described custom  SNP panel based on the National Cancer
Institute (NCI) SNP500 Cancer Project [12]; this panel targets in-
vestigation of soluble-factor mediators  and signaling  pathways,
many of which have known  immunological significance [13]. In
this panel, there is heavy weighting toward nonsynonymous SNPs
      (i.e., SNPs that result in an amino acid substitution). Genotyping
      for SNPs was performed using DNA directly amplified from Ep-
      stein Barr virus (EBV)-transformed B cells generated from periph-
      eral blood samples collected from each subject. Genotyping was
      performed at the Core Genotyping Facility of the NCI in Gaithers-
      burg, Maryland. Genotypes were generated using Illumina Golden-
      Gate  assay technology. Of the  1536 SNPs assayed, a total of 1442
      genotypes passed quality control filters for both the first and second
      sample sets.
        Statistical analysis.   The  clinical and demographic charac-
      teristics, including age, sex, and race, noted in the  first and sec-
      ond studies were compared using Student's t test (for age) and
      2-sample  tests of proportions (for  AE status and for sex and
      race). Allele frequencies were estimated  by dividing the total
      number of  copies of individual alleles by the total number of
      alleles in the sample, and the  frequencies noted in  the 2 studies
      were compared using a 2-sample test of proportions. Deviations
      in the fitness  for Hardy-Weinberg  proportion were evaluated
      using the  exact test as described by Wigginton et al. [ 14].
        We chose a 2-stage design for identifying and replicating
      genetic associations in the independent clinical trials. This
      study design was selected with the goal of minimizing type I
      errors (false-positive results).  For comparison, we  also per-
      formed genetic association analysis in a single pooled sample.
      In the first study, we tested for potential associations between
      each of the  1442 SNPs passing quality control filters, as well as
      for the occurrence of AEs, by use  of logistic regression. For
      each SNP in the first sample set, we recorded the odds ratio
      (OR)  estimate and P value of the  likelihood ratio test for a
      univariate logistic model. No  correction for multiple com-
      parisons was made in the first sample set, because we reserved
      the second  study sample set for determination  of probable
      true-positive  results. In the second sample set, we tested only
      those SNPs that had an AE-associated P value of ss.05 in the
      first study. A significant SNP association in the first study was
      considered  to have been  replicated if  it met the following
      criteria in the second study:  (1) an  OR that consistently asso-
      ciated the risk of AE with the same genotypes and (2) a P value
      =£0.05. To  obtain an  empirical probability of meeting our
      replication  criteria purely by chance, we generated 1000 sim-
      ulated data sets  from both study sample sets by permuting
      case-control  labels.  An additional association for which
      P = .06 is discussed below because of its high biological plausi-
      bility.
        Patterns  of linkage disequilibrium (LD) between replicated
      SNPs on the same chromosome were assessed using Haploview
      software (version 3.32; Broad Institute)  [15].  Haplotypes were
      inferred for SNPs in high LD, by use of the iterative approach
      described by  Lake et al. [16]. The resulting  haplotypes were
      tested for association with AEs by use of univariate logistic mod-
      els. Statistical analyses and simulations were  performed using
      software that  we wrote using the R language  (version 2.5.1; R
                                                                     Genetic Basis for Adverse Events • JID 2008:198 (1 July) •  17
                                   Previews
TOG
Next

-------
Table 1.  Summary of data on adverse event (AE) status and
subject  age, sex, and race in both studies.
Characteristic
AE statusb
Age, mean ± SD,
years
Sexc
Raced
First study
(n = 85)
16/69
23.2 ± 3.9
40/45
84/0/1
Second study
(n = 46)
24/22
24.2 ± 3.8
27/19
44/1/1
pa

-------
Table 3.   Distribution of  genotypes at single-nucleotide poly-
morphisms (SNPs) in MTHFR, IRF1, and IL4.
Gene, SNPa
MTHFR, 1801133


IRF1
9282763


839


IL4
2070874


2243268


2243290


SNP
location,11
bp Genotype
6393745 CC
CT
TT

34237146 AA
AG
GG
34234139 GG
AA
AG

34424723 CC
CT
TT
34428976 AA
AC
CC
34433182 CC
AA
AC
No. (%) of
subjects with
genotype,
by study
First
(n = 85)
36(42)
39(46)
10(12)

39(46)
43(51)
3(4)
39(46)
43(51)
3(4)

52(62)
28(33)
4(5)
52(62)
27(32)
5(6)
53(62)
26(31)
6(7)
Second
(n = 46)
18(39)
21(46)
7(15)

17(37)
24(52)
5(11)
17(37)
24(52)
5(11)

34(74)
12(26)
0(0)
34(74)
12(26)
0(0)
34(74)
12(26)
0(0)
  a By rs number.
  b As determined according to dbSNP (Human Genome Build 36.1; National
Center for Biotechnology Information).
tion with 5?1 functional variants in that region. Because of the
close physical proximity of the associated variants in the 2 genes,
Haploview software (version 3.32) [15] was used to examine the
patterns of LD among those  variants in each sample. Figure 1
shows that the LD plots for SNPs in the 2 genes follow the same
pattern in each study sample. Although there is strong LD be-
tween the SNPs within the 2 genes, there is little evidence for LD
between the 2  genes, indicating that the associations for  each
gene are statistically separate  signals.
  This region of chromosome 5q31 contains discrete haplotype
blocks [22]. Accordingly, haplotypes  were inferred for AE-
associated SNPs  in  IRF1 (rs839  and rs9282763)  and  IL4
(rs2070874, rs2243268,  and rs2243290). In both studies, 2 IRF1
haplotypes accounted for all subjects. The common IRF1 haplo-
type listed in table 4 was found in 71% of the first sample set and
63% of the second sample set. The rare IRF1 haplotype was sig-
nificantly associated with AEs in both studies (P = .03). Across
both studies, 2 different 3-SNP haplotypes in IL4 were found
among 99% of subjects. The  common IL4 haplotype shown in
table 4 was found in 78% of the first sample set and 87% of the
second sample set. The  rare IL4 haplotype was significantly as-
     sociated with the risk of AEs in the first study (P = .05); the
     association was similar in the second study (P = .06).

     DISCUSSION

     MTHFR and IRF1.   The candidate genes noted  to have the
     strongest association with AEs in both studies include a metab-
     olism gene previously associated with adverse reactions to a va-
     riety of pharmacologic agents (MTHFR) and an immunological
     transcription factor (IRF1) gene. The statistical  results  from
     these studies have strong biological plausibility and are in agree-
     ment with previous work  on the immune response to poxvi-
     ruses.
       An SNP in the 5,10-methylenetetrahydrofolate reductase gene
     (MTHFR; rslSOl 133) was associated strongly with the risk of AE
     in both studies. This  nonsynonymous SNP in exon 5 causes
     an amino acid  change from alanine to valine, and functional
     characterization of this SNP demonstrated that it is thermola-
     bile and affects both the quantity and activity of the MTHFR
     enzyme [23]. The enzyme catalyzes the conversion  of  5,10-
     methylenetetrahydrofolate to  5-methyltetrahydrofolate, which
     is a cosubstrate  for homocysteine remethylation to methionine.
     MTHFR function provides pools of methyl groups that are cru-
     cial for the control  of DNA synthesis and  repair mechanisms
     [24].  MTHFR is  a key enzyme in  homocysteine metabolism,
     which plays a major role in regulating endothelial function. In
     the future, it may be of interest to examine the association of
     genetic variation in  this gene with the rare cardiac events that
     occur after vaccination.
       Genetic variation  of MTHFR has been associated with a range
     of clinical outcomes, including altered cardiovascular function,
     organ  transplantation, toxicity of immunosuppressive drugs,
     and systemic inflammation [25-28]. Elevated plasma  levels of
     homocysteine  stimulate endothelial inflammatory responses,
     which could contribute to the development of systemic AEs. Al-
     ternatively, because vaccination elicits immune  responses in-
     volving the rapid proliferation of cells, demand for DNA synthe-
     sis metabolites would be elevated, and alterations in the level or
     activity of the MTHFR enzyme may exert significant influence
     over this process.
       Interferon (IFN) regulatory factor-1 (IRF-1).  The  IRF1
     gene is part of the immunological gene cluster on chromosome
     5q31. We found 2 SNPs in IRF1 that were significantly associated
     with AEs in both study samples. The IRF1 gene encodes an im-
     portant member of the IFN regulatory transcription factor (IRF)
     family. The IRF family regulates IFNs and IFN-inducible genes.
     IRF1 activates transcription of type 1 IFN-a and IFN-/3, as well
     as genes induced by type 2 IFN-y [29]. Many viruses target IRFs
     to evade host immune responses by binding to cellular IRFs and
     blocking transcriptional activation of IRF targets [30].
       Polymorphisms in the gene coding for a transcription fac-
     tor with such  far-reaching effects as IRF1  could have  pro-
                                                                     Genetic Basis for Adverse Events • JID 2008:198 (1 July) • 19
                                   Previous
TOC

-------
                                   *•>
                                   S
                          Block 1 (3kb)    Block 2 (8kb)
                                                       .Ill
                      Block 1 (3kb)    Block 2 (8kb)
                                                                B
Figure 1.   Haploview plot of single-nucleotide polymorphisms (SNPs) at chromosome 5q31.1. A, First study. B, Second study. Squares are shaded to
denote the strength of evidence of linkage disequilibrium (LD) between the pairwise markers. Black squares, strong evidence of LD (r2 >0.90); light-gray
squares, weak evidence of  LD (r2 <0.]0)', and white  squares, no evidence of LD (r2 <0.0). The same 2 LD blocks are apparent in  both studies,
encompassing SNPs in IRF1 (rs839 and rs9282763) or IL4 (rs2070874, rs2243268, and rs2243290).
found effects on the proper immune response to and clear-
ance  of vaccinia virus. Mice deficient in IFN receptors are
especially susceptible to vaccinia virus infection, suggesting
an important role for these molecules in controlling vaccinia
virus infection  [31].  Vaccinia  virus dedicates several host-
modifying genes to counteracting IFNs.  For example, the vi-
ral gene B18R encodes a protein that serves as a viral IFN-a/
/3-binding protein that binds IFNs from several species [32].
This protein also can bind to the cell surface  after secretion,
thus preventing host IFN from binding to cellular IFN recep-
tors [33]. Although the SNPs identified in  IRF1 and IL4 do
not change amino  acids in the  encoded proteins, recent evi-
dence suggests  that  synonymous SNPs, such as rs839, can
alter regulation of mRNA or splice junctions [34,35]. It is also
                       plausible that one or both SNPs are in LD with the causal
                       variant not tested in this study.
                         Interleukin-4.   Genetic polymorphisms in this major cyto-
                       kine gene involved in adaptive immunity to viruses also may be
                       associated with AEs, albeit with a P value of .06 in our relatively
                       small replication  study. We found 3 SNPs in IL4 that may be
                       associated with AEs in both studies. There was high intragenic
                       LD (r2 > 0.9) between the tested SNPs within each gene (IRF1
                       and IL4)  and haplotypes inferred separately for  each  of these
                       genes mirrored the significant risk patterns of the SNPs observed
                       individually. Thus, the fact that multiple SNPs that were in high
                       LD were identified in regions oflRFl and IL4 strongly  suggests
                       that there are additional markers in LD, several of which could
                       functionally contribute to the risk of AEs.
                       Table 4.   Haplotypes inferred  for adverse event (AE)-associated single-nucleotide
                       polymorphisms (SNPs) in //?F/(rs839 and rs9282763) and /W(rs2070874, rs2243268, and
                       rs2243290).
                                     Haplotype
            Risk
                                                                  First study
                                             Second study
                       Gene, SNPa   at baseline11   haplotypec   OR (95% Cl)d    Pe   OR (95% Cl)d   Pe
                       IRF1
                         9282763
                         839
                       IL4
                         2070874
                         2243268
                         2243290
C
A
C
G
A

T
C
A
                     3.2(1.0-10.2)   .03    3.0(1.0-9.0)     .03
2.4(1.0-5.7)     .05    3.8(1.0-14.4)    .06
                        NOTE.  Cl, confidence interval; OR, odds ratio.
                        a By rs number.
                        b Most common haplotype, considering 2 SNPs in IRF1 or 3 SNPs in IL4.
                        = Rare (variant) haplotype, considering 2 SNPs in IRF1 or 3 SNPs in IL4.
                        d Estimated OR comparing the risk haplotype with the haplotype at baseline (95% Cl).
                        3 By likelihood ratio x2 test with 1  df.
20 • JID 2008:198  (1 July) •  Reifetal.
                                    Previous

-------
  The IIA gene encodes a pleiotropic cytokine produced by a
variety of immune cells, especially activated T cells. IL4 con-
trols humoral immune responses, isotype switching, and sup-
pression of cytotoxic T cell function and expansion. Thus,
genetic polymorphisms related to inappropriate regulation of
IIA expression and/or activity of IL-4 cytokine could be asso-
ciated with overstimulated inflammatory responses leading
to the development of clinical AEs.  Previous studies  of the
role of IL4 in the pathogenesis of poxvirus have shown that
IL4 has a central role in altering the adaptive immune re-
sponse.  Overexpression of IL4 during infection with recom-
binant poxviruses encoding IL4 suppresses the induction of
cytotoxic T cell activity by inhibiting CDS ' T  cell prolifera-
tion, which increased the pathogenicity of such recombinant
viruses even in previously immunized animals [36]. 1L4 also
plays a role in preventing optimal innate immune responses
to poxviruses. Secretion of IL-4 during vaccinia virus infec-
tion in individuals with atopic dermatitis alters the cytokine
milieu, resulting in blocking of production of the antimicro-
bial peptide LL-37; this accounts, in part, for the increased
risk of vaccinia virus infection in subjects with atopic derma-
titis [37].
  Model of pathogenesis.   Because the  outcome of interest
here was the aggregation of specific AEs, it is logical that > 1 gene
may be involved. The genes with variants  for which we discov-
ered an association with AEs are all potentially involved in path-
ways that are  in line with our previously hypothesized mecha-
nism  of AEs involving  excess  stimulation  of inflammatory
pathways and the imbalance of tissue damage repair pathways.
This model was developed from studies of circulating cytokines
and relevant immunological effector cells [3-5]. For subjects ex-
periencing AEs, vaccination appears to trigger an acute inflam-
matory response that is excessive. Antigen presentation to T cells
in the dermis leads to the release of T cell cytokines that trigger a
cascade of cytokines and chemokines whose release enhances the
inflammatory response by  (1)  promoting the migration of
monocytes  into the lesion and their maturation into macro-
phagesand  (2) further attracting T cells [38, 39]. Taken together,
these findings suggest that systemic AEs occurring after small-
pox vaccination maybe consistent with low-grade macrophage
activation syndrome caused by virus replication and vigorous
tissue injury and repair.
  There are limitations to the present study. The numbers of
subjects are too small for a  genetic association study of low-
penetrance, high-frequency alleles. The  association between
IL4 variations and AEs was weaker than that between varia-
tions in  other genes and AEs. Nevertheless, the observation of
the same variants in  2 independent clinical trials, the high
biological plausibility of these associations in light of what is
known about the biological  profile of poxvirus, and the po-
tential public health significance suggest that the findings are
of interest.
        Conclusions and future directions.   These data  provide
      the rare opportunity to  (1) study 2 independent cohorts of
      smallpox vaccine recipients and (2) attempt to identify asso-
      ciations between common genetic variations and the occur-
      rence of AEs after vaccination. Statistical analysis of the re-
      sults of the   first  study revealed potentially  significant
      associations between SNPs in biologically interesting candi-
      date genes. Of the AE-associated genes identified in  the first
      study, 2 were replicated in an independent study, with one
      additional candidate gene having results just beyond  the cut-
      off value used to denote  statistical significance but neverthe-
      less having a high level of biological plausibility. It is possible
      that our findings  could  be due to chance, but we  avoided
      multiple testing issues by testing only the most promising
      results  in the  validation sample. Although all SNPs were
      tested in the first study, only those SNPs that were signifi-
      cantly associated with AEs were tested in the second study,
      and our empirically derived probability of  replication  by
      chance alone was <0.1%. The association of SNPs in 3 genes
      across both studies and the biologically plausibility that these
      SNPs were associated with the development of AEs lend cre-
      dence to the reproducibility of these associations.
        As with any statistical association, follow-up studies in ad-
      ditional populations are  needed to identify the particular ge-
      netic susceptibility variants and examine the functional con-
      sequences of polymorphisms in the AE-associated genes. The
      polymorphisms that were identified show consistently high
      heterozygosity across Hispanic, African, African-American,
      Asian,  and white population samples [40].  Therefore,  al-
      though  future population samples may reveal population-
      specific differences in allele frequencies that require analytical
      consideration, the variability in these SNPs makes them rea-
      sonable candidates  for association studies  in more racially
      diverse  populations.  Because   we  found  multiple  AE-
      associated SNPs in regions of IRt'l and IL4, focused studies
      should be undertaken to characterize the genetic variability in
      these candidate regions. Indeed, haplotypes in IRF  and  II.4
      displayed altered susceptibility to a specific systemic AE (i.e.,
      fever) after smallpox vaccination [41]. Although the  associa-
      tion between AEs and a nonsynonymous polymorphism in
      the gene for MTHFR points toward functional significance of
      this SNP,  fine mapping  of  this  locus  should  determine
      whether this  is the case.  For all  3 candidate genes, both
      follow-up replication  and functional  studies  are needed to
      establish the plausibility of the association of common  ge-
      netic polymorphisms with the hypothesized etiological path-
      ways.

      Acknowledgments
        We thank Jennifer Hicks, Karen Adkins (Vanderbilt Pediatric Clinical
      Research Office, Vanderbilt University Medical Center, Nashville, Ten-
                                                                    Genetic Basis for Adverse Events • JID 2008:198 (1 July) • 21
                                  Previous
TOG
Next

-------
nessee), and the staff at the Vaoderbilt General Clinical Research Center
(Vanderbilt University Medical Center), for nursing support.

References
 1.  Keraper AR, Davis MM, Freed GL. Expected adverse events in a mass
    smallpox vaccination campaign. EffClin Pract 2002; 5:84-90.
 2.  Seet BT, Johnston JB, Brunetti CR, et al. Poxviruses and immune eva-
    sion. Annu Rev Immunol 2003; 21:377 423.
 3.  McKinncy BA, Reif DM, Rock MT, et al. Cytokinc expression patterns
    associated with systemic adverse events following smallpox immuniza-
    tion. J Infect Dis 2006; 194:444-53.
 4.  Rock MT, Yoder SM, Talbol TR, Edwards KM, Crowe JE Jr. Adverse
    events after smallpox immunizations are associated with alterations in
    systemic cytokine levels. J Infect Dis 2004;  189:1401-10.
 5.  Rock MT, Yoder SM, Wright PF, Talbot TR, Edwards KM,  Crowe JE Jr.
    Differential regulation of granzyme and perform in effector and mem-
    ory T  cells following smallpox immunization. J  Immunol 2005; 174:
    3757-64.
 6.  Rock MT, Yoder SM, Talbot TR, Edwards KM, Crowe JE Jr. Cellular
    immune responses to diluted and undiluted Aventis Pasteur smallpox
    vaccine. J Infect Dis 2006; 194:435-43.
 7.  Shaklee JF, Talbot TR,  Muldowney JA III, et al. Smallpox vaccination
    does not: elevate systemic levels of  prothrombotic proteins associated
    with ischernic cardiac events. J Infect Dis 2005; 191:724-30.
 8.  Talbot TR, Stapleton JT, Brady RC, et al. Vaccination success rate and
    reaction profile with diluted and undiluted smallpox vaccine: a random-
    ized controlled trial. JAMA 2004; 292:1205  12.
 9.  Hirschhorn JN, Lohmueller K, Byrne E, Hirschhorn K. A comprehensive
    review of genetic association studies. Genet Med 2002; 4:45-61.
10.  Lohmueller KE, Pearce CL,  Pike M, Lander ES, Hirschhorn JN. Meta-
    analysis of genetic association  studies supports a contribution of com-
    mon variants to susceptibility to common disease. Nat Genet 2003; 33:
    177-82.
11.  Talbot TR, Bredenberg I IK, Smith M, LaFleur BJ, Boyd A, Edwards KM.
    Focal and generalized folliculitis following smallpox vaccination among
    vaccinia-naive recipients. JAMA 2003; 289:3290-4.
12.  Garcia-Closas M, Malats N, Real FX, et al. Large-scale evaluation of
    candidate genes identifies associations between VEGF polymorphisms
    and bladder cancer risk. PLoS Genet 2007;  3:e29.
13.  Packer BR, Yeager M, Burdett L, et al. SNPSOOCancer a public resource for
    sequence validation, assay development, and frequency analysis for genetic
    variation in candidate genes. Nucleic Acids Res 2006; 34:L*617—21.
14.  Wigginton JE, Cutler DJ, Abecasis GR. A note on exact tests of Hardy-
    Weinberg equilibrium. Am J Hum Genet 2005; 76:887-93.
15.  Barrett JC, Fry B, Mailer J, Daly MJ. Haploview: analysis and visualiza-
    tion of LD and haplotype maps. Bioinformatics 2005; 21:263 5.
16.  Lake SL, Lyon H, Tantisira K,  et al. Estimation and tests of haplotype-
    environment interaction when linkage phase is ambiguous.  Hum Hered
    2003; 55:56-65.
17.  Ihaka  R, Gentleman R. R: a language  for data analysis and graphics.
    Journal of Computational and Graphical Statistics 1996; 5:299  314.
18.  R Development Core Team. R: a language and environment for statisti-
    cal computing. R Foundation for Statistical Computing. Available at:
    http://www.R-project.org. Accessed 31 August 2007.
19.  Howell MD, Gallo RL, Boguniewicz M, et al. Cytokine milieu of atopic
    dermatitis skin subverts the innate immune response to vaccinia virus.
    Immunity 2006; 24:341-8.
20.  Jackson RJ, Ramsay AJ, Christensen CD, Beaton S, Hall DF, Ramshaw
    I A. Expression of mouse interleukin-4 by a recombinant ectromelia vi-
    rus suppresses cylolytic lymphocyte responses and overcomes genetic
    resistance to mousepox. J Virol 2001; 75:1205—10.
21.  KcrrPJ, Perkins HD, InglisB, ctal. Expression of rabbit IL-4 by recom-
    binant myxoma viruses enhances virulence and overcomes genetic re-
    sistance to myxomatosis. Virology 2004; 324:117-28.
      22. Daly MJ,  Rioux JD, Schaffner SF,  Hudson TJ,  Lander ES. Iligh-
          resolulion haplotype structure in the human genome. Nat Genet 2001;
           29:229—32.
      23. Martin YN,  Salavaggione OE,  Eckloff BW,  Wieben ED, Schaid  DJ,
          Weinshilboum RM. Human methylenetetra.hydrofola.tereducta.se phar-
          macogenomics: gene resequencing and functional genomics. Pharma-
          cogenet Genomics 2006; 16:265 77.
      24. Friso S, Girelli D, Trabetti E, et al. The MTHFR U9SA>C polymor-
          phism and genomic DNA methylation in human lymphocytes. Cancer
          Epidemiol Biomarkcrs Prcv 2005;  14:938- 43.
      25. Dedoussis GV, Panagiotakos DB,  Pitsavos C, et al. An association be-
          tween the  methylenetetrahydrofolate reductase (MTHFR) C677T mu
          tation and inflammation markers  related to cardiovascular disease. Int
          J Cardiol 2005; 100:409-14.
      26. Lim U, Peng K, Shane B, et al. Polymorphisms in cytoplasmic serine
          hydroxymethyltransferase and methylenetetrahydrofolate reductase
          affect the  risk of cardiovascular disease in men. J Nutr 2005; 135:
          1989-94.
      27. Murphy N, Diviney M, Szer J, et al. Donor methylenetetrahydrofolale
          reductase genotype is associated with graft-vcrsus-host disease in herna-
          topoietic stem cell transplant patients treated with methotrexate. Bone
          Marrow Transplant 2006;  37:773—9.
      28. Urano W, Taniguchi A, Yamanaka II, et al. Polymorphisms in the meth-
          ylenelelrahydrofolate reduclase gene were associated with both the effi-
          cacy and the  toxicity of methotrexate used for the treatment of rheuma-
          toid arthritis, as  evidenced by single locus  and haplotype  analyses.
          Pharmacogenetics 2002; 12:183-90.
      29. Harada II, Fujita T, Miyamoto M, et al. Structurally similar but func-
          tionally distinct factors, IRF 1 and IRF 2, bind to the same regulatory
          elements of  IFN and IFN-inducible genes. Cell 1989; 58:729-39.
      30. Goodbourn  S, Didcock L, Randall RE. Interferons: cell signalling, im-
          mune modulation, antiviral response and virus countermeasures. J Gen
          Virol 2000; 81:2341-64.
      31. van den Broek MF,  Mtiller U,  Huang S, Aguet M, Zinkernagel RM.
          Antiviral defense in mice lacking both alpha/beta and gamma interferon
          receptors. J Virol 1995; 69:4792	6.
      32. Symons JA, Alcami A, Smith GL. Vaccinia virus encodes a soluble type I
          interferon receptor of novel structure and broad species specificity. Cell
          1995; 81:551 60.
      33. Alcami A,  Syrnons JA, Smith GL. The vaccinia virus soluble alpha/beta
          interferon (IFN) receptor binds to the cell surface and protects cells from
          the antiviral  effects of IFN. J Virol 2000; 74:11230-9.
          Crawford  DC, Nickerson DA. Definition and clinical importance of
          haplotypes. Annu Rev Med 2005; 56:303 20.
          Duan J, Wainwright MS, Comeron JM, et al. Synonymous mutations in
          the human dopamine receptor D2 (DRD2) affect rriRNA stability and
          synthesis of the receptor. Hum Mol Genet 2003; 12:205	16.
      36. Jackson RJ, Ramsay AJ, Christensen CD, Beaton S,  Hall DF, Ramshaw
          LA. Expression of mouse interleukin 4 by a recombinant ectromelia vi
          rus suppresses cytolytic lymphocyte responses and overcomes genetic
          resistance  to  mousepox. J Virol 2001; 75:1205-10.
      37. Howell MD,  Gallo RL, Boguniewicz M, et al. Cytokinc milieu of atopic
          dermatitis skin subverts the innate immune response to vaccinia virus.
          Immunity 2006; 24:341-8.
      38. Fong  TA, Mosmann TR. The  role of IFN-gamma in delayed-type
          hypersensilivily  mediated by  Till  clones. J Immunol 1989; 143:
          2887-93.
      39. Grom AA, Passo M. Macrophage activation syndrome in systemic juve-
          nile rheumatoid arthritis. J Pediarr 1996; 129:630-2.
      40. Sherry ST, Ward Mil, Kholodov M, et al. dbSNP: the NCBI database of
          genetic variation. Nucleic Acids Res 2001; 29:308-11.
      41. Stanley SL Jr, Frey SL, Taillon-Miller P, et al. Immunogcnctics of small-
          pox vaccination. J Infect Dis 2007; 196:212-9.
22 • JID 2008:198  (1 July)  •  Reifetal.
                                        Previous
TOG
Next

-------
Hepatol Int (2008) 2:39^9
DOI 10.1007M2072-007-9025-2
Genome-wide transcriptome  expression in the liver of a  mouse
model  of high  carbohydrate diet-induced liver  steatosis and its
significance for the disease

Ion V. Deaciuc • Zhenyuan Song • Xuejun Peng • Shirish S. Barve •
Ming Song • Qiang He • Thomas B. Knudsen • Amar V. Singh •
Craig J. McClain
Received: 11 April 2007/Accepted: 8 August 2007/Published online: 27 November 2007
© Asian Pacific Association for the Study of the Liver 2007
Abstract
Purpose   To perform a large-scale gene profiling of the
liver in a  mouse  model of fatty liver induced by  high
carbohydrate (sucrose) diet (HCD) to gain a deeper insight
into potential mechanisms of diet-induced hepatic steatosis.
Methods  C57BL/6 male mice were fed either a purified,
control diet or a HCD for  16 weeks. HCD feeding led to
marked liver steatosis without inflammation or necrosis.
The expression of 42,500 genes/sequences was assessed.
Results   A number of genes (471) underwent significant
expression changes in HCD- as compared to standard diet-
fed mice (n — 5/group; P < 0.01). Of these genes, 211  were
down- and 260 up-regulated. The latter group includes 20
genes encoding enzymes involved in carbohydrate conver-
sion to fat. The genes that underwent expression changes
perform a large variety of molecular functions, and the vast
majority of these have  never been tested before in non-
alcoholic fatty liver of nutritional origin. They reveal novel
aspects of the disease and  allow identification of candidate
genes that may underlie the initiation of hepatic steatosis and
progression to non-alcoholic steatohepatitis.
Conclusions  HCD-fed   laboratory  animals  provide  a
model of early non-alcoholic fatty liver disease resembling
the disease in humans. The genome wide gene profiling of
Electronic supplementary material  Supplementary material is
available for this article at http://dx.doi.org/10.1007/sl2072-007-
9025-2 and is accessible for authorized users.

Ion V. Deaciuc and Zhenyuan Song are contributed equally to this
study.

I. V.  Deaciuc (El) • Z. Song • S. S. Barve • M. Song •
C. J.  McClain
Division of Gastroenterology/Hepatology, Department
of Medicine, University of Louisville School of Medicine,
550 S. Jackson Street, ACB  Bldg., Third Floor, Louisville,
KY 40202, USA
e-mail: ion.deaciuc@louisville.edu

Z. Song
e-mail: zOsongOz@louisville.edu

S. S.  Barve
e-mail: shirish.barve@louisville.edu

M. Song
e-mail: mOsong03@louisville.edu

C. J.  McClain
e-mail: craig.mcclain@louisville.edu
X. Peng
Biometrics and Data Management Department, Takeda Global
Research and Development Center, Inc., Lincolnshire, IL 60069,
USA
e-mail: xpeng@tgrd.com

S. S. Barve • C. J. McClain
Department of Pharmacology and Toxicology, University of
Louisville School of Medicine, Louisville, KY 40202, USA

Q. He
Department of Biochemistry and Molecular Biology, University
of Louisville School of Medicine, Louisville, KY 40202, USA
e-mail: qOhe0002@louisville.edu

T. B. Knudsen  • A. V. Singh
Department of Molecular, Cellular and Craniofacial Biology,
University of Louisville, Louisville, KY 40202, USA

T. B. Knudsen
e-mail: thomas.knudsen@louisville.edu

A. V. Singh
e-mail: avsing01@louisville.edu

C. J. McClain
Louisville Veterans Administration Medical Center, Louisville,
KY 40202, USA
                                                                                                      4y Springer
                                   Previous

-------
40
                                                                                          Hepatol Int (2008) 2:39^9
the liver reveals the complexity of the disease, unravels
novel aspects of HCD-induced hepatic steatosis, and helps
elucidate its nature and mechanisms.

Keywords  Fatty  liver • Gene profiling •
High carbohydrate diet • Mouse


Abbreviations
ALT      Alanine-2-oxoglutarate  amino transferase (EC
          2.6.1.2)
HCD      High-carbohydrate diet
NAFLD   Nonalcoholic fatty liver disease
NASH    Nonalcoholic steatohepatitis
TEARS   Thiobarbituric acid reactive substances
Introduction

Nonalcoholic fatty liver disease (NAFLD) is a spectrum of
pathology ranging from simple steatosis to nonalcoholic
steatohepatitis (NASH), and in some instances progressing
to cirrhosis and even hepatocellular carcinoma [1]. NAFLD
is by far the most frequent cause of abnormal liver enzymes
in the United States. Therefore, there is great interest in the
potential pathogenesis, prevention, and/or treatment of this
disease. Multiple factors have been considered and iden-
tified as causes  of hepatic  steatosis, and they  can be
classified in two major groups: exogenous and endogenous.
Among exogenous factors, hepatotoxic drugs, hepatitis C
infection, malnutrition,  and, perhaps  the most  frequently
encountered factor, composition and amount of food, have
been  confirmed  as  causes of  nonalcoholic  steatosis.
Because  of its high prevalence, the disease has become a
focus of intensive research and, although  some 1,000
studies dealing with the disease have been published in the
last 25 years,  the mechanisms  underlying  its occurrence
and progression to NASH are not fully elucidated.
  Experimental  models based  on voluntary food intake
have been  developed to induce hepatic steatosis in labo-
ratory  animals   [2-8].  The  nutritional  models  using
complete diets resemble the human condition in that they
contain amounts of lipids or carbohydrates that exceed the
energy needs  of  the  body.  As a consequence,  the body
processes  and  deposits the excessive  nutrients  as fat
regardless of their original chemical nature. No matter the
source, the body deposits the fat preferentially in subcu-
taneous and visceral  areas and, to a  lesser extent, in the
liver. The biochemical pathways involved in fat processing
and deposition differ as a function of the origin of fat.  The
deposition  of  the excessive dietary fat  involves  several
biochemical events, including digestion, reconstitution in the
intestinal epithelium, assembling,  transport,  and deposition
[9]. However, excessive amounts of ingested carbohydrates,
mainly hexoses, can be stored only as glycogen and, in sig-
nificant amounts, only in skeletal muscle and the liver. The
glycogen content of these organs at the saturating levels is
around 5-6% of their mass. Therefore, after such levels are
attained, the excessive carbohydrates cannot be stored further
as glycogen; instead the body converts them into fat. The
metabolic pathways accomplishing carbohydrate conversion
into fat are well known and they were well characterized in
earlier studies [10,  11].
   During the last 15 years  it has become increasingly
apparent that macronutrients such as long-chain fatty  acids
and glucose act as signaling molecules leading to changes
in gene expression. Therefore, gene profiling of organs as
affected by macronutrients may provide important infor-
mation on the mechanisms underlying disturbances such as
liver steatosis, overweight,  obesity, insulin resistance, and
others. To our knowledge, no comprehensive,  genome-
wide gene profiling of hepatic steatosis induced by a high-
carbohydrate diet  (HCD),  without the complications of
steatohepatitis,  has been  reported  in either  animals or
humans. Therefore, this study was undertaken to  (i) gain a
deeper insight into the biochemical and cell physiological
mechanisms associated with  HCD-induced liver  steatosis
and (ii) identify potential  "hidden"  genes/pathways that
may contribute to the progression of liver steatosis to
NASH.
   The model of HCD-induced liver steatosis used in this
study consists of long-term (16 weeks) feeding of an  HCD
to mice, and resembles  the  disease  in humans. Human
clinical investigations have demonstrated that a diet low in
fat and rich in carbohydrates (closely resembling the  HCD
used in our mouse study), even when administered for short
periods of time, for example,  5 or 25 days, can lead to
occurrence of uncomplicated fatty liver [12-15]. Thus, this
mouse model is a highly relevant means of investigating
mechanisms of hepatic steatosis.
   Comprehensive gene profiling  of  the  liver  was  per-
formed using the  microarray DNA   technology, which
allows simultaneous  assessment  of   the  expression of
42,500 genes/sequences.  A number of genes  that under-
went significant changes were classified according to their
function  and  selected  genes  were  analyzed from the
viewpoint of their potential participation in various cellular
processes related to nonalcoholic hepatic steatosis.
Materials and methods

Animals and their treatment

The animals were treated in accordance with the Guide for
the Care  and  Use  of Laboratory Animals (National
4y Springer
                                   Previous

-------
Hepatol Int (2008) 2:39^9
                                                                                                             41
Research Council, USA, 1996) as approved by the Insti-
tutional Animal Care and Use Committee of the University
of Louisville (Louisville, KY). Male C57BL/6 mice (Har-
lan,  Indianapolis,  IN),  weighing  23.5 ± 0.8 g  were
maintained under  standard conditions  for  7 days  before
initiation of study diets. Thereafter, the mice were divided
in two groups of 10 individuals each and started on two
different diets: high-carbohydrate diet (HCD) and a puri-
fied, control diet named herein standard diet (SD; both
from Harlan Teklad, Madison, WI). The composition of the
HCD was identical to that used by Feldstein et al. [8], that
is  (in g kg"1),  650 sucrose, 200 casein, 50  corn oil, 40
mineral  mixture (AIN-93G-MX),  10  vitamin mixture
(AIN-93-VX), 2.5 choline bitartrate,  3.0  DL-methionine,
and 10 cellulose.
Animal killing and tissue sampling

After 16 weeks  of  feeding  HCD or SD, the mice were
fasted from 2,200 to 8,000 h, anesthetized with urethane
(100 mg  kg"1 body weight,  intraperitoneally)  and  the
abdominal  cavity opened. Blood (0.5-0.7 mL) was drawn
from the inferior vena  cava with citrate-containing syrin-
ges,  immediately  centrifuged,  and  the  plasma  was
collected. The liver was perfused through the portal vein
with 3-5 mL of ice-chilled phosphate-buffered saline
(pH 7.4), with the inferior vena cava severed to remove the
blood. The  left one-third of the left lobe was immersed in
formalin  while the rest was placed immediately in liquid
nitrogen.
Liver histology

Liver sections were stained with hematoxylin-eosin and
analyzed for the presence of fat, polymorphonuclear infil-
tration, and necrotic areas.
Biochemical assays

The  following assays were  performed using commercial
kits;  in plasma: glucose, triacylglycerols (TAG), free fatty
acids,  alanine-2-oxoglutarate  aminotransferase   (ALT)
(Infinity, Thermo Electron Corp.,  Melbourne, Australia),
adiponectin and TNF-a (R&D Systems, Minneapolis, MN),
and insulin (Crystal Chem.  Inc., Downers Grove,  IL); in
the liver: TEARS according to Quintanilha et al. [16]  and
TAG as above. Total RNA was extracted from the liver
using a kit  (Ambion,  Austin,  TX, Cat. No.  1924)  and
purified with Qiagen minicolumns (Qiagen, Valencia, CA;
Cat.  No. 74104). RNA quality was assessed using Agilent
2100 bioanalyzer and reagents supplied by the manufac-
turer (Agilent Technologies, Inc., Palo Alto,  CA). About
10 (ig of total RNA were processed for mRNA expression
using the  Affymetrix  GeneChip  MGU 430 2.0 Array
(Affymetrix, Santa Clara, CA)  and Affymetrix technology.
This chip  allows  testing 42,500  transcripts  for  their
expression.
Quantitative real-time PCR

The amount of mRNA for 10 randomly selected genes was
measured by quantitative real-time (RT) PCR for five mice
in each group.  The Taqman Gold RT-PCR kit  (Applied
Biosystems, Inc., Foster City, CA) was used for all of the
reactions and the manufacturer's  protocol for  GAPDH
control was followed. For each mouse, 2ug of total RNA
and random hexamer  primers were  used in the initial
reverse transcription reaction. Each gene was detected by a
revalidated Taqman Gene Expression Assay probe set that
was labeled with 6FAM and the  amplification  step was
done in triplicate for each gene in two variants: (i) template
and reverse transcriptase present, (ii) no template present,
and (iii) no reverse transcriptase present. The PCR ampli-
fication   was  analyzed  by  an iCycler  iQ  Real-Time
Detection System  (BioRad  Laboratories,  Inc., Hercules,
CA) and the resulting expression ratios were calculated by
the 2
     -AACt
method as described in the Technical Bulletin of
Gene Expression (Applied Biosystems, 2002).
Assay of protein abundance

The  total  liver protein extraction,  gel electrophoresis,
immunoblotting and band visualization  were performed
using reagents  and technology provided by Santa Cruz
Biotechnology,  Inc. (Santa  Cruz,  CA).  The  following
antibodies were used: GCK (H-88)—sc7908 (Santa Cruz),
EGFR  (Cell Signaling Technology, Inc., Cat. No.  2232,
Danvers, MA),  and cytochrome P450 reductase (Abeam,
Cat.  No. ab!3513, Cambridge, MA).
Gene data processing and statistics

Gene data were analyzed with the Affymetrix Microarray
Suite 5.0 algorithm to generate signal value and detection
label. Only  genes  that generated 5 present calls  in  each
group (n —  5, in each  group,  in a one chip—one animal
design) were taken into consideration for further statistics
and classification.  Also, a false discovery rate was set at
10% and calculated for all probe  sets. The genes whose
expression  was  changed 1.5-fold or greater  in either
                                                                                                      4y Springer
                                   Previous

-------
42
                                                                                            Hepatol Int (2008) 2:39^9
Fig. 1 Liver sections of
standard diet- (a) and high
carbohydrate diet- (b) fed mice.
Note the normal appearance of
the liver in A and fat
accumulation in B. Note that fat
accumulation in the hepatocyte
has the shape of macrovacuole
(arrows). Magnification: 200 x
a                                   b
direction (up or down), in the HCD-fed mice as compared
to the SD-fed mice, were given priority in  ascribing a
potential significance. Finally, the comparison between SD
and HCD-fed mice was made on the basis of a P-value of
0.01.  A detailed presentation of the gene statistics proce-
dure used in this study was given in an earlier publication
from  our laboratory [17].
Results

Body weight

At the end of the feeding period, the body weight of the
HCD-fed mice was 29% greater than that of the  SD-fed
mice  (P < 0.05).  SD-fed mice gained 28% body weight
over the initial time point while the HCD-fed mice gained
65% (P < 0.05; Table 1).
Blood biochemistry

The data in Table 1 show that, at the time of killing, the
HCD-fed animals had increased levels of glucose, choles-
terol,  and insulin in plasma. Free fatty acids in plasma were
                 significantly lower than in control, SD-fed mice. No  sig-
                 nificant changes  were  observed  in  plasma  adiponectin,
                 TNF-a, TAG,  and ALT levels.
                 Liver histology

                 The  histological appearance of the livers was  normal in
                 SD-fed mice while in HCD-fed mice the liver displayed fat
                 infiltration with no inflammation  or  necrosis. The  fat
                 infiltration score was estimated to be 60% (3+, according to
                 the method of Jarvelainen et al. [18] (Fig. la and b).
                 Liver biochemistry

                 TAG content in the liver  of HCD-fed mice was signifi-
                 cantly higher  (P < 0.001) than  in  the  livers of SD-fed
                 mice. TEARS were also increased (P <  0.01; Table 1).
                   All these changes demonstrate that the HCD used in
                 these experiments induced typical hepatic steatosis and, for
                 the  feeding period employed, the  liver did not display
                 markers of hepatitis (inflammation,  necrosis, and others).
                 Some of these data resemble,  in part,  the results reported by
                 Feldstein et al. using HCD feeding  [8]. Whether feeding
Table 1  Body weight and
biochemical parameters of the
plasma and liver in SD- and
HCD-fed mice at the end of the
feeding period
* Means ± SEM were
calculated for 5 animals in each
group
** The initial body weight was
23.5 ± 0.8 g (n  = 10)
Parameter/marker
Body weight (g)
Glucose (mM)
ALT (mU mL"1)
Free fatty acids (plasma, mEq dL~!)
Cholesterol (plasma, mg dL~!)
Insulin (pg mL~ )
Triacylglycerols (liver, mg g~!)
Triacylglycerols (plasma, mg dL~ )
TEARS (nmol g"1 wet weight)
TNF-a (pg mL"1)
Adiponectin (ng mL^1)
Mean ± SEM*
Standard diet
30.1 ± 0.3**
9.3 ± 0.25
28.3 ± 3.0
0.50 ± 0.02
36.4 ± 0.86
582 ± 54
60.2 ± 6.7
101.0 ± 9.8
90.4 ± 14.5
40.3 ± 13.4
23.7 ± 0.48

HCD
38.8 ± 1.2**
14.0 ± 0.75
31.2 ± 2.9
0.33 ± 0.021
67.4 ± 2.6
2,037 ± 360
159.4 ± 14.6
92.9 ± 8.3
170.9 ± 16.8
37.6 ± 18.7
23.2 ± 2.0
P
<0.05
<0.001
NS
<0.001
<0.001
<0.001
<0.001
NS
<0.010
NS
NS
4y Springer
                                    Previous

-------
Hepatol Int (2008) 2:39^9
                                                                                                               43
this diet for longer periods of time  may  lead to NASH
remains to be established.
Liver genomics

A numerical account of the genomics data obtained in our
experiments is given in the self-explanatory diagram  of
Fig. 2. Selected  genes that underwent a change in expres-
sion of 1.5-fold or more in either direction are presented in
Tables 2 and  3. The  genes  in  Table 2 were  classified
according to Bulera et al. [19], with slight modifications
[20].  Also  a group of genes was selected and tabulated,
comprising  several glutathione S-transferases, because  of
their  potential participation  in the progression  of liver
steatosis to NASH (Table 4).  Two Tables  I and II  are
presented in Microsoft  Excel as supplementary material.
Table I presents all gene changed by HCD feeding while
Table I contains an expanded listing of genes involved in
carbohydrate  and fat metabolism  processed  using  the
DAVID  Functional Annotation Chart for gene ontology
(GO)  and Kegg.
  The  QRT-PCR data  (Table 5)  confirmed changes  in
gene expression  (for 10  genes in each experimental group)
obtained with microarray technology. Only minor quanti-
tative differences were observed between the two methods.
Such  differences are routinely observed in many studies.
Protein abundance

The gel images in Fig. 3 show that three proteins randomly
selected  to  be tested for their  abundance—glucokinase
                   42.449 unique Gene bar* IDs
                         on me chip
  260 genes UP-tegulated
       In HCD
211 genes DOWN-
  regulaied
   in HCD
Fig. 2 Flow chart of numeric distribution of genes and sequences
detected in the liver. We propose that the number of genes in the dark
ovals should be taken into consideration for future analysis of the
genomics in the experimental model used in this study
                  (BC011139.1),      cytochrome-P450     oxidoreductase
                  (NM_008898.1), and Emr4 (AY032690.1; EGF-like mod-
                  ule containing mucin-like, hormone receptor-like sequence
                  4)—changed in the same direction as their transcriptome. A
                  fourth protein, MCP-1 (AF128196.1), tested using both the
                  Western blot method  and ELISA, could not be detected.
                  This cytokine,  however, was  not  assayed  in  plasma.
                  Interpretation of genomic data is mainly based on changes
                  in transcriptome expression rather than  in protein abun-
                  dance.  It  is generally assumed that changes  in  protein
                  abundance parallels changes in gene expression. However,
                  if a gene or a set of genes are studied closely,  the assess-
                  ment of their protein  products must be performed before
                  functional significance is ascribed to expression changes.
Discussion

In this  study, 471  genes (Fig. 2) were identified  whose
expression was changed in the livers of HCD-fed animals.
These genes include the  ones  that encode the enzymes,
approximately  20,  involved  in  glucose  and  fructose
metabolism  and  their  conversion to  fat  [10,  12,  15]
(Table 3  and Table I  of supplementary material, and
Fig. 4).  The large number of genes that underwent changes
in expression, taken together with the direction of change,
and with the functional diversity they belong to, demon-
strate that fat accumulation in the  hepatocyte in  HCD-
induced liver  steatosis is  associated with alterations of a
much wider spectrum of biochemical and molecular pro-
cesses than expected on the basis of the data available thus
far. Such a conclusion could have emerged only  from the
large-scale gene profiling  data and supports the usefulness
of this  tool  in  unraveling multifaceted mechanisms  of
disease.
   Importantly, a large number of genes involved in car-
bohydrate conversion to fat were upregulated by the HCD.
A group of genes that are of particular  interest for under-
standing potential  mechanisms  of HCD-induced hepatic
steatosis are presented in Table 3. The data in this table
demonstrate that, as expected, HCD upregulates several
genes directly involved in carbohydrate conversion  to fat
[11, 21, 22]. These genes have been displayed within the
context  of the metabolic  pathways to which they belong
(Fig. 4) in an attempt to  facilitate understanding of their
role(s) in HCD-induced liver steatosis. In addition, one
gene, sterol  regulatory element-binding protein (SREBP)-
Ic, which controls  the transcription of genes involved in
fatty acid synthesis [23], and whose transcription is regu-
lated  by  insulin  [24],  was  upregulated.  Another  gene,
peroxisome proliferator-activated receptor (PPAR)-a, also
a transcription factor, but encoding  enzymes involved in
fatty acid  oxidation  [25], was likewise upregulated.
                                                                                                       4y Springer
                                   Previous

-------
44
                                                                                       Hepatol Int (2008) 2:39^9
Table 2 Selected genes that underwent 2-fold or higher change in their expression in either direction (up or down) in the liver of HCD-fed mice
as compared to the liver of SD-fed mice
Functional group and gene name
Apoptosis
Phosphatidylserine receptor
Cell motility
Eps8 (Epidermal growth factor receptor pathway substrate 8
or EGF receptor kinase substrate 8)
Cell proliferation
Cold inducible RNA binding protein
Channels/Transporters
Amiloride-sensitive cation channel 5, intestinal
Lipocalin 13 (precursor) (retinoid carrier protein)
Complex lipid metabolism
Galactocerebrosidase (galactosylceramidase; precursor) (EC 3.2.1.46).
Cytochromes P450
Cytochrome P450, family 17, subfamily a, polypeptide 1
(catalyzes 17-a hydroxylase and 17,20-lyase activities) (EC 1.14.99.9)
P450 (cytochrome) oxidoreductase (EC 1.6.2.4)
Cytokines/Cytokine receptors
Emr4 (EGF-like module containing, mucin-like, hormone receptor like sequence 4)
Chemokine (C-C) motif ligand 9 (CCL-9; small inducible cytokine A9;
macrophage inflammatory protein 1-gamma
Chemokine (C-C) motif ligand 2
Macrophage inflammatory protein-related protein-2 (MRP-2)
Glutathione metabolism
Glutathione 5-transferase, n 3 (EC 2.5.1.18)
Glutathione 5-transferase, alpha 2 (Yc2) (EC 2.5.1.18)
Nucleic acid metabolism
Deoxyribonuclease II alpha (EC 3.1.22.1). A role in DNA degradation in apoptosis.
Nucleotide metabolism
Cytidine deaminase (EC 3.5.4.5)
Protein metabolism
Ubiquitin specific protease 18
Serine (or cysteine) proteinase inhibitor, clade B, member la
Secretory products
Intestinal trefoil factor (TFF3 precursor)
Signaling/signal transduction
Membrane anchored glycoprotein RECK
(inhibitor of tumour invasion, regulator of MMP-9)
Heat shock protein 1
Guanine nucleotide binding proptein, alpha 14
Macrophage expressed gene 1 (shares a distant ancestry to perform)
Calcium/calmodulin-dependent kinase II gamma (Camk2g)
LIM homebox protein 2 (Lhx2)
Methyl-CpG binding domain protein 1
Retinoic acid early transcript 1, alpha (Rae-1 alpha)
(early mammalian embryogenesis)
STAT-induced STAT inhibitor-2 mRNA
DNAJ (Hsp40) homolog, subfamily B, member l(Heat shock 40 kDa protein 1)
Gene bank Fold change
accession
code UP Down

AK017622.1 2.5

NM_007945 2.0

NM_007705.1 2.5

NM_02 1370.1 2.4
BC027556.1 2.4

NM_008079.1 3.1

NM_007809.1 2.6
NM_008898.1 26

AY032690.1 2.9
AF128196.1 2.2
AF065933.1 10.2
NM_011338 2.2

JO3953.1 3.7
NM_008182 8.3

NM_010062.1 3.4

AK008793.1 2.2

NM_01 1909.1 2.5
AB030426 3.4

NM_01 1575.1 4.5

NM_016678.1 4.5
NM_01 3560.1 2.1
NM_008137.1 2.3
L20315.1 2.5
BC025597.1 2.1
NMJH0710.1 2.1
AK007371.1 4.5
NM_009016. 1 3.5
BB244736 2.3
NM_018808.1 3.7
Expression*

489 ± 40

67 ± 12

328 ± 19

182 ±6
256 ± 24

71 ± 6

326 ± 23
4,896 ± 306

85 ±20
2,034 ± 130
218 ± 26
4,897 ± 231

5,970 ± 490
9,268 ± 966

8,015 ± 529

475 ± 41

332 ± 27
520 ± 122

256 ± 36

187 ± 23
5,538 ± 995
187 ± 27
1,453 ± 79
133 ± 6
63 ±5
385 ± 37
2,613 ± 460
1545 ± 87
8175 ± 314
4y Springer
                                  Previous

-------
Hepatol Int (2008) 2:39^9
                                                                                                                              45
Table 2 continued
Functional group and gene name
SH3-binding kinase 1 (Sbk)
Inhibin E (INHBE)
Xenobiotic metabolism
Sulfotransferase 1C1 (cytosolic)
Gene bank
accession
code
BC025837.1
NM_007945.1

NM_026935.1
Fold change
Up Down
2.3
2.7

2.1
Expression*
137 ± 9
67 ± 11

214 ± 19
* Expressed in intensity reading units (Mean ± SEM, for five animals in each group), for the SD-fed mice. To find the absolute value for the high
carbohydrate-diet fed mice, the value given in this column will be multiplied by the value in the column Up or divided by the value in the column
Down. The difference between the two groups with regard to gene expression is significant at P < 0.01. A complete Table I) comprising all 451
genes that underwent changes of 1.5-fold or more is available as supplemental material to this article
Table 3  Selected genes that likely have direct relevance to the biochemical mechanisms underlying HCD-induced fatty liver in the mouse
Functional group and gene name
Gene bank
accession code
Fold change
Up
Expression*
Carbohydrate metabolism
Solute carrier family 2 (facilitated glucose transporter, member 5)              NM_019741.1         2.7                   165 ± 8
Glucokinase (Hexokinase D, EC 2.7.1.1)                                     BC011139.1          2.4                  1,614 ± 202
Phosphoglucomutase 3 (EC 5.4.2.6)                                          AK013402.1          1.6                   492 ± 53
UDP-glucose pyrophosphorylase 2 (EC 2.7.7.9)                               AF424698.1          1.6                 10,426 ± 455
Ketohexokinase (EC 2.7.1.3)                                                BC013464.1          1.7                 11,278 ± 326
Malic enzyme, supernatant (cytosolic) (EC 1.1.1.40)                           NM_008615.1         1.8                  4,292 ± 440
Glucose phosphate isomerase (EC 5.3.1.9)                                    NM_008155.1         1.8                  3,742 ± 147
Pyruvate dehydrogenase kinase, isoenzyme 4 (EC 2.7.1.99)                    NM_013743.1         1.8                   472 ± 24
Pyruvate dehydrogenase kinase, isoenzyme 1 (EC 2.7.1.99)                    BC027196.1          1.7                  1,261 ± 61
Glycerolphosphate dehydrogenase 2, mitochondrial (EC 1.1.1.8)                NM_010274.1         1.6                  2,412 ± 240
Dihydrolipoamide S-acetyltransferase (E2 component of pyruvate               BC026680.1          1.6                  1,815 ± 126
  dehydrogenase complex) (EC 2.3.1.12)
Glucose 6-phosphate dehydrogenase X-linked (EC 1.1.1.49)                    NM_008062.1         1.8                   265 ± 38
Fatty acid and complex lipid metabolism
ATP citrate lyase  (EC 4.1.3.8)                                               BI456232             1.9                  9,127 ± 1072
Fatty acid desaturase 2  (EC 1.14.99.6)                                        BB430611            1.9                  5,419 ± 390
Glycerol-3-phosphate acyltransferase, mitochondrial (EC 2.3.1.15)              NM_008149.1         1.9                  8,022 ± 395
Monoglyceride lipase (EC 3.1.1.23)                                          NM_011844.2         1.6                  2,362 ± 128
Peroxisome proliferator-activated receptor alpha                              BC016892.1          1.7                  6,062 ± 714
Stearoyl-CoA desaturase (EC 1.14.99.5)                                      NM_009127.1         1.9                  9,793 ± 1,908
Fatty acid desaturase 2  (Delta-6 desaturase)  (EC 1.14.99.5)                    NMJM9699.1         2.5                  6,486 ± 270
Sterol regulatory element binding factor  1 (SREBP-1)                         AI326423             1.6                  5,679 ± 257
Lipocalin 13 (precursor) (retinoid carrier protein)                             BC027556.1          2.4                   256 ± 24
Fatty acid binding protein 5, epidermal                                       BC002008.1          2.5                  3,714 ± 448
Adiponutrin (a triacylglycerol lipase and acylglycerol O-acyltransferase)         NM_054088.1         5.6                   144 ± 8
  (EC 3.1.1.3, and EC 2.3.1.-)
ELOVL family member 6 (elongation of very long chain  fatty acids;            NM_130450.1         2.4                  2,787 ± 185
   a lipogenic enzyme regulated by SREBPs)
* Expressed in intensity reading units (Mean ± SEM, for five animals in each group), for the SD-fed mice. To find the absolute value for the
high carbohydrate-diet fed mice, the value given in this column should be multiplied by the value in the column Up. The difference between the
two groups with regard to gene expression is significant at P < 0.01. An expanded list of genes involved in carbohydrate and fat metabolism is
given in Table I as supplementary material
                                                                                                                      4y Springer
                                        Previous

-------
46
                                                                                            Hepatol Int (2008) 2:39^9
Table 4  Down-regulation of glutathione 5-transferases in the liver of
mice fed a high carbohydrate diet
Gene name
Glutathione
5-transferase fj, I
Glutathione
5-transferase 9 3
Glutathione
5-transferase /j, 3
Glutathione
5-transferase, a 2 (Yc2)
Glutathione
5-transferase a 4
Access code
J03952.1

BC003903.1

J03953.1

NM_008 182.1

NM_010357.1

Change (-fold)
in HCD group
Down 1.8

Down 2.0

Down 3.7

Down 8.0

Down 3.4

   Of interest, the vast majority of genes identified in this
study (Tables 2  and 5,  and Table  I of supplementary
material) cannot  be directly  linked to the biochemical or
molecular  processes leading to fatty  liver.  Changes in
expression of these  genes likely reflect alterations in cel-
lular processes caused by, rather than leading to, hepatic
steatosis. Owing  to space limitations, these genes will not
be discussed in any  detail.
   NASH is thought to evolve through a 2-hit process in
which the first hit is steatosis. The second hit or hits include
multiple factors such as  oxidative stress,  proinflammatory
cytokines,  mitochondrial dysfunction,  insulin resistance,
and even industrial  exposures [1]. An important issue is
whether the data reported herein provide  insights  into
genes that may predispose the liver to a "second hit," thus
leading to  NASH.  Since no changes  in transcriptome
expression of  proinflammatory  or  profibrotic  cytokines
(e.g.,  TNF-a, IL-1/3, IL-18, TGF-j3, and others), classically
thought to mediate liver injury including fibrosis,  were
identified in the HCD-fed mouse liver, it seems that these
classic proinflammatory cytokines, at least those secreted
in the liver,  may not be critically involved  in the early
aspects of this disease. The lack of TNF-a participation in
dietary-induced  NASH was  recently suggested by Deng
et al. [7], who demonstrated  that knocking out the TNF-a
receptor  1  does not prevent  NASH induced  by force-
feeding a fat-enriched diet. Similar studies by Dela Pena
et al. [26] showed that TNFR1 knockout mice still develop
hepatic steatosis when fed a methionine-restricted, choline-
deficient diet. Moreover, studies in children with NAFLD
demonstrate normal serum TNF but decreased adiponectin
as early  events  [27].  Our  study does show that the tran-
scriptome  of two macrophage  inflammatory  proteins,
MCP-1 and  MCP-2  (Table 2), were upregulated, which
may  predict potential  facilitation  of extrahepatic  cell
infiltration into the liver and the initiation  of inflammation.
Several genes, other than proinflammatory cytokines, may
be considered as plausible candidates  for a potential role in
the progression to NASH. One of these is  the macrophage-
expressed gene  1 (L20315.1), a relative of perforin (gran-
zyme  B),  which was  upregulated  2.5-fold  (Table 2).
Another  gene is  methyl-CpG-binding  domain  protein  1
(AK007371.1, known as MBD1),  a member of a family of
five mammalian methyl-CpG-recognizing proteins, which
plays a key role in maintaining a transcriptionally inactive
state  of  methylated promoters  [28,  29].  This  gene  was
downregulated 4.5-fold. Its downregulation may facilitate
expression  of genes that otherwise would be  in a state of
restricted transcription.
   Another group of  genes of interest for potential pro-
gression  of the steatotic liver to NASH is represented by
glutathione 5-transferases  (Table 5), which  were down-
regulated in  the  steatotic  liver. Downregulation of these
enzymes may lead to a decreased capacity of the liver to
detoxify  xenobiotics,  thereby increasing the susceptibility
Table 5  Comparison of gene expression changes for control and HCD-fed mice as determined by quantitative RT-PCR and cDNA microarray
Gene code
AK003441.1
AW489168
AF065933.1
NM_009998.1
NM_010062.1
NM_018808.1
NM_007945
NM_008182
U72881.1
NM_008898.1
Applied Biosystems
assay identification
Mm00614943_ml
Mm00519268_ml
Mm00441242_ml
Mm00456591_ml
Mm00438463_ml
Mm00444519._ml
Mm00514752_ml
Mm00833353_ml
Mm00803317_ml
Mm00435876_ml
Gene name
Ankyrin repeat and KH Domain containing 1
Bcl-2 binding component 3
Chemokine (C-C motif) ligand 2
Cytochrome P450, family 2, subfamily b, poly-peptide 10
Deoxyribonuclease II alpha
DnaJ (HSP40) Homolog, subfamily b, member 1
Eps8 (Epidermal growth factor receptor pathway substrate 8)
Glutathione 5-transferase a 2
Regulator of G protein signaling 16
P450 Cytochrome oxidoreductase
Change in
Microarray
2.5 T
2.1 T
10.2J
20.0J
3.41
3.71
2.01
8.31
7.3 T
26.0J
expression
Q-RT-PCR
2.6 T
2.8 t
2.71
ND*
2.1|
5.8|
1.81
12.51
7.2 t
ND*
 ; ND, not detected. The relative expression for Q-RT-PCR was normalized to the 18S rRNA copy level. Arrows indicate the direction of change
4y Springer
                                   Previous

-------
Hepatol Int (2008) 2:39^9
                                                                                                                        47
        GCK
       EGFR
  Cyp450 reductase
       p-actin
                             SD
                                               HCD
Fig. 3  Gel images illustrating the Western blot  assay of protein
abundance. The  following  genes  were tested  for  their protein
abundance: GCK, glucokinase; EGFR, epidermal growth factor-like
module  containing  mucine-like,  hormone  receptor-like  receptor
sequence  4 (or  Emr4); Cyp450  reducatse,  and /f-actin.  Other
abbreviations: SD, standard diet; HCD, high carbohydrate  diet. The
following values apply for the HCD-to-SD ratio: GCK, 2.00 ±0.18;
Cyp450 reductase, 0.16 ± 0.08; EGFR, 1.23  ± 0.28

of  the liver to  undergo  pathologic  changes including
necrosis.   Such  changes may  be  triggered by  chemical
agents  from  the environment,  and there are well-docu-
mented examples of industrial NASH, such as that caused
by petrochemical exposure [30, 31].
   Lastly,  the increased circulating levels of both glucose
and insulin recorded in this study suggest the existence of a
certain degree of insulin resistance. These findings raise the
question of whether the steatotic liver induced by HCD
feeding is insulin resistant. The concept of an obligatory
association of the steatotic  liver with insulin resistance
has been  challenged  by experimental  and clinical data.
Thus, it has been demonstrated that NASH can occur in the
absence of overt insulin resistance [32-36]. On the basis of
the data presented in this  and  other studies, we surmise
that, in the model of hepatic steatosis used in  this study,
and at the  moment of animal killing, the liver is not insulin
resistant.  Thus, (i)  the insulin  response  element-binding
protein (IREBP-1)  [37, 38], a target of  insulin  signal
transduction  downstream of the  PI-3K/protein kinase B
(Akt) pathway, regulates the expression of many enzymes
involved in carbohydrate conversion to fat;  this factor can
only be active in the presence of an intact insulin signaling
cascade, (ii)  the enzymes  involved  in carbohydrate con-
version  to   fat,  including  glucokinase   (EC 2.7.1.1),
ketohexokinase (EC 2.7.1.3), glucose-6-phosphate dehy-
drogenase (EC 1.1.1.49), and others, were  upregulated in
the liver of the mouse model used in our study (Table 3);
                                                                         Extracellular space
                                                       The hepatocyte
Fig. 4 Schematic representation of the metabolic pathways involved
in carbohydrate conversion  to  fat in the liver.  Represented are
glycolysis, part of the citric acid cycle, citrate cleavage enzyme, fatty
acid synthase, fatty acyl-CoA desaturase, pentosephosphate pathway
and triacylglycerol synthesis. The following nonstandard abbrevia-
tions are used: Fru, fructose;  Glue, glucose; Fru-lP, fructose 1-
phosphate; Gluc-6-P,  glucose 6-phosphate; TrP, triosephosphates;
OxAc, oxaloacetate; Citr, citrate; Mai, malonyl-; CAC, citric acid
cycle; Facyl-, fatty acyl; — (C=C)—, monounsaturated, long-chain fatty
acid; TAG, triacylglycerol; PPP,  pentosephosphate pathway. Red
triangles denote enzymes or other proteins whose gene expression
was upregulated at least 1.5-fold, and they are as follows: 1, glucose
transporter 5; 2, ketohexokinase; 3, glucokinase; 4, components of
pyruvate  dehydrogenase complex  (pyruvate dehydrogenase  kinase
isoenzymes); 5, citrate cleavage enzyme;  6, glucose  6-phosphate
dehydrogenase; 7, stearoyl (fatty acyl)-CoA desaturase; 8, acylglyc-
erol  O-acyltransferase; 9, malic  enzyme  (cytosolic and NADP-
dependent). The pentosephosphate pathway (PPP) is represented here
at the lower right side of the  figure by  the reaction catalyzed by
glucose 6-phosphate dehydrogenase, which was upregulated in HCD-
fed animals. Genes  encoding  enzymes involved in fatty acid /?-
oxidation, a pathway that may contribute to triacylglycerol accumu-
lation in the liver, were not found to be changed. Also, some of the
genes listed in  Table 4 are not represented  in the figure in order to
keep a certain degree of simplicity. Enzyme classification (EC) for the
enzymes in the map is given  in Table 3
                                                                                                                4y Springer
                                      Previous

-------
48
                                                                                                      Hepatol Int (2008) 2:39^9
such an upregulation could not be accomplished otherwise
than through an  adequate response of the liver cells  to
insulin, (iii)  nonenzymatic factors involved in  lipid syn-
thesis,  such  as SREBP-1, are also  under  the control  of
insulin [24, 39]; the expression of this factor that, in turn,
controls  several  major  enzymes involved  in fatty acid
synthesis  [40], was upregulated 1.6-fold (Table 3). Taken
together,  the gene expression data  of this and of  cited
studies are not compatible with an  insulin-resistant liver
during the phase of NAFLD  in  which our animals  were
killed. Further research is required to study the evolution of
the  fatty  liver,  including  potential  progression  to steato-
hepatitis,  in the model used in this study.
   In conclusion,  our study (i) demonstrates the usefulness
of the mouse model of HCD-induced hepatic  steatosis  for
the  study of  the  fatty  liver of  nutritional  origin,  (ii)
emphasizes the importance of using large-scale gene pro-
filing of  the  liver in identifying  potential  causes  and
understanding the mechanisms underlying the  disease, and
(iii)  offers  a  database  for further  investigation  of  the
mechanisms  underlying  the hepatic  steatosis of  dietary
origin and its potential progression to NASH.

Acknowledgements  The work reported in this study  was supported
by National Institutes of Health grants (to I.V.D., Z.S., S.S.B., T.B.K.,
A.V.S.,  and C.J.M.) and a Department of Veterans Affairs grant (to
C.J.M.).


References

 1. McClain CJ, Mokshagundam PL,  Barve SS, Song Z, Hill DB,
    Chen T, Deaciuc I. Mechanisms of non-alcoholic steatohepatitis.
    Alcohol 2004;34:67-79.
 2. Den Boer M, Voshol PG, Kuipers F,  Havekes LM,  Romijn JA.
    Hepatic steatosis: a mediator of the metabolic syndrome. Lessons
    from   animal   models.   Artherioscl   Thromb   Vase   Biol
    2004;24:644-9.
 3. Koteish A, Diehl AM. Animal models of steatosis. Semin Liver
    Dis 2001;21:89-104.
 4. Lieber  CS, Leo MA, Mak KM, Xu  Y, Cao  Q,  Ren C, et al.
    Model of non-alcoholic steatohepatitis. Am J Clin Nutr 2004;79:
    502-9.
 5. Nanji AA.  Animal models of nonalcoholic fatty liver disease and
    steatohepatitis. Clin Liver Dis 2004;8:559-74.
 6. Portincasa P, Grattagliano I, Palmieri VO, Palasciano G. Nonal-
    coholic  steatohepatitis:   recent advances from  experimental
    models to clinical management. Clin Biochem  2005;38:203-17.
 7. Deng QG, She H, Cheng JH, French SW, Koop DR, Xion S, et al.
    Steatohepatitis  induced  by  intragastric  overfeeding in  mice.
    Hepatology 2005;42:905-14.
 8. Feldstein AE, Canbay A, Guicciardi ME, Higuchi H, Bronk SF,
    Gores GJ.  Diet  associated steatosis sensitizes to Fas mediated
    liver injury in mice. J Hepatol  2003;39:978-83.
 9. Friedman HA, Nylund B. Intestinal fat digestion, absorption and
    transport. Am J  Clin Nutr 1980;33:1108-39.
10. Foufelle F, Girard J, Ferre P.  Regulation of lipogenic enzyme
    expression by glucose in liver and adipose tissue: a review of the
    potential cellular and molecular mechanisms. Adv Enzyme Regul
    1996;36:199-226.
11.  Flatt JP. Conversion of carbohydrate to fat in the adipose tissue:
    an energy-yielding and, therefore, self-limiting process. J Lip Res
12.  Basciano H,  Federico L, Adeli K. Fructose, insulin resistance,
    and  metabolic   dyslipidemia.  Nutr  Metab  2005 ;2:5.  doi:
    10.1186/1743-7075-2-5.
13.  Diraison F, Yankah V, Letexier D, Dusserre E, Jones P, Beylot
    M. Differences  in  the regulation  of adipose tissue and  liver
    lipogenesis  by   carbohydrates  in  humans.   J   Lipid  Res
    2003;44:846-53.
14.  Hudgins L, Hellerstein M, Seidman C,  Seidman C, Neese  R,
    Diakun J, et al. Human fatty acid synthesis is stimulated by eu-
    caloric   low   fat,  high   carbohydrate  diet.   J  Clin  Invest
    1996;97:2081-91.
15.  Schwartz JM, Linfoot P, Dare D, Aghajanian K. Hepatic de  novo
    lipogenesis in normoinsulinemic and hyperinsulinemic  subjects
    consuming high-fat,  low  carbohydrate and low-fat,  high carbo-
    hydrate isoenergetic diets. Am J Clin Nutr 2003;77:43-50.
16.  Quintanilha AT,  Packer L, Davies JM, Racanelli TM, Davies KJ.
    Membrane effects of  vitamin E  deficiency: bioenergetic and
    surface charge  density studies of skeletal  muscle and  liver
    mitochondria. Ann NY Acad Sci 1982;393:32-47.
17.  Deaciuc IV, Peng X, D'Souza NB,  Shedlofsky SI, Burikhanov R.
    Microarray gene analysis of the liver in a rat model of chronic,
    voluntary alcohol uptake. Alcohol  2004;32: 113-27.
18.  Jarvelainen HA, Fang  C, Ingelman-Sundberg M, Lindros KO.
    Effect of chronic administration of endotoxin and ethanol on rat
    liver  pathology  and  proinflammatory  and  antiinflammatory
    cytokines.  Hepatology 1999;29:1602-14.
19.  Bulera  SJ, Eddy  SM, Ferguson  E, Jatkoe TA,  Reindel JF,
    Bleavins MR, et  al. RNA expression in the early characterization
    of hepatotoxicants in Wistar rats by high-density DNA micro-
    arrays. Hepatology 2002;33: 1239-58.
20.  Deaciuc IV, Doherty DE, Burikhanov R, Lee EY, Stromberg AJ,
    Peng X, et al. Large-scale gene profiling  of the liver in a mouse
    model  of chronic,  intragastric  ethanol infusion. J  Hepatol
    2004;40:219-27.
21.  Spencer AF,  Lowenstein JM. The  supply of precursors  for the
    synthesis of fatty acids. J Biol Chem 1962;237:3640-8.
22.  Wise EM, Ball  EG. Malic enzyme and  lipogenesis. Proc Natl
    Acad Sci USA 1964;52: 1255-63.
23.  Miiller-Wieland  D, Kotzka J. SREBP-1:  gene regulatory key to
    syndrome X? Ann NY  Acad Sci 2002;967:19-27.
24.  Eberle D,  Hegarty B, Bossard P,  Ferre  P, Foufelle F. SREBP
    transcription  factors: master regulators  of lipid  homeostasis.
    Biochimie 2004;86:839^18.
25.  Tan NS, Michalik L, Desvregne B, Wahli W. Multiple expression
    control mechanisms  of peroxisome proliferator-activated recep-
    tors  and  their  target  genes.  J  Steroid Biochem Mol  Biol
    2005;93:99-105.
26.  Dela Pena A, Leclerq I, Field J, George J,  Jones B, Farrell G. NF-
    kappaB activation,  rather than TNF, mediates hepatic inflam-
    mation   in  a  murine   dietary   model   of   steatohepetitis.
    Gastroenterology 2005 ; 1 29: 1 663-74.
27.  Louthan MV, Barve S, McClain CJ, Joshi-Barve S. Decreased
    serum adiponectin: an early event in pediatric non-alcoholic fatty
    liver disease.  J Pediatr 2005;147:835-8.
28.  Ballestar E, Wolffe AP. Methyl-CpG-binding proteins. Targeting
    specific gene repression. Eur J Biochem 2001 ;268: 1-6.
29.  El-Osta A, Wolffe AP. DNA methylation and histone deacety-
    lation in the  control of gene expression: basic biochemistry to
    human development and disease. Gene Expr 2000;9:63-75.
30.  Mehlman  MA.  Dangerous  and cancer-causing  properties  of
    products and chemicals in the oil refining and petrochemical
    industry. VIII. Health effects of motor fuels: carcinogenicity of
    gasoline — scientific update. Environ Res  1992;59(l):238^-9.
4y Springer
                                       Previous

-------
Hepatol Int (2008) 2:39^9
                                                            49
31.  Cave M, Deaciuc IV, Mendez C, Song Z, Joshi-Barve S, Barve S,
    et al. Nonalcoholic fatty liver disease:  predisposing factors and
    the role of nutrition. J Nutr Biochem 2007;18:184-95.
32.  Adams LA, Angulo P, Lindor KD. Nonalcoholic fatty liver dis-
    ease. Can Med Assoc J 2005;172:899-905.
33.  Browning JD, Horton JD. Molecular mediators of hepatic stea-
    tosis and liver injury. J  Clin Invest 2004; 114:147-52.
34.  Browning JD, Szczepaniak LS, Dobbins R, Nurember P, Horton
    JD,  Cohen JC,  et al. Prevalence of hepatic steatosis in an urban
    population in the United States: impact of ethnicity. Hepatology
    2004;40:1387-95.
35.  Evans  RM,  Barish GD, Wang YX. PPARs and the complex
    journey to obesity.  Nat Med 2004;10:l-7.
36.  Hamaguchi M,  Kojima T,  Takeda N, Taniguchi H, Fujii K,
    Omatsu T, et al. The  metabolic syndrome as a predictor of non-
    alcoholic fatty liver disease. Ann Intern Med 2005; 143:722-8.
37. Villafuerte BC, Phillips LS, Rane MJ, Zhao W. Insulin-response
    element-binding protein 1.  A novel Akt substrate involved in
    transcriptional action of insulin. J Biol Chem 2004;279:36650-9.
38. Villafuerte BC, Kaytor EN. An insulin-response element-binding
    protein that ameliorates hyperglycemia in diabetes. J Biol Chem
    2005;280:20010-20.
39. Foufelle F,  Ferre P. Regulation  of carbohydrate  metabolism by
    insulin:  role of transcription  factor SREBP-lc  in the  hepatic
    transcriptional effects of the hormone. J Soc Biol 2001; 195:243-
    8.
40. Griffin MJ, Sul HS. Insulin regulation of fatty acid synthase gene
    transcription: roles of USF and SREBP-lc. Life 2004;56:595-
    600.
                                                                                                                     4y Springer
                                        Previous

-------
TOXICOLOGICAL SCIENCES 110(2), 449-462 (2009)
doi:10.1093/toxsci/kfp098
Advance Access publication May 7, 2009
   Mode of Action for  Reproductive and  Hepatic Toxicity  Inferred  from
                          a  Genomic  Study of  Triazole  Antifungals
                                         Amber K. Goetz*'t and David J. Dix*'1

 *National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North
     Carolina 27711; and ^Department of Environmental and Molecular Toxicology, North Carolina State University, Raleigh, North Carolina 27695

                                        Received January 16, 2009; accepted April 21, 2009
  The mode of action  for the  reproductive toxicity  of some
triazole antifungals has been characterized as an increase in serum
testosterone and hepatic  response, and reduced insemination and
fertility indices. In order to refine our mechanistic understanding
of these potential modes of action, gene expression profiling was
conducted on liver and  testis from  male  Wistar Han  IGS rats
exposed to myclobutanil (500, 2000 ppm), propiconazole (500,
2500 ppm), or triadimefon (500,1800 ppm) from gestation day six
to postnatal day 92. Gene expression profiles indicated that all
three  triazoles significantly perturbed the fatty acid, steroid, and
xenobiotic metabolism pathways in the male rat liver. In  addition,
triadimefon modulated expression of genes in the liver  from the
sterol biosynthesis  pathway. Although expression  of individual
genes were affected, there were no common pathways modulated
by all three triazoles in the testis. The pathways identified in the
liver included numerous genes involved in phase I-III metabolism
(Aldhlal, Cyplal,  Cyp2b2,  CypSal, Cyp3a2,  Slcola4, Udpgtr2),
fatty  acid metabolism  (Cyp4alO, PCX, Ppap2b),  and  steroid
metabolism (Ugtlal,  Ugt2al) for which expression was altered
by the triazoles. These differentially expressed genes form part of
a network involving lipid, sterol, and steroid homeostatic path-
ways regulated by the constitutive androstane (CAR), pregnane X
(PXR),  peroxisome  proliferator-activated  alpha, and   other
nuclear receptors in liver. These  relatively high dose and long-
term exposures  to triazole antiningals appeared to perturb fatty
acid and steroid metabolism in the male rat liver predominantly
through the CAR and PXR signaling pathways. These  toxicoge-
nomic effects describe a plausible series of key  events contributing
to the disruption in steroid homeostasis and reproductive toxicity
of select triazole antifungals.
  Key Words: myclobutanil;  propiconazole;  triadimefon;
toxicogenomics; steroid metabolism.
  Disclaimer: The United States Environmental Protection Agency through its
Office of Research and Development funded and managed the research
described here. It has been subjected to Agency  administrative review and
approved for publication.
  1 To whom correspondence should be addressed at National Center for
Computational Toxicology,  Mail Drop  D343-03, U.S.  Environmental
Protection Agency, Research Triangle Park, NC 27711. Fax: (919) 541-1194.
E-mail: dix.david@epa.gov.

Published by Oxford University Press 2009.
        The ability of triazole antifungals to bind and inhibit fungal
      lanosterol-14a-demethylase  activity (cypSl) makes this  class
      of compounds an effective tool in controlling many species and
      strains of fungi (Ghannoum and  Rice,  1999).  Disruption in
      ergosterol biosynthesis leads to a build up of toxic intermediate
      sterols in the  fungal  cell membrane,  increasing membrane
      permeability and inhibition of fungal growth (Vanden Bossche
      et al.,  1990). Hence, triazole antifungals have been  used for
      their  target  pesticidal mode  of  action  on  fungal cypSl
      inhibition and  have  proven to be valuable in  the control
      against multiple types  of fungal disease.
        Cytochrome  P450 51 (CypSl) is a conserved gene  in fungi,
      protists, mammals,  and plants required  for sterol biosynthesis
      in all eukaryotic systems. The ability of triazoles to bind to the
      heme  protein  and  inhibit  CYP-dependent enzymes  raises
      concerns over triazole effects on hormone synthesis and drug
      metabolism  (Barton et al., 2006;  Goetz et al., 2007; Hester
      et al.,  2006; Sun et al., 2007; Tully et  al., 2006; Wolf et al.,
      2006). In rodents, reproductive  toxicity  has  been  reported
      following administration of myclobutanil or triadimefon, but
      not propiconazole;  and carcinogenicity following administra-
      tion of propiconazole or triadimefon,  but not myclobutanil
      (Goetz et al, 2007; U.S. EPA, 1995,  1996, 2001, 2005a,b,c,
      2006). These  data prompted  interest  in  gaining  a  better
      understanding  of the  modes and  mechanisms  of action for
      triazole related reproductive toxicity and whether there are
      common modes of actions for triazole fungicides.
        In a 14-day oral (gavage) toxicity study, adult male Sprague
      Dawley rats were administered fluconazole (0, 2, 25, or 50 mg/
      kg/day), myclobutanil (0, 10, 75, or 150 mg/kg/day),  propico-
      nazole (0, 10, 75, or 150 mg/kg/day) or triadimefon (0,10, 50, or
      115 mg/kg/day). Only myclobutanil (150 mg/kg/day) produced
      a statistically significant increase in serum testosterone levels
      (Tully  et al., 2006). In contrast results from a reproduction and
      fertility study examining developmental  and adult reproductive
      effects in Wistar Han rats exposed via feed to myclobutanil (ca.
      6.1, 32.9,  or 133.9 mg/kg/day), propiconazole (ca. 6.7, 31.9, or
      169.7 mg/kg/day) or triadimefon (ca. 6.5, 33.1, or 139.1 mg/kg/
      day) from gestation day 6 to postnatal day 92, demonstrated that
      all  three  triazoles  caused  a significant  increase in  serum
                                  Previous
TOC

-------
450
                                                          GOETZ AND DIX
testosterone  levels.  This  evidence  combined  with reduced
fertility,  hepatomegaly,   and  changes  in  pituitary weights
(myclobutanil  only)  suggested  a  disruption  in  testosterone
homeostasis  was  a key mode of  action in the  reproductive
toxicity (Goetz  et  al.,  2007). However, the modes of triazole
related toxicity to the testis and liver, disruption in testosterone
homeostasis, and reduced fertility were unclear.
  Previous research on triazoles has focused on the metabolic,
hepatic  and  thyroid response  to myclobutanil,  propiconazole
and triadimefon following short term  (4  days) to subchronic
(90 days) exposure in the adult rat  and/or mouse (Allen et al.,
2006; Barton et al., 2006; Chen et al., 2009; Goetz et al., 2006;
Hester and Nesnow, 2008; Hester et al., 2006; Sun et al., 2007;
Tully et al,  2006; Ward et al, 2006; Wolf et al,  2006). This
work focused  on  delineating  common  modes  of triazole
toxicity in the  adult animal,  with  an emphasis on defining
biological  pathways  relevant  for  both  carcinogenic   and
noncarcinogenic effects.  Toxicogenomics studies have  been
conducted  in liver,  thyroid, and testis following  4 days to
subchronic exposures to these three triazoles. The goal of the
present study was to gain a better understanding of the  modes
and mechanisms of action for the triazole reproductive toxicity
observed following gestation  to adulthood exposure  (Goetz
et al, 2007). The  specific aim was to identify  key biological
pathways affected by triazoles in the liver and testis  following
exposure from gestation through adulthood, in  order to better
understand the disruptions in testosterone homeostasis.
  Using tissue  samples from  the previously published study
(Goetz et al, 2007), gene expression changes were assessed in
the male rat liver  and testis following exposure  to  myclobu-
tanil, propiconazole, or triadimefon from gestational day (GD)
6  through  postnatal  day  (PND)  92 in  order to  test the
hypothesis  that triazoles  disrupt testosterone homeostasis by
increasing  expression  of   genes  involved  in  testosterone
synthesis  in  the  testis  and  decrease expression  of genes
involved in  testosterone metabolism  within  the liver.  The
PND92 time point was chosen for  this study due to increased
testosterone  levels  and compromised fertility in young  adult
males  at this  time  point  following  exposure  to triazole
antifungals (Goetz et al, 2007). We also hypothesized that if
triazoles  share  common  modes of action,  exposure to these
various triazoles would result in similar expression profiles of
steroidogenic and steroid metabolism related genes.  The  gene
expression   profiling   in  this  study,  in   addition   to   prior
toxicological assessments, was expected to  provide mechanis-
tic  insights  into  potentially   shared  modes   of action for
myclobutanil, propiconazole, and triadimefon.
                MATERIALS AND METHODS

  Animal husbandry and dosing regimen.  A full description of animal
husbandry and dosing is described in Goetz et al. (2007). Briefly, animal care,
handling, and treatment were conducted  in an American Association for
   Accreditation of Laboratory Animal Care-International accredited facility,
   and all procedures were approved by the U.S. Environmental Protection
   Agency  (EPA)  National  Human  and Environmental  Effects  Research
   Laboratories Institutional Animal Care and Use Committee. Timed pregnant
   Wistar Han IGS rats were received from Charles River Laboratories  (Raleigh,
   NC) on GD1-3; single housed, and allowed to acclimate for 3 days prior to
   the start of the treatment. Dams delivered naturally with day of delivery
   designated as PNDO for the FI generation. FI offspring were housed with their
   respective mothers until weaning on PND23. Males were removed from the
   dams and housed by treatment group in pairs until PND50. Males were single
   housed after PND50.
     Feed was prepared by Bayer CropScience (Kansas City, MO) as part of
   a Materials Cooperative Research and Development Agreement between the
   U.S. EPA and the U.S. Triazole Task Force. Control animals were  fed 5002
   Certified Rodent Diet with acetone vehicle added. Treatment groups used in
   this study received feed containing a dietary concentration of myclobutanil
   (M: 500 or 2000 ppm), propiconazole (P: 500 or 2500 ppm), or triadimefon
   (T: 500 or 1800 ppm). Dose levels used for this study were selected to match
   dose levels used in regulatory studies for registering these triazoles with the
   U.S. EPA. Dams began treated feeds on GD6, feed intake and body weights
   were measured weekly. The Ft generation continued on the same treated feed
   diets; feed intake and body weights were measured weekly until necropsy.
   Refer to Goetz et al. (2007) for achieved dose levels on a week by week basis.
   One male from each litter was necropsied on PND92  for transcriptional
   profiling analysis.

     RNA isolation. Total RNA was extracted from PND92 liver and testis of
   control and treatment groups  using  TRI Reagent (Molecular Research Center
   Inc., Cincinnati, OH) according to the manufacturer's protocol and subjected to
   quality control measures before application to microarrays. For quality control,
   RNA  A2eo^280  ratios  were assessed via NanoDrop  Fluorospectrometer
   (NanoDrop Technologies, Inc.,  Wilmington, DE). RNA absorbance readings
   with a range 1.8-2.1 were followed with DNase treatment, Total RNA Cleanup
   (Qiagen RNeasy), and  checked for RNA quality using the model  2100
   Bioanalyzer (Agilent Technologies,  Inc., Palo Alto, CA). Samples with a ratio
   of 28S: 18S rRNA > 1.6 were accepted for subsequent use in DNA microarrays.
   RNA was stored at — 80°C until labeling for microarray hybridization.
     Microarray  hybridization  and scanning.  Microarrays and reagents were
   provided by Affymetrix as part of a Materials  Cooperative Research and
   Development Agreement. Microarray processing was conducted for EPA by
   Expression  Analysis  Inc. (Durham, NC).  Five micrograms of purified  total
   RNA from each liver or testis of three to seven individual rats per  treatment
   group  was  hybridized to Affymetrix GeneChip Rat Genome 230  2.0  plus
   microarrays  according  to  the  Affymetrix GeneChip Expression  Analysis
   Technical Manual (www.affymetrix.com).
     Microarray  and probe set analysis.  To minimize nonbiological factors,
   for example, total amount of target hybridized to each array, signal values from
   each microarray were multiplied by a scaling factor to achieve a mean intensity
   equal to 500. Converted .eel files were loaded into the JMP Genomics program
   (SAS,  Inc.,  Gary, NC), Log2  transformed, normalized using interquartile
   normalization, and analyzed for significant changes in transcript levels through
   row-by-row modeling using one-way ANOVA. For initial exploratory analysis,
   principle component analysis  (PCA)  was applied using JMP Genomics.
   Comparisons were made between  controls and each treatment group  with
   statistical cut-offs applied at a p value adjusted  false discovery rate  (FDR) of
   10% for liver (p < 0.000724)  or FDR of 25% for testis (p < 0.000229), and an
   absolute difference of 11.21 or greater. Probe sets representing transcribed loci,
   unknown genes, and image clones  were removed from the final list of each
   analysis; probe sets with predicted annotations were kept in the analysis. The
   Affymetrix  .eel files can be accessed  through Gene Expression  Omnibus
   (www.ncbi.nkn.nih.gov/geo);   series  accession numbers  GSE10411  and
   GSE10412.
     Pathway analysis.  Ingenuity Pathways Analysis (IPA; Ingenuity Systems,
   www.ingenuity.com)  was used for initial pathway level analysis. Genes from
                                          Previous
TOC

-------
                                                        TRIAZOLE TOXICOGENOMICS
                                                                                                                                            451
the data set that met the absolute difference cut-off of 11.21, p value  cut-off
based on the data set p value adjusted FDR, and associated with a canonical
pathway in the Ingenuity Pathways Knowledge Base (IPKB) were considered
for the IPA-based analysis. Canonical pathways were identified from the IPA
library that were impacted most significantly by the triazoles. The significance
of the association between  the  data set  and the  canonical pathway was
measured using the ratio of genes from the data set that mapped to the pathways
divided by the total number  of genes that mapped to the canonical  pathway.
Significance  was calculated using the right-tailed  Fisher's Exact Test by
comparing the number of focus genes that participated in  a given  pathway,
relative to the total number  of  occurrences  of these genes in  all path-
way annotations in the IPKB. Using this methodology,  over-represented path-
ways were identified containing more focus genes than  expected by  chance.
   Further  analysis of a broader set  of genes  and pathways  included those
identified  by IPA, in  combination with  relevant pathways  from the  Kyoto
Encyclopedia of Genes and Genomes (http://www.genome.jp/kegg/pathway.html)
and  other references to associate  genes  with  the most complete  biological
pathways possible. For Figure  1 and the fatty acid metabolism pathway, additional
references  included Coleman etal. (2000), Nordlie and Foster (1999), and Sul and
Wang (1998). The sterol biosynthesis pathway in Figure 2  was supplemented with
information from Shibata et al. (2001) and Tansey and Shechter (2000). Figure 3,
the cholesterol and bile acid biosynthetic pathway utilized Pandak et al. (2002),
Russell (1999), Schwarz et al. (2001), Staudinger et al. (2001), and Wang et al.
(2005). Figures  4 and  5 on nuclear receptor signaling pathways depended on
Baldan et  al. (2006), Dixit et al. (2005), Guzelian et al. (2006), Jigorel et al.
(2005,  2006), Kretschmer and Baldwin (2005), Maglich et al. (2002), Shenoy
et al. (2004), Yoshikawa et al. (2003), and You (2004).
          Quantitative  PCR.  TaqMan-based  quantitative RT-PCR  was used  to
       determine the relative levels of Abcbl, Cyplal, Cyp2bl/Cyp2b2, CypSal,
       Cyp3a2, Cyp4al, and Ugtlal mRNA in the samples from each treatment group.
       Primer/probe sets specific for each gene were utilized from Applied Biosystems
       (Foster City, CA) for Abcbl (Rn00561753_ml),  Cyplal  (Custom assay),
       Cyp2bl/2 (Custom assay), Cyp3al (Rn01640761_gl), Cyp3a2 (Rn00756461_ml),
       Cyp4al (Rn00598510_ml), and Ugtlal  (Rn00754947_ml).  The exception to
       this was for Cyp2bl and Cyp2b2, for which the primer/probe set could detect
       either  gene transcript-that is why  these results are hereafter referred  to  as
       Cyp2bl/2.  A two-step  RT-PCR process was performed by initial reverse
       transcription of ca. 200 ng of total RNA in  a 60-ul reaction using the High
       Capacity cDNA Archive Kit (Applied Biosystems, Foster City, CA), followed
       by quantitative PCR amplification with isoforms-specific primer/probe sets on
       2 ul of each reverse transcribed cDNA. The reactions were characterized by the
       point during PCR amplification  at which fluorescence of the product crossed
       a defined threshold (Or, automatically determined by the PE Applied Biosystems
       ABI 7900HT Sequencer software), and inspected to ensure all CT values were
       within the linear phase (log scale) of exponential growth for all  targets. CT values
       were determined for target CYP genes and an endogenous control gene, p-actin.
       Each sample was normalized to both the P-actin control and to a vehicle control.
       A difference of one CT was considered equivalent to a twofold difference in gene
       expression (exponential relationship, i.e., RQ = 2~DDCt). Sample means for each
       replicate were  determined along with  the  standard error  of the mean if
       appropriate and percent of adjusted positive control. Relative fold changes in
       mRNA content were analyzed using the Kruskal-Wallis nonparametric ANOVA
       with Dunn's multiple comparisons post-test,  measures  with p <  0.05 were
       considered  significant.
                                   Glucose
                                      1
                             Glucose-6-phosphate
                                      I
         **•"
                                                                                                        ***
uiyceraiaenae-j-pno
3hosphoenol- „ Pklr
. •* 	 Pyr

glucose mediated ind
decreased by PU
Pdk
1 RSI RS
x- — ^
i 	 Co
Fatty Acid >J
(3 Oxidation -J^
M&?

ogt
we Up-regulated
R5 Down-regulate
jction,
<=A
Pyr
, PCV |
in
Ac<
> Ac
/^>
/ Fatty Acid Ma
' Synthesis
Cyp3a3
a-Hydroxy FA-* 	 P

1 183 II \Cvp4aW
I I I %9% \Cvp4a12
I R5I I t$ACvp4a1

acid | |j^
IS?) 1
(s)-3-Hydroxy-3-
methylglutaryl-Co/
|Hmgcs2
Acetoacetyl-CoA
etyl-CoA A
J/Acafr
X\cac6 A
I \Echdc1
lonvl-CoA I I
Fasn i
almitate HNS
1 m
ACSI4
SJ VA
J 1 53
Gpam
Monoacylgly
3-phosphe

:erol 	
ate
I Dga(2
Triacylglycerol
Sterol
^ Biosynthes s
*®
Triacylglycero
|Dgaf2
'\ ,2-Diacylglycerol
t Ppap2b
\ $$ W 1

acid
4 Agpat4
Phospholipid
  FIG. 1.  Effects of three triazoles on the expression of genes in rat liver from the fatty acid metabolism pathway. Key in the lower left corner indicates the order
of presentation for the six treatment groups, and up- or downregulated genes.
                                          Previous
TOC

-------
452
                                                     GOETZ AND DIX
Acetoacetyl-CoA  HmgCs2  > (s)-3-Hydroxy-3-
                         methylglutaryl-CoA
                                                     Hmgcr
                                                            :*• Mevalonate
                                                                            Mvk
                                                                         err
                            -* Mevalonate-5P
                                                HMG CoA reductase
                                                                                          Mevalonate-5PP
                                                                                                 Mvd
                                                                                           LLL
              Sqle
                         Squalene
                                     Fdftl
                                    TTTIgg
                                    Presqualene-PP ±:
                                                              Fdftl
                 Farnesyl-PP
            (S)-Squalene-
            2,3-epoxide
                  I
                           24,25-dihydro-
                           lanosterol
               Lathosterol
                                                                          Fdps
                                                                                   Sc5d
                                                                                          Isopentenyl-PP
7-dehydro-
cholesterol
              Lanosterol
                             Cyp51
                          Lanosterol 14a-
                          demethylase
                            FF-MAS—»• T-MAS —> Zymosterol —
                              Sterol-A14-     Sterol-4,4-     1 Ebp; sterol-
               	
               & Up-regulated
                Down-regulated
                                        reductase
                                        Tm7sf2
                                              X
demethylase
                                                           A8,A7-isomerase
                                                           Sterol-C5-
                                                           desaturase
                                                           Sterol-A7-
                                                           reductase
                                                                                          DhcrJ
                                                                                  Desmosterol —> CHOLESTEROL
                                                                                           Steml-A24-  *
                                                                                           reductase
  FIG. 2.  Triadimefon effects on the expression of genes in rat liver from the sterol biosynthesis pathway. Key in the lower left corner indicates the order of
presentation for the six treatment groups, and up- or downregulated genes.
                         RESULTS

GeneChip Quality Analysis
  Microarrays of poor quality, with a scaling factor of greater
than 15.0 were removed from the analysis. Three GeneChips
from the liver and three GeneChips from the testis  data sets
were removed prior to normalization and statistical analysis of
the two separate data sets. Each treatment group  had three to
seven  GeneChips  available  for robust  analysis  following
removal of microarray chips with weak intensity readings.

Probe Set Analysis in the Liver
  PCA applied to the normalized data set grouped microarrays
by treatment group, with largest variation occurring within the
controls, M500, and T1800 groups. It is unlikely this variation
is  due  to sample size (Controls:   7;  M500:  4; T1800:  3
microarray chips).  Gene expression changes were determined
using one-way ANOVA and a FDR of 10% as  the multiple
testing correction method to control the  familywise error rate.
The FDR of 10%  (a = 0.10) generated a p value cut-off of
7.24E—4 and was considered an ideal cut-off in order to obtain
a subset of 1,043 differentially expressed probe sets for use in
pathway and gene-level analyses. Of those probe sets, 455 had
an absolute difference of 11.21 or greater. Removing probe sets
that interrogated unknown genes or transcribed loci, the final
                                                    list of probe sets identified 308 genes up- or downregulated by
                                                    myclobutanil, propiconazole, or triadimefon (Table 1). The 16
                                                    genes differentially expressed in response to all three triazoles,
                                                    as detected by microarray, are listed in Table 2. Although four
                                                    isoforms of Cyp2B (2b2, 2b3, 2bl3, 2bl5) were assessed using
                                                    the Affymetrix  Rat  230  2.0  GeneChip,  only Cyp2b2 was
                                                    induced  in all the triazole treatment groups.  The majority of
                                                    genes modulated by all three triazoles function in lipid or fatty
                                                    acid  metabolism,  transporter,  and xenobiotic  metabolism
                                                    pathways;  thus a  pathway based  approach was  the focus of
                                                    subsequent analysis and  interpretation.

                                                    Pathway Analysis  in the Liver
                                                      Pathway  analysis identified common biological pathways
                                                    and processes affected by the three triazoles  in rat liver. For
                                                    initial  analysis  of pathways  the entire  liver data set  (31,099
                                                    probe sets) was uploaded into IPA and the absolute difference
                                                    11.21 or greater and p value < 7.24E—4 was  used to  identify
                                                    differentially expressed genes, In the liver data set, 180 of the
                                                    308  significant probe  sets mapped to the IPKB. These  focus
                                                    genes were overlaid onto a molecular network developed from
                                                    information within  the IPKB. Table 3  shows  the pathways
                                                    identified by IPA  as  being  affected by the three triazoles.
                                                    Several  metabolic  pathways  are   common  across the  three
                                                    triazoles including androgen  and  estrogen, arachidonic  acid,
                                      Previous
                                                  TOC

-------
                                               TRIAZOLE TOXICOGENOMICS
                                                                                                                     453
C21-Steroid ,-^ Bile Acid -x
Hormone > /l)J' Biosynthesis ^ \


/"~ ^\C
/ CHOLESTEROL -, N
Cypfta7\
20a, 22p-Dihydroxy-
cholesterol
\Cyp11a1
Cyp1
V Mitochondria /
Estriol 16-
Glucuronide

VVV
~-^\r
yp7a1 7a.Hvdroxy- . .. SrdSaf __ 7a,12a-Dihvdroxv-
i ..I

A-p27af
x^ 27-Hydroxy- Cyp7b1
cholesterol | | | [^ |
r r- II. 1 n

7a-Hydroxylated
Oxysterol *" *" Primary
Cyp7b1 is regulated by Choi, and B.A.
Decrease in Choi. = decrease in Cyp7b1
Hsd3b
791 , 17«-HyHrnvy- CyP17a1 ^ Dehvdro_ HSd3b1 . 3p
pregnenolone
I M W Wuat1a1
\ M m \ \Ugtla1 E3
I M m MUdpgtrt



Hsd17b7
I 583 K8 I22 Uat1a1
-S-jScv^  I 9& 9K I \Ugt2a1
*&£/ 
-------
454
                                                          GOETZ AND DIX
i Triazole
Fatty Acids Oxysterols Bile Acids
s"

	 Aip 	

	 TAhr
Hspca 	 '
i i ^ i
| Hspca 1
*
V
r—1 	 i |
p
TH
u ^ ^ik
[ I ! I I ] I I f~^~\ \ S^^S C jn^tLJ*
[LXRo] [FXR
INr1h3l lNr1h4F
I I
i 1
Arnt Ncor^ i 	 1, 	 1 , 	 II 	 1 j 	 1
1 Ahr 1 \ [PparalRXRj [LXRqlRXRj |_ FXR
n I / 1 I! 1 INr1h3l1 1 lNr1h4f
]
_^S
ERXRJ
ARE II * PPREl I RE I I I RE I
Cyp1a1
j J J
j | | m m Cyp4alO | jSj I Alasi 1 1 1 1 1 W
Cvp4ai2 i i i Ka«a i ces2 i m m w
Cyp4a1 I ISS I I Rvl Me1 | | | ffi gg
Me3 I ! ! RSfSJ I
Scd2 I I I I I m

Cyp7h1l I I RSI I I

J
Triazole |

Cytoplasm
^
JGAKJ *
I Nr1 3 I
J^— Gripl
Nucleus
fCARTRXR
Ncoal ->
— XREMI )•

: - -


; Triazole :

/\l 88 I I I

\ / 	 DnajcY
) " ICAR
I \JI VI \
Hspca

CAR specific CAR/PXR PXR specific _!_
Phase I Phase II IPX!
Cvo1a1 Fl 	 0n?3 Gstm4 \\\VW\ Aldhlal I W W W L,^!
Cvo2b2 I03i%£t%£££0 Udpgtr2 I W W, W, Cyp3a1 [_W W W .
Cyp2b1/2L W Ml M Ugt1a1 \ VA YA W Cyp3a2 \ W W \ I
Ahr I 83 ggj I I Ugt2a1 \ YA YA \ \ *
Gadd45bmJJ^ Phase III fPXR
-LSLo 	 ^ AbCbla 	 PS "iNrliZj
^s
R 1
~ Ncorf
IRXRJ
1 ™'"~ ••'""•• AbccS 1 I I BB Bl —XREMI
1 	 ^D2b 1 [j H H Slco1a4 I VA VOW |
J8t®|?
I I I I ]
82 Up-regulated
S Down-regulated

Alasl 1 1 1 1 B-J Alasi : heme svnthesis:
Por 1 1 1 1 JSJ Por cnfica/ tor all Uyp
reacf/ons
4 	 1
  FIG. 4.  Impact of triazoles on nuclear receptor regulated gene expression in the rat liver. Key in the lower left corner indicates the order of presentation for the
six treatment groups, and up- or downregulated genes.
                                          Previous
TOC

-------
                                               TRIAZOLE TOXICOGENOMICS
                                                                                                                    455
      Androstanes
Xenobiotics
   Bilirubin
                                                     Acat2 ((VLDL)
                                                     assembly!.
                                                     secretion)
                                                                 Sqle, Lss, Cyp51
                                                          Fatty Acid
                                                          (3-oxidation
                                                                     PC
                                                                   Ppap2b
                                                                 Aldh1a1
                                                                                             Xenobiotics
                                                                                PPARa
                                                                                       RXR
                         Phase I   Phase II Phase III
                         Aldh1a1   Gstm4  AbccS  Alasl
                         Cyplal   Ugtlal  Abcbla  For
                         Cyp2b1/2  Ugt2a1  Slco1a4
                         Cyp3a1/2  Udpgtr2
                                                                        Apolipoproteins
                                                                     Lipoprotein Upases
                                                                    FA transport proteins
                                                                    Peroxisomal FA MEs
                                                                   Mitochondrial FA MEs
                        Cell Growth [
                      Hepatomegaly
                                         Detoxification
                                          Elimination
                                                                control plasma
                                                                lipid transport
                                                                                                Cd36
                                                                                                (FA translocase)
  FIG. 5.  Nuclear receptor regulation of genes, enzymes, pathways, and processes in the liver. Genes listed represent effects of triazoles on expression in rat
liver from this study. Expression of these genes is regulated by the nuclear receptors CAR, PXR, and PPAR-a, and the activity of these receptors, or transcription
factors, is modulated by various endobiotics and Xenobiotics. Perturbations of CAR, PXR, and PPAR-a signaling pathways can alter lipid and steroid homeostasis
and promote hepatotoxicity.
treatment, respectively. The two triazoles target different gene
transcripts, however both enzymes'  activities are involved in
primary bile acid synthesis. Propiconazole also decreased the
transcript levels of inositol 1,4,5-triphosphate receptor 2 (Itpr2)
which is involved  in intracellular calcium homeostasis. The
transcript levels for several uridine diphosphate glucuronosyl-
transferases (Ugtlal, Ugt2al, Udpgtr2)  were  increased  by
all three  triazoles,  indicating increased metabolism of steroids
and  Xenobiotics.  Increased  expression   of hydroxysteroid
17p-dehydrogenase  (Hsdl7b) by triadimefon also  indicated
elevated  androgen and estrogen metabolism in the liver.

  Nuclear  receptor regulated genes.  Figure  4 shows the
impact of triazoles on nuclear receptor regulated genes within
the rat liver. There were several genes differentially expressed
that are regulated by the constitutive androstane receptor (CAR),
pregnane X receptor (PXR), aryl hydrocarbon receptor (AhR),
as well as the peroxisome proliferator-activated receptor alpha
(PPAR-a) and liver X receptor (LXR). The genes regulated by
CAR and PXR are phase I, II, and III enzymes that are part of
fatty  acid  and  xenobiotic  metabolism,  sterol  biosynthesis,
steroid  metabolism,  and  cell  cycle  pathways.  Increased
                                                   transcript levels of most of these CAR/PXR-regulated genes
                                                   indicate activation of CAR and/or PXR. Also consistent with
                                                   CAR activation, the genes PCX and Ppap2b were downregulated.
                                                   Results for Cyp2b2 from the  arrays and for Cyp2bl/Cyp2b2
                                                   from PCR  are  both listed (Table  4).  CAR-specific,  PXR-
                                                   specific, and genes coregulated by CAR and PXR are identified
                                                   in Figure 4. Additional genes with overlapping regulation by
                                                   multiple receptors, modulated by triazoles in rat liver; included
                                                   Alasl  expression  regulated by CAR, PXR, and  LXR,  and
                                                   Cyplal expression regulated  by AhR and CAR.  Effects on
                                                   nuclear receptor regulated genes included increased expression
                                                   of  steroid and  xenobiotic  metabolism  genes  aldehyde de-
                                                   hydrogenase (Aldhlal), Cyplal, and Cyp2b2, the previously
                                                   mentioned  glucuronide   and   glucoside  conjugation  genes
                                                   Ugtlal, Ugt2al, and Udpgtr2, glutathione conjugator Gstm4
                                                   and phase III transporter Abcc3; all changes were likely to have
                                                   altered metabolism and excretion of steroids and Xenobiotics.
                                                      A more integrated representation  of  the  genes, enzymes,
                                                   pathways and processes regulated by nuclear receptors  in the
                                                   liver is presented in Figure 5. Various agonists and antagonists
                                                   of the CAR, PXR, and PPAR-a receptors are  indicated, as well
                                                   as the  regulated genes which  in this study were differentially
                                  Previous
                                             TOC
Next

-------
456
                                                     GOETZ AND DIX
                         TABLE 1
     Number of Affymetrix Probe Sets Signaling Significant
 Treatment-Related Gene Expression Changes in Rat Liver and
                           Testis
Dose
level
Triazole (ppm) (mg/kg/day)
Myclobutanil
Liver
Liver
Testis
Testis
Propiconazole
Liver
Liver
Testis
Testis
Triadimefon
Liver
Liver
Testis
Testis

500 (32.9)
2000 (133.9)
500 (32.9)
2000 (133.9)

500 (31.9)
2500 (169.7)
500 (31.9)
2500 (169.7)

500 (33.1)
1800 (139.1)
500 (33.1)
1800 (139.1)
Downregulated
probe
sets"

4
64
0
0

2
44
0
1

46
23
1
0
Up
regulated
probe sets"

1
9
16
0

5
45
0
49

8
154
4
6
Total
number
of probe sets

5
73
16
0

7
89
0
50

54
177
5
6
  "Probe sets significantly changed with a fold change greater than 11.21.


expressed in response  to triazole  exposure. This included up
regulation  of the  antiapoptotic genes Gadd45b and Mdm2.
Fatty acid metabolism, sterol biosynthesis, steroid metabolism,
bile  acid  metabolism,  and  cellular  growth pathways are all
coordinately regulated by these nuclear receptors. Based on
these findings, it is postulated that alterations in these pathways
from exposure to  xenobiotics like the triazoles, mediated by
these nuclear  receptors, resulted in effects  on biological
processes including hepatomegaly, detoxification and elimina-
tion, and plasma lipid transport.

Quantitative PCR
  To further  examine and confirm  changes in  liver gene
expression,  a set of genes modulated by all three triazoles, as
well as additional CAR regulated genes, were  analyzed by
PCR. Quantitative PCR confirmed the increased expression of
Cyplal  and Cyp2bl/Cyp2b2,  and  yielded  more definitive
results for Cyp3al and Ugtlal  (Table 4). Based on the PCR
results,  Cyp3al  is the  17th gene for  which  expression is
modulated by all three triazoles  (see Table 2 for other 16).
Cyp3a2 was not represented on the microarray. PCR detected
an increase in Cyp3a2 mRNA in response to myclobutanil and
propiconazole, and with the increased Cyp3al mRNA content
by all three triazoles, suggests PXR activation by triazoles in
the rat  liver.  The  magnitude of increased Cyp2bl/Cyp2b2
indicated a strong activation of CAR consistent with a malad-
aptive toxic response  as a result of long term triazole exposure.
The  one clear  case  of discordance between microarray and
PCR results, for Abcbl and triadimefon,  may be due to  the
sequences used  for probes in the separate assays.

Probe Set Analysis in the Testis
  Gene expression changes  were determined using a one-way
ANOVA and  a FDR of 25% (a  =  0.25) which generated  a
p value cut-off of 2.29E—4 yielding 169 differentially expressed
probe sets. Of those probe sets, 77  had an absolute difference of
11.21 or greater. Removal of probe sets interrogating unknown or
transcribed loci,  the final list of probe sets equaled 70 (Table 1).
An ANOVA analysis using a FDR of 10% defined one unknown
                                                        TABLE 2
            Expression Changes Common to all Three Triazoles for 16 Genes in Rat Liver, as Detected by Microarray
Accession
number
NMJH7161
BM392091
NM_022407
NM_012737
BM986220
NMJ33586
AI137640
BF564195
AI454613
BG378579
BM390462
AI010233
BI296089
U95011
NM_031741
M13506
Gene
symbol
Adora2b
Ahctfl
Aldhlal
Apoa4
App
Ces2
Cldnl
Crem
Cyp2b2
Htatip2
RGD1310209_pre
Ccdcl26
RGD1562101_pre
Slcola4
Slc2a5
Udpgtr2
Myclobutanil
(32.9 mg/kg/day)
-1.24302
-1.12196
1.60289
-1.09288
1.08894
1.159467
-1.172
-1.20309
2.728842
1.047174
-1.1009
-1.29462
-1.11631
1.212303
-1.0778
1.688198
Myclobutanil
(133.9 mg/kg/day)
-1.29105
-1.30861
1.999272
-1.87156
1.471356
1.650327
-1.36187
-1.40519
3.292202
1.232627
-1.25043
-1.38855
-1.22294
1.954614
-1.22666
2.04723
Propiconazole
(3 1.9 mg/kg/day)
-1.20012
-1.13063
1.310263
-1.45817
1.359173
1.347478
-1.20889
-1.20393
2.793866
1.15738
-1.18184
-1.28179
-1.11001
1.616718
-1.16297
1.678868
Propiconazole
(169.7 mg/kg/day)
-1.30677
-1.23386
2.880486
-1.99054
1.753045
2.317871
-1.42721
-1.32225
4.672197
1.369729
-1.32877
-1.35011
-1.27771
2.129518
-1.23688
2.385029
Triadimefon
(33.1 mg/kg/day)
-1.16624
-1.14095
1.532281
-1.66115
1.256155
1.325125
-1.32804
-1.1299
2.810611
1.120655
-1.37249
-1.24059
-1.14867
2.048135
-1.16947
1.614087
Triadimefon
(139.1 mg/kg/day)
-1.293
-1.26832
4.841574
-2.79911
1.986423
3.361062
-1.38697
-1.39254
7.625469
1.344915
-1.30658
-1.30313
-1.27933
1.973023
-1.18458
3.023706
  Note. Values given as fold change relative to control. Bold: transcript level changes were significant. Suffix _pre represents probe sets with predicted annotation.
                                      Previous

-------
                                                   TRIAZOLE TOXICOGENOMICS
                                                                                                                               457
                                                             TABLE 3
         Biological Pathways Containing Significant Changes in Rat Liver Gene Expression following Exposure to Triazoles
Myclobutanil
                                             Propiconazole
                                                                                                           Triadimefon
Aminosugars metabolism
Androgen and estrogen metabolism

Arachidonic acid metabolism

Ascorbate and aldarate metabolism

Fatty acid metabolism
Fructose and mannose metabolism
Galactose metabolism

Glycerolipid metabolism
Glycolysis/gluconeogenesis
Linoleic acid metabolism
Metabolism of xenobiotics by cytochrome
P450
p38 MAPK signaling
Pentose and glucuronate interconversions
Retinol metabolism
Starch and Sucrose metabolism
Tryptophan metabolism
Wnt/p-catenin signaling
Xenobiotic metabolism signaling
                                Amyloid processing
                                Androgen and estrogen metabolism

                                Arachidonic acid metabolism
                                Arginine and proline metabolism
                                Ascorbate and aldarate metabolism

                                Fatty acid metabolism
                                Fc Epsilon RI signaling
                                Glutamate metabolism
                                Glutathione metabolism
                                Glycerolipid metabolism

                                Glycerophospholipid metabolism
                                Linoleic acid metabolism
                                Metabolism of xenobiotics by cytochrome
                                P450

                                Pentose and glucuronate interconversions
                                Phospholipid degradation
                                Pyruvate metabolism
                                Sphingolipid metabolism
                                Starch and sucrose metabolism
                                Tryptophan metabolism

                                Xenobiotic metabolism signaling
                   Androgen and estrogen metabolism
                   Antigen presentation pathway
                   Arachidonic acid metabolism
                   Butanoate metabolism
                   Cyanoamino acid metabolism
                   Cysteine metabolism
                   Fatty acid metabolism
                   Glutathione metabolism
                   Glycine, Serine and Threonine metabolism
                   Glycolysis/gluconeogenesis
                   Histidine metabolism
                   IL-6 signaling
                   Keratin sulfate biosynthesis
                   Linoleic acid metabolism
                   Metabolism of xenobiotics by cytochrome
                   p450
                   N-Glycan degradation
                   Nitrogen metabolism
                   Propanoate metabolism
                   Pyruvate metabolism
                   Sterol Biosynthesis
                   Toll-like receptor signaling
                   Tryptophan metabolism

                   Xenobiotic metabolism signaling
  Note. Pathways listed were affected by mid and/or high dose of each triazole. Bold: pathways affected by two or more triazoles.
gene; a FDR of 15 or 20% defined a list of 10 genes. The liberal
cut-off was used in order to obtain a subset of genes for pathway
analysis. It is clear, however, from  the ANOVA that there were
either large variations  within treatment  groups  or very  small
changes in gene expression within  the testis.

Pathway Analysis in the Testis
   IPA was used to  identify common pathways affected by the
three triazoles.  From the entire data set (31,099 probe sets), the
                                                    absolute difference of 11.21 or greater andp value < 2.29E—4 was
                                                    used to  identify  genes  whose  expression  was differentially
                                                    regulated. In the testis data set, 50 of the 72 significant probe sets
                                                    mapped to the IPKB. There were no common pathways affected
                                                    by  all  three  triazoles, however,  there  were  five  pathways
                                                    modulated by at least  two triazoles  (Table 5).  The IPA-based
                                                    analysis  of altered pathways in  the  testis  did identify   11
                                                    potentially significant matches to pathways affected by triazoles
                                                    in the liver. Many of these pathways  common to testis and liver
                                                             TABLE 4
 Comparisons between Microarray and Quantitative PCR Measurement of Gene Expression in Rat Liver following Triazole Exposure
                             Abcbl
                                            Cyplal
                                            Cyp2b2
CypSal
Cyp3a2
Cyp4al
Ugtlal
Treatment
ng/kg/day   Array   qPCR   Array  qPCR   Array"   qPCR   Array   qPCR  Array6  qPCR   Arrayc   qPCR   Array   qPCR
Myclobutanil
Propiconazole
Triadimefon
134
170
139
1.24
1.48
1.41
-2.37
1.64
-4.99
1.76
3.02
5.60
5.82
21.31
79.99
3.29
4.67
7.63
64.57
132.12
63.13
-1.12
1.08
1.98
2.05
2.04
18.20
3.22
4.01
1.57
-1.73
-1.59
-1.30
-1.91
-1.53
-2.70
1.50
1.82
1.78
1.44
1.32
7.51
  Note. Values given as fold change relative to control. Bold: significant transcript level or fold changes.
  "Probe set representing Cyp2b2 on GeneChip.
  6No representative probe set on GeneChip.
  cProbe set representing Cyp4alO on GeneChip.
                                      Previous
                                              TOC

-------
458
                                                      GOETZ AND DIX
                                                         TABLE 5
        Biological Pathways Containing Significant Changes in Rat Testis Gene Expression following Exposure to Triazoles
Myclobutanil
            Propiconazole
                                                                                                     Triadimefon
Androgen and estrogen metabolism

C21-Steroid hormone metabolism
Nitrogen metabolism

Propanoate metabolism

Sterol biosynthesis
Valine, leucine, and isoleucine
degradation
fJ-Alanine metabolism
Androgen and estrogen metabolism
Arachidonic acid metabolism
Complement and Coagulation cascades
Cysteine metabolism
Fatty acid metabolism
Linoleic acid metabolism
Metabolism of xenobiotics by cytochrome
P450
Methionine metabolism
Pentose and glucuronate interconversions
Phenylalanine metabolism
Retinol metabolism
Selenoamino acid metabolism
Starch and sucrose metabolism
Taurine and hypotaurine metabolism
Tryptophan metabolism
Urea cycle and metabolism of amino groups

Xenobiotic metabolism signaling
Fatty acid metabolism



Neuregulin signaling

Propanoate metabolism
Valine, leucine, and isoleucine
degradation
fJ-Alanine metabolism
  Note. Pathways listed were affected by mid and/or high dose of each triazole. Bold: pathways affected by two or more triazoles.
are critical to reproduction,  including  androgen  and estrogen
metabolism, C21-Steroid hormone  metabolism, and sterol bio-
synthesis. The other common pathways are significant in relation
to how testis and liver respond to triazole exposures: metabolism
of xenobiotics by cytochrome P450, and xenobiotic metabolism
signaling. Several  additional  pathways recognized in the testis,
were common and robust responders to all three triazoles in the
liver: arachidonic acid metabolism, fatty acid metabolism, linoleic
acid metabolism, and tryptophan metabolism.
   Expanding on the pathways highlighted by the IPA analysis,
the larger set  of  72 differentially  expressed  genes were
examined based on relevant  biological processes  or pathways.
No one major biological process or pathway stood out in this
analysis. This set of genes did map to the lipid, fatty acid and
C21-steroid  hormone  metabolism,  inflammatory  response,
intracellular signaling, or xenobiotic  metabolism pathways.
However, due to the liberal threshold  (FDR of 25%) used
during  the  ANOVA, and  the   limited  number  of  genes
differentially expressed, it was  difficult to have confidence in
these  results for defining specific modes of triazole  related
toxicity within the testis. This was  exacerbated by the fact that
propiconazole,  which  demonstrated  no overt  reproductive
toxicity, caused a greater number of differentially  expressed
genes  in the  testis  compared with  the  two  reproductive
toxicants, myclobutanil and  triadimefon.
                       DISCUSSION

  Toxicological endpoints from our previous assessment of the
reproductive  toxicity of myclobutanil,  propiconazole,   and
                   triadimefon  identified  a potential  mode  of action for  the
                   reproductive  toxicity  of triazole antifungals (Goetz et  al.,
                   2007). The combination of increased serum testosterone levels
                   by all three triazoles, increased anogenital distance and  testis
                   weight, hepatomegaly, and decreased insemination and fertility
                   indices strongly suggested disruption in testosterone homeo-
                   stasis as  a mode  of action for triazole toxicity. However,  the
                   molecular  mechanisms  underlying  these  effects  remained
                   indeterminate. This study was designed to test the hypothesis
                   that disruption in  testosterone homeostasis  was  a result of
                   changes in gene expression leading to increased  steroidogen-
                   esis in the testis and decreased steroid metabolism in the  liver.
                   Furthermore, it was designed  to identify whether  this putative
                   mode of action was common to the triazoles,  and if so, to gain
                   mechanistic understanding of the common biological pathways
                   perturbed by triazoles that lead to toxicity.
                     Analyses based on biological pathways was used to interpret
                   the significant gene expression changes in the liver and  testis
                   and  to   provide  context  for  interpreting  these  changes.
                   Numerous common gene transcripts and biological pathways
                   were identified  for triazoles  in the  liver defining common
                   biological processes  modulated by  all  three triazoles and
                   supporting the interpretation of a common mode  of action. In
                   contrast,  the  small number of differentially  expressed genes
                   and affected pathways in the testis indicated that the observed
                   reproductive  effects were not due  to  modulations  of  gene
                   expression within the testis, and that the testis was not a target
                   organ for triazole reproductive toxicity.
                     Common metabolic pathways for all three triazoles included
                   androgen and estrogen, arachidonic acid, fatty acid, glycerolipid,
                   linoleic  acid,  tryptophan,  and  xenobiotic.  Several  of  the
                                       Previous
                TOC

-------
                                              TRIAZOLE TOXICOGENOMICS
                                                                                                                  459
perturbed pathways common to the triazoles formed a large
interconnected  network  between  glycolysis and  fatty  acid
catabolism,  sterol biosynthesis  and bile  acid biosynthesis  or
steroid metabolism. All three triazoles had a significant impact
on lipid  metabolism,  including fatty acid and steroid metab-
olism, as well as lipid  transport.  The genomic data in this study
demonstrates triazole disruption of pathways of key biological
functions in the liver; including  energy homeostasis, biological
membrane fluidity,  and CYP and other  metabolic activities.
Moreover, these pathways are critical to liver-mediated steroid
homeostasis, and we propose that it is the perturbation of these
critical pathways that leads  to the observed reproductive and
hepatic toxicity of the triazoles.

Hepatic Fatty Acid Metabolism
  Many  phase I,  II,  and III metabolic  genes perturbed by
triazole exposures are regulated by the nuclear receptors PPAR,
CAR, PXR, LXR, FXR, and AhR. Nuclear receptors within the
liver  regulate  specific  and overlapping  subsets  of genes
(Honkakoski and Negishi, 2000; Wei et  al, 2002; Yoshinari
et al.,  2008),   and respond  to  a variety  of endogenous
metabolites.  As an example,  fatty acids activate  PPAR-a
through  a ligand-induced  conformational structure  change.
Down regulation of Cyp4alO and Cyp4al, and up regulation of
Cyp4al2 indicates changes  in  fatty acid levels  and PPAR-a
modulation  by  the triazoles   (Gonzalez and  Shah, 2008).
Triazoles affected  multiple  fatty acid  metabolic genes, such
as Acsl3 and  AcslS,  which  encode enzymes  residing  in
different subcellular locations  and regulating different steps
in the metabolic pathway (Lewin et al., 2001). Overall, there
appeared to be a  shift   from  insulin-stimulated  glucose
metabolism and fatty  acid synthesis and storage (lipogenesis),
over to fatty acid oxidation—similar to what has been reported
for PPAR-a agonists (Xu  et al., 2006). The  triazoles  also
seemed to modulate LXR regulated genes  and pathways, the up
regulation of Alasl,  Ces2,  and Scdl expression  caused by
triazoles  indicated oxysterol activation of LXRa (Chu et al.,
2006) and promotion  of bile acid biosynthesis and secretion.

Constitutive Androstance Receptor
  The most robust nuclear receptor  mediated response  to
triazoles  in the rat liver is the induction of Cyp2b2, which is
regulated by CAR (Wei et al, 2000). Cyp2b2 was the one gene
differentially expressed  in all the triazole treatment groups  in
the rat liver. Expression levels of Cyp2b, like Cyp4a, are
increased by ketone  bodies and  fatty  acids,  and  decrease
following mitochondrial p-oxidation  due to decreased  in-
tercellular  fatty acids.  Cyp2b2  catalyzes the  oxidation  of
testosterone,  arachidonic  acid,  lauric acid, and numerous
environmental agents. Of the five Cyp2b isoforms  assessed
by microarray (2b2, 2b3, 2bl3, 2bl5) or PCR (2bl and 2b2) in
the present study,  only Cyp2b2  was clearly and strongly
induced  by all  three triazoles. The  biological  functions  of
Cyp2b3  and Cyp2bl3 have not been  determined;  however
Cyp2b3 is not phenobarbital inducible (Jean et al., 1994). Cyp2b2
and Cyp2bl5 are both induced by phenobarbital and regulated
by CAR, yet only the array probe set designed for Cyp2b2 was
consistently positive.
  PCR confirmed that  Cyp2b2 was  highly induced by  the
triazoles, and the sensitivity of Cyp2b2 to xenobiotics, elevated
levels  of  testosterone  and fatty  acids  suggests  multiple
functions  across  several  metabolic pathways. It  should be
noted that triazoles may not be metabolized  by rat Cyp2bl or
human CYP2B6 (Barton et al, 2006). However, Barton et al
did not test rat Cyp2b2 metabolism of triazoles to determine if
it might be directly involved in triazole biotransformation. It is
likely  that Cyp2b2 ability  to metabolize androgens and its
impact on related metabolic pathways  in the liver is  more
critical  to understanding  triazole reproductive and hepatic
toxicity.
  CAR regulates multiple metabolic enzyme  and  transporter
genes  modulated  by  triazoles,  including  Alasl,  Cyplal,
Cyp2b2, Lss, Abcc3, Slcola4, PCX, and Ugtlal. Results for
these and other genes are consistent with CAR activation by
triazoles, demonstrating the  multifunctionality  of this receptor
and a direct or indirect responsiveness  to triazoles. Other CAR
activators, like triazoles, also induce hepatomegaly, hepatocyte
hypertrophy, and  induction of CYP  and  other  xenobiotic
metabolizing enzymes in rodent liver, In  the  case of this
triazole study in rats, repeated exposures  to triazoles also led to
disruption of steroid homeostasis and infertility. Chronic, high-
dose exposures to these and other triazoles can also lead to
hepatic tumors and carcinogenesis in rodents,  similar to what
has been  reported for other CAR  activators  (Huang et  al,
2005). The mode of action behind these  CAR mediated tumor
and cancer outcomes appears to be increased cell proliferation
and  suppression of  apoptosis  (Huang et al,  2005).  Up
regulation of cell growth  and antiapoptotic  genes, as well as
well  established CAR regulated genes  such  as  Cyp2b2,
following  the  triazole  treatments  in   this  study  suggest
activation  of rat  CAR   is  a  key  event  in  the observed
hepatomegaly and other hepatotoxicity reported (Goetz et al,
2007). Using CAR knockout mice, Yamamoto et al. (2004)
have demonstrated that CAR is essential for at least some cases
of mouse hepatotoxicity and tumor formation, and  additional
studies  have proven  these  CAR-dependent  mechanisms
relevant to at least  cyproconazole (Peffer et al,  2007).  It is
worth noting that the antiapoptotic gene Gadd45b,  which was
up regulated by  triadimefon in the present study, has recently
been reported as a CAR coactivator (Yamamoto and Negishi,
2008). It appears that triazole modulation of Gadd45b is
dependent on CAR (Peffer et al, 2007).

Pregnane X Receptor
  Activation of genes regulated by PXR was not as robust as
the effects on CAR genes,  but it was another  common effect of
triazoles in rat liver. Cyp3al and Cyp3a2, Aldhlal, as well as
                                  Previous

-------
460
                                                     GOETZ AND DIX
some of  the  genes coregulated by CAR or  other  nuclear
receptors  were  induced by the triazoles.  Rat  Cyp3al  and
Cyp3a2 both metabolize myclobutanil, and perhaps triadime-
fon also (Barton et al, 2006). Human CYP3A4 appears to do
the same, so this PXR mediated induction of Cyp3ais likely
enhancing triazole biotransformation. However, because of the
many ligands  shared by PXR and CAR (Moore et al., 2000),
overlapping  regulation  of hepatic  metabolism  by these
receptors  (Maglich et al., 2002; Tien and Negishi,  2006), and
the promiscuity of PXR binding to many xenobiotics (Orans
et al., 2005), it was difficult to determine the specificity and
significance of PXR in the liver following triazole exposure.
PXR regulates  numerous  metabolic  pathways  and  genes,
including  several  altered by  triazoles  in the present study.
Results with transgenic mice have provided evidence that PXR
response  to xenobiotics is  an  important determinant  in both
rodent and  human hepatocarcinogenesis (Ma et  al.,  2007).
Future studies will need to clearly distinguish the functions of
PXR from  those  of CAR and other nuclear  receptors  in
regulating genes and pathways. These studies will be crucial in
defining   triazole modes  of action  relative  to  hepatic  and
reproductive toxicity, and determining  the relevance of these
mechanistic insights to human  health risk.

Hepatic Steroid Metabolism
   Many   of the genes in  the steroid  metabolism pathway
perturbed  by triazoles in the present study are  regulated by
CAR and  PXR. This genomic response indicated an attempt by
the rat liver to respond to increased serum  testosterone levels
following long  term and  relatively high-dose exposures  to
triazoles.  Many of these triazole-induced changes are likely to
have altered metabolism and excretion of steroids and xeno-
biotics. As an example, changes in Hsdl7b were likely to be an
adaptive response to the elevated circulating testosterone seen
with all three triazoles;  part  of a hepatic  attempt to regain
steroid homeostasis (Mustonen et al., 1997). In contrast, down
regulation of SrdSal, which helps eliminate excess androgens
and  is  typically  positively regulated  by  testosterone  and
dihydrotestosterone (Torres and Ortega, 2003), appears to be
a maladaptative response to triazole exposure.

Conclusions
   The molecular   events  measured in  this  toxicogenomic
study demonstrate  that  myclobutanil,  propiconazole,  and
triadimefon perturb common  biological pathways, many  of
which  are regulated by nuclear receptors. By  defining  the
common  changes  in gene expression and their  associated
biological pathways, strong inferences  can be made about the
causative  factors leading up  to a  disruption in testosterone
homeostasis and associated reproductive toxicity of triazoles:
triazoles   increased fatty acid  catabolism,  reduced bile acid
biosynthesis, induced cholesterol biosynthesis, and impaired
steroid metabolism. The induction of CAR and PXR  nuclear
  receptors drove many  of these changes in gene expression,
  and subsequent changes  in fatty acid,  steroid and xenobiotic
  metabolism. It is likely that modulation of CAR and PXR by
  the  triazoles  in  this long term exposure  study  led  to the
  observed hepatomegaly.
     The  observed disruption in testosterone homeostasis by
  triazoles was not due to modulation of steroidogenic genes in
  the testis. Instead,  there appeared to be disruption of normal
  hepatic testosterone metabolism, leading to increased expres-
  sion of genes in the steroid metabolism and sterol biosynthesis
  pathways as an adaptive response. For reasons  not currently
  understood, negative feedback mechanisms in the hypothala-
  mus-pituitary-gonadal  axis  did not  compensate for  these
  increases in  circulating   steroids,  and  changes  in  hepatic
  metabolism were not able to maintain steroid homeostasis. In
  this rat  model  where  exposure  started  gestationally  and
  continued  to  adulthood,  disruption   of  systemic   steroid
  homeostasis was accompanied by reproductive toxicity  and
  infertility. The gene expression profiles  in this study  have
  provided strong support for a mode of action for reproductive
  and hepatic toxicity, mediated through the CAR and PXR
  signaling pathways that is common to the triazole antifungals.
                          FUNDING

     EPA/North Carolina State University Cooperative Training
  Agreement (#CT826512010) supported A.K.G.
                    ACKNOWLEDGMENTS

     We thank  Drs Hongzu Ren and  Indira  Thillainadarajah
  (EPA)  for  excellent  technical  support.  We  also  thank
  Dr Douglas Wolf (EPA) for technical review of this manuscript.
  Microarrays and reagents for a portion  of  this study were
  provided  by Affymetrix  as  part of a Materials Cooperative
  Research  and Development Agreement with EPA.
                         REFERENCES

  Allen, J. W., Wolf, D. C., George, M. H., Hester, S. D., Sun, G., Thai, S.-F.,
    Delker,  D. A., Moore, T., Jones, C., Nelson,  G., et al.  (2006). Toxicity
    profiles  in mice treated with hepatotumorigenic and non-hepatotumorigenic
    triazole  conazoles fungicides: Propiconazole, triadimefon, and myclobutanil.
    Toxicol. Pathol. 34, 853-862.
  Baldan, A., Tarr, P., Lee, R., and Edwards, P. A. (2006). ATP-binding cassette
    transporter Gl and lipid homeostasis. Curr. Opin. Lipidol. 17, 227-232.
  Barton, H. A., Tang, J., Sey, Y. M., Stanko, J. P., Murrell, R. N, Rockett, J. C.,
    and Dix, D. J. (2006). Metabolism of myclobutanil and triadimefon by
    human and rat cytochrome P450 enzymes and liver microsomes. Xenobiotica
    36, 793-806.
  Chen, P.-J., Padgett, W. T., Moore, T., Winnik, W., Lambert, G. R., Thai, S.-F.,
    Hester,  S. D., and  Nesnow, S. (2009). Three conazoles  increase hepatic
                                      Previous
TOC

-------
                                                         TRIAZOLE TOXICOGENOMICS
                                                                                                                                             461
  microsomal retinoic acid metabolism and decrease mouse hepatic retinoic
  acid levels in vivo. Toxicol. Appl. Pharmacol. 234, 143-155.
Chu,  K., Miyazaki, M., Man, W. C., and Ntambi, J.  M. (2006). Stearoyl-
  coenzyme A desaturase 1 deficiency protects against hypertriglyceridemia
  and increases plasma high-density lipoprotein cholesterol induced by liver X
  receptor activation. Mol Cell Biol. 26, 6786-6798.
Coleman, R. S., Lewin,  T. M., and Muoio, D. M. (2000).  Physiological and
  nutritional regulation  of enzymes of triacylglycerol synthesis.  Annu. Rev.
  Nutr. 20, 77-103.
Dixit, S. G., Tirana, R.  G., and Kim, R. B. (2005).  Beyond  CAR and PXR.
  Curr. Drug Metab. 6, 385-397.
Ghannoum, M. A., and Rice, L. B. (1999). Antifungal agents: Mode of action,
  mechanisms of resistance, and correlation of these mechanisms with bacterial
  resistance. Clin. Microbiol. Rev. 12, 501-517.
Goetz, A. K., Bao, W., Ren, H., Schmid, J. E., Tully, D. B., Wood, C.,
  Rockett, J. C., Narotsky, M. G., Sun, G., Lambert, G. R., etal. (2006). Gene
  expression  profiling  in  the  liver of  CD-I  mice to  characterize  the
  hepatotoxicity of  triazole fungicides.  Toxicol.  Appl.  Pharmacol.  215,
  274-284.
Goetz, A. K., Ren, H.,  Schmid, J. E., Blystone, C. R., Thillainadarajah, L,
  Best, D. S., Nichols, H. P., Strader, L. F., Wolf, D. C., Narotsky, M. G., etal.
  (2007). Disruption of testosterone homeostasis  as  a mode of action for the
  reproductive toxicity of triazole fungicides in the male rat. Toxicol. Sci. 95,
  227-239.
Gonzalez, F. J., and  Shah,  Y. M. (2008). PPARa: Mechanism  of  species
  differences and hepatocarcinogenesis of peroxisome proliferators. Toxicol-
  ogy 246, 2-8.
Guzelian, J., Barwick, J. L., Hunter, L., Phang, T. L., Quattrochi, L. C., and
  Guxelian, P. S. (2006). Identification of genes controlled by the pregnane X
  receptor by microarray analysis of mRNAs from pregnenolone 16alpha-
  carbonitrole-treated rats. Toxicol. Sci. 94, 379-387.
Hester,  S. D., and Nesnow, S. (2008). Transcriptional responses in thyroid
  tissues from rats treated with a tumorigenic and a non-tumorigenic conazoles
  fungicide. Toxicol. Appl. Pharmacol. 227, 357-369.
Hester, S. D., Wolf, D. C., Nesnow, S., and Thai, S. F. (2006). Transcriptional
  profiles in  liver from rats treated with  tumorigenic and non-tumorigenic
  triazoles conazole fungicides: Propiconazole, triadimefon, and myclobutanil.
  Toxicol. Pathol. 34, 879-894.
Honkakoski, P., and Negishi, M. (2000). Regulation  of  cytochrome P450
  (CYP) genes by nuclear receptors. Biochem. J.  347, 321-337.
Huang,  W., Zhang, J., Washington, M., Liu, J., Parant, J. M., Lozano, G., and
  Moore,  D. D. (2005). Xenobiotic stress induces  hepatomegaly and liver
  tumors  via  the  nuclear  receptor  constitutive androstane receptor.  Mol.
  Endocrinol. 19, 1646-1653.
Jean,  A., Reiss, A.,  Desrochers,  M., Dubois, S., Trottier, E.,  Trottier, Y.,
  Wirtanen, L., Adesnik, M., Waxman, D. J., and Anderson, A.  (1994). Rat
  liver  cytochrome  P450  2B3:  Structure  of  the  CYP2B3   gene  and
  immunological identification of a constitutive P450 2B3-like protein in rat
  liver. DNA Cell Biol.  13, 781-792.
Jigorel,  E., Le Vee,  M., Boursier-Neyret,  C., Bertrand, M., and  Fardel,  O.
  (2005).  Functional expression of sinusoidal drug transporters in primary
  human and rat hepatocytes. Drug Metab. Dispos. 33, 1418-1422.
Jigorel,  E., Le Vee,  M., Boursier-Neyret,  C., Bertrand, M., and  Fardel,  O.
  (2006).  Differential regulation of sinusoidal and  canalicular hepatic drug
  transporter expression by  xenobiotics  activating drug-sensing receptors in
  primary human hepatocytes. Drug Metab. Dispos.  34, 1756-1763.
Kretschmer, X. C., and Baldwin, W. S. (2005).  CAR and PXR: Xenosensors of
  endocrine disrupters? Chem. Biol. Interact. 155, 111-128.
Lewin, T. M., Kim, J. H., Granger, D. A., Vance, J. E., and  Coleman, R. A.
  (2001). Acyl-CoA synthetase isoforms 1,4, and 5 are present  in different
          subcellular membranes in rat  liver  and can  be inhibited independently.
          /. Biol. Chem. 276, 24674-24679.
       Ma, X., Shah, Y., Cheung, C., Guo, G. L., Feigenbaum, L., Krausz, K. W.,
          Idle, J. R., and Gonzalez, F. J.  (2007). The pregnane X receptor gene-
          humanized  mouse:  A  model  for  investigating  drug-drug-interactions
          mediated by cytochromes P450 3A. Drug Metab. Dispos. 35, 194-200.
       Maglich, J. M., Stoltz, C. M., Goodwin, B., Hawkins-Brown, D., Moore, J. T.,
          and Kliewer, S. A. (2002).  Nuclear pregnane X receptor  and constitutive
          androstane receptor regulate overlapping but distinct sets of genes involved
          in xenobiotic detoxification.  Mol. Pharmacol. 62, 638-646.
       Moore, L. B., Parks, D. J., Jones, S. A., Bledsoe,  R. K.,  Consler, T.  G.,
          Stimmel, J. B., Goodwin, B., Liddle, C., Blanchard, S. G., Willson, T. M.,
          et al. (2000). Orphan nuclear receptors constitutive androstane receptor and
          pregnane X receptor share xenobiotic and steroid ligands. /. Biol. Chem. 275,
          15122-15127.
       Mustonen, M. V. J., Poutanen,  M. H., Isomaa, V.  V., and Vihko, R. K. (1997).
          Cloning of mouse l?p-hydroxysteroid dehydrogenase type 2, and analyzing
          expression of the mRNAs for types 1, 2, 3, 4 and 5 in mouse embryos and
          adult tissues. /. Biochem.  325, 199-205.
       Nordlie, R. C., and Foster, J. D. (1999). Regulation of glucose production by
          the liver. Annu. Rev. Nutr. 19, 379^06.
       Orans, J., Teotico, D. G., and Redinbo, M. R. (2005). The nuclear xenobiotic
          receptor pregnane X receptor: Recent insights and new challenges. Mol.
          Endocrinol. 19, 2891-2900.
       Pandak, W. M., Hylemon, P. B., Ren, S., Marques, D., Gil, G., Redford, K.,
          Mallonee, D., and Valhcevic, Z. R. (2002).  Regulation of oxysterol 7alpha-
          hydroxylase (CYP7B1) in primary cultures of rat hepatocytes. Hepatology
          35, 1400-1408.
       Peffer, R. C., Moggs, J. G., Pastoor, T., Currie, R. A., Wright, J., Milburn, G.,
          Waechter, F., and Rusyn, I. (2007). Mouse liver effects of cyproconazole,
          a triazole fungicide: Role of the  constitutive androstane receptor. Toxicol.
          Sci. 99,315-325.
       Russell, D. W. (1999). Nuclear orphan receptors control cholesterol catabolism.
          Cell 97, 539-542.
       Schwarz,  M., Russell, D. W., Dietschy,  J.  M.,  and  Turley,  S. D.  (2001).
          Alternate pathways  of  bile  acid  synthesis  in the  cholesterol 7alpha-
          hydroxylase knockout mouse are not upregulated by either cholesterol or
          cholestyramine feeding. /. Lipid Res. 42, 1594-1603.
       Shenoy,   S.  D.,  Spencer,  T. A.,  Mercer-Haines,  N. A.,  Alipour,  M.,
          Gargano, M. D., Runge-Morris, M., and Kocarek, T.  A.  (2004). CYP3A
          Induction by liver X receptor ligands in primary cultured rat and mouse
          hepatocytes is mediated by the pregnane X receptor. Drug Metab. Dispos.
          32, 66-71.
       Shibata, N., Arita, M., Misaki,  Y., Dohmae, N., Takio, K., Ono, T., Inoue, K.,
          and Arai,  H.  (2001). Supernatant protein factor, which stimulates  the
          conversion of squalene to lanosterol, is a cytosolic squalene transfer protein
          and enhances cholesterol biosynthesis. Proc. Nad.  Acad. Sci. U. S. A.  98,
          2244-2249.
       Staudinger, J., Liu,  Y., Madan, A., Habeebu,  S.,  and Klaassen, C. D. (2001).
          Coordinate regulation of xenobiotic and bile acid homeostasis by pregnane X
          receptor. Drug Metab. Dispos. 29, 1467-1472.
       Sul, H. S., and Wang, D.  (1998). Nutritional and hormonal regulation of
          enzymes in fat synthesis:  Studies of fatty acid synthase and mitochondrial
          glycerol-3-phosphate acyltransferase gene transcription. Annu. Rev. Nutr. 18,
          331-351.
       Sun, G., Grindstaff, R. D., Thai, S. F., Lambert, G. R., Tully, D. B., Dix, D. J.,
          and Nesnow, S. (2007). Induction of cytochrome P450 enzymes in rat liver
          by two conazoles, myclobutanil and triadimefon. Xenobiotica 37, 180-193.
       Tansey, T. R., and Shechter, I.  (2000). Structure and regulation of mammalian
          squalene synthase. Biochim.  Biophys. Acta 1529, 49-62.
                                          Previous
TOC

-------
462
                                                                GOETZ AND DIX
Tien, E. S., and Negishi, M. (2006). Nuclear receptors CAR and PXR in the
  regulation of hepatic metabolism. Xenobiotica 36, 1152-1163.
Torres, J. M., and Ortega, E. (2003). Precise quantitation of 5a-reductase type 1
  mRNA by RT-PCR in rat liver and its positive regulation by testosterone and
  dihydrotestosterone. Biochem. Biophys. Res. Commun.  308, 469-473.
Tully, D. B., Bao, W., Goetz, A. K., Blystone, C. R.,  Ren,  H., Schmid, J. E.,
  Strader, L. F., Wood, C. R., Best, D. S., Narotsky, M. G., et al. (2006). Gene
  expression profiling in liver and testis  of rats to characterize the toxicity of
  triazole fungicides. Toxicol. Appl. Pharmacol. 215,  260-273.
U.S. Environmental Protection Agency,  Office of Prevention,  Pesticides and
  Toxic Substances. (1995). Myclobutanil; Pesticide tolerances. Fed. Regist.
  60, 40500-40503.
U.S. Environmental Protection Agency,  Office of Prevention,  Pesticides and
  Toxic Substances. (1996). Carcinogenicity Peer Review  of Bayleton.  U.S.
  EPA, Washington, DC.
U.S. Environmental Protection Agency,  Office of Prevention,  Pesticides and
  Toxic Substances. (2001). Myclobutanil, pesticide tolerance  for emergency
  exemptions. Fed.  Regist.  66, 298-306.
U.S. Environmental Protection Agency,  Office of Prevention,  Pesticides and
  Toxic Substances. (2005a). Myclobutanil; pesticide tolerances for emergency
  exemptions. Fed.  Regist.  70, 49499^9507.
U.S. Environmental Protection Agency,  Office of Prevention,  Pesticides and
  Toxic Substances. (2005b). Propiconazole;  pesticide tolerances for emer-
  gency exemptions. Fed. Regist. 70, 43284^-3292.
U.S. Environmental Protection Agency,  Office of Prevention,  Pesticides and
  Toxic Substances. (2005c). Memorandum. Triadimefon:  Occupational and
  Residential Exposure Assessment for  the Registration Eligibility Decision
  Document.  PC Code  109901, DP  Barcode  314814, 315040.  Office  of
  Prevention, Pesticides, and Toxic Substances, Washington, DC.
U.S. Environmental Protection Agency,  Office of Prevention,  Pesticides and
  Toxic Substances. (2006).  Triadimefon. Preliminary Human Health  Risk
  Assessment revised. Office of Prevention, Pesticides, and  Toxic Substances,
  Washington, DC.
Vanden Bossche, H., Marichal,  P., Gorrens,  J., and Coene,  M.-C. (1990).
  Biochemical basis for the activity and selectivity of oral antifungal drugs.
  Br. J. Clin. Pract. Suppl. 71, 41^6.
Wang, X. J., Chamberlain, M., Vassieva, O., Henderson, C. J., and Wolf, C. R.
  (2005).  Relationship  between hepatic phenotype  and  changes in gene
  expression in cytochrome P450 reductase (FOR) null mice. Biochem. J. 388,
  857-867.
   Ward,  W.  O.,  Delker,  D. A.,  Hester, S. D., Thai,  S.-F.,  Wolf, D.  C.,
     Allen, J.  W.,  and Nesnow, S. (2006). Transcriptional profiles in liver from
     mice treated  with hepatotumorigenic and  nonhepatotumorigenic triazole
     conazoles  fungicides:  Propiconazole,  triadimefon,  and  myclobutanil.
     Toxicol. Pathol. 34, 863-878.
   Wei, P., Zhang, J., Egan-Hafley, M., Liang, S., and Moore, D. D. (2000). The
     nuclear  receptor  CAR  mediates specific  xenobiotic induction  of drug
     metabolism. Nature 407, 920-923.
   Wei, P., Zhang,  J., Dowhan, D.  H., Han, Y., and Moore, D. D. (2002). Specific
     and overlapping functions of the nuclear hormone receptors CAR and PXR
     in xenobiotic response. Pharmacogenomics J. 2, 117-126.
   Wolf, D. C., Allen,  J. W., George, M. H., Hester, S. D., Sun, G., Moore, T.,
     Thai, S.  F., Delker, D., Winkfield, E., Leavitt, S., et al.  (2006). Toxicity
     profiles   in rats   treated with tumorigenic  and  nontumorigenic  triazole
     conazole fungicides: Propiconazole, triadimefon, and myclobutanil. Toxicol.
     Pathol. 34, 895-902.
   Xu,  J.,  Christian, B., and Jump, D.  B. (2006). Regulation  of rat hepatic
     L-pyruvate kinase  promoter composition  and  activity  by  glucose, n-3
     polyunsaturated fatty acids, and peroxisome proliferator-activated receptor-a
     agonist. /. Biol. Chem. 281,  18351-18362.
   Yamamoto,  Y.,  Moore,  R.,  Goldsworthy,  T.  L.,  Negishi,  M.,  and
     Maronpot, R. R.  (2004). The  orphan nuclear receptor constitutive active/
     androstane receptor is essential for liver tumor promotion by phenobarbital in
     mice. Cancer Res. 64, 7197-7200.
   Yamamoto, Y.,  and  Negishi, M. (2008). The antiapoptotic factor growth arrest
     and   DNA-damage-inducible  45  beta  regulates  the  nuclear  receptor
     constitutive active/androstane receptor-mediated transcription. Drug Metab.
     Dispos. 36, 1189-1193.
   Yoshikawa, T., Ide,  T.,  Shimano, H., Yahagi, N, Amemiya-Kudo, M.,
     Matsuzaka, T., Yatoh, S., Kitamine, T.,  Okazaki,  H., Tamura,  Y., et al.
     (2003).   Cross-talk between  peroxisome  proliferator-activated  receptor
     (PPAR) a and liver x  receptor (LXR) in nutritional regulation of fatty acid
     metabolism. I. PPARs suppress sterol regulatory element binding protein-lc
     promoter  through inhibition  of  LXR  signaling.  Mol.  Endocrinol.  17,
     1240-1254.
   Yoshinari, K., Tien, E.,  Negishi, M., and Honkakoski, P. (2008).  Receptor-
     mediated regulation of cytochromes P450. In Cytochromes P450: Role in the
     Metabolism and Toxicity of Drugs and Other Xenobiotics  (C. loannides,
     Ed.), pp.  417^48. RSC Publishing, Cambridge.
   You, L. (2004).  Steroid hormone biotransformation and xenobiotic induction of
     hepatic steroid metabolizing  enzymes. Chem. Biol. Interact. 147, 233-246.
                                              Previous
TOC

-------
TOXICOLOGICAL SCIENCES 107(2), 331-341 (2009)
doi:10.1093/toxsci/kfh234
Advance Access publication November 12, 2008
Modeling Single and Repeated  Dose Pharmacokinetics of PFOA  in  Mice

    Inchio Lou,* John F. Wambaugh,* Christopher Lau,t'! Roger G. Hanson,t Andrew B. Lindstrom,^ Mark J. Strynar,^
                              R. Dan Zehr,t R. Woodrow Setzer,* and Hugh A. Barton*'2
 *National Center for Computational Toxicology; ^Reproductive Toxicology Division, National Health and Environmental Effects Research Laboratory; and
  $Human Exposure and Atmospheric Science Division, National Exposure Research Laboratory, Office of Research and Development, U.S. Environmental
                                 Protection Agency, Research Triangle Park, North Carolina 27711

                                       Received July 28, 2008; accepted October 31, 2008
  Perfluorooctanoic acid (PFOA) displays complicated pharma-
cokinetics in that serum concentrations indicate long half-lives
despite which steady state appears to be achieved rapidly. In this
study, serum and tissue concentration time-courses were obtained
for male  and female CD1 mice after single, oral doses of 1 and
10 mg/kg of PFOA. When using one- and two-compartment models,
the  pharmacokinetics for these two dosages are not consistent
with serum time-course data from female CD1 mice administered
60 mg/kg, or with serum concentrations following repeated daily
doses of 20 mg/kg PFOA. Some consistency between dose regimens
could be achieved using the saturable resorption model of Andersen
et al. In this model PFOA is cleared from the serum into a filtrate
compartment from  which it  is  either excreted  or resorbed
into the serum by a process presumed transporter mediated with
a Michaelis-Menten form. Maximum likelihood estimation found
a transport maximum of Tm = 860.9 (1298.3) mg/l/h and half-
maximum concentration of KT  = 0.0015 (0.0022) mg/1 where the
estimated standard errors (in  parentheses)  indicated large un-
certainty. The estimated rate of flow into and out of the filtrate
compartment, 0.6830 (1.0131) 1/h was too large to be consistent with
a biological interpretation. For these model parameters a single dose
greater than 40 mg/kg, or a daily dose in excess of 5 mg/kg were
necessary  to observe nonlinear pharmacokinetics for PFOA in
female CD1  mice. These data and modeling analyses more fully
characterize  PFOA in mice for purposes of estimating internal
exposure for use in risk assessment.
  Key  Words:  perfluorooctanoic  acid  (PFOA);  compartment
model; resorption model; pharmacokinetic parameters; statistical
analysis; CD1 mice.
  Perfluorooctanoic acid (PFOA) and related compounds are
used primarily  as surface-active agents  in the production of
various   fluoropolymers  and  fluoroelastomers  (Kudo  and
Kawashima, 2003). Because of the  strength of the carbon-
fluorine bond, PFOA is stable to metabolic and environmental

  1 To whom correspondence should be addressed at Mail Drop 67, U.S.
Environmental Protection Agency, Research Triangle Park, NC 27711. Fax:
(919) 541-4017. E-mail: lau.christopher@epa.gov.
  2 Current address: Pfizer, Inc., PDM PK/PD Modeling, Eastern Point Rd.,
MS 8220-4328, Groton, CT 06340.

Published by Oxford University Press 2008.
      degradation (Butenhoff et al., 2004). PFOA is widespread in
      wildlife and humans—from polar bears living  in Greenland
      (Dietz et al, 2008), to giant pandas in China (Dai et al, 2006),
      from the general population to occupationally exposed workers
      (Betts, 2007; Olsen et al, 2007). Average blood levels from the
      general population in the United States are approximately 4-5
      parts per billion (Calafat et al, 2007).
        The  toxicology of PFOA has been extensively reviewed
      (Andersen et al, 2008; Lau et al, 2007; Kennedy et al, 2004).
      PFOA  is  associated with liver enlargement  in rodents and
      nonhuman  primates. Hepatocellular adenomas, Ley dig cell
      tumors,  and pancreatic  acinar  cell  tumors occurred in rats
      (Biegel  et al,  2001; Cook et al, 1992). Exposure to  a high
      dose of PFOA (20 mg/kg) for two days  late in gestation was
      sufficient  to  produce neonatal mortality and  birth weight
      reduction in mice (Wolf et al, 2007). Further investigations
      showed the daily PFOA treatment  with 5  mg/kg and  lower
      doses during gestation was associated with effects (White et al,
      2007; Abbott et al, 2007).
        PFOA is found in human blood and breast milk from the
      general population in countries worldwide (Butenhoff  et al,
      2004).  Workers occupationally exposed to fluorochemicals
      have serum levels  of PFOA  approximately one order  of
      magnitude higher  than  those  reported in  the  general
      population. The PFOA  serum elimination half-life in work-
      ers  was estimated as 3.8  years (Olsen et al, 2007). This
      is much longer than in laboratory animals, for example,
      hours for the female rat to days for  the male rat to weeks for
      the  monkey  (Lau  et  al, 2007).  Gender differences  are
      particularly notable in rats, with limited differences in other
      animals (Kudo and Kawashima, 2003).  The basis for  the
      species  and gender differences in  elimination of PFOA is
      still not well  understood PFOA is high bound to plasma
      proteins  and this  does  not  appear to differ  substantially
      across  species  (Kudo and Kawashima, 2003). Differential
      expression of transporter proteins in the kidney may be one
      explanation and is clearly a major factor in the sex difference
      observed in rats (Kudo et al, 2002). Transporter activity has
      been confirmed in rat, in which organic  anion transporters
      1 and  3,  organic  anion transporting  polypeptide 1, and
                                 Previous
TOG
Next

-------
332
                                                        LOU ET AL.
perhaps others  mediate  PFOA  cross-membrane transport
(Kudo et al, 2002; Nakagawa et al,  2008).  Recent studies
with  expressed  human and rat organic  anion transporters
1 and 2 found similar activity (Nakagawa et al., 2008). In
addition, liver distribution  in rats is dose-dependent (Kudo
et al., 2007) and transporter-dependent (Han et  al.,  2008),
whereas studies in mice have demonstrated that both uptake
and  efflux transporters  in  liver  are  regulated  by  PFOA
(Cheng and Klaassen, 2008; Maher et al., 2008). These kinds
of effects presumably underlie the observation of the need to
incorporate  time-dependent  changes  in  pharmacokinetic
modeling  for PFOA  and PFOS  (Harris and Barton,  2008;
Tan  et al., 2008)
  A margin of exposure approach was used in the U.S. EPA's
PFOA  preliminary  risk  assessment, which compared mea-
sured human blood levels with laboratory animal blood levels
associated with toxic  effects  (U.S.  EPA,  2005). The area
under the blood concentration-time curve (AUC), concentra-
tion  at  steady  state  (Css), or peak concentration (Cmax) were
dose  metrics for evaluating effects in  this draft assessment.
The  cross-species pharmacokinetic extrapolation using AUC
or Css  (e.g., Csshuman/Cssmouse)  and  a one-compartment
pharmacokinetic model is estimated by the ratio of half-lives
assuming the volume of distribution is a similar  fraction of
body weight across species,  for example,  Css = dose rate
(mg/kg/day)/[volume of distribution (I/kg) X elimination rate
constant (I/day)], where  half-life equals In 2/elimination rate
constant. To date, the one-compartment model has been used
for PFOA pharmacokinetic  analysis  in  rat and monkey
(Harada et al.,  2005; U.S. EPA,  2005;  Washburn  et  al.,
2005).  However,  the half-life  estimated  in humans and
animals may not be constant and animal half-lives estimated
by following blood levels  after  a single  dose may not  be
comparable  with estimates from humans  who  have had
chronic exposure. Monkey and rat data have been  interpreted
as indicating that the  volume of distribution changes with
concentration (Trudel et al., 2008; Washburn et  al,  2005).
Large volumes of distribution do not  seem likely, however,
because PFOA is known to rapidly achieve quasi-steady-state
in blood  (Andersen  et  al,  2006;  Lau  et al,   2006).
Alternatively,   monkey  data   suggests   that excretion  is
concentration  dependent  (faster  elimination  rate  at  higher
concentrations),  that is, half-lives are not constant (Andersen
et al,  2006).  These  issues   increase   the difficulty  in
extrapolating  from  one  species  to  another. A  recently
developed biologically motivated pharmacokinetic model of
saturable, renal resorption that depends on kinetic factors for
transport successfully described the monkey data (Andersen
et al, 2006). The difference in apparent elimination rates with
increasing dose indicated  that  capacity-limited, saturable
transport processes  may  be involved in the kinetic behavior
of PFOA.
  In this study,  one- and two-compartment models with first
order absorption and clearance were statistically analyzed for
   PFOA time  course  data to  estimate  the pharmacokinetic
   parameters—volume of distribution,  absorption rate,  and
   elimination  rate—for female and male CD-I  mice based
   upon  PFOA concentrations  following  single  and repeated
   doses. The saturable resorption  model, which elaborates the
   description of elimination, also was applied to investigate the
   kinetic behaviors  of PFOA in mice.  These analyses  charac-
   terize  models  for  mice  that provide  initial  estimates  of
   dosimetry that could be applied in risk assessment,  though
   they  may  also  be  considered  intermediate  steps  in  the
   development  of  a more  complete  physiologically  based
   pharmacokinetic model  that would better characterize PFOA
   tissue distribution, which is not  directly addressed by any of
   these  current models.
                  MATERIALS AND METHODS

     PFOA (ammonium salt; >98% pure) was purchased from Fluka Chemical
   (Steinheim, Switzerland). Nuclear magnetic resonance analysis kindly provided
   by 3M Company (St Paul, MN) indicated that approximately 98.9% of the
   chemical was straight-chain and the remaining 1.1% was  branched isomers.
   [1,2-13C]-PFOA was purchased from Perkin-Elmer (Wellesley, MA) and used
   as an internal standard in the quantitative analysis. For all studies, PFOA dosing
   solutions were dissolved in deionized water and prepared fresh daily.
     Complete data tables are available as an online supplement.

     Animal treatment.  All animal studies were conducted in accordance with
   the Institutional Animal Care and Use Committee guidelines established by
   the U.S. Environmental  Protection Agency's  Office  of Research  and
   Development/National Health and Environmental Effects  Research Labora-
   tory. Procedures and facilities were consistent with the  recommendations
   of the 1996 National Research Council's "Guide for the Care and Use of
   Laboratory Animals," the Animal Welfare Act,  and Public Health Service
   Policy on the Humane Care and Use of Laboratory Animals. Animal facilities
   were controlled for temperature (20-24°C) and relative humidity (40-60%)
   and kept under a 12-h light-dark cycle. Mature male and female CD-I mice
   (70-80 days of age) were  purchased  from Charles River  Laboratories
   (Raleigh, NC) and shipped by truck to our facilities, with a transit time of
   less than one hour. Animals were segregated by sex, housed in polypropylene
   cages (three  per cage), and provided  pellet chow (LabDiet  5001, PMI
   Nutrition International, St. Louis, MO) and tap water ad libitum. Mice were
   allowed several days for acclimation and randomly assigned to treatment
   groups.  Several studies were  undertaken involving single  or repeated
   dosing. Two studies with very  similar designs were carried out in which
   mice were given a single oral gavage treatment of either 1 mg/kg or 10 mg/kg
   PFOA. In the first study (PK1), three males and three females from each dose
   group were sacrificed by decapitation at the following time intervals: 4, 8, or
   12h, and 1,3,6,9,13,20,27,34,42, or48 days. Trunk blood was collected for
   serum preparation and stored at —20°C; liver and kidney were dissected, flash-
   frozen on dry-ice and stored at —80°C until being processed for analysis. For
   the second  study (PK2), the evaluation time points were extended to include
   55, 62,70, and 80 days. Serum, liver and kidneys were collected and stored as
   described previously. Based upon initial modeling efforts,  a study at a higher
   dose was carried out in which female mice were given 60 mg/kg PFOA (6 mg/
   ml dosing solution, 10 ml/kg dosing volume) and three mice were sacrificed at
   each of the following time intervals: 2,4,6, 8,12,24,36 h, or 2,4,6, 8,11,14,
   21 days, 28  days. Only serum was analyzed for these  animals. Finally,
   a repeated dose study was carried out in which five animals received 20 mg/kg/
   day for 17 days and serum was obtained 24 h after the final  dose as previously
   described (Lau et al, 2006).
                                       Previous
TOG
Next

-------
                                       SINGLE AND REPEATED DOSE PHARMACOKINETICS OF PFOA
                                                                                                                                          333
   PFOA determination.  Serum samples were thawed and mixed well by
vortexing; an aliquot (25-100 ul) was removed for analysis. The volume of
serum assayed was varied to optimize detection of PFOA because levels were
very high at early time points and very low at the latest. Liver and kidney
were  thawed,  weighed  and homogenized (polytron)  in  5  volumes  of
deionized,  distilled water. Analysis of PFOA in serum and tissues was
performed using a modification of a method originally developed by Hansen
et al. (2001).  Briefly, 25-100 ul of serum or 25 ul of tissue homogenate was
combined with 1 ml of 0.5M tetrabutylammonium hydrogen sulfate (pH 10)
and 2 ml of 0.25M sodium carbonate and then vortexed for 20 min in a 15 ml
of polypropylene tube. Three hundred microliters of this mixture  was then
transferred to  a fresh  15 ml of polypropylene tube and  25  ul of a 1 ng/ul
solution of 13C-PFOA was added as an internal standard. Five milliliters of
methyl tert-butyl ether  (MTBE)  was then added and vortexed  again for
20 min. The sample was centrifuged at 2000 X g for 3 min to separate the
aqueous  and  organic phases, and 1 ml of the  MTBE layer  was transferred
to a 5-ml polypropylene tube where it was evaporated to dryness at 45°C
under a gentle stream of dry nitrogen.  The residue was  then solubilized in
400 ul of a 2mM ammonium acetate/acetonitrile  (1:1  by vol) solution and
transferred to  a polypropylene autosampler vial.  No pH adjustments were
made for this  solution. Extracts were analyzed using an Agilent 1100 high-
performance  liquid chromatograph (Agilent Technology, Palo Alto, CA)
coupled  with  an  API 3000  triple  quadrupole mass spectrometer  (Applied
Biosystems, Foster City, CA) (LC/MS/MS). Ten microliters of the extract was
injected  onto  a  Luna C18(2) 3  X  50 mm, 5-um column  (Phenomenex,
Torrance, CA) using  an isocratic mobile phase  consisting of 30% 2mM
ammonium acetate solution and 70% acetonitrile at a flow rate of 200 ul/min.
PFOA  and 13C2-PFOA were  monitored using  parent and  daughter  ion
transitions of 413  —> 369 and 415 —> 370, respectively. Peak integrations and
areas were determined using Analyst software (Applied Biosystems Version
1.4.2, Foster City, CA). For each analytical batch, matrix-matched calibration
curves  were prepared as described above using mouse  serum spiked with
varying levels of PFOA. For quality control, check standards were prepared
by spiking large volumes of mouse serum at several arbitrary levels. These
check standards were stored frozen and aliquots analyzed with each analytical
set. Different  preparations of standards  were used for  each experimental
study; the concentration of each newly prepared standard was compared with
the previous batch to ensure consistency. In addition, control  mouse serum
samples  were fortified  at two  or  three levels  in  duplicate with  known
quantities of PFOA during the preparation of each analytical set. Duplicate
fortified  and  several check standards were run in each  analytical batch to
assess precision and accuracy. The limit of quantification (LOQ) was set as
the lowest calibration point on the standard curve. Analytical batches were
considered to be acceptable if: matrix and reagent blanks had no significant
PFOA  peaks  approaching the LOQ, the standard curve had a correlation
coefficient > 0.98, and all standard curve points, fortified, and check samples
were within 70-130% of the theoretical and previously  determined values,
respectively.

   One- and two-compartment pharmacokinetic analysis of 1 and 10 mg/kg
data.  The PFOA  single  oral dose time course data at the two lower doses
included  three tissues (blood  sera, liver and kidney), two genders (female and
male mice), and two doses (1 and 10 mg/kg) collected in two experimental
blocks  (PK1  and PK2). Thus, there are  24 data  sets  in  all. We  estimated
parameters using R (version  2.4.1,  R Development Core  Team, 2007). One-
and two-compartment models were fit to blood sera, liver, and kidney time-
course data for each gender, dose, and block.
   The one- and two-compartment models with  first order absorption and first
order elimination can be described as:
                    kj>
               (>
                   c(t)=-
                              kj)
   One-compartment model, where D, dose; V&, volume of distribution;
adsorption rate constant; &e, elimination rate constant.
          Two-compartment model, central compartment, where D and &a are as above
       and Vi, volume of central compartment; fc12, rate constant  for transfer from
       compartment  1  to  compartment 2; k2i,  rate  constant for transfer  from
       compartment 2 to compartment 1;  a, agglomerate rate constant representing
       net loss  from  the  central  compartment  during  the  distribution  phase;
       P, agglomerate rate constant representing net loss from the central compartment
       after the distributional phase is complete.
          The models were fit by using generalized nonlinear least square (gnls) using
       the R function gnls  in package nlme (Pinheiro et al., 2007), to estimate the
       parameters. The likelihood ratio test was applied to compare the one- and two-
       compartment models to determine which model better described the data.
          Initially, one-compartment models were fit with a separate parameter value
       for each gender, dose-level,  and experimental block in each tissue. Conditional
       F-tests (Pinheiro and Bates, 2000) on linear combinations of the dose-block-
       gender specific parameters were then used to  determine the extent to which
       each  parameter  could be  simplified.  An  orthogonal  series  of  contrasts,
       analogous to those in multi-factorial analysis of variance, was developed, so
       that interaction terms were first  tested  (in order of decreasing complexity)
       followed by main effects terms. That is, we first tested  for a given parameter
       type (e.g., volume of distribution) whether there was a significant three-way
       interaction (dose X block X gender), which was followed by (if the three-way
       interaction was not significant) dose X block, dose X  gender, block X gender,
       and then  gender, block and dose. Using the results of these  tests, a new
       statistical model was constructed  by collapsing over the effects that were not
       significant. For example, if only gender effects remained significant for volume
       of distribution, a new model would  be constructed  in which volume of
       distribution was allowed to vary among genders, but not across blocks or dose
       levels. When block was found to be significant, it was incorporated as a random
       effect in a nonlinear mixed-effects model, fit using the function nlme.
          One- and two-compartment pharmacokinetic  analysis  of 60  mg/kg
       data.  Subsequent to the analysis of 1 and 10 mg/kg data, an oral time course
       in serum  of female mice  exposed to  60 mg/kg was  collected based upon
       preliminary model predictions  that the time course should be biphasic. This
       data was also evaluated for  one- and two-compartment model fits.
          The  data differ from that collected for  1 and 10 mg/kg doses in having
       replicate values for about half the measurements, so a  hierarchical statistical
       model was fit to the data, with yd (in the one-compartment model)  and Vi (in
       the two-compartment model) varying among subjects. ks and &a were found to
       not be  statistically  identifiable as  individually  varying  parameters.  In  this
       model, the distribution of the log of the compartment volume is assumed to be
       Gaussian, and the population mean and variance are additional parameters to be
       estimated. Estimation was via the method  of Lindstrom and Bates (1990), as
       implemented in the package nlme  (Pinheiro et al., 2007) for R (R Development
       Core Team,  2007).

          Saturable resorption model analysis.  Our saturable resorption model was
       adapted  with minor modifications  from  Andersen  et al.   (2006),  and
       implemented using Matlab  (version R2007a, The Mathworks, Natick,  MA)
       (see Appendix) to simulate and predict the single and repeated oral dose data
       for blood sera in female  mice. Solutions  were obtained  using a stiff solver
       (ode23s) that implemented the  modified  Rosenbrock  (2,3)  pair  approach
       (Shampine and Reichelt, 1997). All the simulations were run  on a computer
       equipped with 3-GHz Dual Core Pentium 4 processor and the Windows XP
       operating system.
          The  salient feature of the  Andersen et  al.  (2006) model is  that the  free
       concentration of PFOA  in  the central compartment (given by  free* Ci) is
       cleared  to a filtrate  compartment where it is either  excreted or  resorbed via
       a saturable process with a Michaelis-Menten form. We examined the original
       three compartment (two body, one filtrate) model with eight model parameters
                                         Previous
TOG
Next

-------
334
                                                         LOU ET AL.
                        i  Second Compartment
                        !        (vb c2)
          Oral Dose
             (D)
Central Compartment
(Vc, C1; free)
Qfil'
,
Tm,KT
                          Filtrate Compartment
                               (Vffl, C3)
  FIG. 1.  A schematic for the renal saturable resorption pharmacokinetic
model.
as well as a simplified, two-compartment (one body, one filtrate) model with six
parameters (Fig.  1). Oral uptake was assumed to be first order with the same
rate we determined for the one-compartment model.
  We determined the model parameters  for female mice via  Maximum
Likelihood Estimation. Values of the likelihood function were calculated for
eight sets of observations in sera: the two blocks each of 1 mg/kg and 10 mg/kg
single doses, the 60 mg/kg single dose, the 17-day repeated 20 mg/kg/day dose
observations, and the repeated dose data from Lau et al. (2006) for 7 and
17 days also at 20 mg/kg/day. We  allowed the coefficient of variation to be
different for each of these eight sets of observations so that our likelihood
function depended upon either six or eight model parameters and eight variance
parameters. The contribution to the likelihood of for  each animal was
calculated—for a few animals where replicate measures had been performed,
the results were averaged. We used  a Nelder-Mead optimizer (Lagarias et al.,
1998) to find the combination of parameters that maximized the likelihood. We
numerically approximated the  second derivative (D'Errico, 2007) of the
likelihood function at the optimized parameter values to obtain the standard
error for each parameter estimate.
                          RESULTS

Experimental Data
   Serum, and in some cases tissue concentrations, of PFOA in
mice were obtained in  several studies including time course
data following  a single dose at three different dosages and
single time points following two durations of repeated dosing
(Table 1).
   The blood sera, liver  and kidney concentration time-courses
after single oral doses of 1 and 10 mg/kg are plotted in Figure 2
for the male  and female  mice.  Male and female  mice  fairly
rapidly absorbed PFOA, as judged by the time of maximum
   observed concentration  (4  or 8 h).  The  liver concentrations
   were  often  higher than those in  sera,  whereas  both were
   substantially higher than the  kidney concentrations. The data
   from  sera,  liver, and kidney and  plots for all the male  and
   female mice dosed with 1 and  10  mg/kg  are presented in the
   Supplemental Materials.
     For single doses of 1  and 10 mg/kg the pharmacokinetics are
   essentially linear, as illustrated in Figure 3 in which the female
   serum time courses  collapse,  when  scaled  by dose,  onto
   approximately the  same line on a semilogarithmic plot. The
   pharmacokinetics were  quite  different  at the highest dose, 60
   mg/kg,  appearing roughly bi-exponential with  low  concen-
   trations, as a fraction of total dose, achieved more rapidly.

   Statistical Analysis of Compartmental Pharmacokinetic
     Models
     Pharmacokinetic  data are routinely analyzed using classical
   one and two-compartmental models and serum concentration data
   (Table 2).  To   compare  clearance  and  apparent  volume of
   distribution, as reflected by liver and kidney concentrations, data
   for each of these tissues were also fitted using the compartmental
   models. For the 1 and  10 mg/kg data, all kinetic parameters were
   identified and estimated in the one-compartment model. With the
   two-compartment model, it was possible to estimate parameters in
   only  six of the twenty-four 1 and 10  mg/kg data sets for blood
   sera,  liver and kidney,  due to failures of convergence. These
   failures are likely to be related to the inability to uniquely estimate
   some of the parameters.  We  compared  the  one- and  two-
   compartment models for estimating the  available corresponding
   identified parameters using the likelihood ratio test and found that
   none of the results  were significant (p  > 0.05), that is, adding
   a second compartment did not significantly improve the ability of
   the model to account for the data. Thus, we at first focused on the
   one-compartment model for further parameter estimation studies
   with  the 1 and 10 mg/kg data.
     yd  and £e differed  significantly between males and females
   for all three tissues, although the differences generally are not
   large  (Table 2).  Vd differed between doses in kidney. Vd also
   varied significantly between the two blocks in sera, which was
   therefore included in  the estimation of experimental error (all
   p <  0.05). The absorption rate constant, £a, was  marginally
   estimable with these data, so it was estimated as a single value
   across all data sets  for each tissue.
                                                          TABLE 1
                                          Pharmacokinetic Studies of PFOA in Mice
                                           Dose
                                                                   Sex
                                                                                        Tissues
                                                                                                                 Sampling
Single dose (PK 1)
Single dose (PK 2)
Single dose
Repeated dose
Repeated dose; Lau et al. (2006)
1, 10 mg/kg
1, 10 mg/kg
60 mg/kg
20 mg/kg, 17 days
20 mg/kg, 7 and 17 days
Male, female
Male, female
Female
Female
Female
Serum, liver, kidney
Serum, liver, kidney
Serum
Serum
Serum
Time course
Time course
Time course
24 h after final dose
24 h after final dose
                                       Previous
TOG
Next

-------
                                SINGLE AND REPEATED DOSE PHARMACOKINETICS OF PFOA
                                                                                                                   335
c
o
O
<
O
LL.
CL
                      10o
o
o


**.. 1

• 1 mg/kg Plasma T 1 mg/kg Liver ^ 1 mg/kg Kidney
0 10 mg/kg Plasma v 10 mg/kg
* 5 $ *
f * 5
f*ft. .
Liver > 10 mg/kg Kidney
•H I 1 ! 1 1 f * * *
ft * • •


0
•
                                        Female
                                                                                Male
                                ttf
                                                v
                                                                                             V  V
                                  20
                                          40
                                                  60
                                                                         20
                                                                                 40
                                                          80      0
                                                             Day

                               FIG. 2.  Experimental data plots for different doses, genders, and blocks.
                                                                                         60
                                                                                                 80
  The mean and 95% confidence intervals for each parameter
are shown in Table 2. Because the initial sampling time point
4 h, is too late to capture the absorption  processes, the  95%
confidence intervals for ka for blood sera and liver are quite
wide. We cannot identify £a for kidney from our data. To solve
this problem, we first explored the sensitivities of estimates of
yd and ke to values of £a by estimating Vd and ke while fixing £a
at 0.3, 0.5, 1, and  1.5/h. The estimates and  95% confidence
intervals were insensitive to the particular value chosen for £a,
so £a for kidney was set to the mean (0.527/h) of £a values in

10.0-
"3
|5-°-
0
CO
€ 2.0-
2"
^. 1.0-
o
0. 0.5-
E
'„-
0.1 -

I
\



• 1 mg/kg
»„
> •
o
<><>

) <
y
0 1 0 mg/kg
O 60 mg/kg
>
II
<> ^
> O
O
o
i 1
III V
^ 1 .

0 CD T 1 < 0 i i 20 40 60 8 Day FIG. 3. Serum concentrations scaled by dose for females administered single doses of 1, 10, and 60 mg/kg. Points are means, error bars are 95% confidence intervals for the means. 1 and 10 mg/kg dose groups are largely superimposed and linear in time on this semi-log suggesting linear first-order kinetics at these doses. The 60 mg/kg group has a substantially different shape and time course. blood sera and liver. As shown in Table 2, female mice have lower values of Vd in blood sera and kidney, but higher values in liver, than male mice, hi all the three tissues, ke values are modestly higher in female mice than in male mice. The one-compartment model was successful in describing the 1 and 10 mg/kg single dose serum data sets with estimated values of Vd (= 0.135 I/kg) and ke (= 0.00185/h) for female mice. These same two parameter values fail to predict serum concentrations for both the higher, 60 mg/kg single dose and repeated dose (20 mg/kg) blood sera data. The serum concentrations resulting from doses of 10 and 60 mg/kg, converge after roughly a day, indicating that linear pharmacokinetic models—including both one and two- compartment models—cannot reproduce the observed phar- macokinetics. Though the 60 mg/kg single dose data was well-described by a two-compartment analysis (Table 3), as is shown in Figures 4a and 4b, predictions made using the two- compartment parameter estimates from the 60 mg/kg data were not consistent with the 1 and 10 mg/kg blood sera data. Jointly analyzing all the available data to optimize the two- compartment model, also shown in Figure 4, resulted in pre- dictions that did not reproduce any of the dose regimens. After 1 and 17 days of dosing female CD1 mice with 20 mg/kg PFOA, Lau ef al. (2006) found that the measured serum levels were approximately equal (176 ± 56 vs. 172 ±34 mg/1, respectively). We use recalculated values based on the original, individual mouse data that are somewhat different from what was reported by Lau ef al. (2006). Additional 17 day, 20 mg/kg repeated dose experiments were performed for five female CD1 mice and a serum PFOA concentration of 130 ± 23 mg/1 was measured. The one-compartment model only fit the repeated dose data if the elimination rate ke was increased from 0.00185/h to 0.025 5/h, that is, the half-life of PFOA in blood sera decreased from 15 to 1.2 days. This results in the contradiction that differ- ent kinetics were observed when using the one-compartment Previous TOC


-------
336
                                                      LOU ET AL.
                                                       TABLE 2
                           One compartment model parameters for 1 and 10 mg/kg doses of PFOA
                                                         Female (95% confidence interval)
                                                           Male (95% confidence interval)
Blood Sera



Liver


Kidney



Vd (L/kg)
ka (1/h)
ke (1/h)
>t>/2 (day)
Vd (L/kg)
ka (1/h)
ke (1/h)
Vd
-------
                                   SINGLE AND REPEATED DOSE PHARMACOKINETICS OF PFOA
                                                                                                                           337
                     O)
                     E
                    <
                    O
                    LL
                    0-
                     E
                     O
                    CO
10.00 -
 5.00-

 2.00-
 1.00-
 0.50-

 0.20-
 0.10-
 0.05-
                                           1 mg/kg
          	2 cmpt: all
          • • •  2 cmpt: 60 mg/kg
          	resorption model
               I
              20
                         I
                        40
 r
60
 r
80
 800-

 600-

 400-
                         200-
               20 mg/kg repeated _
                                       10
                                              12
                                                    14
                                                          16
              200.0
              100.0
               50.0

               20.0
               10.0
                5.0

                2.0-
                1.0-
                0.5-
                                              200-


                                              100-


                                               50-



                                               20-
                                                          20
 I
40
 r
60
                                                                                 80
                                                                60 mg/kg
                                                                                    10
                                                                                          15
                                                                                               20
                                                                                                     25
                                                                  Days
  FIG. 4.  Comparing predictions for the two-compartment model when fit to all the available data (dashed line) with a fit to just the 60 mg/kg data (dotted line).
Neither model does a good job of describing all of the data, whereas the saturable resorption model (solid line) is more consistent between doses.
(0.6830 1/h)  is roughly two thirds of  the  total output.  For
comparison, in monkeys Andersen et al. (2006) assumed  that
2fli was 10% of cardiac output to the kidney, corresponding to
gfli = 0.0943 1/h. For a standard body weight of 0.025  kg,
glomerular filtration  for  female C57BL/6J  mice is  much
smaller, 0.00945 1/h (Qi et al., 2004), whereas in adult male
CD1 mice the urinary  flow rate is even smaller—0.000076 1/h
                                         (Luippold et al.,  2002). Thus, our optimized  value for Q^\
                                         does not appear to have a  direct physiologic analog.
                                            Using our  calibrated  saturable  resorption model, we  in-
                                         vestigated what doses of PFOA are predicted to show the two
                                         different kinetic  behaviors.  For  selected   dose  levels,  we
                                         predicted  changes in kinetics from low dose  to high  dose
                                         (Fig. 7). Only one phase was predicted at low doses (< 40 mg/kg),
                         100
                      I
                          10
02   1
                          0.1
                                  iCU,
                                      O. ..
                               0.1
                                      • 1 mg/kg
                                      0 10 mg/kg
                                      O 60 mg/kg
                                                  10
                                                           100
                                           300


                                           250


                                           200


                                           150


                                           100


                                            50


                                             0
                                                                                          10
                                                                                                   15
                                                               Day
  FIG. 5.  Predictions and quantiles for the saturable resorption model when optimized using all available data. The predictions for the maximum likelihood
estimated parameters are indicated by a solid line, with open squares indicating where model predictions should be compared with observations for the repeated
dose data. Dashed lines indicate the 95% upper and lower quantiles using the estimated parameter uncertainty. The Lau et al. (2006) 7- and 17-day observations as
well as our new 17-day observations are indicated by solid triangles.
                                     Previous
                                   TOC

-------
338
                                                        LOU ET AL.
                          TABLE 4
 Assumed and Optimized PFOA Resorption Model Parameters
Parameter
                             Value (SE)
                                                 Source
Body weight (BW)
Cardiac output
Absorption rate (&a)


Volume of distribution of
the central compartment (Vc)
Volume of renal filtrate (yfil)

Renal blood filtrate rate (<2m)
Volume of distribution of
second body
compartment (Vt)
Intercompartmental
clearance (Qd)
Transport maximum (rm)
Transport affinity
constant (/fT)
Proportion of PFOA
free in serum (free)
25 g
16.5 1/h for mice
0.537 /h


0.0027 (0.0002) 1

0.01 1

0.6830(1.0131) 1/h
0.0545 (0.0151) 1


0.00059 (0.00037) 1/h

860.9 (1298.3) mg/l/h
0.0015 (0.0022) mg/1

0.02

Assumed standard value
Barbee et al (1992)
Estimated from single
dose data using one-
compartment model
Optimized

Assumed,
Andersen et al. (2006)
Optimized
Optimized


Optimized

Optimized
Optimized

Assumed,
Andersen et al. (2006)
         100
                                                                "a
                                                                -§
                                                                2    0.01
                                                                Q_
                                                                £
                                                                2
                                                                W  0.0001
                                                                       0.01
                                                                                   0.1
                                                                                                           10
                                                                                                                      100
                                                                                                Day
                                                                 FIG. 6.  Saturable resorption model predictions using parameters obtained
                                                               with a maximum likelihood estimate show that the concentration in the filtrate
                                                               compartment (dashed) spikes early on allowing the concentration in the primary
                                                               compartment (solid) to rapidly converge for doses of 10 and 60 mg/kg.
whereas  two phases occurred at the high doses (> 40 mg/kg)
with a fast initial elimination rate giving way to a much slower rate
after roughly one day. Simulations using an earlier version of the
model were the basis for selecting the 60 mg/kg  dose, which did
demonstrate  biphasic behavior predicted  by the saturable re-
sorption  model but not previously observed in  the  serum time
course data with lower doses. For repeated doses, daily doses of
0.01, 0.1, and 1 mg/kg saturated after about two  weeks, whereas
for 5 mg/kg the serum concentration quickly saturated. Within
a day of daily doses of 50 and 500 mg/kg, serum concentration
saturated at the same concentration as with 5 mg/kg.
  Normalized sensitivity  coefficients, defined as  (change of
output/output)/(change of input/input), were used to test the
parameter  sensitivity at different days after a single dose of
20 mg/kg (Fig. 8). After 1 day, the most sensitive parameters
were Qju, Tm, and KI,  that is,  the kinetics of the resorption
  process, because they dictate clearance, are the most important
  for predicting long-term concentrations.

                         DISCUSSION

    For 1 and 10 mg/kg single doses, kinetic parameters differ
  significantly  between genders  but the  magnitude  of the
  differences  are small indicating  PFOA  pharmacokinetic
  behaviors  are similar  in  female and male mice in contrast
  to  rats. The  values  of  the parameter  £a  were not well
  estimated, and the 95% confidence intervals were wide. This
  is because the PFOA absorption in mice was fairly rapid, and
  the  absorption was  almost  finished  before  the   initial
  sampling time  point (4 h). Due to the uncertain estimation
  of fca values, we used  only one fca for female  and male mice
  for each tissue.
                                   100
                                <   10
                                o
                                E
                                   0.1
                                       0
                                                                100
                                                                 10
-  0.1
 Day
                                                                     ''
                                                                      '-j i
                                                                         f--f w^i
                                                                    0
                                                                               10
                                                                                          20
  FIG. 7.  Delineation of predictions for the PFOA concentration (mg/1) in the central compartment. For the single dose (top) solid lines depict doses of 0.1, 1,
10, 100, and 1000 mg/kg. The dashed line indicates a dose of 40 mg/kg which is roughly where the onset of nonlinearity occurs. For the repeated dose (bottom)
solid lines depict repeated daily doses of 0.001, 0.1, 1, 50, and 500 mg/kg. The dashed line indicates a daily dose of 5 mg/kg.
                                       Previous

-------
                                 SINGLE AND REPEATED DOSE PHARMACOKINETICS OF PFOA
                                                                                                                     339
c
o>
'o
^
Qfii
* ka
* Vfil
» free
                                  '*»>,
                             10
                            Day
                                      15
                                               20
  FIG. 8.  Analysis of the parameter sensitivity by increasing each parameter
in turn by 1% and comparing predicted concentrations for a 20 mg/kg single
dose with those for the optimized/assumed value. Note that  plot points for
several parameters are on top of each other near zero.
  The one-compartment model is successful in describing the
1 and 10 mg/kg single dose data sets with the estimated values
of Vd,  ka, and ke, but fails in predicting the higher, 60 mg/kg
single dose and the repeated dose data.  Similarly, although the
60 mg/kg data can be described by a two-compartment model,
for the optimized parameters that model  overestimates the 1 and
10 mg/kg single dose data. Neither model predicts the repeated
dose observations without  changing  some model  parameters
drastically from the single dose case.
  The saturable resorption model of Andersen et al.  (2006)
reconciles the  lower two doses with the high single dose by
allowing the clearance to change for different exposure levels in
place of the first order or proportional clearance in the previous 1
and 2 compartment models. At low dose or the early period of
repeated dose for our data, the PFOA concentration in the filtrate
compartment is low and is proportional to dose, which has a low
net urine elimination  rate, whereas  at high dose  (including
pseudosteady  state  of repeated dose for our data), the PFOA
concentration in the filtrate compartment is high and resorption
is saturated, which results in a high net urine elimination rate.
  The saturable resorption model does  not, however, com-
pletely reconcile the  single dose data with the repeated  dose
concentrations. Though parameters can be estimated such that
most data is with the 95% confidence intervals, the concen-
trations observed after repeated doses by Lau et al. (2006) seem
to be systematically higher than  predicted.  This may reflect
experimental variability in light  of the repeated  dose  data
reported here, which was lower than that previously measured.
  The modeling  analyses presented  here  can be used  to
estimate initial internal dose metrics for toxicity studies carried
out in adult mouse. Characterization of the uncertainties in the
parameter estimates permits some  description of the uncertain-
ties in predicted dose metrics. However, none of these models
describe tissue dosimetry, which would require a physiologically
based pharmacokinetic model structure. The recent demonstra-
tions that PFOA exposures in mice alter transporter expression
in at least the liver (Cheng and Klaassen, 2008;  Maher et al.,
2008), raise further questions about the time course and dose-
response for these changes and how they affect serum, liver, or
other tissue  concentrations.  Additional experimental  work
would also benefit  from  quantifying  the  fecal  and urinary
elimination from mice because current analyses assume nearly
complete absorption and do not distinguish elimination routes.
Models that are more physiologically based would potentially
need to explicitly address these issues.


                        FUNDING

   Interagency   Agreement   (RW-75-92207501)   with   the
National Toxicology Program at  the  National Institute for
Environmental Health Science was a  source of partial funding.
                                                                            ACKNOWLEDGMENTS

                                                             The United States Environmental Protection Agency through
                                                           its Office of Research and Development funded and managed
                                                           the research described here. This research has been subjected to
                                                           Agency's administrative review and approved for peer review.
                                                           We appreciate technical assistance from Kaberi Das.
                                                                                  APPENDIX


                                                           function [dCdt] =pfoa_ode_new2compab(t,C,P)
                                                           % From one-compartment analysis of 1 and 10 mg/kg data:
                                                           ka=0.537;
                                                           % From Andersen et al. (2006):
                                                           Vfil = 0.01;
                                                           free = 0.02;
                                                           % Parse the parameter vector P
                                                           Vc = P(l);
                                                           Vt = P(2);
                                                           kd = P(3);
                                                           Qd = kd*Vc;
                                                           Tm = P(4);
                                                           KT = P(5);
                                                           Ml = P(6);
                                                           Qfil = kfil*Vc;
                                                           % Note that gut compartment has different units:
                                                           dCdt=zeros(4,l);
                                                           dCdt(l) = ka/Vc*C(4)-Qd/Vc*free*C(l)+Qd/
                                                             Vc*C(2)-Qfil/Vc*C(l)*free+
                                                             Tm*C(3)/(KT+C(3));
                                                           dCdt(2) =  l/Vt*(free*Qd*C(l) - Qd*C(2));
                                                           dCdt(3) =  l/Vnl*(Qfil*C(l)*free-Vc*Tm*C(3)/
                                                             (KT+C(3))-Qfil*C(3));
                                                           dCdt(4) = -ka*C(4);
                                                        %[Vfil] = L
                                                        %[free] = 1

                                                        % [Vc] = L
                                                        % [Vt] = L
                                                        %[kd] = 1/h
                                                        %[Qd] = L/h
                                                        %[Tm] = mg/L/h
                                                        %[KT] = mg/L
                                                        %[kfil] = 1/h
                                                        %[Qfil] = L/h
                                                        %[C(1)] = mg/L
                                                        %[C(2)] = mg/L
                                                        %[C(3)] = mg/L

                                                        %[C(3)] = mg
                                   Previous
                                                    TOG
                       Next

-------
340
                                                                  LOU ET AL.
                           REFERENCES

Abbott, B. D., Wolf, C. J., Schmid, J. E., Das, K. P., Zehr, R. D., Helfant, L.,
  Nakayama,  S., Lindstrom, A.  B., Strynar, M. J., and Lau, C. S. (2007).
  Perfluorooctanoic acid (PFOA)-induced developmental toxicity in the mouse
  is dependent on expression  of peroxisome proliferator activated  receptor-
  alpha (PPARa). Toxicol. Sci. 98, 571-581.
Akaike, H. (1974).  A new look  at the statistical model identification. IEEE
  Trans. Automat. Contr. 19, 716-723.
Andersen, M. E., Butenhoff, J. L., Chang, S. C., Farrar, D. G., Kennedy, G. L.,
  Jr.,  Lau,  C.,  Olsen,  G. W., Seed,  J.,  and Wallace,  K.  B.  (2008).
  Perfluoroalkyl acids and related chemistries—Toxicokinetics and modes of
  action. Toxicol. Sci. 102, 3-14.
Andersen, M. E., Clewell, H. J.,  Tan, Y.,  Butenhoff, J. L., and Olsen, G. W.
  (2006). Pharmacokinetic  modeling of saturable,   renal  resorption  of
  perfluoroalkylacids in monkeys—Probing the determinants of long plasma
  half-lives. Toxicology 227, 156-164.
Barbee, R. W., Perry, B. D., Re, R. N., and Murgo, J. P. (1992). Microsphere
  and dilution techniques for the  determination of blood flows and volumes in
  conscious mice. Am.  J. Physiol. 263(3 Pt 2),  R728-R733.
Belts, K. S.  (2007). Perfluoroalkyl acids: what  is the  evidence  telling us?
  Environ. Health Perspect. 115, A250-A256.
Biegel, L. B., Hurtt, M. E., Frame, S. R., O'Connor, J. C., and Cook, J.  C.
  (2001). Mechanisms of  extrahepatic  tumor induction  by  peroxisome
  proliferators in male CD rats. Toxicol. Sci.  60, 44-55.
Butenhoff, J. L.,  Gaylor, D. W., Moore,  J. A., Olsen,  G. W., Rodricks,  J.,
  Mandel, J. H., and Zobel, L. R. (2004). Characterization of risk for general
  population exposure to perfluorooctanoate. Regul. Toxicol.  Pharmacol. 39,
  363-380.
Calafat, A. M., Wong, L.-Y., Kuklenyik, Z., Reidy, J. A., and Needham, L. L.
  (2007). Polyfluoroalkyl  chemicals in the U.S. population: Data  from the
  National Health and Nutrition  Examination Survey (NHANES) 2003-2004
  and comparisons with NHANES 1999-2000. Environ. Health Perspect. 115,
  1596-1602.
Cheng, X.,  and Klaassen,  C.  D. (2008).  Critical  role of PPARj alpha)  in
  perfluorooctanoic acid- and perfluorodecanoic acid-induced down-regulation
  of Oatp uptake transporters in mouse livers. Toxicol. Sci. 106, 37-45.
Cook, J. C., Murray, S. M., Frame, S. R., and Hurtt, M. E. (1992). Induction of
  Leydig cell adenomas  by  ammonium perfluorooctanoate:  A  possible
  endocrine-related  mechanism. Toxicol. Appl. Pharmacol. 113, 209-217.
Dai, J.,  Li,  M.,  Jin, Y.,  Saito,  N.,   Xu,   M., and  Wei,  F.  (2006).
  Perfluorooctanesulfonate  and periluorooctanoate in  red panda and giant
  panda from China. Environ. Sci. Technol. 40, 5647-5652.
D'Errico, J.  R. (2007).  Automatic numerical differentiation, MatlabCentral.
  Available from: http://www.mathworks.com/matlabcentral/fileexchange/loadFile.
  do?objectld='3490. Accessed January 31,2008.
Dietz, R., Bossi, R., Riget, F. F., Sonne, C., and Born, E. W. (2008). Increasing
  perfluoroalkyl contaminants in east Greenland polar bears (Ursus maritimus):
  A new toxic threat to the Arctic bears. Environ. Sci. Technol. 42, 2701-2707.
Han, X.,  Yang, C. H., Snajdr, S.  L, Nabb, D. L., and Mingoia, R. T. (2008).
  Uptake of perfluorooctanoate in freshly  isolated hepatocytes from male and
  female rats. Toxicol. Lett. 181, 81-86.
Harada,  K.,  Inoue, K.,  Morikawa,  A., Yoshinaga,  T.,  Saito,  N.,  and
  Koizumi,  A.  (2005). Renal clearance  of perfluorooctane sulfonate and
  perfluorooctanoate in humans and their species-specific excretion.  Environ.
  Res. 99, 253-261.
Hansen, K. J., Clemen, L. A., Ellefson, M. E., and Johnson, H. O. (2001).
  Compound-Specific, Quantitative Characterization of Organic Fluorochem-
  icals in Biological Matrices. Environ. Sci. Technol. 35, 779-770.
Harris, L. A., and Barton, H. A. (2008). Comparing single and repeated dosimetry
  data for perfluorooctane sulfonate in rats. Toxicol. Lett. 181, 148-156.
   Kennedy, G. L., Butenhoff, J. L., Olsen, G. W., O'Connor, J. C., Seacat, A. M.,
     Perkins, R. G., Biegel, L. B., Murphy, S. R., and Farrar, D. G. (2004). The
     toxicology of perfluorooctanate. Crit.  Rev. Toxicol. 34, 351-384.
   Kudo, N., Katakura, M., Sato, Y., and Kawashima, Y.  (2002). Sex hormone-
     regulated renal transport of perfluorooctanoic acid. Chem. Biol. Interact. 139,
     301-316.
   Kudo,  N.,  and  Kawashima,  Y.  (2003). Toxicity  and  toxicokinetics  of
     perfluorooctanoic acid in humans and animals. /. Toxicol. Sci. 28, 49-57.
   Kudo, N., Sakai, A., Mitsumoto, A., Hibino, Y., Tsuda, T., and Kawashima,  Y.
     (2007). Tissue distribution and hepatic subcellular distribution of perfluor-
     ooctanoic acid at low dose are different from those at high dose in rats. Biol.
     Pharm. Bull. 30, 1535-1540.
   Lagarias, J. C.,  Reeds, J.  A., Wright,  M. H., and Wright, P.  E. (1998).
     Convergence properties  of  the  Nelder-Mead  Simplex  Method in Low
     Dimensions. SIAM J. Optim. 9, 112-147.
   Lau, C., Anitole, K., Hodes, C., Lai, D., Pfahles-Hutchens, A., and Seed, J.
     (2007).  Perfluoroalkyl acids: A  review of  monitoring  and toxicological
     findings. Toxicol. Sci. 99, 366-394.
   Lau, C., Thibodeaux, J. R., Hanson, R.  G., Narotsky, M. G., Rogers, J. M.,
     Lindstrom, A. B., and Strynar, M. J. (2006). Effects of perfluorooctanoic acid
     exposure during pregnancy in the mouse. Toxicol. Sci. 90, 510-518.
   Lindstrom, M. J., and Bates, D. M. (1990). Nonlinear mixed effects models for
     repeated measures data. Biometrics 46, 673-687.
   Luippold, G., Pech, B.,  Schneider, S., Osswald, H., and Muhlbauer, B. (2002).
     Age dependency of  renal function in CD-I mice.  Am.  J. Physiol. Renal
     Physiol. 282, 886-890.
   Maher, J.  M., Aleksunes,  L. M.,  Dieter, M.  Z., Tanaka, Y., Peters, J. M.,
     Manautou, J. E., and Klaassen, C.  D. (2008). Nrf2  and PPARjalpha)-
     Mediated  Regulation of  Hepatic Mrp Transporters  after  Exposure  to
     Perfluorooctanoic Acid  and  Perfluorodecanoic Acid.  Toxicol.  Sci  106,
     319-328.
   Nakagawa, H., Hirata, T., Terada, T., Jutabha, P., Miura, D., Harada, K. H.,
     Inoue, K., Anzai, N., Endou, H., Inui, K., et al. (2008). Roles of organic
     anion transporters  in the renal excretion of perfluorooctanoic acid. Basic
     Clin. Pharmacol. Toxicol. 103, 1-8.
   Olsen, G. W., Burris, J. M., Ehresman, D. J., Froehlich, J. W., Seacat, A. M.,
     Butenhoff, J. L., and Zobel, L. R. (2007). Half-life of serum elimination  of
     perfluorooctanesulfonate, perfluorohexanesulfonate, and perfluooctanoate  in
     retired fluorochemical production workers.  Environ. Health Perspect.  115,
     1298-1305.
   Pinheiro, J. C., and Bates, D. M.  (2000). In  Mixed-Effects Models  in S and
     S-PLUS. Springer, New York.
   Pinheiro, J.  C., Bates,  D.  M., DebRoy, S.,  Sarkar, D., and R Core Team
     (2007). In Nlme: Linear and Nonlinear Mixed Effects Models.  R package
     version 3, pp.  1-84.  The R Foundation for Statistical Computing, Vienna,
     Austria.
   Qi,  Z., Whitt, L, Mehta, A., Jin, J.,  Zhao, M., Harris, R. C., Fogo, A. B., and
     Breyer,  M. D. (2004). Serial determination of glomerular filtration rate  in
     conscious mice using FITC-insulin clearance. Am. J. Physiol. Renal Physiol.
     286, 509-596.
   R Development Core Team. (2007). R: A Language  and Environment for
     Statistical Computing. R  Foundation for  Statistical  Computing,  Vienna,
     Austria. ISBN 3-900051-07-0. http://www.R-project.org.
   Shampine, L. F., and Reichelt, M. W. (1997). The Matlab ODE suite. SIAM J.
     Sci. Comput. 18(1), 1-22.
   Tan, Y.  M.,  Clewell,  H. J.,  3rd,  and Andersen,  M. E.  (2008).  Time
     dependencies in perfluorooctylacids  disposition in  rat  and monkeys:  A
     kinetic analysis. Toxicol. Lett. Ill, 38^-7.
   Trudel, D., Horowitz, L., Wormuth, M., Scheringer, M., Cousins, I. T., and
     Hungerbiihler, K.  (2008). Estimating  consumer exposure to  PFOS and
     PFOA. Risk Anal. 28, 251-269.
                                              Previous
TOG
Next

-------
                                       SINGLE AND REPEATED DOSE PHARMACOKINETICS OF PFOA
                                                                                                                                          341
U.S. EPA. (2005). Draft risk assessment of the potential human health effects
  associated with exposure to perfluorooctanoic acid and its salts (PFOA).
  OPPT, Washington,  DC. Available  from:  http://www.epa.gov/oppt/pfoa/
  pubs/pfoarisk.pdf. Accessed September 28, 2008.
Washburn,  S.  T.,  Bingman,  T. S.,  Braithwaite,  S.  K.,  Buck,  R.  C.,
  Buxton, L. W., Clewell, H. J., Haroun, L. A., Kester, J. E., Rickard, R. W.,
  and Shipp, A. M. (2005). Exposure assessment and risk characterization for
  perfluorooctanoate in selected consumer articles. Environ.  Sci.  Technol.  39,
  3904-3910.
       White, S. S., Calafat, A. M., Kuklenyik,Z., Villanueva,L.,Zehr,R. D., Helfant,L.,
         Strynar, M. J., Lindstrom, A. B., Thibodeaux, J. R., Wood, C., et al. (2007).
         Gestational PFOA exposure of mice is associated with altered mammary gland
         development in dams and female offspring. Toxicol. Sci. 96, 133-144.
       Wolf,  C. J.,  Fenton, S. E., Schmid, J.  E., Calafat, A. M.,  Kuklenyik, Z.,
         Bryant, X. A., Thibodeaux, J. R., Das,  K. P., White, S. S., Lau, C.  S., et al.
         (2007). Developmental toxicity of perfluorooctanoic acid in the CD-I mouse
         after cross-foster and restricted  gestational exposure.  Toxicol.  Sci.  95,
         461^73.
                                         Previous
TOC
Next

-------
Modulation  of TLR2 Protein  Expression by miR-105  in Human
Oral Keratinocytes*
Received for publication, January 28,2009, and in revised form, May 22,2009  PublishedJBC Papers in Press, June 9,2009,001 10.1074/jbc.M109.013862
Manjunatha R. Benakanakere*, Qiyan Li*, Mehmet A. Eskan*, Amar V. Singh§\ Jiawei Zhao*, Johnah C. Galicia*,
Panagiota Stathopoulou*, Thomas B. Knudsen§", and Denis F. Kinane*1
From the ^Center for Oral Health and Systemic Disease and the § Department of Molecular, Cellular, and Craniofacial Biology,
University of Louisville School of Dentistry, Louisville, Kentucky 40202, the ^National Center for Computational Toxicology,
Environmental Protection Agency, Research Triangle Park, North Carolina 2777 7, and ^Lockheed Martin,
Research Triangle Park, North Carolina 2777 7
  Mammalian biological processes  such as  inflammation,
involve regulation of hundreds of genes controlling onset and
termination. MicroRNAs (miRNAs) can translationally repress
target mRNAs and  regulate  innate immune responses.  Our
model system comprised primary human keratinocytes, which
exhibited robust differences in inflammatory cytokine produc-
tion (interleukin-6 and tumor necrosis factor-«) following spe-
cific Toll-like receptor 2 and 4 (TLR-2/TLR-4) agonist chal-
lenge. We challenged these primary cells with Porphyromonas
gingivalis (a Gram-negative bacterium that triggers TLR-2 and
TLR-4) and performed miRNA expression profiling. We identi-
fied miRNA (miR)-105 as a modulator of TLR-2 protein trans-
lation in human gingival keratinocytes. There was a strong
inverse  correlation  between cells  that  had high cytokine
responses following TLR-2 agonist challenge and miR-105 lev-
els. Knock-in and knock-down of miR-105 confirmed this
inverse relationship. In silica analysis predicted that miR-105
had complementarity for TLR-2 mRNA, and  the luciferase
reporter assay verified this. Further understanding of the role of
miRNA in host responses may elucidate  disease susceptibility
and suggest new anti-inflammatory therapeutics.
  The innate immune  response is a  crucial first  line of
defense against pathogens.  Host detection of  microbes
occurs through pattern recognition  receptors,  including
Toll-like receptors (TLRs)2 that are expressed on many cells,
including  macrophages, monocytes (1), and keratinocytes
(2). To date, 11 TLRs have been identified in humans, recog-
nizing a range of distinct and conserved microbial molecules
(3). TLRs  responding to particular pathogens may activate
complex networks  of pathways  and interactions, positive
and negative feedback loops, and multifunctional transcrip-
* This work was supported, in whole or in part, by National Institutes of Health
  Grant DE017384 (to D. F. K.) from the United States Public Health Service,
  NIDCR.
1 To whom correspondence should be addressed: University of Louisville
  School of Dentistry, 501 South Preston St., Rm. 204, Louisville, KY 40202.
  Tel.: 502-852-3175; Fax: 502-852-5572; E-mail: dfkina01@louisville.edu.
2 The abbreviations used are: TLR, Toll-like receptor; HGEC, human gingival
  epithelial cell; IL-6, interleukin 6; TNF-a, tumor necrosis factor a; miRNA,
  microRNA; hsa-miR, Homo  sapiens  microRNA; pre-miR,  precursor
  microRNA; UTR, untranslated region; SOCS3, suppressor of cytokine signal-
  ing 3; FSL-1, Pam2CGDPKHPKSF (a synthetic diacylated lipoprotein and a
  specific ligand for TLR-2).
tional responses (4). Among the key downstream targets of
these  networks  are  NF-KB,  mitogen-activated  protein
kinases, and members of the IRF family (5). Proper regula-
tion of the gene products comprising these networks by tran-
scriptional and post-transcriptional processing is not only
important for selective pathogen elimination but also for pre-
venting excessive accumulation of cytokines  such as interfer-
on-/3, interferon-y,  IL-6,  and TNF-a that initiate the host
defense against microbial attack (6). Deregulated expression of
these cytokines has been implicated in cancer, autoimmunity,
and hyper-inflammatory states (7-9).
  MicroRNAs (miRNAs)  have been implicated in pathway-
level regulation of complex biological processes (10). The role
of miRNA-based regulation of the innate immune responses is
a current topic of investigation (11). Mammalian miRNAs are a
class of conserved, small noncoding RNA oligonucleotides that
function as negative regulators of translation for multiple target
transcripts (12). As many as 5000 distinct miRNAs maybe tran-
scribed and processed in mammalian cells (13-17). Mature
miRNAs bind to specific cognate sequences in the 3'-UTRs of
target transcripts, resulting in  either mRNA degradation or
inhibition of translation (12).
  In mammalian cells, the miRNAs provide a key level of bio-
logical regulation  in developmental and differentiation path-
ways (18). Deregulation of specific miRNA abundance has been
associated with malignancies in the colon, breast, and lung (19,
20).  Recently,  miRNAs have  been shown to  modulate  the
NF-KB pathway  (miR-146a)  (21) and negatively  regulate
TRAF6, IRAKI (miR-155) (22), or SOCS3 (miR-203) (23). It is
presently unclear how miRNAs regulate cellular pathways in
innate  and inflammatory processes, where precise control of
complex networks is needed to engage an appropriate response
to microbes that avoids a cytokine storm.
  Periodontitis is a common chronic inflammatory condition
affecting 50% of humans that results in loss of bone and teeth
(24). This disease is initiated by dental plaque, a microbial bio-
film composed mainly of Gram-negative anaerobic bacilli (25,
26), including the pathogen Porphyromonas gingivalis. Individ-
ual human variability in susceptibility to periodontitis is recog-
nized (27) and may involve individual variation in the immune
response  (25, 26). We  identified innate  immune variations
within a bank of over 30 primary human gingival cell cultures
(25) based on variations in cytokine response following TLR
agonist challenge.
a
o
a.
a>
a.
o
cr
p
b
<3
W
a>
T£
a>
3
cr
a>
K)
o
AUGUST 21, 2009-VOLUME 784-NUMRFR 34
                                  Previous
                       IAL OF BIOLOGICAL CHEMISTRY 23107

-------
miR- 705 in Human Oral Keratinocytes
  The present study tested the hypothesis  that differential
expression of miRNAs may account for some of the variability
in innate immunity. To test this hypothesis, we selected three
"normal" and three "diminished"  cytokine-response pheno-
types. We subjected the corresponding primary human gingival
cell cultures to TLR agonist challenge and profiled the expres-
sion of 600 miRNAs. We found strong up-regulation of hsa-
miR-105 specifically in the diminished cytokine-response phe-
notype and furthermore showed that TLR-2 protein levels were
depressed. This implies  a concordant logic  circuit in which
miR-105 inversely regulates TLR-2 function. A computational
(in silico) search of the miRNA database revealed that TLR-2
transcript is a potential target for  miR-105 regulation at the
3'-UTR. This binding was confirmed using a linked luciferase
reporter gene, and through small interference RNA and inhib-
itor (antagomir) studies a functional association with cell sur-
face TLR-2  expression. We also confirmed this complementa-
rity. We conclude  that cell  surface  TLR-2  expression is
inversely regulated by miR-105 expression in human gingival
epithelial cells. This  mechanism may reduce  inflammatory
cytokine production and provide a novel target for therapeutic
intervention.

EXPERIMENTAL PROCEDURES
  Cell Culture and Challenge Assays—A total of 13 human gin-
gival epithelial cells (keratinocytes), with University of Louis-
ville IRB approval, were  obtained from healthy patients after
third molar extraction. They were grown  as  previously
described (28) to sub-confluence, sub-cultured, and challenged
as described (2, 29, 30). At confluence, they  were challenged
with heat-inactivated P. gingivalis  (strain 33277) or 1 /xg/ml
FSL-1 (Pam2CGDPKHPKSF, a synthetic diacylated lipoprotein
and a specific ligand for  TLR-2) (InvivoGen, CA). Cells were
challenged for 24 h, and culture supernatants were subjected to
IL-6 and TNF-a cytokine levels were measured by enzyme-
linked immunosorbent assay (BD Biosciences). The transcrip-
tion factor NF-KB assay was performed using a modified elec-
trophoretic mobility shift assay technique with TransAM™
NF-KB enzyme-linked immunosorbent assay kit from Active-
Motif (Carlsbad, CA) according to the manufacturer's instruc-
tions. HEK-293 (ATCC number: CRL-1573) cells were cultured
following ATCC  protocol. Briefly, the cell monolayer was
washed and incubated with 2-3 ml of trypsin-EDTA solution to
the flask and neutralized with trypsin inhibitor after 5 min. The
cells were centrifuged and suspended in ATCC-formulated
Eagle's minimum  essential medium  (catalogue no. 30-2003)
with 10% fetal bovine serum (complete medium).  The cells
were propagated in complete medium until they were ready for
transfection.
  miRNA Array Profiling/Analysis—Total RNA was collected
by the TRIzol method and purified with a Qiagen purification
kit (Qiagen), and total RNA quality was analyzed using a Bio-
analyzer 2100 (Agilent). Equal amounts of each sample were
used to generate a reference pool. For each array to be hybrid-
ized, 2 /j,g of total RNA from each sample, and the reference
pool were labeled with Hy3™ and Hy5™ fluorescent label,
respectively, using the miRCURY™ LNA Array labeling kit
(Exiqon, Denmark) following the manufacturer's instructions.
The Hy3™-labeled sample and the Hy5™-labeled reference
pool RNA were mixed and hybridized to the miRCURY™ LNA
array version 8.1 (Exiqon). The hybridization was performed
according to the  miRCURY™ LNA array  manual using a
Tecan  HS4800  hybridization  station (Tecan, Austria). The
miRCURY™ LNA array microarray slides were scanned by a
ScanArray 4000 XL scanner (Packard Biochip Technologies),
and the image analysis was carried out using the ImaGene 7.0
software  (BioDiscovery, Inc.).  Expression ratios were deter-
mined for microarray data by computing the background-cor-
rected fluorescent signal from the query sample (Q)/reference
sample (R). Ratiometric data were transformed to log 2 to pro-
duce a continuous spectrum of up- and down-regulated values.
Data were normalized by plotting the difference, log 2(Q/R),
against the average, (1/2) log 2(Q*R) followed by the application
of locally weighted regression (lowness) to smooth intensity-
dependent ratios. The log  2(Hy3/Hy5)  intensity  data were
uploaded into GeneSpring v7.3 for two-way analysis of variance
(factors = cell-line and second treatment), using a parametric
test with variances assumed  equal, a cutoff (p = 0.05) to gen-
erate a heat-map through bi-directional hierarchical clustering
(31, 32).
  TLR-2 mRNA and miR-105 Real-time PCR—Total RNA was
extracted from cultured cells  using TRIzol reagent (Invitrogen).
The isolated total RNA  samples  were used for first strand
cDNA  synthesis with specific miR-105 hairpin loop primers
(Applied Biosystems, Foster City, CA). Real-time PCR was per-
formed by using 1 ng of cDNA with an miR-105-specific primer
and probe on an ABI 7500 system (Applied Biosystems) in the
presence of TaqMan DNA polymerase. The data were analyzed
by normalizing miRNA level to miRNA RNU48 (small nucleo-
lar RNA  used as internal control, which has least variability
across the cell types and challenges). For TLR-2 mRNA quan-
tification,  the total RNA was converted to single-stranded
cDNA using a cDNA archive kit (Applied Biosystems) and 100
ng of cDNA to quantify TLR-2  mRNA  using the TaqMan
method (Applied Biosystems). Glyceraldehyde-3-phosphate
dehydrogenase was the internal control, and -fold increase was
calculated as described (33).
  Transfection of miRNA—Epithelial cells were transfected
with 100 pmol of miR-105 mimic (UCAAAUGCUCAGACUC-
CUGUGGU) and miR-105  inhibitor (AGTTTACGAGTCT-
GAGGACACCA) (Dharmacon, CA) and co-transfected with
100 pmol of small interference RNA control, labeled with 6-car-
boxyfluorescein to monitor transfection efficiency. The trans-
fection reaction was  performed  using  FuGENE  6 reagent
(Roche Applied Science). Cells were challenged with P. gingiva-
/zs/FSL-1  for 24 h following transfection.
  Immunohistochemistry—The cells were seeded onto collag-
en-coated chamber glass slides (Lab-Tek™ II Chamber Slide®,
Nalgene Nunc International, Rochester, NY). At 50-60% con-
fluence, the cells were transfected either with miR-105 mimic
or miR-105 inhibitor or with scrambled small interference RNA
using FuGENE 6 transfection reagent as described above. The
transfection reaction was performed for up to 24 h and replaced
with fresh medium. The  challenge assay was performed after
48 h of transfection with FSL-1 (0.5 /Ag/ml, InvitroGen), they
were fixed in 4% paraformaldehyde, permeabilized, and stained
a
o
a.
a>
a.
o
cr
p
b
<3
w
a>
T£
a>
3
cr
a>
23108  JOURNAL OF BIOLOGICAL
                                  Previous
                        1MB 284-NUMBER 34-AUGUST 21, 2009

-------
                                                                              miR-W5in Human Oral Keratinocytes
CD
m
£
,wj
i
6
                                                          B
                                                                               hsa-miR-105
                                            11006 (ISH*2B)
                                            19530 (h»W7i)
                                            1SOM(W#7c)
                                            1~JJ(l»W7j)
                                            177«(HsH*7t)
                                            10975 (HSH*182)
                                                                          Normal type
                                                          Stem loop of pre-miR-105

                                                           s^^Si^r^SS^S^^^S

                                                                             V
                          Diminished type
                                                                                                     J

                                            13171 (hsw*«J|
                                            11W7(hsw*«4)
                                      ikm
                                                                                         V
                                                          Mature miR-105    UCAAAUGCUCAGACUCCUGUGGU
                                                                 miR-105   UGUCCUCA- - GACUCGUAAAC
                                                                           i ii  11  ii    i     mi 1111
                                                          3' UTR of TLR2   ATAAGAGTGGCATAGTATTTG
                                                          Antagomir-105   AGTTTACGAGTCTGAGG ACACCA
       FIGURE 1. Normal and diminished cytokine response cells were challenged with heat-inactivated P. gingivalis (strain 33277) at 100 multiplicity of
       infection for 24 h, and RNA was miRNA microarray profiled using the miRCURYLNA Array (Exiqon). The heat map shows two-way hierarchical clustering
       of genes and samples (rows = miRNA, columns = sample). The color scale indicates relative expression:ye//ow, above mean; blue, below mean; and black, below
       background. Global microarray expression revealed distinct profiles for 109 miRNAs expressed among these cells (p = 0.05) out of which, 26 well annotated
       miRNAs revealed distinct patterns (p = 0.0037), and miR-105 is represented with a rectangular dotted box (A). This signal is up-regulated in the diminished
       cytokine response phenotype after challenging with heat-inactivated P. glnglvalls and down-regulated in normal response cells after challenge (p = 0.0017) (6).
       The stem loop of miR-105 with complimentarity toTLR-2 mRNAand sequence of antagomir used for the present study are shown (Q.
       with anti-human TLR-2 antibody overnight at 4 °C followed by
       Alexa Fluor® 488 anti-mouse IgG in 3% bovine serum albumin
       (1/500, InvitroGen) for 1 h at room temperature and SYTO®
       83-orange for 15 min. The stained cultures were photographed
       using a Confocal Laser Scanning Microscope (FV500, Olym-
       pus, Melville, NY).
         Western Blotting— Total  protein was extracted from cells
       using radioimmune precipitation assay buffer after 24 h of chal-
       lenge. The Western  blot was performed by loading 25 jug of
       total proteins on to each lane. Blotted membranes were blocked
       using 5% nonfat milk and incubated at 4 °C overnight in anti-
       TLR-2  antibody (Cell Signaling Technology, Danvers, MA).
       The membranes were washed and incubated in anti-mouse IgG
       conjugated with horseradish peroxidase (Cell Signaling Tech-
       nology)  secondary antibody and signal was developed using
       ECL plus™ Western blotting detection reagent (Amersham
       Biosciences). The ratiometric analysis of band intensity was cal-
       culated using FluorChem HD software (Alpha Innotech).
         Luciferase Reporter Assay—-The putative miRNA-105 target
       site of 52 bp within the 3'-UTR region of human TLR-2  mRNA
       (Ensembl transcript ID: ENST 00000260010) were synthesized
with flanking Spel and Hindlll restriction enzyme sites. In addi-
tion, the primers with their putative binding site mutated, were
also  synthesized from Integrated DNA Technologies Inc.
(sense primers:  5'-gctgactagtCATAGATGATCAAGTCCCT-
TATAAGAGTGGCATAGTATTTGCATATAACaagcttggac-
3'; antisense primer: 5'-gtccaagcttGTTATATGCAAATACT-
ATGCCACTCTTATAAGGGACTTGATCATCTATGactag-
tcagc-3'; mutated sense primer: 5'-gctgactagtCATAGATGA-
TCAAGTCCCTTATAAGAGTGGCATAGTCATATAACaa-
gcttggac-3'; and mutated antisense primer: 5'-gtccaagcttGTT-
ATATGACTATGCCACTCTTATAAGGGACTTGATCAT-
CTATGactagtcagc-3')- The sense and antisense strands of the
oligonucleotides were  annealed  (34). The annealed oligonu-
cleotides were digested with Spel and Hindlll and ligated into
the multiple cloning site of the pMIRREPORT Luciferase vec-
tor (Ambion, Inc.). The post-transcriptional regulation  of
pMIRREPORT luciferase vector  was potentially regulated  by
miRNA interactions with the TLR-2 3'-UTR. We then trans-
fected cultured HEK293 cells with each of these reporter con-
structs (pMIR-TLR2 or pMIR-mutTLR-2), as well as co-trans-
fecting them with pMIF-cGFP-Zeo-miR-105 (pMIF-miR-105)
       AUGUST 21, 2009-VOLUME 284-NUMREBJ4
                                                            TOC
                       
                                                          a.
                                                          3
                                                          cr
                                                          p
                                                          b
                                                          <3
                                                           .
                                                          w
                                                          a>
                                                          T£
                                                          a>
                                                          cr
                                                          a>

-------
miR-105 in Human Oral Keratinocytes

   A

   K
                                                           B
   a
   £
   <
                                                               150i
                                                            E  100
                                                           d   50-
                Normal type
                                    Diminished type
Normal type
                                                                                                               to
Diminished type
                                               Normal type
                                                                   Diminished type
FIGURE 2. TLR-2 gene, IL-6, and TNF-a cytokine expression in both normal type and diminished cytokine response cells. Human gingival keratinocytes
were challenged with heat-inactivated P. gingivalis (100 MOI) or FSL-1 (1 /ng/ml) for 24 h. Total RNA was isolated for real-time PCR. The TLR-2 receptor mRNA
abundance increased upon TLR-2 ligand challenge in normal cells and was unchanged in diminished response cells after challenge (A), compared with control
(p< 0.001). An enzyme-linked immunosorbent assay was performed on the supernatants, and IL-6 (6) and TNF-a (Q were found to be up-regulated in normal
cells and remained unchanged in diminished response cells after challenge (p< 0.001). Results are mean ± S.D.fn = 3) using the same primary cells. Statistical
comparisons are shown by horizontal bars with asterisks above them (*** indicates p < 0.001 ;NS = no significant difference).
a
o
^_
o
B)
a.
a>
a.
3
                                                                                                                          cr
                                                                                                                          p
                                                                                                                          b
                                                                                                                          <3
                                                                                                                            .
                                                                                                                          cr
                                                                                                                          3
                                                                                                                          W
                                                                                                                          a>
plasmid  (System  Biosciences)  following transfection  with
FuGENE 6 as noted above. Luciferase expression was assessed
by confocal microscopy 24 h after transfection by use of anti-
Luciferase  antibody (Abeam). The protein extract  from the
transfected cells was collected using  radioimmune precipita-
tion assay buffer and equal amounts of protein were tested by a
Luciferase  activity assay  kit following  the  manufacturer's
instructions (Stratagene).
  Statistical Analysis—The microarray statistical analysis  is
detailed under "miRNA Array Profiling/Analysis" above. The
mRNA -fold increase  data were calculated according to the
AACT method (33). Cytokine data were evaluated by analysis of
variance using the InStat program (GraphPad, San Diego, CA)
with  Bonferroni corrections  applied.  Statistical  differences
were declared significant at/? <  0.05 level. Statistically signifi-

23110 JOURNAL OF BIOLOGICAL CHFMI^TRY
                                    Previous
                                                             cant data are indicated by asterisks (p < 0.05 (*), p < 0.01 (**),
                                                             and p< 0.001 (***)).

                                                             RESULTS
                                                               MiR-105 Is Up-regulated in Human Gingival Keratinocytes
                                                             with a Diminished Inflammatory Response—We hypothesized
                                                             that miRNAs may play a role in innate immune response vari-
                                                             ation (25) and could differentiate periodontitis disease-suscep-
                                                             tible and disease-resistant subjects. From a bank of over 30 pri-
                                                             mary cell cultures of human gingival epithelial cells (HGECs)
                                                             (2) we selected cultures having a diminished cytokine response
                                                             type and "normal" response type as reported previously. Briefly,
                                                             the rule specification for the latter selected cells that up-regu-
                                                             lated IL-6 and TNF-a production by at least  2-fold after 4-h
                                                             challenge  with TLR  agonists,  and  the rule  for the former
                                                                                      ME 284-NUMBER 34-AUGUST 21, 2009

-------
                                                                           miR-W5in Human Oral Keratinocytes
                                               Diminished type     Normal type
                                                                       H
                                         95kd	
                                         45kd
                                        P-9
                                                                          "* TLR-2
                                                                            • (1-Actin
                                                   Diminished type    Normal type
                                                              H  h
                                               1.5-,
        Nortnal type + P.g   Diminished type * P.g
FIGURE 3. miR-105 and TLR2 expression in normal and diminished cytokine response cells. The cells were
subjected to P. gingivalis and FSL-1 treatment for 24 h and quantitated the miR-105 expression and Western
blot for TLR-2. Total RNA was amplified with specific miR-105 hairpin loop primers and subjected to real-time
PCR. miR-105 up-regulated in  diminished cytokine response cells after challenge. The miR-105 expression
showed minimal or no change in normal cells (XI) (p = 0.05). Three normal and three diminished cell types were
compared after P. gingivalis challenge for miR-105 gene expression (6) and TLR-2 protein expression, using
Western blot (Q with ratio metric analysis (/3-Actin/TLR-2) (D). Statistical comparisons are shown by bars with
asterisks above them (* indicates p < 0.05; NS = no significant difference).
(diminished) phenotype was no significant increase in pro-in-
flammatory cytokine production after challenge. We thus
tested in depth the 3 diminished cytokine response primary
cultures against 3 representatives of the normal HGEC type
chosen from the median of the range of the 13 available cultures
(2).
  We performed chip-based miRNA profiling from  normal
and diminished cytokine response cells challenged with heat-
inactivated P. gingivalis, a TLR-2 and TLR-4 agonist. The data
were deposited in the Gene Expression Omnibus data base
under   the  platform   GPL7423  and  series   GSE13042
(www.ncbi.nlm.nih.gov/geo). In  the  preliminary  statistical
analysis, the normalized miRNA  data were  analyzed by two-
way analysis of variance (phenotype, treatment, and interac-
tion) for all miRNA chips. Of the 600 miRNA species tested on
the platform, 95 miRNAs were significantly different between
phenotypes and  45 between treatments.  Among the 109
miRNAs that were differentially expressed only 26 were well
annotated. These  26 were significantly altered by challenge
with  heat-inactivated  P. gingivalis when compared with
unstimulated cells (p = 0.0038). The miR-105 signal was mark-
edly  down-regulated in normal versus diminished cells after
TLR agonist challenge (Fig. L4) (p = 0.0017).
  In silico analyses  revealed 1060 hits for miR-105  targets
and predicted complimentarity to  TLR-2 (Fig. 1C). Because
                         we noted miR-105 up-regulation
                         in  the  diminished  cytokine  re-
                         sponse  cells,  we constructed a
                         computational matrix for miR-105
                         targets,  which revealed 37 poten-
                         tial mRNA targets across species.
                         In  matrix,  hsa-miR-577 and hsa-
                         miR-19b were  also  differentially
                         expressed  between  normal  and
                         diminished cells but to  a lesser
                         degree than miR-105 (see dotted box
                         in  Fig. I A). Numerical  expression
                         values were plotted to show the dif-
                         ference between normal  and dimin-
                         ished  cell types  (Fig. 15). For this
                         reason, we focused on miR-105.
                           TLR-2 mRNA Lip-regulation Cor-
                         relates   with   Pro-inflammatory
                         Cytokines in Gingival Keratinocytes—
                         Because   P.   gingivalis   signals
                         through both TLR-2 and TLR-4 (2,
                         35) we evaluated TLR-2 mRNA by
                         quantitative real-time  PCR  after
                         challenging both cell  types with
                         FSL-1, a specific agonist to TLR-2.
                         Analysis  of TLR-2  mRNA abun-
                         dance between phenotypes revealed
                         a 2.5-fold up-regulation with P. gin-
                         givalis challenge  and a 4.5-fold up-
                         regulation with FSL-1 challenge rel-
                         ative to the unstimulated  control
                         (p  =  0.001) (Fig. 2A).  In  striking
                         contrast,   diminished   cytokine
response cells did not show this up-regulation (Fig. 2A). TLR-2
induction upon P. gingivalis or FSL-1 stimulation is consistent
with TLR-2 recognition of P. gingivalis (36, 37). Furthermore,
IL-6 cytokine protein levels  corresponded to corresponding
mRNA levels. Normal response cells  up-regulated their IL-6
(Fig. 25) and TNF-a (Fig. 2C) production followingP.gingivalis
or FSL-1 challenge, and again this response was not evident in
diminished response cells (Fig. 2, 5 and C). We also explored
IL-12p40 secretion in the gingival epithelial cells after P. gingi-
valis and FSL-1 challenge. The primary gingival epithelial cells
do not secret IL-12p40.3  This has to be  further verified and
hence not included in the present study.
  Modulation of TLR-2 Protein Expression by miR-105—Con-
firmation of the differential expression of miR-105 was sought
by a non-array method. The real-time PCR data indicated an
8-fold up-regulation of miR-105 following P. gingivalis chal-
lenge, and an 11-fold increase following FSL-1 challenge, in
diminished-response cells relative to  an internal benchmark,
miRNA RNU48 (Fig. 3, A and 5). Consistent with the microar-
ray data, the normal response cells did not show significant
up-regulation of miR-105  (Fig. 3).
' M. R. Benakanakere, Q. Li, M. A. Eskan, A. V. Singh, J. Zhao, J. C. Galicia, P.
  Stathopoulou, T. B. Knudsen, and D. F. Kinane, unpublished data.
                                                                                                                         a
                                                                                                                         o
                                                                                                                         a.
                                                                                                                         a>
                                                                                                                         a.
                                                                                                                         o
                                                                                                                         cr
                                                                                                                         p
                                                                                                                         b
                                                                                                                         <3
W
a>
•g.
a>
3
cr
a>
AUGUST 21, 2009-VOLUME 284-NUMREBJ4
                                                       TOC
                        >AL OF BIOLOGICAL CHEMISTRY  23111

-------
miR- 705 in Human Oral Keratinocytes
     95kd

     45kd

Mock
Antagomir
P-9
FSL-1
    300i
    200-
    100-
              TLR-2

              [i-Actin
+  .  .  .
              IL-6
FIGURE 4. The level of TLR-2 protein by Western blot after transfecting
miR-105 antagomir. The diminished type cells transfected with miR-105
antagomir up-regulated TLR-2 protein expression upon TLR-2 agonist chal-
lenge (A). The ratio metric analysis (/3-Actin/TLR-2) of Western blot intensity
showing TLR2 protein expression (B). Similarly, IL-6 (Q and TNF-a (D) were
up-regulated in diminished cell types when miR-105 antagomir was trans-
fected and challenged either with P. gingivalis or FSL-1. Results are mean ±
S.D. (n = 3) representative of two independent experiments. Statistical com-
parisons are shown by bars with asterisks above them (*** indicates p < 0.001;
NS = no significant difference).

   Search  of the open miRNA data base revealed over 1060
potential target transcripts for miR-105. These potential target
genes were loaded onto Ingenuity pathway analysis (Ingenuity
Systems Inc.) software to discover the most  significant  path-
ways associated  with the global miR-105  target genes. This
analysis revealed a preponderance of immune diseases (data
not shown) and identified TLR-2 as one of the important tar-
gets. Because  microarray profiling revealed  miR-105 as  the
most significant miRNA discriminating between normal and
diminished HGEP phenotypes, we tested for evidence that
miR-105 expression would down-regulate TLR-2 as well as pro-
inflammatory cytokines  in the 3 selected cell types, as well as
across the broader panel of 13 HGECs available. This analysis
revealed  a generalized,  strong inverse correlation between
TLR2 protein and miR-105 gene expression (Fig. 3, B and C).
   We  next sought to determine  whether miR-105  directly
modulated TLR-2 mRNA and/or protein expression in HGECs.
To test this, we transfected miR-105 mimic (same sequence as
mature miR-105) and miR-105 antagomir (sequence comple-
mentary to miR-105, which blocks its effect) to diminished
cytokine cell phenotypes. After miR-105 transfection, the cells
were challenged for 24 h with either heat-inactivated P. gingi-
valis or FSL-1. The diminished cytokine response cells trans-
fected with miR-105 antagomir up-regulated the TLR-2 protein
levels in either challenge (Fig. 4A) compared with mock trans-
fected cells. This indicates a role for miR-105 in modulating
TLR-2 protein expression and implies  a post-transcriptional
repression of  TLR-2  translation.  However, the  miR-105
antagomir control cells expressed higher TLR-2 protein versus
non-transfected cells. This implies a role for miR-105 in mod-
ulating basal receptor expression, an inference supported by
the correlation of TLR-2 and miR-105  (Fig. 3, B  and C). Evi-
dence for a functional relationship was  sought by transfecting
miR-105 antagomir  into diminished cytokine response cells
challenged with P. gingivalis and FSL-1. This scenario induced
significantly higher IL-6 and TNF-a compared with mock con-
trol, confirming that miR-105 overexpression is biologically rel-
evant (Fig. 4, C and D). Because we previously showed that P.
gingivalis activates NF-KB (38) and induces cytokines (39), the
induction of IL-6 in miR-105 mimic and antagomir-transfected
cells was further verified by measuring NF-KB activity by mod-
ified electrophoretic  mobility shift assay technique (FACE kit,
ActifMotif). The miR-105 inhibitor-transfected cells  exhibited
increased NF-KB  activity upon P. gingivalis and FSL-1 ligand
challenge (data not shown). The normal cell phenotype trans-
fected with miR-105  mimic down-regulated NF-KB activation
after P. gingivalis or  FSL-1 challenge suggesting that miR-105
induction is dependant on NF-KB.
  MiR-105 Modulated Surface TLR-2 Expression—The func-
tional relationship was further confirmed by Western blot for
TLR-2 and immunohistochemistry  after inhibiting  miR-105.
Overexpression of miR-105 mimic in normal  cells challenged
with FSL-1  suppressed TLR-2 protein levels  compared with
scrambled miRNA target or mock transfection (Fig. 5A). We
then sought to determine the effect of reducing the miR-105
levels by transfecting miR-105 antagomir into the cell type with
miR-105 up-regulation. Lysates of cells  challenged with FSL-1
had increased TLR-2 protein (Fig. 55). The antagomir control
had increased TLR-2  protein level confirming  our  TLR-2
mRNA  observations (Fig.  3A), which correlated  with the
expression of miR-105.
  Surface expression of TLR-2 immunoreactivity was  con-
firmed by confocal microscopy. TLR-2  expression was down-
regulated in  cells transfected with miR-105 mimic and chal-
lenged with FSL-1 (Fig. 6D). In contrast, TLR-2 expression was
retained in cells transfected with miR-105 antagomir (Fig. 6C).
This confirmed that miR-105  up-regulation suppressed TLR-2
protein expression on the cell surface. These data are consistent
with observations of another miRNA, let-7i, and the link
between surface TLR-4 expression, NF-KB activation, and cyto-
kine modulation.
  To test if we had predicted correctly the binding site for miR-
105 on TLR-2, miR-105 binding to the  3'-UTR of TLR-2 was
assessed by cloning a putative cognate 22-bp fragment from 332
bp  away from the  stop  codon  (Ensembl  transcript  ID:
ENST00000260010) to the multiple cloning site located at the
3'-UTR of the Luciferase reporter gene in pMIRREPORTER. A
mutant vector was also constructed by deleting the predicted
AGTTTA binding site of miR-105 and cloning the  fragment
into a multiple cloning site located in the 3'-UTR of the Lucif-
erase reporter gene. Co-transfection of HEK293 cells  with
Luciferase construct (pMIR-TLR2) and a vector overexpressing
miR-105 precursor (pMIF-miR-105) inhibited Luciferase activ-
ity  (Fig. 7A, panel C). The mutant Luciferase vector (pMIR-
mutTLR2) co-transfected with pMIF-miR-105 retained lucifer-
                                                                                                                       o
                                                                                                                       o
                                                                                                                       Q.
                                                                                                                       0>
                                                                                                                       Q.
                                                                                                                       O
                                                                                                       CT
                                                                                                       O
                                                                                                       O
                                                                                                       <3
                                                                                                       W
                                                                                                       (D
                                                                                                       T£
                                                                                                       (D
                                                                                                       3
                                                                                                       cr
                                                                                                       o>
23112 JOURNAL OF BIOLOGICAL CHFMISTRY
                                   Previous
                        UME 284-NUMBER 34-AUGUST 21, 2009

-------
                                                                              miR-W5in Human Oral Keratinocytes
CD
m
£
,wj
i
6
                                  *TLR2
                                  •*• p-Actin
   Mock
   FSL-1
   miR-105 mimic
   miR-105 mimic+FSL-1
               45kDa
Mock
FSL-1
miR-105 antagomir
miR-105 antagomir+FSL-1
                                                                                                diction of  the binding site.  We
                                                                                                may conclude that modulation of
                                                                                                TLR-2  expression  by miR-105
                                                                                                occurs  through  binding  to  the
                                                                                                3'-UTR  of  TLR-2  mRNA,  thus
                                                                                                inhibiting TLR-2 translation.
                                                           0.8-1
                     0.0
FIGURE 5. The expression of TLR-2in epithelial cells following miR-105 mimic and antagomir transfection.
The normal cells were transfected with miR-105 mimic, and diminished cells were transfected with miR-105
antagomir and challenged with FSL-1 (1 /ng/ml) for 24 h. Protein (20 /ng) was loaded onto each well and detected by
anti-TLR-2 antibody for chemiluminescence detection. The normal cells transfected with miR-105 down-regulated
TLR-2 protein expression (A), and diminished cell type transfected with miR-105 antagomir up-regulated TLR-2
protein level (6), ratios of Western blot intensity data (/3-Actin/TLR2) are represented in Cand D, respectively.
FIGURE 6. Surface expression of TLR-2 in gingival keratinocytes. Normal cells transfected with either miR-105
mimic or antagomir and stained with antiTLR-2 antibody-clone TL2.3 (eBiosciences) with proper Isotype control
(Mouse lgG2a).TLR-2 was detected by immunohistochemistryand photographed by confocal microscopy. Control
(A), FSL-1-treated cells increased TLR-2 expression (6) miR-105 inhibitor did not affect the surface expression of
TLR-2, but the expression was maximized after FSL-1 challenge (Q, miR-105 mimic suppressed TLR-2 surface expres-
sion even after challenging with FSL-1 as seen by confocal microscopy (D). Bright field overlay with merged SYTO®
83-orange-stained nucleus and Alexa Fluor® 488-stained TLR2 (/), SYTO® 83-orange (//), Alexa Fluor® 488 (Hi), and
merged image (iv).
ase activity (Fig. 7A, panel D). Cell lysates had significantly less
luciferase from  samples co-transfected with  pMIR-TLR2
and pMIF-miR-105 (Fig. 75). This confirms our in silico pre-
                                            DISCUSSION
                                              This study identified miR-105 as a
                                            modulator of TLR-2 protein transla-
                                            tion in human gingival keratinocytes.
                                            There was a strong inverse correla-
                                            tion between cells that naturally had
                                            high cytokine  responses  following
                                            TLR-2 agonist challenge and  miR-
                                            105 levels. Knock-in and knock-down
                                            of miR-105 confirmed this inverse
                                            relationship.  In  silico analysis pre-
                                            dicted that miR-105 had complemen-
                                            tarity for TLR-2 mRNA, and the lucif-
                                            erase reporter assay verified this.
                                              Recently,  miRNAs  have  been
                                            shown to fine-tune innate immune
                                            responses (40).  For example, miR-
                                            146a/b was up-regulated in an NF-
                                            KB-dependent  manner   (21).  In
                                            another study, IL-6 induced let-7a-
                                            modulated  apoptosis  in  cholan-
                                            giocytes (41) and up-regulated miR-
                                            19a and -19b while down-regulating
                                            SOCS1 (suppressor of cytokine sig-
                                            naling 1), a gene important in nega-
                                            tive  regulation   of  TLR  signaling
                                            (42). The present study adds miR-
                                            105 to the panel of miRNA species
                                            known to influence innate immune
                                            function.
                                              Human miR-105  is  located  on
                                            the intronic region  of  GABRA3A
                                            (y-aminobutyric  acid receptor 3a),
                                            which resides on the X chromosome.
                                            Certain types of tumor cells have been
                                            shown to transcribe miR-105 but lack
                                            processing machinery in  the nucleus
                                            to form mature miRNA (43). It is still
                                            unclear how miR-105 is processed
                                            and exported out of the nucleus in
                                            gingival epithelial cells. Perhaps it is
                                            analogous to miR-155, which is pres-
                                            ent on the B-cell integration cluster
                                            transcript up-regulated with polyri-
                                            boinosinicpolyribocytidylic acid or the
                                            cytokine  interferon-/3   challenge.
                                            The up-regulated B-cell integration
                                            cluster transcript with miR-155 pre-
                  cursor undergoes processing to export mature miR-155 out of
                  the nucleus, which suppresses the macrophage inflammatory
                  response via c-Jun NH2-terminal kinase pathway (21).
                                                                                                                              a
                                                                                                                              o
                                                                                                                                      a.
                                                                                                                                      a>
                                                                                                                                      a.
                                                                                                                                      o
                                                                                                                              cr
                                                                                                                              p
                                                                                                                              b
                                                                                                                              <3
                                                                                                                              W
                                                                                                                              a>
                                                                                                                              T£
                                                                                                                              a>
                                                                                                                              3
                                                                                                                              cr
                                                                                                                              a>
AUGUST 21, 2009-VOLUME 784-NUMRFR 34
                                     Previous
                                            
-------
miR- 705 in Human Oral Keratinocytes
B
           5 —|  CMV  |  Luciferase  | 3'UTR of TLR2
                       r
            pMIR-TLR2 I-GUUAUAAGAGUGGCAUAGUAUUUG ^1

         pMIR-mutTLR2 I-GUUAUAAGAGUGGCAUAG( —del—)-^l
                                             v«-
FIGURE 7. The putative miRNA-105 target site within the 3[prime]-UTR of
human TLR-2 mRNA (Ensembl transcript ID: ENST00000260010) and a pre-
dicted binding site mutated primers were synthesized, annealed, digested
with Spel and Hindlll, and ligated into the multiple cloning site of the
pMIRREPORT Luciferase vector. Cultured HEK293 cells with each of these
reporter constructs were co-transfected with pMIF-cGFP-Zeo-miR-105 plasmid
and assessed for Luciferase expression by confocal microscopy. The positive con-
trol had only pMIRREPORTER Luciferase construct (A (A)), The negative control
had pMIRREPORTER /3-galactosidase co-transfected with pMIF-cGFP-Zeo-miR-
105 (A (B)), cells with pMIRREPORTER Luciferase co-transfected with pMIF-cGFP-
Zeo-miR-105 plasmid showing decreased Luciferase expression (A (Q), and cells
with a mutated vector pMIRREPORTER Luciferase  co-transfected with pMIF-
cGFP-Zeo-miR-105 vector retained Luciferase expression (A (D)). A Luciferase
reporter vector with potential binding site is represented in 6. The activity of
Luciferase was measured using Luciferase assay kit (Stratagene). The Luciferase
activity was significantly down-regulated in cells co-transfected with pMIF-cGFP-
Zeo-miR-105 (pMIF-miR-105) and pMIRREPORTER-TLR2 (plasmid containing
putative miR-105 binding region of TLR2) but not in cells co-transfected with
pMIF-miR-105 and pMIRREPORTER-mutTLR2 (plasmid containing mutated miR-
105 binding region of TLR-2) (Q. Results are mean ± S.D. of triplicates and are
representative of three independent experiments. Statistical comparisons are
shown by bars with asterisks above them (** indicatesp < 0.01;N5 = no significant
difference).
  High basal secretion of IL-6 seen in deficient cell types may
be explained by multiple factors such as activation of NF-KB at
the basal level or cAMP signaling. IL-6 production is not solely
dependent on NF-KB, and the IL-6 gene in epithelial cells con-
tains cAMP-responsive elements on the promoter  that  are
important for its transcriptional regulation (35, 44).
  Agonist availability and receptor compartmentalization are
pivotal  in regulating TLR  signaling (45). TLR  itself may be
degraded, preventing ligand activation, or its protein expres-
sion may be inhibited (45). It has been shown that expression of
TLR-4 is necessary for intestinal homeostasis (46), increases in
TLR-4 are pathogenic in lupus-like autoimmune disease (47),
and increased TLR-2 levels are associated with the response to
vaccinia virus (48). Taken together, the receptor surface levels
regulate pathogen recognition and inflammatory responses.
  Up-regulation of miR-203 has been shown to inhibit the sup-
pressor of cytokine signaling 3 (SOCS3) involved in inflamma-
tory responses  and  keratinocyte  function  (23), repress  the
expression of p63-promoting differentiation of epithelium (49),
and repress sternness by targeting DeltaNp63 (50). In contrast
to miR-105, the TLR ligands tested did not up-regulate miR-
155 in our  epithelial cell model, although both miR-146  a/b
were up-regulated (data not shown). This suggests that miR-
155 is not involved in the down-regulation of  epithelial cell
cytokines, but miR-146 a/b might have a  role in inhibiting
IRAKI and  TRAF6 protein expression as previously observed
(21).
  It is unclear how miRNA-105 itself is regulated  in gingival
epithelial cells, because its precursor resides on intron I of the
GABRA3A  gene within the X chromosome and has  a neuro-
transmitter function (51). Lithium has been used as  a potent
drug against affective neurological disorders (52). By inhibiting
glycogen synthase kinase 3 (53), lithium invokes an anti-inflam-
matory response (54). The y-aminobutyric  acid receptor has
been shown to inhibit immune responses of T cells, and mod-
ulation of this receptor influences T-cell responses and autoim-
mune  diseases  (55). These data  also suggest  that  intronic
regions are involved in regulating cell function.
  Our diminished cytokine response cell findings  reflect the
observation made with TLR-2 knock-out mice (37)  in that the
low inflammatory response correlates with the low level of
TLR-2 expression in these cells. These data also confirm other
studies (21,22,34) showing that miRNAs may play a crucial role
in modulating immune function. Although miR-105 targets
TLR-2 and can induce down-regulation of cytokine production
in primary epithelial cell cultures, it is unlikely that low cytokine
response  is solely  explained by  miR-105  regulation. Instead,
miR-105  may play an important role in fine-tuning TLR-2
response  to control  excessive inflammation. Further under-
standing of the mechanism of miR-105 and other targets may
lead to  a  better understanding of variations in  inflammatory
responses within the oral mucosa and to new anti-inflamma-
tory therapeutics.
REFERENCES
 1.  Akira, S., Uematsu, S., and Takeuchi, O. (2006) Cell 124, 783-801
 2.  Kinane, D. F., Shiba, H., Stathopoulou, P. G., Zhao, H., Lappin, D. F., Singh,
    A., Eskan, M. A.,  Beckers, S., Waigel, S., Alpert, B., and Knudsen, T. B.
    (2006) Genes Immun. 7,190-200
o
o
                                                                                                                             a.
                                                                                                                             a>
                                                                                                                             a.
                                                                                                                             o
                                                                                                                             cr
                                                                                                                             p
                                                                                                                             b
                                                                                                                             (3
                                                                                                                             W
                                                                                                                             a>
                                                                                                                             T£
                                                                                                                             a>
                                                                                                                             3
                                                                                                                             cr
                                                                                                                             a>
23114  JOURNAL OF BIOLOGICAL CHFMISTRY
                                     Previous
                          ME 284-NUMBER 34-AUGUST 21, 2009

-------
                                                                                         miR-W5in Human Oral Keratinocytes
 3.  Akira, S., and Takeda, K. (2004) Nat. Rev. Immunol. 4, 499-511
 4.  Bartel, D. P. (2004) Cell 116, 281-297
 5.  Liew, F. Y., Xu, D., Brint, E. K., and O'Neill, L. A. (2005) Nat. Rev. Immunol.
    5,446-458
 6.  Lynn, D.}., Winsor, G. L., Chan, C, Richard, N., Laird, M. R., Barsky, A.,
    Gardy, J. L., Roche, F. M., Chan, T. H., Shah, N., Lo, R., Naseer, M., Que,}.,
    Yau, M., Acab, M., Tulpan, D., Whiteside, M. D., Chikatamarla, A., Mah,
    B., Munzner, T., Hokamp, K., Hancock, R. E., and Brinkman, F. S. (2008)
    Mol. Syst. Biol. 4,218
 7.  Farh, K. K., Crimson, A., Jan, C., Lewis, B. P., Johnston, W. K., Lim, L. P.,
    Burge, C. B.,  and Bartel, D. P. (2005)  Science 310,1817-1821
 8.  Gregory, R. L, Yan, K.  P., Amuthan, G., Chendrimada, T., Doratotaj, B.,
    Cooch, N., and Shiekhattar, R. (2004) Nature 432, 235-240
 9.  Chendrimada, T. P., Gregory, R. L, Kumaraswamy, E., Norman, J., Cooch,
    N., Nishikura, K., and Shiekhattar, R. (2005) Nature 436, 740-744
10.  Ambros, V. (2004) Nature 431, 350-355
11.  Fritz, J. H., Girardin, S. E., and Philpott, D. J. (2006) Sci STKE 2006, pe27
12.  Lee, Y., Kim, M., Han,}., Yeom, K. H., Lee, S., Baek, S. H., and Kim, V. N.
    (2004) EMBOJ. 23, 4051-4060
13.  Fazi, F., Rosa, A., Fatica, A., Gelmetti, V., De Marchis, M. L., Nervi, C., and
    Bozzoni, I. (2005) Cell 123, 819-831
14.  O'Donnell, K. A., Wentzel, E. A., Zeller, K. I., Dang, C. V., and Mendell,
    J. T. (2005) Nature 435, 839 - 843
15.  Pasquinelli, A. E., Hunter, S., and Bracht, J. (2005) Curr. Opin. Genet. Dev.
    15,200-205
16.  Chen, C. Z.,  Li,  L., Lodish, H. F., and Bartel, D. P. (2004) Science 303,
    83-86
17.  Monticelli, S., Ansel, K. M., Xiao, C., Socci, N. D., Krichevsky, A. M., Thai,
    T. H., Rajewsky, N., Marks, D. S., Sander, C., Rajewsky, K., Rao, A., and
    Kosik, K. S. (2005) Genome Biol. 6, R71
18.  Esau, C., Kang, X., Peralta, E., Hanson, E., Marcusson, E. G., Ravichandran,
    L. V., Sun, Y., Koo, S., Perera, R. J.,  Jain, R., Dean, N. M., Freier, S. M.,
    Bennett, C.  F.,  Lollo,  B., and Griffey,  R. (2004) /. Biol.  Chem. 279,
    52361-52365
19.  Calin, G. A., Sevignani, C., Dumitru, C. D., Hyslop, T., Noch, E., Yen-
    damuri, S., Shimizu, M., Rattan, S., Bullrich, F., Negrini, M., and Croce,
    C. M. (2004)  Proc. Natl. Acad. Sci. U.S.A. 101, 2999 -3004
20.  Lu,  L, Getz,  G., Miska, E. A., Alvarez-Saavedra, E., Lamb,  L, Peck,  D.,
    Sweet-Cordero, A., Ebert, B. L., Mak, R. H., Ferrando, A. A., Downing, J. R.,
    Jacks, T., Horvitz, H. R., and Golub, T. R. (2005) Nature 435, 834-838
21.  Taganov, K. D., Boldin, M. P., Chang, K. J., and Baltimore, D. (2006) Proc.
    Natl. Acad. Sci. U.S.A. 103,12481-12486
22.  O'Connell, R. M., Taganov, K. D., Boldin, M. P., Cheng, G., and Baltimore,
    D. (2007) Proc. Natl. Acad. Sci. U.S.A. 104,1604-1609
23.  Sonkoly, E., Wei, T., Janson, P. C., Saaf, A., Lundeberg, L., Tengvall-Linder,
    M., Norstedt, G., Alenius, H., Homey, B., Scheynius, A., Stable, M., and
    Pivarcsi, A. (2007) PLoS ONE 2, e610
24.  Kinane, D. F. (2000) Ann. R. Australas Coll. Dent. Surg. 15, 42-50
25.  Kinane, D. F., Galicia, J. C., Gorr,  S. U., Stathopoulou, P. G., and Benaka-
    nakere, M. (2008) Front. Biosci. 13, 966 -984
26.  Spratt, D. (2003) in Medical Biofilms: Detection, Prevention, and Control
    (Jass, J., Surman, S., and Walker, J., eds) pp. 175-198, Wiley and Sons, Ltd.,
    London
27.  Kinane, D. F., and Hart, T. C. (2003) Crit. Rev. Oral Biol. Med. 14,430-449
28.  Shiba, H.,  Venkatesh, S. G.,  Gorr, S. U., Barbieri, G., Kurihara,  H.,  and
          Kinane, D. F. (2005) /. Periodontal Res. 40, 153-157
      29.  Eskan, M. A., Rose, B. G., Benakanakere, M. R., Zeng, Q., Fujioka, D.,
          Martin, M. H., Lee, M. J., and Kinane, D. F. (2008) Eur. J. Immunol. 38,
          1138-1147
      30.  Eskan, M. A., Benakanakere, M. R., Rose, B. G., Zhang, P., Zhao, J., Statho-
          poulou, P., Fujioka,  D., and Kinane, D. F. (2008) Infect. Immun. 76,
          2080-2089
      31.  Eisen, M. B.,  Spellman, P. T., Brown, P. O., and Botstein, D. (1998) Proc.
          Natl. Acad. Sci. U.S.A. 95,14863-14868
      32.  Singh, A. V., Green, M., States, J. C., and Knudsen, T. B. (2007)  Birth
          Defects Res. Part A. Clin. Mol. Teratol. 79, 319-350
      33.  Livak, K. J., and Schmittgen, T. D. (2001) Methods 25, 402-408
      34.  Chen, X. M., Splinter, P. L., O'Hara, S. P., and LaRusso, N. F. (2007) /. Biol.
          Chem. 282, 28929-28938
      35.  Eskan, M. A., Hajishengallis, G., and Kinane, D. F. (2007) Infect Immun 75,
          892-898
      36.  Darveau, R. P., Pham, T. T., Lemley, K., Reife, R. A., Bainbridge, B. W.,
          Coats, S. R., Howald, W. N., Way, S. S., and Hajjar, A. M. (2004) Infect.
          Immun. 72, 5041-5051
      37.  Burns, E., Bachrach, G., Shapira, L., and Nussbaum, G. (2006)/. Immunol.
          177,8296-8300
      38.  Brozovic, S., Sahoo, R., Barve, S., Shiba, H., Uriarte, S., Blumberg, R. S., and
          Kinane, D. F. (2006) Microbiology 152, 797-806
      39.  Sandros, J., Karlsson, C., Lappin, D. F., Madianos, P. N., Kinane, D. F., and
          Papapanou, P. N. (2000) /. Dent. Res. 79,1808-1814
      40.  Can tier, M. P., Sadler, A. J., and Williams, B. R. (2007) Immunol. Cell Biol.
          85,458-462
      41.  Meng, F., Henson, R., Wehbe-Janek, H., Smith, H., Ueno, Y., and Patel, T.
          (2007)/. Biol. Chem. 282, 8256-8264
      42.  Pichiorri, F., Suh, S. S., Ladetto, M., Kuehl, M., Palumbo, T., Drandi, D.,
          Tacdoli, C., Zanesi, N., Alder,  H., Hagan, J. P., Munker, R., Volinia, S.,
          Boccadoro, M., Garzon, R., Palumbo, A., Aqeilan, R. I., and Croce, C. M.
          (2008) Proc. Natl. Acad. Sci. U.S.A. 105,12885-12890
      43.  Lee, E. J., Baek, M., Gusev, Y., Brackett, D. J., Nuovo, G. J., and Schmittgen,
          T. D. (2008) RNA 14, 35-42
      44.  Krueger, J., Ray, A., Tamm, L, and Sehgal, P. B. (1991) /. Cell. Biochem. 45,
          327-334
      45.  Miggin, S. M., and O'Neill, L. A. (2006) /. Leukoc. Biol. 80, 220 -226
      46.  Rakoff-Nahoum, S.,  Paglino,  J.,  Eslami-Varzaneh,  F.,  Edberg, S., and
          Medzhitov, R. (2004) Cell 118, 229-241
      47.  Liu, B., Yang, Y., Dai, J., Medzhitov, R., Freudenberg, M. A., Zhang, P. L.,
          and Li, Z. (2006) /. Immunol. 177, 6880 - 6888
      48.  Zhu,  J., Martinez, J., Huang, X., and Yang, Y. (2007) Blood 109, 619-625
      49.  Yi, R., Poy, M. N., Stoffel, M., and Fuchs, E. (2008) Nature 452, 225-229
      50.  Lena, A. M., Shalom-Feuerstein, R., di Val Cervo, P., Aberdam,  D., Knight,
          R. A., Melino, G., and Candi, E. (2008) Cell Death Differ. 15,1187-1195
      51.  Bell, M.V., Bloomfield, J., McKinley, M., Patterson, M. N., Darlison, M. G.,
          Barnard, E. A., and Davies, K. E. (1989) Am. J. Hum. Genet. 45, 883-888
      52.  Rowe, M. K.,  and Chuang, D. M. (2004) Expert Rev. Mol Med.  6,1-18
      53.  Phiel, C. J., and Klein,  P. S. (2001) Annu. Rev. Pharmacol. Toxicol. 41,
          789-813
      54.  Martin, M., Rehani, K., Jope, R. S., and Michalek, S. M. (2005) Nat. Immu-
          nol. 6, 777-784
      55.  Tian, J., Chau, C, Hales, T. G., and Kaufman, D. L.  (1999) /. Neuroimmu-
          HO/. 96, 21-28
a
o
a.
a>
a.
o
cr
p
b
<3
W
a>
T£
a>
3
cr
a>
K)
o
AUGUST 21, 2009-VOLUME 784-NUMRFR 34
                                          Previous
TOC
                                  IAL OF BIOLOGICAL CHEMISTRY  23115

-------
                                               Reproductive Toxicology 27 (2009) 373-386
 I  I N!  Ml  R
                                              Contents lists available at ScienceDirect
                                             Reproductive Toxicology
journal homepage: www.elsevier.com/locate/reprotox
Pharmacokinetic modeling of perfluorooctanoic  acid during gestation and
lactation in the mouse1^

Chester E. Rodriguez*, R. Woodrow Setzer, Hugh A. Barton1
US Environmental Protection Agency, Office of Research and Development, National Center for Computational Toxicology, Research Triangle Park, NC27711, United States
ARTICLE   INFO

Article history:
Received 14 November 2008
Received in revised form 3 February 2009
Accepted 19 February 2009
Available online 4 March 2009

Keywords:
Perfluorooctanoic acid
Pharmacokinetics
Modeling
Mice
Rats
Developmental toxicity
Gestation
Lactation
                                         ABSTRACT
       Perfluorooctanoic acid (PFOA) is a processing aid for the polymerization of commercially valuable flu-
       oropolymers. Its widespread environmental distribution, presence in human blood, and adverse effects
       in animal toxicity studies have triggered attention to its potential adverse effects to humans. PFOA is
       not metabolized and exhibits dramatically different serum/plasma half-lives across species. Estimated
       half-lives for humans, monkeys, mice, and female rats are 3-5 years, 20-30 days, 12-20 days, and 2-4 h,
       respectively. Developmental toxicity is one of the most sensitive adverse effects associated with PFOA
       exposure in rodents, but its interpretation for risk assessment is currently hampered by the lack of
       understanding of the inter-species pharmacokinetics of PFOA. To address this uncertainty, a biologically
       supported dynamic model was developed whereby a two-compartment system linked via placental blood
       flow described gestation and milk production linked a lactating dam to a growing pup litter compart-
       ment. Postnatal serum levels of PFOA for 12951/SvImJ mice at doses of 1 mg/kg or less were reasonably
       simulated while prenatal and postnatal measurements for CD-I mice at doses of 1 mg/kg or greater were
       simulated via the addition of a biologically based saturable renal resorption description. Our results sug-
       gest that at low doses a linear model may  suffice for describing the pharmacokinetics of PFOA while a
       more complex model may be needed at higher doses. Although mice may appear more sensitive based
       on administered dose of PFOA, the internal dose metrics estimated in this analysis indicate that they may
       be equal or less sensitive than rats.
                                                                        Published by Elsevier Inc.
1. Introduction

   Perfluorooctanoic acid (PFOA) is a synthetic, fully fluorinated
alkyl acid that has been produced industrially for several decades
for use primarily as  an emulsifier in the aqueous polymeriza-
tion of fluoropolymers such as polytetrafluoroethylene (Teflon®).
Fluoropolymers made through the use  of  PFOA exhibit  valu-
able commercial properties that include water and oil repellency,
high stability, and inertness, and thus find extensive  utilization
 *  The United States Environmental Protection Agency through its Office of
Research and Development funded and managed the research described here. It
has been subjected to Agency administrative review and approved for publication.
Approval does not signify that the contents reflect the views of the Agency, nor
does mention of trade names or commercial products constitute endorsement or
recommendation for use.
  * Corresponding author at: The National Center for Computational Toxicology
(B205-1), Office of Research and Development, US Environmental Protection Agency,
109 T.W. Alexander Dr., Research Triangle Park, NC 27711, United States.
Tel.: +1 919 541 0447: fax: +1 919 541 1194.
   E-mail addresses: rodriguez.chester@epa.gov (C.E. Rodriguez),
habarton@alum.mit.edu (H.A. Barton).
  1 Current address: Pfizer, Inc., PDM PK/PD Modeling, Eastern Point Road,
MS 8220-4328, Groton, CT 06340, United States.

0890-6238/S - see front matter. Published by Elsevier Inc.
doi:10.1016/j.reprotox.2009.02.009
                                in  numerous sectors including aerospace,  automotive,  build-
                                ing/construction, chemical processing, electrical and electronics,
                                semiconductor, textile, biomedical, and others. The manufacture
                                and use of PFOA have been accompanied by considerable direct
                                emissions of this chemical to the environment. Since 1951 when
                                PFOA began  to be  used industrially, it has been estimated that
                                cumulative global emissions are in the range of 2400-5200 tonnes
                                [1].
                                   Emissions of PFOA may also be indirect in nature, through the
                                manufacture and use of precursor compounds that can be degraded
                                to PFOA under normal environmental conditions. One of these com-
                                pounds is 8-2 fluorotelomer alcohol (8:2-FTOH) which has been
                                produced for surface  protection applications. 8:2-FTOH is volatile
                                and has been shown to readily undergo atmospheric oxidation to
                                PFOA [2,3]. The biotic degradation of 8:2-FTOH in aquatic media has
                                also been shown to yield PFOA as a terminal product [4].
                                   The  chemical properties  exhibited  by PFOA  include excep-
                                tional stability, low volatility, and inertness which are  ideal for
                                its intended commercial applications, but have also resulted in a
                                non-biodegradable and persistent environmental pollutant [5,6].
                                Recent reports of widespread environmental distribution, presence
                                in human blood and wildlife samples, and toxic effects in laboratory
                                animal studies have generated significant toxicological and regu-
                                      Previous
                          TOC

-------
374
                                       C.E. Rodriguez eta/./Reproductive Toxicology 27(2009)373-386
latory interest. Particularly concerning is the long serum half-life
observed in humans (estimated at ~3-5 years) for whom reported
serum levels range from low parts per billion for the general US pop-
ulation to low parts per million for occupationally exposed workers
and other highly exposed populations [7-10]. Thus, even though
human serum levels of PFOA maybe considered low, a long half-life
could be indicative of the  tendency of this  chemical to bioaccu-
mulate, potentially leading to higher body burdens and associated
long-term health risks.
   The pharmacokinetics of PFOA is unusual  in that serum/plasma
clearance can vary dramatically across species, and for some, across
gender.  Among the species examined to date, humans exhibit
the longest  plasma half-life at 3-5 years, followed by  monkeys
at 21-30 days,  and mice at 12-20 days [9-11]. The suggestion
of concentration-dependent changes in renal excretion,  however,
means these half-life estimates may reflect clearance only at rela-
tively low plasma concentrations [12]. In the case of the rat, a very
dramatic gender difference in plasma  clearance is observed. The
male adult rat clears PFOA with a plasma half-life of about 6 days
while a half-life that is more than 30-fold faster at 2-4 h is observed
for its female counterpart [9,10]. It should be noted that clearance
of PFOA likely reflects removal of only the parent compound since
the carbon-fluorine bond is too strong for mammalian metabolic
systems to cleave and  no metabolites have been reported to date
[13]. The underlying mechanisms for the differential retention of
PFOA across  species/sexes remain to be conclusively demonstrated
but evidence suggests  a role for organic anion transporters in the
liver and kidneys [12,14,15].
   Developmental  toxicity has been demonstrated in rodent
species to be one of the  most sensitive adverse  effects associated
with PFOA exposure. When coupled to human studies reporting low
levels of PFOA in umbilical cord blood [16], neonatal blood collected
immediately after birth [17], and breast milk [18,19], the need to
understand the significance and relevance of animal developmen-
tal toxicity studies becomes more critical for doing human health
risk assessment. In particular, the implications of the differences in
plasma clearance on findings from developmental toxicity studies
carried out in rodents have been largely unaddressed and represent
a substantial source of uncertainty in risk analysis of PFOA. In the
rat, reported developmental effects from two generation reproduc-
tive toxicity studies at an administered daily oral gavage dose of
30mg/kg-d PFOA include deleterious effects in  the Fl generation
in the form  of increased post-weaning mortality, pre- and post-
weaning body weight (BW) deficits, and delayed sexual maturation.
Fewer developmental effects were reported at 10 mg/kg-d, and little
or no effects at 3 mg/kg-d [20]. In an inbred mouse strain, increased
neonatal mortality has been reported at an administered oral gav-
age dose as low as 0.6 mg/kg-d [21], and neonatal BW deficits and
eye opening delays are among the effects reported at 1 mg/kg-d
[21,22]. At higher administered  doses, mammary gland  develop-
ment alterations in both lactating dams and female pups have
been reported [23] along with prenatal effects that can range from
maternal weight gain deficits and delayed parturition to  full-litter
resorption [22,24]. For all  developmental  endpoints examined,
mice were more sensitive than rats based on administered doses of
PFOA.
   The interpretation  of animal developmental toxicity studies
can greatly benefit from information on the pharmacokinetics of
PFOA during gestation and lactation particularly for reconciling the
disparities of the large interspecies differences in plasma  clear-
ance. In the case of the female rat, for instance, it is unclear if
the higher administered doses required to induce developmental
toxicity as compared to mice are  reflective of  the faster plasma
clearance of PFOA in this species and/or lower inherent sensitivity.
Indeed, with a half-life of 2-4 h, the female  rat  being dosed daily
can be expected to clear most  of the chemical  by  24 h,  resulting
in a repeated episodic pharmacokinetic profile that would con-
trast with the bioaccumulative profile expected for other species
with much slower plasma clearance. Consequently, toxicity findings
from rat studies are difficult to interpret especially for any type of
cross-species extrapolation. In contrast, the female mouse has been
proposed to be more amenable for interpretation since its pharma-
cokinetic profile  should resemble that of other species including
humans in its bioaccumulative nature [22]. From a risk assessment
perspective, it is  critical to select the most appropriate laboratory
species for cross-species extrapolation of toxic effects, and pharma-
cokinetic information can be very helpful to that end in the case of
PFOA.
   Biologically based pharmacokinetic modeling may provide the
best approach for addressing the lack of understanding of the inter-
species pharmacokinetics of PFOA during gestation and lactation in
animal developmental toxicity studies. A pharmacokinetic model
that describes the  critical physiological and anatomical changes
associated with  gestation and  lactation can be very useful for
estimating internal dose as a means  of reconciling the large  dif-
ferences in plasma  clearance across species. Such a model can also
be used for estimating the dose to the offspring in order to eval-
uate the  impact  of the default risk assessment approach that is
based solely on the maternal dose, even when the adverse effects
are observed in the offspring. Since the critical window of expo-
sure for postnatally observed developmental effects (e.g., neonatal
BW deficits and  neonatal lethality)  may be during the prenatal
period, a  pharmacokinetic model can also  be used for estimat-
ing the appropriate dose metric (prenatal versus postnatal internal
dose) for a given  adverse effect. At a minimum, a pharmacokinetic
model of gestation  and lactation can be used to maximize the util-
ity of the limited  information available and help prioritize research
needs.
   This report describes the  development  and application  of a
biologically based  two-compartment model for  describing  the
pharmacokinetics of PFOA during gestation and lactation in  the
mouse. The model is based on  limited data available but incor-
porates  the critical  changes in growth, placental blood  flow,
PFOA-partitioning,  and milk production. Gestation was described
with a pregnant dam and a concept!  compartment linked  via
placental  blood  flow  while  milk production  linked a  lactating
dam to a growing  pup litter compartment. The overall goal for
developing the model was to  address some of the uncertain-
ties  associated  with the  pharmacokinetics of PFOA in animal
developmental toxicity  studies. The  specific objectives were as
follows:
  to  explore  the possibilities  in  model  structure that can be
  supported by serum information reported from mouse develop-
  mental toxicity studies;
  to provide estimates of internal dose for dam and offspring during
  gestation and lactation;
  to compare internal dose estimates to those reported for the rat as
  a means for addressing differences in plasma clearance in these
  two species;
  to help identify research needs for the development of a more
  elaborate pharmacokinetic model.
   The results suggest that at lower doses of PFOA a linear model
may suffice for describing the pharmacokinetic behavior of PFOA
in gestation and lactation, while a renal resorption component may
be necessary to describe the non-linear behavior at higher doses.
The model was used to derive estimates of internal dosimetry for
dams as well as pups. Some of the implications of these findings as
well as the information gaps encountered are discussed.
                                       Previous

-------
                                         C.E. Rodriguez eta/./Reproductive Toxicology 27(2009)373-386
                                                                                                                             375
Table 1
Mathematical expressions describing the growth of maternal tissues during gestation.
Tissue
Uterus (vu)
Mammary (vmg)
Carcass fat (vf)
Liver (vl)
Pre-pregnancy fraction of BWa
0.002
0.01
0.07
0.04
Mathematical expression3'1"'0
pvu =vu x (1 +0.077 x (gd-3)1-6)
pvmg = vmg x ( 1 + 0.27 x gd)
pvf=vfx(l +0.0165 xgd)
pvl = vlx(l +0.0255 x(gd-6))
Duration
gd3-18
gdO-18
gdO-18
gd6-18
   Taken from [25]; gd denotes gestation day.
   The designation p denotes tissue weight during pregnancy.
   vu, vmg, vf, and vl are the respective pre-pregnancy tissue weights.
2.  Materials and methods

2.1.  Model code

   All pharmacokinetic models were coded and implemented in acslX (Version
2.04, Aegis Technologies, Huntsville, AL). The total simulation time was set to 39 full
days (or 936 h). Gestation constituted the first 18 full days of simulation with the first
24 h designating gestation day 1 (gdl). Parturition was set to take place at the end of
gd!8 with the subsequent 24 h constituting postnatal day 1 (pndl). The appropriate
24 h adjustment was accounted for in cases where experimental data was obtained
using the designation of gdO and/or pndO for the 24 h following mating and birth,
respectively.
3. Model structure

3.1. Gestation

   Gestation was described as a two-compartment system consist-
ing of pregnant dam and concept! linked by placental plasma flow
(Qcon). The pregnant dam compartment accounted for maternal
tissues that increase in growth during gestation, namely uterus,
mammary tissue, carcass fat, and liver. Mathematical expressions
describing the growth of these tissues during gestation have been
previously reported and are listed along with their respective pre-
pregnancy fractions of BW in Table 1 [25].  BW values of 23.0 and
26.7 g were used for 12951/SvImJ and CD-I adult  non-pregnant
female mice, respectively, based on information from respective
studies [21,24].
   The concept! compartment consists of embryo/fetuses and all
of the  associated tissues including placentas and its growth was
estimated as the difference  between  total pregnant female BW
and the contribution of maternal tissues beginning  with the non-
pregnant  adult  female  mouse BW. Raw BW measurements  for
12951/SvImJ pregnant mice throughout gestation were kindly pro-
vided by Abbott et  al. [21] and analyzed  using the R  statistical
software package [26]. The resulting mathematical expression that
describes  the  BW changes during gestation is  listed in Table 2.
Similarly  for  CD-I  mice,  raw  BW measurements kindly pro-
vided by  Lau et al. [22] were analyzed for  fit using R to the
expression listed in Table  2. BW  information of only untreated
groups  was considered  for this analysis  since  maternal weight
gain deficits were  not observed in 129Sl/SvImJ pregnant mice
[21] and only at the higher administered doses  of  PFOA in  preg-
nant CD-I mice [22]. Furthermore, in  modeling treated groups,
only those administered doses that did not affect  litter size  are
being  considered,  namely 0.1-1.0 and l-10mg/kg-d  PFOA  for
       129Sl/SvImJ and CD-I  mice, respectively [21,22]. Since gestation
       is being described strictly as a flow-limited model, the concept!
       was modeled beginning with the onset of placenta blood flow and
       any growth before this time (gdO-5) was  attributed to maternal
       tissues.
         The development of placental blood flow was described by a
       mathematical expression beginning at gd6 as previously reported
       [25]. Accounting for differences in gestation times, the predicted
       profile was in agreement with a measurement reported in the
       published literature of 1.26 ± 0.54 mL/min/g placenta for pregnant
       Balb/c mice at gd!6 [27]. If the 21-day gestation period of Balb/c
       mice is assumed to scale proportionally to the 18-day gestation
       period used  in this model for 129Sl/SvImJ and CD-I  mice,  gd!6
       would be equivalent to the modeled gd!3. Such scaling has  been
       described previously for addressing differences in gestation times
       for rodents [25]. Using a placental weight of 0.0858 g for gd!3 [28],
       the reported placental blood flow measurement was expressed as
       0.108±0.046mL/min and plotted along with the predicted profile
       (Fig. 2B). Reported serum levels of PFOA were simulated assuming
       a hematocrit fraction of 0.45 to obtain the corresponding plasma
       flow rate.

       3.2.  Lactation

         Lactation was described as a dynamic dam and pup litter com-
       partment linked by a milk compartment (Fig. 3). BW information
       for lactating dams was derived from a study with MF1 mice [29] and
       implemented in the model by applying the same percent increase
       to the predicted BW of pregnant 12951/SvImJ and CD-I  mice at
       the end of gestation (excluding concept!). This manipulation of the
       data avoids discontinuities in the model that may result from strain
       differences. The resulting BW values for 12951/SvlmJ and CD-I lac-
       tating dams were analyzed for fit to the mathematical expressions
       listed in Table 2 (Graphpad Prism, San Diego, CA).
         Raw  BW  measurements  for 12951/SvImJ  and CD-I  pups
       throughout lactation were obtained from  the respective toxicity
       studies [21,22] and analyzed for fit to the  corresponding expres-
       sions listed in Table 2 using the R statistical software package [26].
       Pup BW information of only the untreated group was considered
       for this analysis since BW gain deficits are  less  than 25% at the
       higher administered doses to the dam (i.e., 1 mg/kg-d and 5 mg/kg-
       d or higher for 12951/SvImJ and CD-I mice, respectively) [21,22,24].
       Moreover, it is unknown at this time if the observed BW gain deficits
       of the nursing pups are reflective of lower milk consumption which
       would decrease the lactational transfer of PFOA. In the absence of
Table 2
Mathematical expressions used to simulate body weight changes in pregnant/lactating dams and nursing pups.
                                                            Mathematical expression (g)
CD-I pregnant dam
CD-I lactating dam
CD-I pup
12951/SvImJ pregnant dam
12951/SvImJ lactating dam
12951/SvImJ pup
BW = 31.08 +1.07 x (gd-10) + 0.136 x (gd-10)2 +0.00582 x (gd-10)3 - 0.000379 x (gd-10)4
BW = 30.40 + 0.78 x pnd-0.028 x pnd2
BW= 1.235 + 0.22 x pnd + 0.0654 x pnd2 - 0.0055 x pnd3 + 0.000145 x pnd4
BW= 24.96 + 0.56 x (gd-10) + 0.0653 x (gd-10)2 + 0.000846 x (gd-10)3 - 0.000268 x (gd-10)4
BW= 25.2 + 0.548 x pnd + 0.00234 x pnd2 - 0.000963 x pnd3
BW= 1.42- 0.189 x pnd + 0.114 x pnd2 - 0.00673 x pnd3 +0.000129 x pnd4
                                      Previous
TOC

-------
376
                                       C.E. Rodriguez eta/./Reproductive Toxicology 27(2009)373-386
such information, pup BW changes are being modeled as those of
the untreated group for both mouse strains.
   Mouse milk yield information for a litter size of 10 was obtained
from a report by Knight et al. [30], expressed on a per pup basis,
and fitted to the following cubic expression (Graphpad prism):

MY(pnd) =  0.484 + 0.124 x pnd - 0.00510 x pnd2

            + 0.0000570 xpnd3

MY and pnd denote milk yield in gram/day/pup and postnatal day,
respectively. In the absence of strain-specific MY information, the
expression of MY on a per pup basis is a reasonable way of mini-
mizing the impact of strain-dependent differences in litter size.
   It was assumed that all the milk produced was consumed by the
pups without loss or delay. The rate of milk production by the dam
(and suckling rate of nursing pups) was assumed constant and mod-
eled as continuous without any circadian variation, as previously
modeled for rats [31 ].

3.3. Pup excreta recirculation

   Neonatal rodents are reportedly unable to eliminate waste with-
out maternal stimulation which involves the dams consuming pup
urine [32,33]. This process was modeled as previously reported [31 ]
as an additional input to the dam whereby the amount of PFOA
eliminated by the pups during the first two weeks of life would be
transferred back to the lactating dam.

3.4. Renal resorption

   The kidney resorption model was adapted  from a report by
Russel et al. [34]. It was first implemented for 12951/SvlmJ adult
non-pregnant female mice assuming a constant BW of 21 g and the
physiological parameters listed in Table 3. The sum  of glomerular
filtration rate and renal  plasma flow rate equals the total  kidney
plasma flow forthe mouse (assuming a hematocrit fraction of 0.45)
[35].

4. PFOA-speciflc pharmacokinetic parameters

   Absorption and elimination of PFOA were described as first
order processes. PFOA is well-absorbed with a time of maximum
observed concentration of less than 4h in adult mice [11]. An
absorption rate constant of 0.537/h estimated for the adult CD-I
mouse was assumed to be the same for all pregnant and lactating
dams [11 ]. In the absence of any absorption information for nursing
pups, all of the PFOA contained in milk was assumed to be absorbed
without loss or delay.
   A volume of distribution of 0.135 L/kg BW estimated for adult
non-pregnant CD-I mice [ 11 ] was assumed to also apply to all mice
including concept! and nursing pups.
Table 3
Physiological parameters used to describe renal resorption in the mouse.

Renal resorption parameter                                Value
            Table 4
            Embryo/fetus:maternal serum partition coefficients (Pe/f) for PFOAa.
Cardiac output (CCC, L/h/kg0-75)
Kidney blood flow (QRC, fraction of QC)
Glomerular filtration rate (GFR, L/h/kg BW
Urine flow rate (L/h/kg BW)
Volume of renal plasma (fraction of BW)
Volume of renal filtrate (fraction of BW)
16.5'
0.091 *
0.378b
0.000303C
0.00067d
0.000097d
 * Obtained from [35].
 b Obtained from [47].
 c Obtained from [48].
 d Estimated by BW-scaling of values reported forthe dog [34].
                       gd9
                      gdlO
gd!3
gd!5
gd!8
            Pe/f
                       0.05
                                  0.07
                                             0.06
                                                         0.10
                                                                    0.20
             ' Estimated from [36].


            4.1. Gestation

               The elimination rate constant for 129Sl/SvlmJ pregnant mice
            was estimated to be 0.00233/h by optimization (acslX, Version 2.04,
            Aegis Technologies) against serum levels of PFOA in non-lactating
            dams (pregnant dams that did not give birth to viable pups or
            pups died shortly after birth) at pnd22. These pregnant dams were
            dosed daily  from  gdl-17 with 0.1, 0.3, 0.6, and l.Omg/kg PFOA
            [21]. The estimated elimination rate constant was scaled allomet-
            rically to account  for BW changes associated with gestation and
            lactation.
               An elimination rate constant of 0.00185/h estimated for adult
            CD-I mice [11] was also assumed to apply to CD-I nursing pups.
               Disposition of PFOA during gestation was described using  lin-
            ear interpolation (TABLE function, acslX) of embryo/fetus:maternal
            serum partition coefficients  (designated as Pe/f) that were esti-
            mated from  serum information kindly provided by M. Henderson
            [36] (Table 4). In the absence of any other information, the same
            Pe/f value as gd9 was assumed for gd6-8.

            4.2. Lactation

               Lactational transfer was described as a first order process. The
            first order rate constant was estimated as previously described for
            rats [31] via the following expression:

                   Pm x Vm
            klac =
                  24 x Vdam
where Pm refers to the millcmaternal serum partition coefficient
(assumed constant throughout lactation). Vm is the volume of milk
in liters produced in 24 h obtained from mouse milk yield informa-
tion already described [37]. Vdam is the volume of distribution of
PFOA in liters for the  lactating dam assuming a value of 0.135 L/kg
BW [11 ]. The only unknown in the klac expression is Pm which, in
the absence of any milk-partitioning information, was estimated to
be 0.0285 by optimization against serum levels of lactating dams
and nursing pups [21]. The same value ofPm was assumed for CD-I
mice. A Pm value of 0.0285 is much lower than that estimated for
rats at 0.1 [38].

4.3. Renal resorption

   The transport affinity constant  (Kt) and transport maximum
(Tmc) for 129Sl/SvlmJ adult non-pregnant  mice were estimated
by optimization (acslX, Version 2.04, Aegis Technologies)  against
serum levels of PFOA for mice dosed daily from gdl-17 with 0.1,
0.3, 0.6,  and l.Omg/kg PFOA but whose litters were fully resorbed
early  in pregnancy. The pharmacokinetics of PFOA in these mice
can be considered as adult non-pregnant mice since they  did not
undergo the BW changes associated with pregnancy and lactation.
Serum measurements were kindly  provided by Abbott et al. [21].
The resulting values for Kt and Tmc are listed in Table 5. With the
exception of Kt and Tmc, the same parameter  values were used
for modeling pregnant and lactating CD-I mice, but accounting for
the respective BW changes. Both  Kt and Tmc were first optimized
against reported maternal serum  levels of PFOA measured  at term
corresponding to administered doses of 1,3, 5 and 10 mg/kg-d PFOA
[22]. Since postnatal serum measurements only included two doses,
namely 3 and 5 mg/kg-d [24], only Tmc was optimized and the same
                                       Previous

-------
                                       C.E. Rodriguez eta/./Reproductive Toxicology 27(2009)373-386
                                                                                                                         377
Table 5
Renal resorption parameters used to simulate serum levels of PFOA in 12951/SvlmJ
and CD-I mice.
Reference
Mouse strain
Serum sampling time
I
-------
378
                                        C.E. Rodriguez eta/./Reproductive Toxicology 27(2009)373-386
                                          Qcon
         Oral gavage dose
              (mg/kg)
                                                                Concept!
                                                             Vcon     Ccon
                                                      Dam

                                              Cdam           Vdam
                                                                        Qcon
                                                                                     ked      ,
                                                                                             Urine
                                      Pup excreta
                                      recirculation
                                      (from birth to PND14)
                                                    klac

                                                Milk
                                            Cmilk     Vm

                                                    klac
                                                                   Pups
                                                             Cpup        Vpup
                                                                     kep
                                                                   Urine
                           Fig. 3. Pharmacokinetic model of gestation and lactation used in the analysis of 12951/SvlmJ mice.
   Lactation was similarly described following parturition at the
end of gd!8 as dynamic lactating dam and pup litter compart-
ments, but linked to a milk compartment (Fig.  3, bottom portion).
Lactational exposure of PFOAto nursing pups was modeled assum-
ing that all of the milk was produced at a constant daily rate and
consumed by nursing pups without loss  or delay, as previously
described for rats [31]. The BW changes  for lactating dams and
                                               nursing pups are shown in Fig. 4A-D, respectively, and the corre-
                                               sponding mathematical expressions that describe them are listed
                                               in Table 2. Since the inability of neonatal mice in the early postna-
                                               tal period to eliminate waste without maternal stimulation likely
                                               results in maternal re-exposure to PFOA [32], this phenomenon was
                                               modeled postnatally as an additional dose to the lactating dam, as
                                               previously described for rats (Fig. 3) [31 ].
               §
                  40-|
              :' >
               -
'a> to
            73
             o
14-

12-

10-

 8-

 6-

 4-

 2-

 0
                       (B)
                                                     Weaning
                                                        (D)
                                                                                         weaning
                            Abbott et al. 2007
                     0  2   4  6  8  10  12  14 16  18  20 22
                                 Postnatal day
                                                       Q.
                                                       3
                                                       Q.
                                                                   Q
                                                                   O
                                                                   •o
                                                                   o
                                                                   DO
                           14-

                           12-

                           10-

                            8-

                            6-

                            4-

                            2-

                            0
                                                                                Lau et al. 2006
                                                       0   2   4   6   8 10 12 14  16  18  20  22  24
                                                                   Postnatal day
                              Fig. 4. Simulation of body weight changes for pregnant/lactating dams and nursing pups.
                                        Previous
                                           TOC

-------
                                       C.E. Rodriguez eta/./Reproductive Toxicology 27(2009)373-386
                                                                                                                         379
         Non-pregnant

   *   Abbott etal. 2007


  ~~  Pregnant/non-lactating dam

          A   Abbott et al. 2007

   (A) 1.0 mg/kg/day
                                                                        ~ ~   Pregnant/Lactating dam

                                                                            0  Abbott et al. 2007


                                                                       • • •  Pup

                                                                         •   Abbott et al. 2007
                                                                        (B) 0.6 mg/kg/day
                100-,
                                                                     60-1
                                                                     40-
                                                                     20-
                                                                         Gestation 	 Lactation
                         200
                                400    600
                                 time (hr)
                                              800    1000
                   (C) 0.3 mg/kg/day
                                                       0      200     400     600
                                                                      time (hr)

                                                       (D) 0.1 mg/kg/day
                                                                                                    800
                                                                                                           1000
             B)

             a>
               2.0-
                                                                 89
                                                                     0.0
                                                                         Gestation
                                                                                         Lactation
                         200
                                 400     600
                                 time (hr)
                                               800
                                                      1000
                                                                              200
                                                                     400    600
                                                                      time (hr)
                                                                                                    800
                                                                                                           1000
                                       Fig. 5. Simulation of serum levels of PFOA in 12951/SvlmJ mice.
5.2. Comparison of model-simulated profiles

   Three mouse developmental toxicity studies have been reported
and some serum information is available to evaluate the pharma-
cokinetic model of gestation and lactation [21,22,24]. The studies
have been carried out with pregnant 129Sl/SvImJ and CD-I mice.
In both cases, the pregnant dam was dosed daily with different con-
centrations of PFOA from gdl-17 and serum levels of PFOA were
measured either prenatally after 1 day (i.e., at term) or postnatally at
weaning, (i.e., 23 days following the last treatment). No single study
provided both prenatal and postnatal  serum measurements of
PFOA. Nonetheless, the study with 129Sl/SvlmJ pregnant mice [21]
provided the most thorough serum information for examining the
pharmacokinetic behavior of PFOA during gestation and lactation.
Reported serum measurements  included mice that  were  non-
pregnant, pregnant but did not lactate (pregnant/non-lactating),
pregnant then lactating (pregnant/lactating), and  corresponding
nursing pups. Pregnant/non-lactating mice were those that were
pregnant but whose pups were stillborn or died soon after birth,
and therefore, did not  lactate or their lactation was not consid-
ered significant [21 ].Thus, although serum levels of PFOA were only
measured at only one time point, namely pnd22, in combination,
these limited  data can  be used to  examine the pharmacoki-
netic  behavior of PFOA during gestation and lactation within a
                                               given model structure. Thus, the model structure in Fig. 3 which
                                               accounts for the critical changes in BW, placental blood flow, PFOA-
                                               partitioning, and milk production was evaluated for its ability to
                                               simulate serum levels of PFOA reported in 129Sl/SvImJ mice [21].
                                               As shown in Fig. 5A-D, the model reasonably simulated serum lev-
                                               els of PFOA for all pregnant mice and nursing pups. Only serum
                                               levels corresponding to doses of 0.1-1 mg/kg-d were available for
                                               simulation since full-litter resorption was observed very early in
                                               gestation at higher doses [21 ].
                                                  The serum information provided for non-pregnant female mice
                                               is of significance because in addition to allowing comparison with
                                               pregnant mice at doses <1 mg/kg-d, it also allows an examination of
                                               the pharmacokinetic behavior of PFOA at higher doses (>1 mg/kg-
                                               d) that  would  not be possible in pregnant mice because of the
                                               induction of full litter resorption resulting from PFOA exposure
                                               [21]. Thus,  a linear one-compartment model based  on a constant
                                               BW of 21 g was initially evaluated for its ability to simulate serum
                                               levels of PFOA collected for non-pregnant mice at 23 days follow-
                                               ing the  last treatment (the equivalent of pnd22), but this model
                                               structure failed to simulate non-linear behavior of serum levels
                                               which became more apparent at the higher doses (data not shown).
                                               Extrapolating from previous findings that implicate transporters in
                                               the renal clearance of PFOA [14] and modeling efforts that involve
                                               renal resorption to explain the non-linear pharmacokinetic behav-
                                    Previous
                                         TOC

-------
380
                                        C.E. Rodriguez eta/./Reproductive Toxicology 27(2009)373-386
                                                    Kidney
                                          Urine

Fig. 6. Pharmacokinetic model used in the analysis of 12951/SvlmJ non-pregnant
adult mice.
ior of PFOA in monkeys [12], a biologically based saturable renal
resorption component was implemented in the one-compartment
model for 129S1 /SvlmJ non-pregnant mice. In this hypothetical kid-
ney compartment depicted in Fig. 6, the unbound fraction of PFOA
from the central compartment would get filtered by the glomeru-
lous into a filtrate compartment from which PFOA would be either
excreted through the urine or resorbed via saturable renal trans-
porter^), with a transport affinity Kt and transport maximum Tm,
into a renal plasma compartment for entry back into the systemic
circulation. This biologically based modeling of renal clearance has
been used previously to describe active tubular secretion in renal
      2"  800n
      "S>

      <  600-
      £
      Q.
      O  400-

      I
      .2  200-
      |
      5
      co    o
                                                                                                          	5.0 mg/kg/day
                                                                                                           A  Abbott et al. 2007
                                                                                                              10.0 mg/kg/day
                                                                                                           •  Abbott et al. 2007
                                                                                                          — 20.0 mg/kg/day
                                                                                                           »  Abbott et al. 2007
            0  100 200 300 400 500 600 700 800 900
                          time (hr)

    Fig. 7. Simulation of serum levels of PFOA in 12951/SvlmJ non-pregnant adult mice
    at reported doses greater than 1 mg/kg-d.
    clearance [34], but was adapted in this case to describe renal resorp-
    tion of PFOA. This non-linear model reasonably simulated serum
    levels for 12951/SvlmJ non-pregnant mice at all doses. Simulations
    for doses of 0.1 -1.0 mg/kg-d are shown for comparison with preg-
    nant 129S1/SvlmJ mice in Fig. 5A-D, and those  for doses greater
    than 1 mg/kg-d are shown in Fig. 7.
       In   comparison,   129S1/SvlmJ  adult   non-pregnant  mice
    achieved the highest serum concentrations of PFOA, followed
    by pregnant/non-lactating, pregnant/lactating, and nursing pups
    (Fig.  5A-D). Based on the magnitude  of serum measurements,
    gestation and lactation in combination act to lower maternal
    serum levels of PFOA by about 3-fold, and thus represent impor-
    tant clearance pathways for the dam and correspondingly major
    sources of exposure for the offspring.
       The ability of models  to  predict serum levels of PFOA  was
    quantitatively evaluated using Pearson residual analysis for each
    dose modeled. As shown in Fig.  8A,  the model predictions for
                (A) Non-pregnant 129S1/SvlJ mice
        (B) Pregnant/non-lactating 129S1/SvlJ mice
5
W 4.0-
1 2.0-
!
0. 0.0-
•a
£
= -2.0-
i
 0.0-
.»
•5
£ -1.0-
l « -2.0-
i i i i i 5
0.2 0.4 0.6 0.8 1.0 3
(0 -3.0 ~

Q
Dose (mg/kg) ^ -4.0-


T




1

0.2


i

0.4 0


i
1
L
6 0.8 1.0



i

Dose (mg/kg}
Fig. 8. Evaluation of model predictions for 12951/SvImJ mice. The dots represent the residual values (measured - predicted) divided by the standard error of the mean (SEM)
for each dose and the lines depict the deviation from the measured value.
                                        Previous
TOC

-------
                                       C.E. Rodriguez eta/./Reproductive Toxicology 27(2009)373-386
                                                                                                                        381
12951/SvImJ adult non-pregnant mice can deviate from the serum
measurements by as much as 40% at the lowest and highest dose
of 0.1 and 20 mg/kg-d, respectively, but are less than 20% for all
other doses. In contrast, the predictions by the linear pharmacoki-
netic model of gestation and lactation model are generally better,
ranging from 35% to less than 10% (Fig. 8B-D).
   The serum information from two developmental toxicity studies
carried out in CD-I mice is also very limited. In one of the studies,
serum  levels of PFOA were measured at term, i.e.,  24 h following
the last treatment [22], while the other study reported serum mea-
surements  at pnd22 [24]. The lowest dose of PFOA examined in
CD-I mice was 1 mg/kg-d [22], reflecting an apparent lower sensi-
tivity based upon administered  dose for this strain as compared
to 12951/SvlmJ [21,22]. In an initial effort  to examine the phar-
macokinetics  of PFOA in pregnant CD-I mice, the linear model
of gestation and lactation in Fig. 3 was parameterized  for CD-I
mice and evaluated for its ability to simulate serum levels of PFOA
measured at term. Absorption and elimination rate constants were
those estimated for adult non-pregnant female CD-I mice [11 ]. As
shown in Fig. 9, even at the lowest dose reported, the linear model
over-predicted serum levels by nearly four-fold. The discrepancy
increased at higher  doses, presumably indicating the inability of
the model  to simulate non-linear behavior (data not shown). In
another effort, the renal resorption component previously used to
describe the non-linear pharmacokinetic behavior of PFOA in adult
non-pregnant 129Sl/SvlmJ mice was implemented for CD-I mice.
The resulting model structure is shown in Fig. 10. The saturable
renal resorption component significantly improved the simulation
of serum levels of PFOA at term for all doses reported (Fig. 11A-D)
[22]. No serum information for concept! or pups was reported, but
the model can predict the corresponding levels in this compartment
(Fig. HA-D).The renal resorption parameters used to simulate CD-
1 serum levels of PFOA are listed in Table 5  along with those used
with adult non-pregnant 129Sl/SvImJ mice.
   Serum levels of PFOA measured in CD-I  mice postnatally were
also  simulated via the  incorporation of renal  resorption. These
serum measurements were performed at pnd22 and corresponded
                                                                                      Parturition
                    100-


                     80-
                 5"
                 B)
                 I  601
                 J2

                 a>  40-


                 I  ^
                 CO
                                                     Simulation by linear model

                                                     Lau et al. (2006)
                                                       100
                                                                 200
                                                               time (hr)
                                                                           300
                                                                                     400
                                     Fig. 9. Predicted serum profile of PFOA at 1 mg/kg-d in pregnant CD-I mice by linear
                                     pharmacokinetic model of gestation and lactation.
                                     to only two administered doses,  namely 3 and  5 mg/kg-d, but
                                     included dam and pups [24]. When compared to  prenatal serum
                                     measurements at term at the same administered doses [22], these
                                     pnd22 measurements do not reflect any impact of parturition and
                                     lactation in  decreasing serum levels of PFOA. In fact, they are
                                     only about 27% and 48% lower for 3 and 5 mg/kg-d, respectively
                                     (40.50 ± 1.89 mg/L versus 29.47 ± 2.55 mg/L and 71.91 ±8.33mg/L
                                     versus 36.90 ±4.75 mg/L).  Thus, based solely on magnitude, a
                                     slower serum clearance of PFOA (i.e., greater accumulation) would
                                     be necessary to simulate the pnd22  serum measurements using
                                     the same pharmacokinetic model  in  Fig. 10. As shown in Fig. 12,
                                     maternal and pup serum levels of PFOA at pnd22 were success-
                                     fully simulated, but required an approximate  increase in Tmc of
                                     10-fold (while  maintaining Kt constant due to the absence of an
                                     adequate range of doses). The predicted prenatal serum profiles of
                                     PFOA were as much as 5-fold higher than those predicted based
                                     on serum  measurements at term (Figs. 11 and 12). The basis for
                                                                                                  Urine
                   o\
                   *-<
                    &
                   a
  Pup excreta
'' recirculation
  (from birth to PND14)
Cmilk     Vm

       klac
4
Pups
Cpup Vpup
Jkep
Urine
                      Fig. 10. Renal resorption pharmacokinetic model of gestation and lactation used in the analysis of CD-I mice.
                                    Previous
                               TOC

-------
382
                                        C.E. Rodriguez eta/./Reproductive Toxicology 27(2009)373-386
                  (A) 1 mg/kg/day
    40 -i


    30 -


    20 -


    10 -
 o>
 OT
         Gestation
                        Lactation •
	 Pregnant t Lactatng Dam
 i  lau etal 2006

	Concept! / Pups
       0  100 200 300 400 500 600 700 800 900
                     time (hr)
                  (B) 3 mg/kg/day
                                          •    Pregnant / Lactating Dam
                                            A   Lau el al 2006

                                          — -  Concept /Pup
    150-,
       0  100 200 300 400 500 600 700 800 900
                     time (hr)
                  (C) 5 mg/kg/day
                        Lactation	•
"fi
v>

I
E
o
in
 =. 100-
    50 -
         Gestation
	 Pregnant / Lactating Dam
 A  Lau et al 2006
- - • Concepti / Pup
     o +—r-^7-
       0  100 200 300 400 500 600 700 800 900
                     time (hr)

                  (D) 10 mg/kg/day
    250 -i Gestation
                        Lactation -
                                           — Pregnant / Lactating Datr
                                           1   Lau et al 2006
                                          	Concept / /Pups
       0  100 200 300 400 500 600 700 800 900
                     time (hr)

      Fig. 11. Simulation of prenatal serum levels of PFOA in CD-I mice.
the large inconsistency in the predicted bioaccumulative prenatal
profiles is unclear since both studies were performed in the same
mouse strain using the same doses, dosing regimen, and supplier
of PFOA [22,24]. The profiles were not expected to vary by more
than the reported inter-laboratory analytical coefficient of varia-
tion of  20.3% for measurements of PFOA in serum samples [42].
With  the  exception of serum  measurements  at 3 mg/kg-d mea-
sured at term, model simulations were in good agreement with
reported measurements in CD-I mice, deviating at maximum 20%
from measurements (Fig. 13A-C).
                         E
                         »_
                         a>
                         w
                            300 -\
                            200-
                            100-
                                                                             Gestation
                                               3 mg/kg/day
                                              	   Lactation
                                                                                                           Pregnant / Lactating Dam

                                                                                                           Wolf etal. 2007

                                                                                                           Concepti / Pups

                                                                                                           Wolf etal. 2007
                                    100  200  300  400  500  600  700  800  900
                                                 time (hr)
                                                                       500-]
                                                                       400-
                                                                             Gestation
                                                                                          5 mg/kg/day
                                                                                          	   Lactation
                                                                   I
                                                                   E   300-
                                                                  tn
                                                                  $
                                                                  g   200

                                                                  E
                                                                  i
                                                                  0>
                                                                  OT
                                                                       100-
                                                           	  Pregnant / Lactating Dam

                                                            •   Wolf etal. 2007

                                                           — •  Concepti / Pups

                                                                Wolf et al. 2007
                                   — I ---- 1 - 1 - 1 - 1 - 1 - 1 - 1 - 1 —
                                0   100  200  300  400  500  600  700  800  900
                                                 time (hr)

                             Fig. 12. Simulation of postnatal serum levels of PFOA in CD-I mice.
                       6. Discussion

                          This modeling effort was  aimed  at addressing some  of the
                       uncertainties associated with the pharmacokinetics  of PFOA  in
                       animal  developmental toxicity  studies.  To  this end,  a phar-
                       macokinetic  model  of gestation and lactation was  developed
                       for the mouse based  on limited information available. A two-
                       compartment system  linked by  placental blood flow described
                       gestation and milk production sequentially linked a lactating dam
                       to a growing pup compartment. The model incorporated the crit-
                       ical changes in BW (for dam and  offspring), placental blood flow,
                       embryo/fetus:maternal serum partition coefficients for PFOA, and
                       milk production. The ability of the model to simulate serum lev-
                       els of PFOA  from developmental toxicity studies involving  two
                       different strains of mice was evaluated. The results indicate that
                       a linear  clearance description was sufficient to reasonably sim-
                       ulate  serum  levels of  PFOA at administered doses of 1 mg/kg-d
                       or less for 12951/SvlmJ  mice that were pregnant/non-lactating,
                       pregnant/lactating, and respective nursing pups. In contrast, the
                       non-linear behavior exhibited by serum levels of PFOA in CD-I mice
                       at the only reported  doses of 1 mg/kg or greater was simulated by
                       the incorporation of a  saturable renal resorption description. The
                       manipulation of renal clearance of PFOA in rats through hormones
                       and inhibitors of organic anion transporters served as precedence
                       for the renal resorption component [14]. Previous modeling efforts
                       have also made use  of a different mathematical formulation for
                       renal resorption to explain the non-linear pharmacokinetic behav-
                       ior of PFOA in monkeys [12].
                          The Implementation  of  renal resorption also allowed simu-
                       lation  of serum levels of PFOA  for 129Sl/SvlmJ non-pregnant
                                        Previous
                   TOC

-------
Table 6
Estimates of internal dose metrics for developmental toxicity studies in rodents.
Dose (mg/kg-d)
0.1

0.3


0.6



1.0

1.0



3.0



3.0



3.0



5.0



5.0

Species (strain)
Mouse (12951/SvImJ)

Mouse (12951/SvImJ)


Mouse (12951/SvImJ)



Mouse (12951/SvImJ)

Mouse (CD-I)



Rat (Sprague Dawley)



Mouse (CD-I)



Mouse (CD-I)



Mouse (CD-I)



Mouse (CD-I)

Life stage
Dam
Pups
Dam
Pups
Dam

Pups

Dam

Pups

Dam
Pups
Dam

Pups


Dam

Pups

Dam
Pups


Dam

Pups

Dam
Pups

Toxic effect
None
Liver hypertrophy
None
Liver hypertrophy
Neonatal lethality
Liver hypertrophy
Liver hypertrophy
Neonatal lethality
Neonatal BW deficit
Eye opening delay
Liver hypertrophy
Liver hypertrophy
Neonatal lethality
Neonatal BW deficit
Eye opening delay
Liver hypertrophy
Parturition delay
Neonatal BW deficit
Eye opening delay
None
Neonatal BW deficit
Post-weaning lethality
Sexual maturation delay
Post-weaning BW deficits

Liver hypertrophy
Parturition delay
Neonatal BW deficit
Eye opening delay
Neonatal lethality
Liver hypertrophy
Full litter resorption
Neonatal BW deficit
Eye opening delay
Liver hypertrophy
Neonatal lethality
Liver hypertrophy
Parturition delay
Full litter resorption
Neonatal BW deficit
Eye opening delay
Neonatal lethality
Liver hypertrophy
Full litter resorption
Neonatal BW deficit
Eye opening delay
Liver hypertrophy
Neonatal lethality
NOAEL or LOAEL
LOAEL
-
NOAEL
NOAEL
-
LOAEL
NOAEL
NOAEL
LOAEL
-
LOAEL
LOAEL
LOAEL
NOAEL
LOAEL
LOAEL
-
NOAEL
NOAEL
NOAEL
NOAEL

LOAEL
-
-
NOAEL
LOAEL
NOAEL
LOAEL
LOAEL
LOAEL
NOAEL
-
-
NOAEL
-
LOAEL
LOAEL
-
LOAEL
Reference
[21]
[21]
[21]
[21]
[21]

[21]

[21]

[21]

[22]
[22]
Draft risk assessment

Draft risk assessment


[22]

[22]

[24]
[24]


[22]

[22]

[24]
[24]

AUC average daily (mg/L-hr)
G
118
14
355
41
711

82

1185

137

548
55
-

N/A


1429

138

3811
447


2160

205

6120
707

G + L
94
25
283
76
566

153

943

254

296
48
83

N/A


755

115

2981
678


1127

165

4644
1033

Cmax (mg/L)
8
2
25
7
49

14

82

24

34
5
13

N/A


87

12

270
49


131

18

423
75

tmax (h)
390
498
390
498
390

498

390

498

388
390
2

N/A


388

390

390
458


388

390

390
458










o
rn
1
uez eta/.,
I
3
n.
^
a.
s
a.
8
1
tSJ
•s]
To
g
10
Lo
\
Lo
CO
Ol









'Abbreviations: G = gestation only; G + L=gestation and lactation; N/A = not available.
                                                                Previous
TOC

-------
384
                                       C.E. Rodriguez eta/./Reproductive Toxicology 27(2009)373-386
2.0 -
5
LLJ
5°. o.o-
0)
•5
f -2.0 -
Q.
•o
3 -4.0 -
in
a
I
(A) Pregnant uu-1 mice
• . 9
i
2






1
i i i i i
4 6 8 10 12

Dose (mg/kg/day)




l



14 16 18 20







  E
  LU
  CO
  I
  TJ
  £
  Q.
  3
  (A
  ra
      -6.0 J
       2.0 n
               (B) Pregnant/ lactating CD-1 Mice
5
LU
co
~ 1.0-
I
0
£ o.o -
a
•
•o
£
3 -1.0-
ra
a>
S
<







i







3.0 3.5 4.0 4.5 5.0



Dose (mg/kg/day)

      -2.0 J
 0.0
-1.0 -
      -2.0 -J
                   (C) Nursing CD-1 pups
                  3.0
                     —J—
                      3.5
—J—
 4.0
—|—
 4.5
1
               Dose (mg/kg/day)
Fig. 13. Evaluation of model predictions for CD-I mice. The dots represent the resid-
ual values (measured - predicted) divided by the standard error of the mean (SEM)
for each dose and the lines depict the deviation from the measured value.
adult mice which exhibited non-linear behavior at doses greater
than 1 mg/kg-d at which  full litter resorption was  observed in
the respective pregnant mice [21]. Modeling of 12951/SvlmJ non-
pregnant,  pregnant/non-lactating,  and pregnant/lactating mice
at the same administered doses of PFOA confirm that gesta-
tion and lactation serve as significant  clearance pathways for
the  dam  (and correspondingly sources  of exposure  for  the
offspring).
   The pharmacokinetic  behavior  of PFOA  at  doses  less than
1 mg/kg-d in pregnant CD-I  mice is unclear. Nevertheless,  the
analyses with 129Sl/SvlmJ mice is indicative that a linear pharma-
cokinetic model may be appropriate in the analysis of gestational
and lactational exposures to PFOA at low doses which may be more
relevant to environmental exposures.
   Another objective of this modeling study was to use the respec-
tive pharmacokinetic models to derive initial estimates of internal
dose as a means for reconciling large interspecies differences in
plasma clearance. Although different model structures and param-
eter values were necessary in the analyses for PFOA, estimates
of internal dose can still  be  derived to make initial intra- and
inter-species comparisons and estimate the relative  contribution
of gestational and lactational exposures. Intra-species comparisons
may be important given that a single set of parameter values failed
to describe different data sets even when they were obtained using
the same species and strain, doses, dosing regimen, and source of
PFOA. From a risk  assessment perspective, interspecies compar-
isons are of particular interest since developmental toxicity studies
carried out in rats versus mice have yet to be reconciled given the
observed large difference in plasma clearance between these two
species (i.e., half-life of 2-4 h versus 12-20 days, respectively). Thus,
the average daily area under the serum concentration-time curve
(AUC) and the maximum concentration (Cmax) of PFOA for both dam
and offspring were selected as measures of internal dose and esti-
mated from the respective models. These estimates were compared
to those of the rat at an administered dose of 3 mg/kg-d estimated in
the US EPA draft risk assessment for PFOA [43]. As listed in Table 6,
at the lowest dose  of 0.1 mg/kg-d  reported for 129Sl/SvlmJ mice,
the AUC values of 118 and 94mgh/L for gestation only (G) and
gestation and lactation (G + L), respectively, for the dam are com-
parable to the value of 83 mg h/L estimated for the pregnant rat at
the 30-fold higher administered dose of 3 mg/kg-d. The estimated
value  for Cmax of 8mg/L  is also comparable to 13mg/L reported
for  the rat. Since liver hypertrophy in 129Sl/SvlmJ nursing pups
was the only adverse effect reported and its relevance to humans
remains the subject of debate, the next higher dose of 0.3 mg/kg-d
represented the no adverse effect level (NOAEL) for neonatal lethal-
ity for 129Sl/SvlmJ nursing pups. The estimated G + L AUC value
of 283 mg h/L for 12951/SvlmJ dams does not reflect the 10-fold
decrease in administered dose as compared to the rat  and is in fact
more than 3-fold higher. It should be noted that internal dose esti-
mates for pups are not part of the default risk assessment approach
for analyzing developmental toxicity studies and therefore were not
estimated in the draft risk assessment for PFOA. However, our esti-
mates for 12951/SvlmJ nursing pups can still be compared to those
for rat dams. At 0.3 mg/kg-d, the  G + L AUC value of 76 mg h/L for
129S1 /SvlmJ pups is comparable to that of the rat dams at 83 mg h/L.
Based on these internal dose estimates, irrespective of model struc-
ture, 129Sl/SvlmJ mice do not appear to be more sensitive than
the rat and the seemingly greater sensitivity to PFOA may be due
to a much higher internal dose achieved for a given administered
dose.
   Also listed in Table 6 are the AUC and Cmax estimates for CD-
1 mice. At the lowest administered dose of 1 mg/kg-d,  the G + L
AUC for CD-I dams is about 3.6-fold greater than the value esti-
mated for the rat at 3 mg/kg-d. The G + L AUC estimates for  the
corresponding pups are about 1.7-fold lower, but still do not reflect
the 3-fold decrease in administered dose of the rat. Also  shown
are  the estimates at doses of 3 and 5 mg/kg-d reported in both
studies with CD-I mice [22,24]. Although these doses and internal
dose estimates are probably too high for any type of risk assess-
ment applications, they are nonetheless useful for intraspecies and
interlaboratory comparisons. The AUC estimates for the reported
postnatal serum measurements [24] are about 2.8-  and 4.5-fold
higher than the reported prenatal measurements for both dam and
pups at 3 and 5 mg/kg-d PFOA, respectively. The inconsistency may
be analytical in nature since similar adverse effects were observed
in both studies [22,24].
   The predicted AUC estimates for 129Sl/SvImJ mice suggest that
the relative contribution of gestation and lactation as a source of
exposure for the offspring are comparable when exposure to  the
dam occurs during gestation without continued exposure  during
lactation.
                                       Previous
                                                      TOC

-------
                                          C.E. Rodriguez eta/./Reproductive Toxicology 27(2009)373-386
                                                                                                                                 385
   In the  case of  CD-I  mice, the  relative contributions vary
depending on the study analyzed. Based on the prenatal serum
measurements of PFOA [22], the contribution of gestation far out-
weighs that of lactation. The contribution of lactation to pup
exposure is predicted to increase significantly based on postna-
tal serum measurements [24]. It should be noted that windows
of susceptibility may render gestation more toxicologically impor-
tant for  organs/tissues that undergo  critical development before
birth. Results from cross fostering studies suggest that exposure to
PFOA in  utero may be sufficient to produce postnatal BW deficits
and  developmental delays in  the  pups,  and  a  window of sus-
ceptibility for some of these effects may exist early in gestation
[24].
   The results of this modeling effort are  based, due  to limited
data available, on several assumptions that need to be emphasized.
First, gestation was modeled strictly as a flow-limited process; any
diffusion-limited transfer of PFOA was not included. Second, the
millcmaternal serum partition coefficient was modeled as constant
throughout lactation which may not be accurate since  milk com-
position changes as a function of postnatal day [44,45]. Third, in
the absence of data, the first order elimination rate constant for the
pups was either optimized based on pup serum levels (129Sl/SvlmJ
mice) or assumed to be the same as the respective dam (CD-I mice).
The identity and ontogeny of the renal transporters involved in the
renal clearance of PFOA remain an area of investigation  and there-
fore cannot be modeled  at this time. Fourth, our models are based
on a classical compartmental approach augmented with biological
information, and therefore are not meant to be used for any type of
extrapolation. Cross-species and high-to-low dose extrapolations
maybe dependent on additional biological processes such as dose-
dependent changes in liver distribution [46]  and may need a whole
body PBPK model approach.
   Several research needs were identified throughout the develop-
ment and evaluation of the models. First, the lack of longitudinal
prenatal and postnatal serum measurements of PFOA in the same
mouse strain  within the same study. The  available prenatal and
postnatal serum measurements for CD-I mice from different stud-
ies seem inconsistent and do not complement each other. Second,
there were no milk measurements of PFOA available in the mouse.
Ideally,  such  measurements should  be carried  out in  parallel
with maternal serum measurements  from the same dam so that
changes  in  partition can be properly monitored. In the absence
of such  data, the millcmaternal serum partition coefficient  for
PFOA was optimized based on  maternal and pup serum informa-
tion [21] and  assumed constant throughout lactation. Third, pup
BW deficits resulting from PFOA exposure are  not being mod-
eled at this time  since  it is unclear whether  or not they reflect
a lower  consumption of milk  which would decrease  the lacta-
tional transfer of PFOA. Pup BW deficits can be as much as 20-
25% at the higher doses administered  to the dam [21,22], and thus
are not likely to dramatically change our estimates of internal dose
metrics.
   Although administered doses of PFOA less than 1 mg/kg-d may
exhibit little or no phenotypic toxicity in CD-I mice, serum informa-
tion for these low doses may further support a linear model for the
pharmacokinetics of PFOA at low doses that may be more reflective
of environmental exposures.
   In conclusion, this effort at a minimum provides initial phar-
macokinetic model  structures for further explorations  of the
pharmacokinetic behavior of PFOA during gestation and lactation
relevant to  developmental toxicity studies which involve differ-
ent exposures (e.g., in utero and lactational  exposures) but whose
current analysis for risk are based solely on the maternal dose. Fur-
thermore, our estimates of internal dose suggest that the mouse
achieves much higher  internal dose of PFOA and  may not be
more sensitive than the  rat to the developmental toxicity induced
       by PFOA. These results may be  important in selecting the  most
       appropriate laboratory species for risk assessment and/or further
       investigations associated with PFOA.


       Conflict of interest

          The authors declare that there are no conflicts of interest.


       Acknowledgements

          The invaluable assistance through sharing of unpublished data
       and discussions with Drs. Christopher Lau, Barbara Abbott, and
       Suzanne Fenton are greatly appreciated by the authors throughout
       this modeling effort. Input from Drs. Andrew Lindstrom and  Mark
       Strynar with regards to analytical chemistry issues is appreciated
       along with comments from Dr. Jennifer Seed (internal reviewers).
       This project was partially funded by Interagency Agreement RW-
       75-92207501 with the National Toxicology Program at the National
       Institute for Environmental Health Science.

       References

        [1] Prevedouros K, Cousins IT, Buck RC, Korzeniowski SH. Sources, fate and trans-
           port of perfluorocarboxylates. Environ SciTechnol2006;40(January(l)):32-44.
        [2] Wallington TJ, Hurley MD, Xia J, Wuebbles DJ, Sillman S, Ito A, et al. For-
           mation of C7F15COOH (PFOA) and other perfluorocarboxylic  acids  during
           the atmospheric oxidation of 8:2 fluorotelomer alcohol. Environ Sci Technol
           2006;40(February (3)):924-30.
        [3] Nabb DL, Szostek B, Himmelstein MW, Mawn MP, Gargas ML, Sweeney LM, et al.
           In vitro metabolism of 8-2 fluorotelomer alcohol: interspecies comparisons and
           metabolic pathway refinement. Toxicol Sci 2007;100(December(2)):333-44.
        [4] Dinglasan MJ, Ye Y, Edwards EA, Mabury SA. Fluorotelomer alcohol biodegra-
           dation yields poly- and perfluorinated acids. Environ Sci Technol 2004;38(May
           (10)):2857-64.
        [5] Hori H, Nagaoka Y, Yamamoto A, Sano T, Yamashita N, Taniyasu S, et al. Effi-
           cient decomposition of environmentally persistent perfluorooctanesulfonate
           and related fluorochemicals using zerovalent iron in subcritical water. Environ
           Sci Technol 2006:40(February (3)): 1049-54.
        [6] Key BD, Howell RD, Criddle CS. Fluorinated organics in the biosphere. Environ
           Sci Technol 1997:31:2445-54.
        [7] Calafat AM, Wong LY, Kuklenyik Z, Reidy JA, Needham LL. Polyfluoroalkyl
           chemicals in the US. population: data from the National Health and  Nutri-
           tion Examination Survey (NHANES) 2003-2004 and comparisons with NHANES
           1999-2000. Environ Health Perspect 2007:115(November(ll)):1596-602.
        [8] Olsen GW, Burris JM, Ehresman DJ, Froehlich JW, Seacat AM,  Butenhoff JL,
           et al. Half-life of serum elimination of perfluorooctanesulfonate, perfluoro-
           hexanesulfonate, and perfluorooctanoate in retired fluorochemical production
           workers. Environ Health Perspect 2007:115(September(9)):1298-305.
        [9[ Lau C, Anitole K, Hodes C, Lai D, Pfahles-Hutchens A, Seed J. Perfluoroalkyl acids:
           a review of monitoring and toxicological findings. Toxicol Sci 2007:99(October
           (2)):366-94.
       [10] WambaughJF, Barton HA, Setzer RW. Comparing models for perfluorooctanoic
           acid pharmacokinetics using Bayesian analysis.] Pharmacokinet Pharmacodyn
           2008:35(December (6)):683-712.
       [11] Lou I, WambaughJF, Lau C, Hanson RG, Lindstrom AB, Strynar MJ, et al. Mod-
           eling single and repeated dose pharmacokinetics of PFOA in mice. Toxicol Sci
           2009:107(February(2)):331-41.
       [12[ Andersen ME, Clewell 3rd  HJ, Tan YM, Butenhoff JL, Olsen GW. Pharma-
           cokinetic modeling  of saturable, renal resorption of perfluoroalkylacids in
           monkeys—probing the determinants of long plasma half-lives. Toxicology
           2006:227(October(l-2)):156-64.
       [13] Kudo N, KawashimaY. Toxicity and toxicokinetics of perfluorooctanoic  acid in
           humans and animals. J Toxicol Sci 2003:28(May (2)):49-57.
       [ 14] Kudo N, Katakura M, Sato Y, Kawashima Y. Sex hormone-regulated renal trans-
           port of perfluorooctanoic acid. ChemBiol Interact 2002:139(March(3)):301-16.
       [15] Han X, Yang CH, Snajdr SI,  Nabb  DL, Mingoia RT. Uptake of perfluorooc-
           tanoate in freshly isolated hepatocytes from male and female rats. Toxicol Lett
           2008:181(2):81-6.
       [16[ Apelberg BJ, Witter  FR, Herbstman JB, Calafat AM, Halden RU,  Needham LL,
           et al. Cord serum concentrations of perfluorooctane sulfonate (PFOS) and per-
           fluorooctanoate (PFOA) in relation to weight and size at birth. Environ  Health
           Perspect 2007:115(November (11)): 1670-6.
       [17[ Spliethoff HM, Tao L, Shaver SM, Aldous KM, Pass KA, Kannan K, et al. Use
           of newborn screening program blood spots for exposure assessment: declin-
           ing levels of perluorinated compounds in New York State infants. Environ Sci
           Technol 2008;42(July (14)):5361-7.
       [18] TaoL, Kannan K, Wong CM, Arcaro KF, Butenhoff JL. Perfluorinated compounds
           in human milk from Massachusetts, U.S.A. Environ Sci Technol 2008:42(April
           (8)):3096-101.
                                       Previous
TOC

-------
386
                                                C.E. Rodriguez eta/./Reproductive Toxicology 27(2009)373-386
[19] So MK, Yamashita N, Taniyasu S, Jiang Q, Giesy JP, Chen K, et al. Health risks in
    infants associated with exposure to perfluorinated compounds in human breast
    milk from Zhoushan, China. Environ Sci Technol 2006;40(May (9)):2924-9.
[20] Butenhoff JL, Kennedy Jr GL,  Frame SR, O'Connor JC, York RG. The reproduc-
    tive toxicology of ammonium perfluorooctanoate (APFO) in the rat. Toxicology
    2004;196(March(l-2)):95-116.
[21] Abbott BD, Wolf CJ, Schmid JE, Das KP,  Zehr  RD,  Helfant L, et al. Perfluo-
    rooctanoic acid induced developmental toxicity in the  mouse is dependent
    on expression of peroxisome  proliferator activated receptor-alpha. Toxicol Sci
    2007;98(August (2)):571-81.
[22] Lau C, Thibodeaux JR, Hanson RG, Narotsky MG, Rogers JM, Lindstrom AB, et
    al. Effects of perfluorooctanoic acid exposure during pregnancy in the mouse.
    Toxicol Sci 2006;90(April (2)):510-8.
[23[ White SS, Calafat AM, Kuklenyik Z, Villanueva L, Zehr RD, Helfant L, et al.
    Gestational  PFOA exposure  of mice  is associated with  altered mammary
    gland development in dams and female offspring. Toxicol Sci 2007;96(March
    (1)): 133-44.
[24[ Wolf CJ, Fenton SE, Schmid JE, Calafat AM, Kuklenyik Z, Bryant XA, et al. Develop-
    mental toxicity of perfluorooctanoic acid in the CD-I mouse after cross-foster
    and restrictedgestational exposures.Toxicol Sci 2007;95(February(2)):462-73.
[25] O'Flaherty EJ, Scott W, Schreiner C, Beliles RP. A physiologically based kinetic
    model of rat and mouse gestation: disposition of a weak acid. Toxicol Appl
    Pharmacol 1992;112(February (2)):245-56.
[26] R, Team DC.  R: a language and environment for statistical computing. Vienna,
    Austria: R Foundation for Statistical Computing: 2008. ISBN 3-900051-07-0.
    http://www.R-project.org.
[27] Taillieu F, Salomon  LJ,  Siauve  N, Clement O, Faye  N,  Balvay  D,  et al.
    Placental perfusion and  permeability: simultaneous  assessment with dual-
    echo contrast-enhanced MR imaging in mice. Radiology 2006:241(December
    (3)):737-45.
[28] Iguchi T, Tani N, Sato T, Fukatsu N, Ohta Y. Developmental changes  in mouse
    placental cells from several stages of pregnancy in vivo and in vitro. Biol Reprod
    1993:480anuary(l)):188-96.
[29] Johnson MS, Thomson SC, Speakman JR. Limits to sustained energy intake.
    III. Effects of concurrent pregnancy and lactation in Mus musculus.] Exp Biol
    2001:204(June(Pt ll)):1947-56.
[30] Knight CH, Maltz E, Docherty AH. Milkyield and composition in mice: effects of
    littersize and lactation number. Comp Biochem Physiol A 1986:84(l):127-33.
[31] Yoon M, Barton HA. Predicting maternal rat and pup exposures: how different
    are they? Toxicol Sci 2008:102(March(l)):15-32.
[32] Gubernick DJ, Alberts JR. Resource exchange inthe biparental California mouse
    (Pemmyscus  californicus): water transfer from pups to parents. J Comp Psychol
    1987:101(December(4)):328-34.
[33] Henning  SJ.  Postnatal development: coordination of feeding, digestion, and
    metabolism. Am J Physiol 1981:241(September(3)):G199-214.
[34] Russel FG, Wouterse AC,  Van  Ginneken  CA. Physiologically based pharma-
    cokinetic model for the renal  clearance  of iodopyracet and the interaction
         with probenecid in the dog.  Biopharm Drug Dispos  1989:10(March-April
         (2)):137-52.
     |35] Brown RP, Delp MD, Lindstedt SL, Rhomberg LR, Beliles RP. Physiological param-
         eter values for physiologically based pharmacokinetic models. Toxicol Ind
         Health 1997:13Guly-August (4)):407-84.
     [36] Henderson WM, Smith MA. Perfluorooctanoic acid and perfluorononanoic acid
         in fetal and neonatal mice following in utero  exposure to 8-2 fluorotelomer
         alcohol. Toxicol Sci 2007;95(February (2)):452-61.
     [37] Knight CH, Maltz E, Docherty AH. Milk yield and composition in mice: effects
         of litter size and lactation number. Comp Biochem Physiol A Comp Physiol
         1986:84(l):127-33.
     [38] Hinderliter PM, Mylchreest E,  Gannon SA, Butenhoff JL, Kennedy Jr GL. Per-
         fluorooctanoate: placental and lactational transport pharmacokinetics in rats.
         Toxicology 2005:211(July (l-2)):139-48.
     [39] McCullagh P, Nelder JA.  Generalized linear models. 2nd ed. London xix + 511:
         Chapman and Hall:  1989. p. 37.
     [40] Clarke  DO,  Elswick BA,   Welsch  F, Conolly  RB.   Pharmacokinetics  of
         2-methoxyethanol and 2-methoxyacetic acid inthe pregnant mouse: a physio-
         logically based mathematical model. Toxicol Appl Pharmacol 1993:121(August
         (2)):239-52.
     [41] Salomon LJ, Siauve  N, Taillieu F, Balvay D, Vayssettes C, Frija G, et al. In vivo
         dynamic MRI measurement of the noradrenaline-induced reduction in placen-
         tal blood flow in mice. Placenta 2006:27(September-October (9-10)):1007-13.
     [42] Longnecker MP, Smith CS, Kissling GE, Hoppin JA,  Butenhoff JL, Decker E, et
         al. An interlaboratory study of perfluorinated alkyl compound levels in human
         plasma. Environ Res 2008:107(June (2)):152-9.
     [43] USEPA. Draft Risk Assessment for the Potential Human  Health Effects Associ-
         ated with Exposure to Perfluorooctanoic Acid and Its Salts, http ://www.e pa.gov/
         oppt/pfoa/pubs/pfoarisk. Access date: 11/01/08.
     [44] Godbole VY, Grundleger ML, Pasquine TA, Thenen SW Composition of rat milk
         from day 5 to 20 of lactation and milk intake of lean and preobese Zucker pups.
         JNutrl981:lll(March(3)):480-7.
     [45] McMullinTS, Lowe ER, Bartels MJ, Marty MS. Dynamic changes in lipids and pro-
         teins of maternal, fetal, and pup blood and milk during perinatal development
         in CD and Wistar rats. Toxicol Sci 2008:105(October (2)):260-74.
     [46] Kudo N, Sakai A, Mitsumoto A, Hibino Y, TsudaT, Kawashima Y. Tissue distribu-
         tion and hepatic subcellular distribution of perfluorooctanoic acid at low dose
         are different from those at high dose in rats. Biol Pharm Bull 2007:30(August
         (8)): 1535-40.
     [47] Qi Z, Whitt I, Mehta A, Jin J, Zhao M, Harris RC, et al. Serial determination of
         glomerular filtration rate in conscious mice using FITC-inulin clearance. Am J
         Physiol Renal Physiol 2004:286(March (3)):F590-6.
     [48] Luippold G, Pech B, Schneider S, Osswald H, Muhlbauer B. Age dependency
         of renal function in CD-I  mice. Am J Physiol Renal Physiol 2002:282(May
         (5)):F886-90.
     [49] Han X, Snow TA, Kemper RA, Jepson GW Binding of perfluorooctanoic acid to
         rat and human plasma proteins. Chem Res Toxicol 2003;16(June (6)):775-81.
                                                Previous
TOC

-------
                                      Environ. Sci. Technol. 2009, 43, 2374-2380
                                               te
                                           from
lifl               A
                         A.  COHEN
                         AND
H U B A L ,
YING  XU,1" ELAINE
PER  A.  CLAUSEN,5
JOHN  C.  LITTLE*'"1"
Department of Civil and Environmental Engineering, Virginia
Tech, Blacksburg, Virginia, National Center for
Computational Toxicology, Environmental Protection Agency,
Research Triangle Park, North Carolina, and New
Technologies Group, National Research Centre for the
Working Environment, Lers0 Parkalle 105, DK-2100
Copenhagen 0, Denmark
Received  May 24,  2008,  Revised  manuscript
January 5, 2009. Accepted January 5, 2009.
        received
A two-room model is developed to estimate the emission
rate of di-2-ethylhexyl phthalate (DEHP) from vinyl flooring and
the evolving gas-phase and adsorbed surface concentrations
in  a realistic  indoor environment.  Because the DEHP emission
rate measured in a test chamber  may be quite different from
the emission ratefromthe same material inthe indoor environment,
the model provides a convenient means to predict emissions
and transport in a more realistic setting. Adsorption isotherms for
phthalates and plasticizers on interior surfaces, such as
carpet, wood, dust, and human skin, are derived  from previous
field and laboratory studies. Log-linear relationships  between
equilibrium parameters and chemical vapor pressure are obtained.
The predicted indoor air DEHP concentration at steady state
is0.15^g/m3. Room 1 reaches steady state within about one year,
while the adjacent room reaches  steady state  about three
months later. Ventilation rate  has  a strong influence  on  DEHP
emission rate while total suspended particle concentration
has a substantial impact on gas-phase concentration. Exposure
to  DEHP via inhalation, dermal absorption, and oral ingestion
of  dust is evaluated. The model clarifies the mechanisms that
govern the release of DEHP from  vinyl flooring and the
subsequent interactions with  interior surfaces, airborne
particles, dust, and human  skin. Although further model
development, parameter identification, and model validation
are needed, our preliminary model provides a mechanistic
framework that elucidates exposure pathways for phthalate
plasticizers, and  can most likely be adapted to predict emissions
and transport of other semivolatile organic compounds,  such
as brominated flame retardants and biocides, in  a residential
environment.

Introduction
Since the 1930s, phthalates have been used as plasticizers to
enhance the flexibility of rigid  polyvinylchloride  (PVC)

   * Corresponding author fax: (540) 231-7916; e-mail: jclCaVt.edu.
   f Virginia Tech.
   * Environmental Protection Agency.
   § National Research Centre for the Working Environment.

2374 « ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 43, NO. 7, 2009
products (1), with worldwide phthalate production exceeding
3.5 million tons/year (2). About 90% of phthalates are used
as plasticizers in polymers (e.g., PVC) and  are found in a
wide range of consumer products including floor and wall
covering, toys, car interior trim, clothing, gloves, footwear,
and artificial leather (3). Di(2-ethylhexyl) phthalate (DEHP)
is most widely used and accounts for more than 50% of total
phthalalc production (3). The main use of DEHP is in PVC
products such as vinyl flooring, where it is typically present
at concentrations of ~20—40% (w/w) (4, 5). Other common
phthalates are dibutyl phthalate (DBF), benzyl butyl phthalate
(BBP), di-isononyl phthalate (DINP), and di-isodecyl ph-
thalate (DIDP).
   Because phthalates are notchemically bound in polymers,
slow emission from the products to air or o ther media usually
occurs.  Adverse health effects  of  phthalates are briefly
reviewed in the Supporting Information (SI). Phthalate esters
have been recognized as major indoor pollutants (3, 4, 6}. By
sampling in 120 homes  and  analyzing  for 89 organic
chemicals, Rudel et al.  (7) revealed that phthalates are one
of the most abundant  contaminants in indoor air. In the
recent EPA-sponsored CTEPP (Children's Total Exposure to
Persistent Pesticides and Other Persistent Organic Pollutants)
study (ff), concentrations of over 50 target compounds were
measured in multimedia samples  from the homes and
daycare centers of 260 preschool age children. The two
phthalates targeted in  the CTEPP  study were detected in
residential air and house dust, and on interior surfaces and
dermal wipe samples. As in Rudel et al.'s study, measured
phthalate concentrations were among the highest of any of
the targeted compounds, including pesticides, PA! Is, and
PCBs. Despite this, only a few studies of phthalate emission
characteristics and exposure are available.  Uhde et al.  (9)
measured emission of several phthalates from PVC-coated
wall-coverings  in  test  chambers  under  standard room
conditions. Clausen et al. (5) measured emissions of DEHP
from vinyl flooring for  more than a year in both the FLEC
(field  and  laboratory  emission  cell) and  the  CLIMPAQ
(chamber for laboratory investigations of materials, pollution,
and air quality). In addition, the effect of humidity on the
emission of DEHP from vinyl flooring was studied for one
year in the FLEC (10), emission of phthalates from different
types  of plasticizcd product was studied for 150 days in the
CLIMPAQ (11), and emission of phthalates from different
types  of  plasticized materials was studied using a passive
flux sampler (12),
   Based on the Clausen et al. (5)  experiments, Xu and Little
(13) developed  a model to predict the emission  rate of
phthalates from polymer materials. Their analysis revealed
that emissions of the  very low volatility semivolatile organic
compounds  (SVOCs) (such as DEHP) are subject to "external"
control (partitioning into the gas phase, the convective mass-
transfer coefficient, and adsorption onto chamber surfaces).
The tendency of phthalates to adsorb strongly to surfaces is
most  likely similar to that of other SVOCs. Gebefiigi (14)
showed that SVOCs were sorbed by cotton and Van Loy et
al. (15) found that more than 99% of recovered nicotine was
adsorbed to the walls  of  their stainless steel  chamber.
Compared  to these  experimental  chamber  systems, the
indoor environment  has many other types of surface that
will adsorb SVOCs such as DEHP to different extents. The
emission rate measured in a test chamber may therefore be
quite different from the emission rate from the same material
in the indoor environment.
   The model developed by Xu  and Little (13) provides a
convenient means to estimate the emissionratc and gas phase
                                                                 10.1021/es8013E4f CCC: $40.75
                                                                                        © 2009 American Chemical Society
                                                                                              Published on Web 02/19/2009
                                Previoys
              TOG
                                                                       Next

-------


fa TSP, Q




p.**, c""«l
Room 1
Glass v, y,

y,, TSP
I ft
Vinyl Flooring
Ceiling .
Room 2 ' f
V, y, Glass
window
Q
-» PI, Wood
hT
Carptt


yj. TSP, (

"""•"""


FIGURE 1. Schematic representation of the two-room model.
TABLE 1. Conditions for Two-Room Model
 volume (m x m x m)
 ventilation rate (m3/h)
 area of vinyl flooring, Av (m2)
 area of carpet, Ac (m2)
 area of glass window, Ag (m2)
 area of furniture, A (m2)
 area of ceiling and wall, >4CW (m2)
 total suspended particles, TSP
                                  room 1

                                 ! x3 xl
                                   13.3
                                    9

                                   1.7a

                                    41

                                   20fi
 room 2

! x3 x3
  13.3

   9
  1.7a
 20.3a
  41
  20fi
  3 According  to  typical  surface  to  volume  ratio  in
residences  (76). b Typical TSP in  residential  environment
(77).
and adsorbed surface concentrations likely to occur in more
realistic indoor environments. In this paper we both simplify
and extend the Xu and Little model to investigate potential
emission and distribution of DEHP in a residential environ-
ment. Field data collected in  the CTEPP study as well as
recent laboratory data are used to parameterize the extended
model. The model is then used to estimate the emission rate
and gas-phase DEHP concentration following the installation
of vinyl flooring in a room. Finally, we examine the influence
of two key parameters (air exchange rate and airborne particle
concentration) on DEHP emissions, and estimate the po-
tential exposure through inhalation, dermal absorption, and
oral ingestion of dust.

Two-Room  Model
To better estimate DEHP emissions in a residential environ-
ment the SVOC emissions model (13) was extended from a
one-compartment description of an experimental chamber
to a two-compartment representation of two adjacent rooms
in a home (Figure 1). Vinyl flooring, the only source of DEHP
considered, is placed in room 1, while carpet and wooden
furniture are arranged in room 2. The room conditions are
provided in Table 1.
   Mass Balance. With reference to Figure 1, the accumula-
tion of phthalate in room 1 obeys the following mass balance:

dyj(f)                 dCg(f)      dCcw(f)
  dt    1
                        dt
dt
                    dt
                       •+Av-m(t)-Q-y1(t)-QF1(t)   (I)
where yi (/
-------
                     Somnlar\ layer
                                                 5
                                                 «

                                                 O
                                                                    Boundary layer
                     Vfl, surf
                              Ailsorntion l
                                                                  h Aftsnrnlmn hi airhornt* particle
FIGURE 2. Schematic of sorption process. Note that the four individual materials shown for illustrative purposes in Figure 2a do not
comprise a layered structure.
where Csurf is the surface concentration (/
-------
   80-

   60-

i  40-
c;


8  20
0)
ro   0
                         Raw data
                        1 Linear regression
                         95% Prediction interval
                         95% Confidence interval
                    R2= 0.68, p-value <2e-16
                                               -9
                                               CD


                                               §
                                                               40-
                                                               20-
                                                            «   0-
                                                              Raw data
                                                              Linear regression
                                                              95% Prediction interval
                                                              95% Confidence interval
                0.0  0.1   0.2  0.3  0.4  0.5  0.6  0.7
                     Gas phase concentration, y (ng/m3)

               	3    A H1111 c V i n	
                                                       0.0  0.1 0.2 0.3  0.4 0.5 0.6 0.7 0.8  0.9 H
                                                           Gas phase concentration, y

                                                      	h Duct	
FIGURE 3.  Linear regression results for DBP.
E  6
•c
               0-
                           logK>urf=-1.06logVii-3.30
                           R*= 0.97
                                             DBP"
                                                                 DEHP
                                                                                               DBP
                -7.0
                      -6.5
                             -6.0    -55    -5.0
                                logV (mmHg)
                                                 -4.5
                                                       -4.0
                                t-li
                                     an ciri
                                         Vir
                                                     -7.0    -6.5    -6.0   -5.5    -5.0
                                                                 log Vp (mmHg)

                                                     	h  Duct
                                                                                                  -4.5
FIGURE 4.  Linear regression between log (Vf) and log (Ksmf}.
TABLE 2. Partition Coefficients for DEHP

                  surface

        furniture, wall and ceiling a
        carpet b
        glass °
        skin d
        airborne particles "
        dustf
                                        partition coefficient, K

                                             2500(m)
                                             1700 (m)
                                       3800 (fig/m2)/ (fig/m3) '
                                             9500 (m)
                                            0.25 (m3/ug)
                                            21100 (m3/g)
                                                                                 isotherm exponent, n
                                                                                        1.5
  "Calculated using log Ksuri = -0.779 logl/p - 1.93, Figure S6.  bCalculated using log /Csurf = -0.627 logl/p -1.08,  Figure
S6. °Xu and Little fitted the Freundlich isotherm for glass. dCalculated using log /Csurf = -1.06 logl/p -  3.30, Figure 4.
"Calculated using  log KVi particle = -0.860 log l/p - 4.67, eq S3. fRegression result of Figure S5.
we developed simple correlations between the equilibrium
parameters and the vapor pressure of the target chemicals.
   Correlation of Equilibrium  Parameters with  Vapor
Pressure. Correlations were obtainedbetweenvaporpressure
(Vp)  and sorption parameters (KSUIf)  for  different interior
surfaces, including settled dust. Linear relationships between
log (Vp) and log  (KSUIf) were found (e.g.,  human skin and
dust) as shown in Figure 4. Results for all surfaces are shown
in Figure S6. Data for child hand wipe and adult hand wipe
were combined to get the relationship between the human
skin partition coefficient and vapor pressure. The partition
coefficient of BPA for dust did not conform well to this
relationship, thus Figure 4b  only shows the relationship
between phthalate vapor pressures and partition coefficients
for dust.
   While using only three chemicals does not provide a
conclusive relationship, the overall results suggest that it is
                                               possible to relate the equilibrium partition coefficients to
                                               vapor pressures. Finally, we used the new correlations to
                                               obtain the partition coefficient for DEHP on different interior
                                               surfaces, as shown in Table 2. The isotherm used for glass
                                               is based on a previous study (13), although the nonlinear
                                               nature of the isotherm may have been due to the data fitting
                                               procedure. In  the SI, the skin/air partition  coefficient is
                                               checked using a completely different procedure and shown
                                               to be acceptable.
                                                  Mass Transfer Coefficient. The value of hm, the mass-
                                               transfer coefficient for  the boundary layer adjacent to the
                                               various surfaces, was estimated using correlation equations
                                               (21), which express hm as a function of Reynolds number
                                               and Schmidt number. Huang et al. (22) measured air velocities
                                               in a typical house in the United States. They found velocities
                                               with a range of 0.01—0.16 m/s and showed that values near
                                               the floor are higher than those in the center of the room. In
                                                              VOL. 43, NO. 7, 2009 / ENVIRONMENTAL SCIENCE & TECHNOLOGY • 2377
                                 Previous
                                          TOC

-------
Concentration (ng/m")
o o o o o
8O r*' -* N
Ul 0 01 0
	 Ccn-centratton in Room 1
— •— Concentration in Room 2
— C— Errtssiwt rate
Aw exchange
f ^f^ Air exchang*
//^Jp*" Air exchangs
rate= 0 25h '. TSP= 20ugftn '
fSte=0.5h'. TSP=20ua/m"'

jLjor*'^ At MCFwnae r»t«« O.Sh'1. TSP« aOusrtn1
|^^-T>--<>-0~OH^&-O-O-<>-O— O-O— O—O— O-O
Air exchange rsle= 025h', TS P= 40ug'(m '
Air exchange rste= 025h ' TSP=20Ltg/m
JBas*lm*iCondtcn)
* ^v W M U CJ
3 bi O in b en
Emission rate (M9/m2h)
Surface concentration (^9/m2)
POfrCRODOM-fcO
oS8S8SS8g
x
r*"*''
—9- Human skin
— T— Furniture
— •— Glass window
y — f~-^^W — T — V — * — T-
-i-A-t-fc-t-

0 200 400 600 800 1000 0 200 400 600 800
Time (days) Time (days)
0 h
4-1

1000
FIGURE 5. (a) Effect of air exchange rate and TSP concentration on DEHP  concentration and emission rate, (b) Predicted
concentration on interior surfaces.
                                                     DEHP
TABLE 3. Concentration of Phthalates in Indoor Air and Household Dust Samples
DEHP
gas phase concn (fig/m3)



dust phase concn (fig/g)



references
(25)
(26)
(7)
(4)
(25)
(27)
(7)
(4) and (28)
n
40
125
102
59
12
600
101
30
mean
0.48
0.14
0.07
0.19
950
1200
340
776
max our study
1.6 0.15

1.0
0.4
3100 3000
3500
7700
1542
the CLIMPAQ chamber  (5), which roughly approximates
conditions in a real room, the velocity at the test piece surface
was estimated to be 0.15 m/s and this value is used to estimate
fem. Odum et al. (23) measured mass transfer of PAHs and
others SVOCs to and from combustion aerosols at 25 °C, and
their result is used here as femp, the mass-transfer coefficient
for particles.
   We now have estimates of all the partition coefficients for
DEHP between indoor air and interior surfaces (hardwood
floor, carpet,  human skin, and  particles), as well as the
associated mass transfer coefficients. We are therefore able
to use the two-room model to estimate the emission rate
and evolving gas-phase DEHP concentration following the
installation of vinyl flooring in room 1.
   Because no other data for DEHP concentrations on real
interior surfaces are available, the CTEPP study provides the
only available data that we can  use  to estimate DEHP
adsorption isotherms. Even though the values of the partition
coefficient  for DEHP  on interior surfaces  can only be
considered rough estimates, we showed in  a sensitivity
analysis (24) that they do not have a strong influence on the
steady state indoor air DEHP concentration, which is the
basis for our exposure analysis (although they do influence
the time it takes to reach steady state). We therefore believe
that our emissions and transport model represents a reason-
able first step.

Results  and Discussion
For baseline  conditions (Table  1), the indoor air DEHP
concentration at steady state is 0.15 ,Mg/m3. As shown in Table
3, this value is similar to that measured within homes  in
both the United States and Europe, although it should be
emphasized that vinyl flooring is the only source of DEHP
considered here. Room 1 reaches steady state within about
one year, while the adjacent room reaches steady state about
     three months later. Airborne particles increase the rate at
     which DEHP is transported between rooms by a factor of 5
     relative to gas-phase transport. The boundary layer sur-
     rounding the airborne particles is much thinner than the
     boundary layer adjacent to the other indoor surfaces and the
     suspended particles reach equilibrium with the gas-phase
     much more rapidly than the larger surfaces.  Suspended
     particles are therefore very effective at transporting DEHP
     from one room to another, because DEHP also desorbs very
     rapidly from the particles.
        In Figure 5a, the impact of air exchange rate and total
     suspended particle concentration on the DEHP emission rate
     and the DEHP concentration in rooms 1 and 2 is examined.
     Increasing air exchange rate will increase the DEHP emission
     rate from the vinyl flooring significantly while an increase in
     the TSP concentration causes a substantial decrease in the
     gas-phase concentration in  both rooms, but increases the
     emission rate in Room 1. An increase in the air exchange rate
     was assumed to double the velocity of the air above the vinyl
     flooring (from 0.15 to 0.30 m/s) and this higher value was
     used to calculate the h^ associated with the flooring.
        Figure  5b shows the predicted DEHP concentration
     change with time on various interior surfaces in room 2. The
     predicted DEHP concentration on human skin is 5—7 times
     higher  than on the other surfaces due to the high skin/air
     partition coefficient for DEHP. The skin/air partition coef-
     ficient was obtained from hand-wipe samples in the CTEPP
     study. It is generally believed that these hand-wipe  samples
     are measuring chemicals transferred from indoor  surfaces
     onto the hands directly. However, the fact that the skin/air
     isotherms determined for both adult and child are almost
     identical for DBF, BBP, and BPA (see, for example, Figure
     4a), suggests that SVOCs may be transferring directly from
     the air to the skin, or that if large amounts are picked up by
     direct dermal transfer,  that some desorbs to re-establish
2378 • ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 43, NO. 7, 2009
                               Previous
TOC

-------
equilibrium with the air. Indeed, in a subsequent paper (24),
we show thai there is  a strong correlation  between the
concentrations of DBF, BBP, and BPA on skin and those in
the gas phase, but almost no correlation with those on interior
surfaces. This further suggests that certain SVOCs may reach
the skin through the gas phase, and not via dermal transfer
as is commonly suspected.
   Estimating Exposure to DEHP from  Vinyl Flooring.
Based on the model results,  we are interested in evaluating
exposure to vapor phase DEHP in air, particle bound DEHP
in air, and DEHP in settled dust. The exposure pathways of
interest are inhalation of vapor, inhalation of particles, dermal
absorption of DEHP deposited on the skin, and oral ingestion
via household dust.
   The  detailed exposure calculations are shown in the SI.
For dermal exposure, the overall skin permeability coefficient,
P, is controlled by permeation through the skin (Akin;air) as
well as by permeation through the air boundary layer adjacent
to the skin (Patr), or:
                              1
                                                     (10)
where PSkjn/air (cm/hr) is vapor to skin permeability, and Pajr
(cm/hr)  is permeability of the boundary  layer. For low
volatility compounds, convective mass transfer through the
air boundary layer adjacent to the skin may become the rate
limiting factor, and this is the case for DEHP. As detailed in
the SI, the estimated value of Pis 580 cm/hr.
   As shown in Table S8, the reference dose  (RID) for DEHP
is  20 ag/kg/d according to the U.S. EPA. Airborne particles
contribute 80% of the inhalation exposure, although  the
highest value of total inhalation exposure is less than 0.6
ag/kg/d, which is much lower than the  RfD.  For infants,
exposure through oral intake via  dust is 1.6 times higher
than the RfD, although the estimate for dust intake rate of
10.3 mg/kg/d (29) may be high.  Exposure via these two
pathways is similar to other study results (6,  30, 31}. Dermal
absorption of DEHP deposited on skin is greater than that
taken up through inhalation. For DEHP, the primary route
of exposure is  oral ingestion of dust.  Overall,  children
experience 2™ 10 times higher exposure risk than adults based
on all  exposure pathways.

Acknowledgments
Financial support was provided by the  National  Science
Foundation (CBET 0504167). We thank Peter  Egeghy at EPA's
National Exposure Research Laboratory for his assistance
with the CTEPP database, and John Kissel, Linsey Marr, Bill
Nazaroff, and Charlie Weschler for their  useful comments
on the draft  manuscript. This paper has been subjected to
U.S. EPA Office of Research and Development review and
approved for publication,

Supporting Information Available
Further details on the regression results for the CTEPP data,
modification of surface partition coefficients, and detailed
exposure calculations. This material is available free of charge
via the Internet at http://pubs.acs.org.

Literature
 (1) Latini, G.; Felice, C. D.; Vcrrotti, A. Plasticizers, infant nutrition
    and  reproductive health. RegrodJfoxJCQl- 2004, 1,9, 27-33.
 (2) Cadogan, D. F.; Howick, C. J. Plaslicizers. Kirk-Othmer Ency-
    clopedia  of Chemical Technology' 1996, 19, 258-290.
 (3) Bornehag, C.-G.;  Lundgren. B.; Weschler, C. J.;  Sigsgaard, T.;
    Hagerhed-Engman, L.; Sundell, I. Phthalates in Indoor Dust
    and  Their Association with Building Characteristics. Environ^
    Health Perspect. 2005, 11.9 (10), 1399-1404.
       (4) Fromme, H.; Lahrz, T.; Piloty, M.; Gebhart, FL; Oddoy,A.; Ruden,
          H. Occurrence of phthalates and musk fragrances in indoor air
          and dust from apartments and kindergartens in Berlin (Ger-
          many). Indoor Air 2004. 14, 188-195.
       (5) Clausen, P. A.; Ilansen, V.; Gunnarsen, L.; Afshari, A.; Wolkoff,
          P. Emission of di-2-ethylhexyi phthalate from PVC flooring into
          air and uptake in dust: Emission and sorption experiments in
          FLEC and CL1MPAQ. Environ. Set Technol 2004, 38 (9), 2531-
          2537.
       (6) Wensing, M.; Uhde, E.; Salthammer, T. Plastics additives in the
          indoor environment—flame rctardants and plasticizcrs. Sd.
          TotajEnmon. 2005, 339, 19-40.
       (7) Rudel, R.; Camann, D.; Spengler,J.; Korn, I,.; Brody,J. Phthalates,
          alkylphenols, pesticides, polybrominated diphenyl ethers, and
          other endocrine-disrupting compounds in indoor air and dust.
          Environ. Sci. Technol. 2003, 37(20), 4543-4553.
       (8) U.S. EPA. A Pilot Study of Children's Total Exposure to Persistent
          Pesticides and Other Persistent Organic Pollutants (CTEPP); 2005;
          http://www.epa.gov/heasd/ctepp/index.htmCaccessedMayl3,
          2008).
       (9) Uhde, E.; Bednarek, M.; Fuhrmann, F.; Salthammer, T. Phthalic
          Esters in the Indoor Environment -Test chamber  studies on
          PVC-coated wallcoverings.  Indgo^Air 2001, 11,  150-155.
      (10) Clausen, P. A.; Xu, Y.; Kofoed-S0rensen, V.; Little, J. C.; Wolkoff,
          P. Hie influence of humidity on the emission of di-(2-ethylhexyl)
          phthalate (DEHP) from vinyl flooring in the emission cell FFEC.
          Atmos. Environ. 2007, 41, 3217-3224.
      (11) Afshari, A.; Gunnarsen, L.; Clausen, P. A.; Hanson, V. Emission
          of phthalates from PVC and other materials. Indoor Air 2004,
          14, 120-128.
      (12) Fujii,  M.; Shinohara, N.; Lim, A.; Otake, T.; Kumagai, K.;
          Yanagisawa, Y. A study on  emission of phthalate esters from
          plastic materials using a passive flux sampler. Atmos. Environ.
          2003,  37, 5495-5504.
      (13) Xu, Y.; Little, J. Predicting emissions of SVOCs from polymeric
          materials and their interaction with airborne particles. Environ.t
          Sd.  Technol. 2006, 40 (2), 456-461.
      (14) Gebefiigi, I.  Chemical  exposure in enclosed environments.
          Toxicol Environ. Chem. 1989,  20- 21, 121-127.
      (15) Van Loy, M. D.; Lee, V. C.; Gundel, L. A; Daisey, J. M.; Sextro,
          R. G.; Nazaroff, W. W. Dynamic behavior of scmivolatilc organic
          compounds in indoor air. 1. Nicotine in a stainless steel chamber.
          Enwron.__Scl_Technol. 1997, 31, 2554-2561.
      (16) Hodgson, A. T.; Ming, K. Y.; Singer, B. C. Quantifying Object and
          Material Surface Areas  in Residences; LBNL-56786; Lawrence
          Berkeley National Laboratory,  2005.
      (17) Weschler, C. J. Indoor/outdoor connections  exemplified by
          processes that depend on an organic compound's saturation
          vapor  pressure. Atmos.  Environ. 2003, 37, 5455-5465.
      (18) VanLoy, M.; Riley, W.; Daisey, J.; Nazaroff, W. Dynamic behavior
          of semivolatile organic  compounds in indoor air. 2. Nicotine
          and phcnanthrcnc with carpet and wallboard. Environ. Set.
          Technol. 2001, 35 (3), 560-567.
          Won, D.; Corsi, R. L.; Ryncs, M. Sorptivc Interactions between
          VOCs  and IndoorMaterials. IndoorAtr200l, 11, 246-256.
          Pankow, J.  Application of common y-intcrccpt  regression
          parameters for log kp vsl /t for predicting gas particle partitioning
          in the urban-environment. Atmos. Environ. Part A 1992,26(14),
          2489-2497.
          Axley,  J. W. Adsorption modeling for  building contaminant
          dispersal analysis. Indoor Air 1991, 2, 147-171.
          Huang, J.  M.; Chen, Q.;  Hibot, B.; Rivoalen,  H.  Modeling
          contaminant exposure  in a singlefamily  house. Indoor Built
          Environ, 2004, 13 (1), 5-19.
      (23) Odum, J.; Yu, J.; Kamens,  R. Modeling the mass  transfer of
          semivolatile organics in combustion aerosols.  Environ.	Set
          Technol, 1994, 28, 2278-2285.
      (24) Xu, Y.; Hubal, E. C.; Little,). C., Predicting Residential Exposure
          to Phthalate PI asticizcr Emitted from Vinyl Flooring-Sensitivity,
          Uncertainty, and Implications for Biomonitoring. Submitted.
      (25) BATJCH. Anafyse und Bewertungder in Raumluft und Hausstaub
          vorhandenen Konzentrationen der Weichmacherbestandteile
          Diethylhexylphlhalat(DEIiP) und Dibulylphlhalat (DBF}, Berlin,
          Germany, 1991.
      (26) Sheldon, L.;  Clayton, A.; Keever, J.; PerriU, R.; Whitaker, D.
          PTEAM: Monitoring of phthalates  and PAHs in indoor and
          outdoor air. In Samples in Riverside California; Research Note
          94-10; California  Environmental Protection Agency, Air Re-
          sources Board: Sacramento, CA, 1994.
      (27) Mattulat, A. (SOFIA GmbH Berlin) (2002)  Konzentration von
          mittel-undschwerfliichtigenorganischenVerbindungeninStaub
          aus Innenraumen -  Belastungssituation im Jahr, 2001.
                                  Previoys
                                                                VOL. 43, NO. 7, 2009 / ENVIRONMENTAL SCIENCE & TECHNOLOGY « 2379
TOG
Next

-------
(28) Weschler, C. J.; Salthammer, T.; Fromme, H. Partitioning of
    phthalates among the gas phase, airborne particles and settled
    dust in indoor environments. Atmos.Em>iron. 2008,42 (7), 1449-
    1460.
(29) Stubenrauch, S.; Hempfling, R.; Doetsch, P. D. G. Vorschlage
    zur Charakterisierung und Quantifizierang pfadiibergreifender
    Schadstoffexposition. Z, Umweltchem. OkotoxikoL  1999, 11,
    219-226.
(30) Fromme, H.; Gruber, L.; Schlummer, M.; Wolz, G.; Bohrner, S.;
    Angerer, J.; Mayerd, R.; Liebld, B.; Bolte, G. Intake of phthalates
          and di(2-ethylhexyi)adipate: Results of the Integrated Exposure
          Assessment  Survey based on  duplicate diet  samples and
          biomonitoring data. Em/iron. Int. 2007, 33 (8), 1012-1020.
      (31) Wormuth, M.; Scheringer, M.; Vollenweider, M.; Hungerbiihler,
          K. What Are the Sources of Exposure to Eight Frequently Used
          Phthalic Acid Esters in Europeans? Risk Anal 2006, 23 (3),
          803-824.
      ES801354F
2380 » ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 43, NO. 7, 2009
                                   Previoys
TOG
Next

-------
Research

Profiling Chemicals Based on Chronic Toxicity Results from the U.S. EPA
ToxRef Database
Matthew T. Martin, Richard S. Judson, David M. Reif, Robert J. Kavlock, and David J. Dix
National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Research
Triangle Park, North Carolina, USA
 BACKGROUND: Thirty years of pesticide registration toxicity data have been historically stored as
 hardcopy and scanned documents by the U.S. Environmental Protection Agency (EPA). A signifi-
 cant portion of these data have now been processed into standardized and structured toxicity data
 within the EPA's Toxicity Reference Database (ToxRefDB), including chronic, cancer, develop-
 mental, and reproductive studies from laboratory animals. These data are now accessible and mine-
 able within ToxRefDB and are serving as a primary source of validation for U.S. EPA's ToxCast
 research program in predictive toxicology.
 OBJECTIVES: We profiled in vivo toxicities across 310 chemicals as a model application of
 ToxRefDB, meeting the need for detailed anchoring end points for development of ToxCast predic-
 tive signatures.
 METHODS: Using query and structured data-mining approaches, we generated toxicity profiles from
 ToxRefDB based on long-term rodent bioassays. These chronic/cancer data were analyzed for suit-
 ability as anchoring end points based on incidence, target organ, severity, potency, and significance.
 RESULTS:  Under conditions of the bioassays, we observed pathologies for 273 of 310 chemicals, witli
 greater preponderance (> 90%) occurring in the liver, kidney, thyroid, lung, testis, and spleen. We
 observed  proliferative lesions for 225 chemicals, and  167 chemicals caused progression to cancer-
 related pathologies.
 CONCLUSIONS: Based on incidence, severity, and potency, we selected 26  primarily tissue-specific
 pathology end points to uniformly classify the 310 chemicals. The resulting toxicity profile classifi-
 cations demonstrate tlie utility of structuring legacy toxicity information and facilitating the com-
 putation of these data within ToxRefDB for ToxCast and other applications.
 KEY WORDS: cancer, chronic toxicity, pesticides, relational database, toxicity profile. Environ Health
 Perspect 117:392-399 (2009). doi:10.1289/ehp.0800074 available via http://dx.doi.org/ [Online
 20 October 2008]
The U.S. Environmental Protection Agency
(EPA) and other regulatory agencies are inves-
tigating novel approaches to predict chemical
toxicity, with the major goals being to enable
the rapid screening of thousands of chemi-
cals that have not previously been character-
ized, to  increase mechanistic understanding of
chemical toxicity, and  to reduce the number
of animals required for toxicity testing. All of
these goals initially require high-quality in vivo
toxicity  data in order to test and validate these
new approaches.  To  support  U.S. EPA's
ToxCast effort  (Dix et al.  2007), we have
created  the structured and curated Toxicity
Reference Database (ToxRefDB) to  tabulate
information from guideline  in vivo toxicity
studies.  ToxRefDB and related databases will
help support computational analysis and mod-
eling of the links from  molecular interactions
through cellular and organ phenotypes all the
way to whole-animal toxicity. This transfor-
mation of existing toxicity data will facilitate a
transition to the National Research Council's
(NRC)  vision for Toxicity Testing in the 21st
Century (Collins et al. 2008; NRC 2007). The
NRC envisions a focus on  toxicity pathways
that will link molecular assays to toxicity out-
comes in humans and ecological species.
   Traditional toxicity testing for risk assess-
ment of single compounds  or limited groups
of compounds can cost millions  of dollars
per chemical and years of effort. Since  1970,
the U.S. EPA has accumulated a vast store of
high-quality regulatory toxicity information
on thousands of compounds, most of which
has been inaccessible for computational  analy-
ses. The curation and structuring of chemical
toxicity information into the readily accessible
ToxRefDB have created a valuable resource for
both retrospective and prospective toxicologic
studies. ToxRefDB initially focused  on captur-
ing developmental rat and rabbit, multigenera-
tion reproduction rat, and chronic/cancer rat
and cancer mouse studies. In addition  to the
data model,  we developed a detailed toxicity-
based controlled vocabulary for all the study
types spanning clinical chemistry, pathology,
reproductive, and developmental effects.
    An important initial application  of
ToxRefDB is to provide anchoring of in vivo
toxicity data for the U.S. EPA's ToxCast
research program, which has been designed to
address the agency's needs for chemical  prior-
itization by  using state-of-the-art approaches
in high-throughput screening (HTS) and
toxicogenomics (U.S.  EPA 2008b). Nearly
all of the ToxCast phase I chemicals are food-
use pesticide active ingredients that have
undergone a full  suite of mammalian  toxic-
ity  tests, creating an unparalleled reference
set of toxicologic information. The complete
and highly standardized data set provided by
ToxRefDB facilitates analysis of the ToxCast
phase I chemicals across chemical, study type,
species, target organ, and effect. Additionally,
ToxRefDB serves as a model for other efforts
to capture quantitative, tabular toxicology
data from legacy and new studies and to make
these data useful for cross-chemical computa-
tional toxicology analysis.

Methods
Data characteristics. We  collected reviews
of registrant-submitted  toxicity studies,
known as data evaluation records (DERs),  for
roughly 400 chemicals from the U.S. EPA's
Office of Pesticide Programs (OPP) within
the Office of Pollution Prevention  and Toxic
Substances  (OPPTS). The file  types of the
DERs include TIFF, Microsoft Word, Word
Perfect, and PDF formats, some of which  are
not directly text-readable. We indexed every
DER file based on a file name convention
that consisted of the pesticide chemical (PC)
code, study identification number (MRID),
study type identification number  [based  on
870 series OPPTS  harmonized  health effect
guidelines (U.S. EPA 1996)], species code,
review identification  number (TXR), and a
review version code. The latter  code identi-
fied the review as a primary review, secondary
review, supplemental review, updated execu-
tive summary, or a deficient review.
    For the initial  build of ToxRefDB,  we
collected and  indexed a total of 4,620 DERs
from OPP. These included five types of studies

Address correspondence to M.T.  Martin, U.S.
Environmental Protection Agency, MD D343-03,
109 TW Alexander Dr., Research Triangle Park, NC
27711 USA. Telephone: (919) 541-4104. Fax: (919)
685-3399. E-mail: Martin.Matt@epa.gov
  Supplemental Material is available online at http://
www.ehponline.org/members/2008/0800074/
suppl.pdf
  We thank the Office of Pesticide Programs  for
contributions to the ToxRefDB project, including
access to toxicity data evaluation records, scientific
consultation, and review of the manuscript by V.
Dellarco. We also thank D. Corum, K. McLaurin,
L. Peck, D. Rotroff, and D. Scoville for excellent
work entering data into ToxRefDB.
  The U.S. EPA, through its Office of Research and
Development funded and managed the research
described here. It has been subjected to agency
review and approved for publication.
  The authors  declare they have no  competing
financial interests.
  Received 6 August 2008; accepted 20 October 2008.
392
                           VOLUME 117 | NUMBER 3 | March 2009  • Environmental Health Perspectives
                                        Previous
                  TOC

-------
                                                                                    Profiling toxicity of chemicals using ToxRefDB
from a variety of species: developmental in rat
and rabbit, reproductive in rat, subchronic in
mouse and rat, and chronic or cancer  in rat
and mouse. Approximately 1,000 DERs pro-
vided chronic and cancer data, and we selected
a subset of these for curation into the data-
base to yield data on 310 unique chemicals:
rat chronic/cancer studies on 283 chemicals,
and mouse cancer studies on 267 chemicals.
Each study assessed a single technical-grade
chemical's toxicity potential in a single species
and study type. The first portion of the DER
outlines the test substance, purity, lot/batch
numbers, MRID, study citation, OPPTS test
guideline, and reviewers of the study. The
executive summary captures all of the basic
study design information, including species
and  strain,  doses, number of animals per
treatment group, and any deficiencies in study
protocol.
   Dose levels are listed in parts per million
and  through food consumption and body
weight calculation or standard conversion
as milligrams per kilogram body weight per
day. Where possible, dose levels were listed as
milligrams per kilogram body weight  per day
in ToxRefDB. The executive summary also
describes adverse effects observed at all dose
levels in the study. No observed adverse effect
level (NOAEL) and lowest observed adverse
effect level (LOAEL) are established based
on adverse effects. The adverse effects used to
derive  NOAEL and LOAEL are referred to
as "critical effects" in this article, regardless of
their role in establishing reference dose levels
in regulatory determinations for a chemical.
   The body of the DERs provides detailed
test material,  animal information, and full
dose—response data in text and tables for a
variety of "effect types", including  mortality,
clinical signs, clinical chemistry, hematology,
urinalysis, gross pathology, nonneoplastic
pathology, and neoplastic pathology. For each
effect type, we also specified an "effect target"
(e.g., liver as target organ) and "effect descrip-
tion" (e.g., hypertrophy).
   ToxCast phase I chemicals also included
nonpesticidal chemicals such as perfluorinated
compounds, phthalates, and other industrial
chemicals. Although DERs and pesticide regis-
tration studies were not available for these
chemicals, there were often high-quality and
standardized chronic and other types of toxicity
studies available from the National Toxicology
Program,  peer-reviewed literature, or  other
sources. We organized and evaluated data from
these study reports and publications consistent
with the information from the DERs.
   Information  on chemical  identity and
structure was  provided by  the  U.S. EPA
DSSTox (Distributed Structure-Searchable
Toxicity) program  (U.S.  EPA 2007).
ToxRefDB outputs are linked to informa-
tion from other sources through the U.S. EPA
ACToR  (Aggregated  Computational
Toxicology Resource) database (Judson et al.
2008b;  U.S. EPA 2008a). ACToR will also
serve as  the primary portal for public access to
ToxRefDB and related outputs. ACToR stores
the HTS data being generated by the ToxCast
program and will link these HTS data with
traditional toxicity data from ToxRefDB and
other sources.
    Relational model. In  the development of
ToxRefDB, a relational model approach was
taken with input from other toxicity data-
base standards, including  ToxML (Yang et al.
2006).  The resulting data model is semi-
hierarchical in  nature: a single compound can
be tested in multiple studies, each study can
contain  multiple treatment groups, and mul-
tiple effects can be observed in each treatment
group. The data model is organized from a
chemical-centric viewpoint to allow data inte-
gration  and exchange with other data sources
and to facilitate the linkage of the reference
toxicity  information to chemical-specific data
generated using in vitro technologies (i.e.,
ToxCast). The relational model was then
implemented into a table structure with estab-
lished relationships that  ensure data integ-
rity, updateability, and standardization [see
Supplemental Material, Figure 1  (http://www.
ehponline.org/members/2008/0800074/
suppl.pdfj.
    Development of a toxicity-based controlled
vocabulary. The development of a controlled
vocabulary within ToxRefDB was  neces-
sary for the standardization of data captured
across various studies and study types per-
formed  over roughly 30 years. The nonredun-
dant list of terms across various information
domains provided data integrity and search-
ability. We based study type terminology on
the unique study types harmonized  by the
Organisation for Economic Co-operation and
Development and the OPPTS (U.S. EPA
1996). Specificstandardized terminology for
study design was established for species/strain,
method/route of administration,  and units for
dose and dosing duration. Treatment  group-
related vocabularies were  developed to estab-
lish the generation, gender, and dosing period.
    A primary goal in evaluating the registrant-
submitted toxicity  studies is  to establish
NOAEL and LOAEL values  for a variety of
categorical end points, including  systemic, off-
spring, maternal, parental, developmental, and
reproductive toxicity across the various study
types. These categorical end points are captured
and normalized across studies for each effect
responsible for deriving the NOAEL/LOAEL.
    The  development of a toxicologic effect
vocabulary was approached in a domain-
specific manner. For example, we derived
clinical  pathology terms from OPPTS guide-
lines and  collected clinical pathology labo-
ratories and organ pathology terms from
various  public resources, including the
National Toxicology Program's Pathology
Code Tables (2007). The vocabulary under-
went further standardization  by mapping
all synonymous terms to a single nonre-
dundant value. We took a taxonomical
approach for establishing the finalized effect
vocabulary based on a three-tiered hierarchi-
cal model, with the effect type at the  top,
followed by effect target and then effect
description. Examples of effect type include
clinical chemistry, hematology, urinalysis,
body weight, mortality, gross pathology,
nonneoplastic pathology, neoplastic pathol-
ogy, and developmental and reproductive
effects. Subclasses of these types include spe-
cific  target organs (e.g., liver, lung, spleen) or
measured analytes (e.g., alanine aminotrans-
ferase, aspartate aminotransferase, choles-
terol). The specific combinations of effect
type and target are then further subclassed
based on a  nonredundant descriptive term
(e.g., increase,  decrease, hypertrophy,  atro-
phy). For organ pathology terms, each target
organ has a set of regions, zones, and cell
types that  characterize the site of toxicity.
The  full effect vocabulary is available on the
ToxRefDB home page (U.S. EPA 2008c).
   Data input.  The ToxRefDB Data Entry
Tool was developed with Microsoft Access
providing the user interface for all initial data
input and is also available at the ToxRefDB
home page  (U.S. EPA 2008c). After the initial
quality control (QC) steps discussed below,
the data are migrated to ToxRefDB, which is
implemented using the open-source MySQL
platform. Data entry followed a series of pro-
tocols outlined  in the  ToxRefDB Standard
Operating Procedure (SOP) documents that
define mapping of toxicologic information
to standardized  fields, use of a standardized
vocabulary, and extraction of biologically and
statistically significant treatment-related effects.
   Data QC and management,  QC con-
sisted of 100% cross-checking of studies,
systematic  updates of ToxRefDB to ensure
consistency across the studies, expert review
of data outputs,  and external review by stake-
holders. All data entered into ToxRefDB have
undergone cross-checking,  which entailed a
second person validating each entered value
based on the source information (primarily
DERs). Systematic QC involved querying the
database for potential  inconsistencies  (e.g.,
male-only  effects being assigned to female
treatment groups, or systemic LOAEL being
set at multiple dose levels) along with updat-
ing vocabularies and related records. Expert
review was  performed on data outputs of the
chronic/cancer rat or mouse studies, includ-
ing all of the end points captured in the data
tables of this publication. In addition to inter-
nal QC,  an ongoing process allowing stake-
holders the opportunity to review ToxRefDB
Environmental Health Perspectives •  VOLUME 117 I NUMBER 3 I March 2009
                                        Previous
                                                                                  393

-------
Martin et al.
records is in place. The companies or regis-
trants that sponsor the data or support the
registration of the chemical are reviewing the
accuracy of the data relative to DERs and
other risk assessment documents. To  date,
studies on 235 chemicals have been reviewed
by registrants, and comments  from these
reviews indicate greater than 99% accuracy
in capturing treatment-related effects  from
DERs. The stakeholder review  process has
facilitated additional information from  addi-
tional studies,  DERs, and other risk assess-
ment documents to be  collected  and entered
into ToxRefDB.
   Data output and analysis. The structured
toxicity information stored within ToxRefDB
can be extracted in various  formats using
MySQL queries.  For the purpose of provid-
ing computable  outputs,  that is, quantita-
tive outputs amenable  to statistical  analysis,
we used a consistent data output. The cross-
tabulated data output  consisted of rows  of
chemical information (e.g., CAS  registry
number and chemical  name)  and columns
of end points or effects, with the cross sec-
tion being the lowest dose at which the  effect
or end point was observed, that is, lowest
effect level (LEL) in mg/kg/day. Even though
NOAEL/LOAEL values can be queried from
the database, the current analysis uses LELs,
which do not  reflect the NOAEL/LOAEL
regulatory determinations  derived from
the studies and refer only to the minimum
dose at which a specific effect or group  of
effects occurs. We used administered dose
levels rather than molar concentrations  to
represent the chemically induced effects and
end points, because of uncertainties in the
pharmacokinetics linking administered dose
to tissue concentrations reinforcing the fact
that molecular weight alone  cannot substi-
tute for dosimetry. Additional transformation
of the dosing information was performed,
including log-based and binning methods for
potency. For example, we developed a bin-
ning method for illustrating relative potency
to provide information into the sensitivity of
the end point from the perspective of treat-
ment dose. To derive nonarbitrary dosing
intervals, LEL for body weight changes were
analyzed and separated into equivalent quin-
tile bins (data not shown). The resulting bins,
< 15, < 50, < 150, < 500, and > 500 mg/kg/
day, were then applied to all end points. For
instance, a chemical that caused liver hyper-
trophy at 5 mg/kg/day would be assigned a 5,
at 25 mg/kg/day a 4, and so on. If the effect
was not observed, then a zero was assigned.
Table 1. Summary statistics for chronic/cancer rat and mouse studies entered into ToxRefDB.
Study
Total chronic/cancer
Rat
Mouse
Chemicals
310
283
267
No. of
studies
577
298
279
Treatment
groups
7,340
4,228
3,059
Treatment groups
with effects
3,082
1,721
1,344
Effects3
19,537
12,215
7,416
Critical
effects*
3,119
1,816
1,303
aTotal number of effect type, target, and description combinations assigned to any treatment group. ^Effects that are cri-
teria for establishing the study-specific NOAEL/LOAEL.
            Rat
                • Low incidence of toxicty
                nThyroid and liver toxicants
                • Liver toxicants
                n Kidneytoxicants
                • Spleen and anemia toxicants
                niesticulartoxicants
                n Cholinesterase inhibitors
                • Body weight decrease
                                                    Mouse
            • Low incidence of toxicty
            n Lung tumorigens
            • Liver tumorigens
            • Spleen and anemia toxicants
            n Liver toxicants (general)
            n Cholinesterase inhibitors
            • Body weight decrease
                      Chemical
                                                                 Chemical
              ' f - •-''-•  I
               f jr. -.
                                   -i
                                 &
                                              Mouse
                -i         '
                             .
                     .    M3
           ,  _j
                                                             No effect       2,048     0.015625
                                                                        mg/kg/day   mg/kg/day
                                                                Lowest effect level (LEL): -log2(LEL)

Figure 1. Unsupervised two-way hierarchical clustering of 207 effects  in rat (A) and 112 effects in
mouse (6) with incidence > 5, for 310 chemicals with chronic/cancer toxicity data in ToxRefDB. Specific
clusters or classes based on associated toxicities are indicated by the color-coded chemical dendrogram:
seven clusters for rat, and six for mouse.
Additionally, log-transformed potency values
were derived using -Iog2  of LEL. We used
Iog2 to reflect the minimal dose spacing, that
is, doubling, typically used for in viva toxi-
cology studies. A constant value of 12 was
then added to zero-center the data, allow-
ing for zero to represent no observed effect.
Therefore, a value  of 1 would be equivalent
to an effect at 2,048 mg/kg/day and 18 would
be equivalent to  0.015625 mg/kg/day. The
resulting data formats are highly amenable to
statistical data analysis, including descriptive
and predictive data-mining algorithms.
    We carried out unsupervised two-way hier-
archical clustering  across all chemicals of all
effects with incidence greater than 5, as well as
selected end points, based on log-transformed
potency  values using Pearson's dissimilarity
measure  for both chemicals and effects. This
analysis used Ward's method for linkage (Ward
1963) and the agglomerative clustering method
as implemented in the Partek Discovery Suite
(Partek Inc., St. Louis, MO). In order to assess
statistically significant species  concordance
across different effects, a permutation study
was carried out. For each effect, the associa-
tion between chemical and effect for the cor-
responding rat and mouse study was randomly
permuted 1,000 times. We  recorded the cross-
species concordance for all simulations (per-
mutations) and compared it with the observed
concordance, thus  giving an estimate of the
concordance due purely to chance. Analyses
were carried out using R version 2.6.1  (Ihaka
and Gentleman 1996).
    An initial 10%  incidence cutoff was used
to filter out individual and groups of effects
for potential use in predictive modeling. This
cutoff was chosen  following the results of a
related simulation  study that demonstrated
high levels of sensitivity and specificity for
various machine learning methods on data
with at least a 10% hit rate for predicted end
points (Judson et al. 2008a). For other appli-
cations, it may be useful to  add less frequently
occurring effects and end points.

Results
Summary profiles of the ToxRefDB chronic/
cancer data set. To date, ToxRefDB has
captured in vivo mammalian toxicity study
information from DERs for 411 conventional
pesticide active  ingredients.  This present
analysis focuses on the systemic toxicity and
cancer end points culled from chronic/cancer
rat or mouse studies on 310 of the chemicals
entered into ToxRefDB. ToxRefDB enabled
analysis to be performed along lexicologically
related axes, including by chemical, study
type, species, and effect. Study duration, dos-
ing methods, data  quality, guideline adher-
ence, and sex were additional parameters for
data filtering.  In looking across all chronic/
cancer rat and mouse studies, we assigned
394
                           VOLUME 1171 NUMBER 3 I March 2009 • Environmental Health Perspectives
                                         Previous

-------
                                                                                      Profiling toxicity of chemicals using ToxRefDB
19,537 effects to  3,082 different treatment
groups in a total of 577 studies on 310 chemi-
cals (Table  1). Effects are a combination of
study type,  species, effect type, effect target,
and effect description for a given chemical,
for example, chronic/cancer, rat, neoplastic
pathology,  liver, and adenoma. Across  the
19,537 effects, 1,135  unique effects were
observed, of which 484 were  deemed criti-
cal effects,  that is, criteria for establishing
NOAEL/LOAEL,  in at least a single study.
   The ToxRefDB chronic/cancer data set on
310 chemicals contained approximately 20,000
observed effects in rat or mouse studies.  We
achieved a high-level view of a subset of these
data, and the relationships among chemical,
effect, and potency, by unsupervised two-way
hierarchical  clustering of 207 rat (Figure  1A)
and 112 mouse (Figure IB) effects. For  the
rat, the 283  chemicals separated into seven  dis-
tinct clusters or classes of the chemicals  based
on these toxicity profiles. Approximately 70
chemicals formed a cluster with an overall  low
incidence of toxicity, whereas the remaining
chemicals displayed a unique set of toxicologic
properties. More than 80 chemicals clustered
as hepatotoxicants, and a subset of these also
caused thyroid toxicity. Ten of the 15 conazole
fungicides analyzed were in this hepatoxicity
cluster.  Clusters of chemicals exhibiting kid-
ney, spleen/anemia, or testicular toxicities were
not enriched for a  specific chemical structural
class. Cholinesterase inhibitors clustered sepa-
rately from other chemicals and were enriched
for organophosphates.  In mouse, the 267
chemicals included clusters of cholinesterase
inhibitors, spleen/anemia toxicants, and  hepa-
totoxicants comparable with that observed for
rat. Of the  112 total effects clustered in  the
mouse, 28 of these were liver toxicities, dem-
onstrating the predominance of the liver as a
target organ in the mouse. The  unsupervised
clustering of rat and mouse effects identified
concentrations of effects and chemicals that
were emphasized in subsequent,  expert-driven
approaches to chemical classification.
    Toxicity-based classification of chemi-
cals. The distribution of effects across effect
types (Figure 2A) revealed that nonneoplastic
pathologies dominate determination  of sys-
temic NOAEL/LOAEL, demonstrating the
potential importance of this  class of effects
or end points to chemical regulation. The
percentage of chemicals positive  for an end
point in  both rat and mouse, over the total
positive  for the same end  point in only the
rat or mouse, was defined as "species concor-
dance." Species concordance for nonneoplastic
pathology was 68%. Of the  167 chemicals
that caused neoplastic lesions in  rat or  mouse
chronic/cancer studies, 35% caused neoplastic
lesions in both rat and  mouse. We observed
one or more pathologies in 273  of the 310
chemicals. The incidence of pathologic
response, analyzed by target organ and species,
was used to identify target organs for further
investigation (Figure 2B). More  than 90%  of
those 273 chemicals caused pathologies in the
liver, kidney, thyroid, lung, testis, or spleen.
   Whereas individual effects relating  to
highly detailed pathologic  outcomes  would
provide classifications with the  highest bio-
logical specificity, the limitations of classifying
chemicals based solely on  specific individ-
ual effects was apparent early in the analysis
of ToxRefDB  data. Only 11 specific, indi-
vidual pathologic effects were  observed for
more than 10% of the  chemicals (Table 2).
                  25       50      75      100
               Percent chemicals with observed
                        effect type
                  25      50       75
            Percent chemicals with observed
               pathology by target organ
Liver hypertrophy is the only common effect
across both  species based on a 10% inci-
dence cutoff. In  addition  to low incidences
of detailed pathologic effects, biases based on
study design and pathology nomenclature
limited the overall ability to compare chemi-
cal toxicities when we used individual effects.
Grouping or aggregating related or near-syn-
onymous terms, such as liver adenoma, com-
bined adenoma/carcinoma, and carcinoma,
resulted in more  informative and statistically
powerful sets of effects. Thus, the limitations
of classifying chemicals based solely on spe-
cific individual effects were addressed by cre-
ating biologically  related groupings of effects.
    Grouping tumor end points and extending
to include proliferative lesions. This  aggre-
gative approach  was illustrated by creating
groups of neoplastic end points and the exten-
sion of these  groups to include nonneoplastic
proliferative lesions. The aggregation of neo-
plastic effects for each target organ resulted
in an increase in the number of useful group-
ings beyond the individual  mouse liver tumor
effects shown in  Table 2.  However, the end
points were still limited to mouse liver and rat
thyroid neoplasia, based on an initial > 10%
incidence cutoff. Associating the neoplastic
end points with proliferative lesions increased
the number of target organs to include liver,
kidney,  thyroid, lung, and testes. In general,
only neoplastic lesions are  considered indica-
tive of rodent carcinogenicity. However,
including nonneoplastic proliferative lesions
provides a conservative  model for assessing
and predicting rodent tumorigenic poten-
tial, based on the assumption that prolonged
proliferative response leads  to eventual tumor
formation. A simulation study was performed
to assess whether the concordance between rat
and mouse effects occurred at a  rate greater
than chance  across neoplastic and  prolifera-
tive classifications. Extending tumorigenicity
groupings to include proliferative lesions
significantly increased species concordance
across numerous  target organs, including the
liver and kidney  [see Supplemental Material,

Table 2. Pathology observed for > 10% of ToxRefDB
chemicals in chronic/cancer rat and mouse studies.
                                                                                           Target organ
                                                                Effect
                                    Percent
                                    observed
Figure 2. ToxRefDB chronic/cancer incidence data summarized by effect type (A) and by target organ
pathology (6) for 310 chemicals with rat or mouse studies. Blue bars, total percentage of chemicals with
that observed effect; black bars, percentage of chemicals for which that effect was used to derive systemic
NOAEL/LOAEL levels.
Rat
 Liver      Hypertrophy                  25
 Kidney     Nephropathy                  14
 Liver      Vacuolization                 12
 Thyroid     Adenoma                    11
 Thyroid     Hyperplasia                  11
Mouse
 Liver      Hypertrophy                  25
 Liver      Adenoma                    21
 Liver      Necrosis                     16
 Liver      Adenoma/carcinoma combined    14
 Liver      Pigmentation                 14
 Liver      Carcinoma                   12
Environmental Health Perspectives •  VOLUME 117 I NUMBER 3 I March 2009
                                         Previous
                  TOC
                                                                                    395

-------
Martin et al.
Figure2 (http://www.ehponline.org/members/
2008/0800074/suppl.pdf)].
   Mapping oftoxicity end points to a can-
cer progression schema. Relationships between
effects and the relative severity of those effects
are not inherent to the database structure.
Figure 3A presents a conceptualization of the
end point progression schema in which chemi-
cals were scored from 0 to  5 for each target
organ, based on the severity of the effect, rang-
ing from no observed pathology (0) to neoplas-
tic lesions (5). End-point progression scoring
reduced  the possible chemical classifications
to a single ordinal score (i.e., scores 0—5) for
each target organ. Figure 3B presents the dis-
tribution of end-point progression scores for
rat and mouse, liver and kidney. Examples
of the impact of this scoring system include
resmethrin, which caused treatment-related
increases in a preneoplastic lesion (i.e., hyper-
plastic nodules) in the liver without progressing
to a tumor. In contrast, metaldehyde caused
treatment-related increases in liver tumors but
was not identified as causing any preneoplas-
tic lesions, even though preneoplastic lesions
can be assumed to have occurred as a precur-
sor event to liver tumor formation. Using the
end-point progression scoring system allowed
reasonable comparison of these two chemicals,
if desired, by linking the preneoplastic score
of 4 for resmethrin, to the neoplastic  score of
5 for  metaldehyde, along the continuum of
end-point progression. The  incidence of liver
pathology between rats and mice was com-
parable when we grouped end-point progres-
sion scores. More than  50% of the chemicals
tested resulted in a range  of nonneoplastic to
neoplastic lesions (i.e., scores 2-5). However,
the relative severity for liver pathologic pro-
gression  in mice was higher than in rats: 25
chemicals caused rat liver tumors, whereas 80
chemicals caused mouse liver tumors.
    Selected end points for predictive mod-
eling. In addition  to end points specific to
various target organs, chemicals were classi-
fied with respect to multigender,  multisite,
or multispecies tumorigenicity (Table 3). Of
the 310 chemicals in the chronic/cancer data
set for which 240  chemicals were tested in
both species,  167  chemicals were classified
as tumorigens; 109 of those chemicals were
multigender, multisite, or multispecies tumori-
gens. Of the 283 chemicals tested  in the rat,
42 chemicals  were classified as multigender
and multisite  tumorigens. Of 267 chemicals
tested in the mouse, 57 and 25 chemicals were
classified as multigender and multisite tumori-
gens, respectively. Of the 240 chemicals tested
in both species, 49 chemicals were classified as
multispecies tumorigens. The distribution of
relative potency values indicated that the rat
was commonly more sensitive than  the mouse
for multigender and multisite tumorigenicity.
In the  rat,  38% of  the multigender and 45%
of the  multisite incidences were at < 50 mg/
kg/day (i.e., relative potency values of 4—5),
compared with 23% and 28% in the mouse.
Conversely, 39% multigender and 28% multi-
site tumorigenicity occurred in the mouse at
> 500  mg/kg/day (i.e., relative potency value
1), compared  with  17% and 10%  in the rat.
Multispecies tumorigenicity was not achieved
at doses  <  15 mg/kg/day, and 41% of inci-
dences occurred at > 500 mg/kg/day.
                                        End point progression
          Mouse
            Rat
                                     100         150
                                         No. of chemicals
                                                            200
                                                                       250
                                                                                  300
Figure 3. (A) ToxRefDB systemic toxicity and cancer outcomes represented along an end-point progression
continuum. This schema was used to derive a severity score for each chemical based on the maximum
value within a target organ. (6) Based on end-point progression, 310 chemicals were scored for liver and
kidney pathology in rat and mouse chronic/cancer studies. Clinical chemistry used in this analysis is limited
to target-organ-specific analytes (e.g., alanine aminotransferase for liver, and urea nitrogen for kidney).
    Unsupervised and expert-driven approaches
to end-point selection and subsequent chemical
classification yielded near identical sets of target
organs from which to select specific effects or
aggregated effects. Based on incidence, severity,
potency, and significance, 25 end points from
chronic/cancer rat and mouse studies were
selected for subsequent ToxCast predictive
modeling (Figure 4A). The addition of multi-
species tumorigens raised the total to 26 end
points, each caused by 20 or more chemicals.
Besides the multispecies tumorigen end point,
16 of the end points were from rat studies
and 9 end points were from mouse. The same
four end points were characterized in both rat
and mouse liver, affording direct comparisons
across species for tumors, proliferative lesions,
apoptosis/necrosis, and hypertrophy. The only
other frequent target organ common to both
species was the kidney. Frequent rat-specific
target organs included thyroid,  testis, and
spleen, whereas the only target organ specific to
mouse was the lung. Unsupervised hierarchical
clustering of the 16 rat end points (Figure 4B)
and the 9 mouse end points (Figure 4C) dis-
played the relative distribution  of the selected
end points and chemicals. Of the 283  chemi-
cals with a rat chronic/cancer study, 218 were
positive in at least one of the selected end
points, whereas 155 of 276 chemicals with a
mouse cancer study were positive in at least
one selected end point. Rat and mouse end
points clustered primarily by target organ, with
distinct clusters of thyroid, spleen, kidney, and
liver toxicants in the rat. The high incidence of
liver tumorigens in the mouse drives chemical
groupings. However, chemicals  causing or not
causing liver hypertrophy and necrosis appear
to segregate into two large groups of liver toxi-
cants. In both species, the selected chronic/can-
cer end points represent the robust patterns of
toxicologicresponse shown in Figure 2A and B.
A full listing  of the chronic/cancer end points
derived from ToxRefDB for ToxCast predic-
tive modeling, with their associated LELs,
log-transformed potency, and relative potency
values, are available on the ToxRefDB home
page (U.S. EPA 2008c).

Discussion
Advancing alternative testing methods for
assessing chemical safety requires an informed
transition from the current toxicity testing to
systems that are higher throughput, more pre-
dictive, and not as dependent on the extensive
use of animals. To  support this transition,
we created ToxRefDB to capture a rich set
of existing in viva laboratory animal toxicity
data on a group of environmentally relevant,
well-studied chemicals.  Pesticide active ingre-
dients have comprehensive  toxicity profiles
that are opportune data sets  for creating a
bridge from in vivo to in vitro toxicology.
ToxRefDB digitizes and stores toxicity data
396
                            VOLUME 1171 NUMBER 3 I March 2009 •  Environmental Health Perspectives
                                         Previous

-------
                                                                                      Profiling  toxicity of chemicals using ToxRefDB
in a structured and searchable format, and
using structured data mining methods makes
these data a computable resource for predic-
tive toxicology efforts such as the U.S. EPA's
ToxCast program (U.S. EPA2008b).
    Individual toxicity effects based on unique
type, target, and description yielded only a small
number of in vivo end points across a significant
number of chemicals supportive of robust pre-
dictive modeling. However, grouping effects by
effect type and target often collapsed hundreds
of individual effects into a single end point,
common to dozens of chemicals. The goal was
to strike a balance between maintaining biologi-
cal specificity across a group of related effects
while increasing total incidence for effects
across a critical mass of chemicals. For example,
extending tumor end points to include prolif-
erative lesions increased not only total incidence
but also species concordance and thus  increased
confidence in characterizing a chemical's poten-
tial toxicity. Grouping  proliferative lesions
also addressed other potential factors, such as
changes in pathology nomenclature over time
(Wolf and Mann 2005) and reporting incon-
sistencies. Deriving end points based on groups
of effects yielded organ- and species-specific
end points in the liver, kidney, thyroid, testis,
spleen, and  lung in rats or mice with a high
enough incidence across ToxRefDB chemicals
to support predictive modeling.
   Another approach for addressing the
limitations  of profiling chemicals based on
Table 3. Multigender, multisite, and multispecies tumorigens in ToxRefDB.

Chemical
Carbaryl
Dipropyl
isocinchomeronate
Fentin
Dazomet
Clodinafop-propargyl
Lactofen
Dimethoate
Malathion
Diuron
Dacthal
Isoxaflutole
Spirodiclofen
Diclofop-methyl
Cinmethylin
Imazalil
Nitrapyrin
Propoxur
Daminozide
Thiacloprid
Vinclozolin
Di(2-ethylhexyl)phthalate
Folpet
MGK(octacide264)
Iprodione
Cacodylicacid
Propyzamide
Oxadiazon
Resmethrin
Pyrithiobac-sodium
Bentazone
Fluthiacet-methyl
Metaldehyde
Triflusulfuron-methyl
Fludioxonil
Prodiamine
Tepraloxydim
Clofencet-potassium
Isoxaben
Pymetrozine
Topramezone
Triadimefon
Oryzalin
Simazine
Tebufenpyrad
Dichloran
Dimethenamid
Prosulfuron
Acetochlor
Ametryn
Oxytetracycline HCI
Bifenthrin
Disulfoton
Metam-sodium
Quizalofop-ethyl
Rat
Multigender
2
1

5
5
4
3

4
4
2
2
2
4



2

4
3
1



5


2
1





1






4
4
4
3
3
3
3
2
1





Multisite
2
1

5
5
4
3
5
4
3
2
2
2
4
3
3

2

4
4

5
2


4



2





2
1




4
4
4
3
3
3
2
3
1




Mouse
Multigender Multisite
1 1
1 1

4
3
4
5
4 4
1
1
1
1
1

3
3
3 3

2 1


1
1
1
1 1


5



5
2
2
2



1
1
1
1









5 5
5 5
4 5
4 4

Multispecies
2
1

4
4
4
4
4
1
1
1
1
1
4
3
3
3
2
2
1
1
1
1
1
1
3
3
3
2
2
2
2
2
2
2
1
1
1
1
1
1
1














Chemical
Tribufos
Amitraz
Fenoxycarb
Spiroxamine
Tefluthrin
Permethrin
Trifloxystrobin
Chloridazon
Triforine
Dichlorvos
Pyraclostrobin
Alachlor
Captan
Maneb
Azafenidin
Lindane
Fluazinam
Paclobutrazol
Acephate
Linuron
Propanil
Triasulfuron
Fipronil
Thiabendazole
Boscalid
Pendimethalin
Pyrimethanil
5,5-Dimethylhydantoin
Cyazofamid
Chloropicrin
Fenamiphos
Molinate
Chlorpyrifos-methyl
Fluoxastrobin
Fenitrothion
Cyproconazole
Prochloraz
Thiamethoxam
Bispyribac-sodium
Piperonyl butoxide
Propiconazole
Acifluorfen-sodium
Difenoconazole
Primisulfuron-methyl
Pyraflufen-ethyl
Thiodicarb
Fenoxaprop-ethyl
Buprofezin
Propargite
Dichlobenil
Quintozene
Tralkoxydim
Benomyl
Cloprop
Thiophanate-methyl
Rat
Multigender









5
5
4
N
N








4
3
2
2
2
1
1



















4
2
2

N
N
N

Multisite









5
5
3
N
N















5
5
5
4
1

















3
N
N
N
Mouse

Multigender Multisite
3
3
3
3
3
2
2
1
1
N
N
N
3
2




















5
4
4
3
2
2
2
1
1
1
1
1


N
N
N
N
3
2
1
4
3
3
3
3
2
2
1
1
N
N
N
3
2
































4
2
N
N
N
N




Multispecies









N
N
N
N
N
4
4
3
3
2
2
2
1


























N
N
N
N
N
N
N
Relative potency: 5, < 15mg/kg/day;4, <50 mg/kg/day;3, < 150mg/kg/day; 2, <500mg/kg/day; 1,> 500 mg/kg/day; N, not assessed (no study available).
Environmental Health Perspectives •  VOLUME 117 I NUMBER 3 I March 2009
                                         Previous
                  TOC
                                                                                    397

-------
Martin et al.
individual toxicity effects was to compare the
severity of these effects across a continuum of
pathophysiology. Because the progression to
cancer (Hanahan and Weinberg 2000) and
organ-specific progression  to tumorigenicity
(Cohen  and  Arnold 2008) have been well
characterized, we created a five-point sever-
ity scoring system to encode this. Using this
approach, ToxRefDB provides a quantitative
value associated with the  key events in the
progression to tumor formation and cancer.
Incorporating additional information on the
severity of in vivo effects in ToxRefDB may
be fruitful in  future  modeling and predictive
toxicology efforts. Additional  data not cur-
rently in ToxRefDB,  including incidence data,
would have to be added for more detailed
dose-response analyses and assessment of the
magnitude of change  for specific effects.
    Because many of the tumors caused  by
chemical exposure  in ToxRefDB  occur at
high doses that are many orders of magnitude
removed from potential human exposures, it
is useful to also consider multigender, multi-
site, and multispecies tumorigenicity in the
course of evaluating  chemicals. Current U.S.
EPA cancer risk assessments use multisite and
multispecies tumorigenicity as indicators of
increased significance for tumor findings (U.S.
EPA 2005). Thus, the tumorigenic end points
selected for ToxCast predictive modeling
included multigender, multisite, and multi-
species tumorigens. Additional analyses of
these multiplicities in the tumorigenicity data
of ToxRefDB are under way, with the goal of
improving hazard assessments, chronic/cancer
study protocols, and future data requirements.
    Success in predicting target-organ—specific
effects in ToxCast will depend on numerous
factors, including the target, species,  and dose
response of the effects that are being predicted.
In the present analysis of ToxRefDB, we iden-
tified effects in the liver, kidney, thyroid, tes-
tis, spleen, and lung in rats or mice that we
will now attempt to predict using in vitro data
from ToxCast.  Because species concordance
of the in  vivo effects in  ToxRefDB was fairly
limited, success in predicting species-specific
versus multispecies effects will be an interest-
ing outcome of ToxCast. The dose responses
for selected end points are also provided by
ToxRefDB,  including  log-transformed
potency values conducive  to  computational
analysis, and relative potency values that facili-
tate comparisons across chemicals  and end
points. These quantitative data should facili-
tate development of new in vitro and in silica
methods to predict in vivo chemical toxicity.
    Although numerous studies have evalu-
ated the use of biochemical, cell-based, and
genomic assays to build predictive models of
toxicity, these efforts have usually been lim-
ited to only a partial view of the complex biol-
ogy underlying tissue, organ, or whole-animal
toxicity. By probing such a  broad spectrum of
biology in the hundreds of ToxCast assays, the
"toxicity signatures" will be optimally pre-
dictive and representative of a broad range
of in vivo toxicity end points. A variety of
statistical techniques and machine learning
approaches will be used to  mine this com-
plex data set for toxicity signatures with high
sensitivity and specificity. These include
linear discriminant analysis, support vector
machines, and  neural networks. In  addition
to these automated approaches, more hypoth-
esis-driven, biologically based signatures will
assist in filling the large gap between molecu-
lar and phenotypic end points. It is  expected
that assays of multiple types, probing multiple
pathways, will be required to predict  in vivo
toxicity across  a wide range of chemicals—
this is the approach taken within ToxCast
and ToxRefDB.
    ToxRefDB continues  to develop, add-
ing toxicity end points from additional study
types, including multigeneration reproductive
and prenatal developmental tests, for predictive
modeling in the ToxCast research program.
Besides expanding toxicity coverage to other
study types, ToxRefDB will expand in  chemi-
cal coverage  to include more nonpesticide
chemicals. As each of these ToxRefDB data
sets pass through U.S. EPA quality and clear-
ance processes, they will be made  publicly
available through peer-reviewed publications,
ToxRefDB home page, and ACToR.  The con-
tents of the entire database will be viewable and
searchable in the future through a Web-based
                                                                                           Chemical
 Rat
          Liver tumors
  Liver proliferative lesions
  Liver apoptosis/necrosis
       Liver hypertrophy
     Kidney nephropathy
 Kidney proliferative lesions
     Thyroid proliferative
        Thyroid tumors
     Thyroid hyperplasia
       Testiculartumors
       Testicular atrophy
       Spleen pathology
  Cholinesterase inhibition
           Tumorigen
     Multisite tumorigen
    Multigender tumorigen
 Mouse
          Liver tumors
  Liver proliferative lesions
  Liver apoptosis/necrosis
       Liver hypertrophy
       Kidney pathology
          Lung tumors
           Tumorigen
     Multisite tumorigen
    Multigender tumorigen
                                                                        Thyroid hyperplasia
                                                                        Thyroid tumors
                                                                        Thyroid proliferative lesions
                                                                        Testiculartumors
                                                                        Testicular atrophy
                                                                        Kidney proliferative lesions
                                                                        Spleen pathology
                                                                        Kidney nephropathy
                                                                        Cholinesterase inhibition
                                                                        Liver tumors
                                                                        Liver proliferative lesions
                                                                        Liver apoptosis/necrosis
                                                                        Liver hypertrophy
                                                                        Multigendertumorigen
                                                                        Multisite tumorigen
                                                                        Tumorigen
                                                                         Kidney pathology
                                                                         Liver necrosis/apoptosis
                                                                         Liver hypertrophy
                                                                         Lung tumors
                                                                         Multisite tumorigen
                                                                         Multigendertumorigen
                                                                         Liver tumors
                                                                         Liver proliferative lesions
                                                                         Tumorigen
                  Percent chemicals with observed end point
                  (Rat: 283 chemicals; mouse: 267 chemicals)
                                                                                                           Lowest effect level (LEL): -Log2(LEL)
                                                                                                     No effect
                                                                                                                    2,048
                                                                                                                  mg/kg/day
                                                                                  0.015625
                                                                                  mg/kg/day
Figure 4. (A) The 16 rat and 9 mouse ToxRefDB end points from chronic/cancer studies selected for ToxCast predictive modeling. Two-way hierarchical clustering
of the rat (B) and mouse (C) end points based on log-transformed potency values. Dose and potency values for all chemicals relative to these 25 end points are
provided on the ToxRefDB home page (U.S. EPA 2008c).
398
                            VOLUME 117 I NUMBER 3 I March 2009  *  Environmental Health Perspectives
                                          Previous
                   TOC

-------
                                                                                                   Profiling toxicity of chemicals  using ToxRefDB
query tool located on the ToxRefDB website
(U.S.EPA2008c).
    ToxRefDB offers unparalleled amounts
of legacy toxicity  information on environ-
mental chemicals  captured in a structured
format, providing a platform for repeated and
updated chemical characterizations. Creating
the ability to search and filter across 30 years'
worth of toxicity data required extensive
amounts of data normalization, annotation,
and curation and was made possible through
the  development of a robust standardized
vocabulary for the fields and data elements
within ToxRefDB.  In the  present study, we
used chronic toxicity data in ToxRefDB  to
derive toxicity profiles for the ToxCast phase I
chemicals, yielding a set of toxicity-based and
predictable end points.  In  future applications
of ToxRefDB, researchers,  risk assessors, and
regulators will use the database for retrospec-
tive and modeling projects looking across a
large landscape of chemical  and toxicity space.
                   REFERENCES

Cohen SM, Arnold LL. 2008. Cell proliferation and carcino-
    genesis. J Toxicol Pathol 21:1-7.
Collins FS, Gray GM, Bucher JR. 2008. Transforming environ-
    mental health protection. Science 319:906-907.
Dix DJ, Houck KA, Martin MT, Richard AM, Setzer RW, Kavlock
    RJ. 2007. The ToxCast program for prioritizing toxicity test-
    ing of environmental chemicals. Toxicol Sci 95:5-12.
Hanahan  D, Weinberg RA. 2000. The hallmarks of cancer. Cell
    100:57-70.
lhaka R, Gentleman R. 1996. R: a language for data analysis and
    graphics. J Comput Graph Stat 5: 299-314.
Judson RS, Elloumi F, Setzer RW, Li Z, Shah I. 2008a. A com-
    parison of machine learning algorithms for chemical toxic-
    ity classification using a simulated multi-scale data model.
    BMC Bioinformatics 9:241 doi:10.1186/1471-2105-9-241
    [Online 19 May 2008].
Judson RS, Richard AM, Dix DJ, Houck K, Elloumi F, Martin
    MT et  al. 2008b. ACToR—Aggregated  Computational
    Toxicology Resource. Toxicol Appl Pharmacol doi:10.1016/j.
    taap.2007.12.037 [Online 11 June 2008].
National  Research Council. 2007. Toxicity Testing  in the 21st
    Century: A Vision and a Strategy. Washington, DC:National
    Academies Press.
National  Toxicology Program. 2007.  Pathology Code Tables.
    Available: http://jones.mehs.nih.gov:8080/pct/  [accessed
    7 February 2008].
U.S. EPA (U.S. Environmental Protection Agency]. 1996. OPPTS
    Harmonized Test Guidelines Available: http://www.epa.gov/
    opptsfrs/publications/OPPTS_Harmonized/870_Health_
    Effects_Test_Guidelines/[accessed 10 January 2008].
U.S. EPA. 2005. Guidelinesfor Carcinogen Risk Assessment. Risk
    Assessment Forum. Washington, DC:U.S. Environmental
    Protection Agency.
U.S. EPA (U.S. Environmental  Protection Agency]. 2007.
    Distributed  Structure-Searchable Toxicity Database.
    Available: http://www.epa.gov/ncct/dsstox/ [accessed
    14 February 2008].
U.S. EPA (U.S. Environmental Protection Agency]. 2008a.
    ACToR Home. Available:  http://www.epa.gov/ncct/actor/
    [accessed 29 September 2008].
U.S. EPA (U.S. Environmental Protection Agency]. 2008b.
    ToxCast Program. Available: http://www.epa.gov/ncct/
    toxcast/ [accessed 12 June 2008].
U.S. EPA (U.S. Environmental Protection Agency]. 2008c.
    ToxRefDB Home. Available: http://www.epa.gov/ncct/
    toxrefdb/ [accessed 6 October 2008].
Ward JH. 1963. Hierarchical grouping to optimize an objective
    function. J Am Stat Assoc 58:236-244.
Wolf DC, Mann PC. 2005. Confounders in interpreting pathology
    for safety and risk assessment. Toxicol Appl Pharmacol
    202:302-308.
Yang C, Benz RD, Cheeseman MA. 2006. Landscape of current
    toxicity databases and database standards. Curr Opin
    Drug DiscovDev 9:124-133.
Environmental Health Perspectives •  VOLUME 117 I NUMBER 3 I March 2009
                                               Previous
                                                                                                399

-------
                                                Reproductive Toxicology 28 (2009) 209-219
 I  I N!  Ml  R
                                              Contents lists available at ScienceDirect
                                             Reproductive Toxicology
journal homepage: www.elsevier.com/locate/reprotox
Profiling the  activity of environmental chemicals in prenatal developmental
toxicity studies using  the U.S. EPA's ToxRefDB^

Thomas B. Knudsen3'*, Matthew T. Martin3, Robert J. Kavlock3, Richard S. Judson3,
David J. Dix3, Amar V. Singhb
' National Center for Computational Toxicology (NCCT), Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, JVC 27711, United States
b Lockheed Martin, contractor to NCCT, Research Triangle Park, JVC, United States
ARTICLE   INFO

Article history:
Received 20 March 2009
Received in revised form 31 March 2009
Accepted 31 March 2009
Available online 10 April 2009

Keywords:
Database
Environmental chemicals
Pesticides
Developmental toxicity
                                         ABSTRACT
       As the primary source for regulatory developmental toxicity information, prenatal studies characterize
       maternal effects and fetal endpoints including malformations, resorptions, and fetal weight reduction.
       Results from 383 rat and 368 rabbit prenatal studies on 387 chemicals, mostly pesticides, were entered
       into the U.S. Environmental Protection Agency's (EPA) Toxicity Reference Database (ToxRefDB) using
       harmonized terminology. An initial assessment of these data was performed with the goal of profiling
       environmental chemicals based on maternal and fetal endpoints for anchoring in vitro data provided in the
       EPA's ToxCast™ research program. Using 30 years worth of standard prenatal studies, maternal and fetal
       effects were culled from the database and analyzed by target-description fields and lowest effect levels
       (LELs). Focusing on inter-species comparison, the complexity of fetal target organ response to maternal
       dosing with environmental chemicals during the period of major organogenesis revealed hierarchical
       relationships. Of 283 chemicals tested in both species, 53 chemicals (18.7%) had LELs on development
       (dLEL) that were either specific, with no maternal toxicity (mLEL), or sensitive (dLEL
-------
210
                                            T.B. Knudsen et al. / Reproductive Toxicology 28 (2009) 209-219
hood. Standard practice for assessing disruptions in embryogenesis
involves  testing pregnant  laboratory  animals  of two  species,
typically rats  and  rabbits, exposed during the  period  of major
organogenesis  and evaluated just prior to term along with mon-
itoring maternal status throughout pregnancy. Under this design
the major manifestations of developmental toxicity may express as
one or more of a number of possible endpoints such as intrauterine
death, fetal growth retardation, structural variations and abnormal-
ities [5,6]. Predictive modeling of developmental toxicity requires a
computational framework that can integrate mechanistic data with
high-quality toxicity data from in vivo studies.
   EPA's Toxicity Reference Database  (ToxRefDB) has been imple-
mented with animal-based  toxicity data  from chronic/cancer rat
and cancer mouse studies [7], multi-generation  reproduction rat
studies [8], and prenatal developmental toxicity studies in rats and
rabbits (described here). The data has been manually entered from
source documents representing EPA's Office of Pesticide Programs
(OPP) reviews  of registrant-submitted guideline studies known
as Data Evaluation Records  (DERs). The initial build of ToxRefDB
entered these data for 387 chemicals that include 280 chemicals
within  Phase-I of EPA's ToxCast™  research program [4]. Here we
describe the implementation of ToxRefDB for prenatal developmen-
tal toxicity studies. Experimental  protocols in general follow EPA
Health Effects Test Guidelines OPPTS 870.3700 [9] or the preceding
OPP 83-3 guideline [10], and are similar to the OECD guideline for
prenatal developmental toxicity testing [11].
   Databases of birth defect registries [12,13], developmental toxi-
cology literature (http://rsi.ilsi.org/Projects/devtoxsar.htm) [14,15],
and animal studies (http://ntp.niehs.nih.gov/) [6,16] have generally
looked to identify relationships within classes of toxic agents, devel-
opmental outcomes, test species and human populations (see [17]).
ToxRefDB represents the first large-scale implementation of its kind
for profiling the activity of environmental chemicals based on a
comprehensive analysis of source data for a broad  range of end-
points relevant to EPA risk assessments, including developmental
toxicity.
   The present study describes the initial build of ToxRefDB for
prenatal  developmental  toxicity  studies  (herein  referred to as
ToxRefDB_prenatal). Through this implementation, a detailed anal-
ysis is possible to link observational relationships within classes
of toxic agents and  developmental outcomes in  rats  and rabbits.
Rather than an exhaustive analysis of chemical-endpoint linkages,
the present study was designed to identify and evaluate  key hier-
archical relationships  that represent the primary determinants of
developmental toxicity. Focusing on inter-species comparison, the
study goals were to evaluate: (1) the complexity of fetal target organ
response to maternal dosing with environmental chemicals  dur-
ing the period of maj or organogenesis, e.g., how many times target
endpoints were affected by chemical; (2) the relative sensitivity
and specificity of maternal and fetal parameters in comparing these
responses between rat and rabbit test species; and (3) how many
times each  chemical  was counted  by target  endpoint, e.g., how
many chemicals in the set produce a certain effect. Profiling devel-
opmental toxicity in this manner has revealed a number of findings
that are consistent with previous database studies of developmen-
tal toxicology, some that differ with those studies, and some novel
relationships. The novel data model reported here is envisaged to
provide an important public resource for mechanistic modeling and
predictive understanding  of developmental processes and toxici-
ties.

2.  Implementation and methods

2.1.  Data sources

   The initial build of ToxRefDB [7] indexed 4618  DERs of different study types
(chronic/cancer, sub-chronic, reproductive, developmental) and test species (mostly
    rat, mouse, and rabbit). The DERs were from EPA's Office of Pesticide Programs
    (OPP) within the Office of Prevention, Pesticides and Toxic Substances (OPPTS).
    Source DERs consisted of 3775 printed documents optically scanned to ".tiff flies;
    837 WordPerfect documents; and 6 documents in other electronic file formats.
    Each DER was indexed by filename convention of Pesticide Chemical (PC) Code,
    Master Record Identification (MRID)  number, study type identification number
    (based on most relevant 870 series OPPTS harmonized health effect guidelines),
    species code, and review  identification number and version code [7,8]. Informa-
    tion on chemical identity and structure was provided by the EPA DSSTox program
    (www.epa.gov/ncct/dsstox/index.html). The work described here specifically covers
    a subset of 1318 DERs indexed for 'prenatal developmental toxicity' denoted by the
    filename extension 3700, and subsequently referred to here as ToxRefDB.prenatal.

    2.2. Source vocabulary

       The use of standardized nomenclature is essential for ToxRefDB operations. An
    internationally harmonized terminology for developmental toxicology was estab-
    lished in 1997 by the International Federation of Teratology Societies (IFTS) [18].
    A subsequent series of workshops on terminology development eliminated certain
    ambiguities and established working definitions for malformations and variations
    [ 19-21 ]. The DevTox lexicon was downloaded from www.DevTox.org. An enhanced
    annotation system was used by ToxRefDB in which 895 terms from the harmonized
    nomenclature was joined with standardized terms from the OECD-OPPTS vocabu-
    lary 111 ] to generate a thesaurus of 988 non-redundant terms that apply to maternal
    and developmental  endpoints. In the enhanced system, 'description' annotates the
    particular apical endpoint  or phenotype (observation) and 'target' annotates coarse
    regional anatomy (localization). The description-target fields represent the basic
    observational effects entered into ToxRefDB.prenatal.

    2.3. Data entry and  quality assurance

       The data entry tool was developed in Microsoft Access® and implemented using
    an open source MySQL™ platform [7]. The relational model took inputs from ToxML
    [22] and included metrics for data integrity, quality, updateability, and standardiza-
    tion. Quality control (QC)  consisted of 100% cross-checking of studies, systematic
    updates of ToxRefDB to ensure consistency across the studies, expert review of data
    outputs, and  external review by registrants. All data entered into ToxRefDB have
    undergone cross-checking, which entailed a second person validating each entered
    value based on the source  information (primarily DERs). Systematic quality control
    involved querying the database for potential inconsistencies (e.g., fetal effects being
    assigned to the maternal treatment group) along with updating vocabularies and
    related records [7[.

    2.4. Source information

       The 1318 DERs  for prenatal developmental studies encompassed 4896 dose
    groups. Doses were expressed as mg/kg-d where available since the studies were
    conducted via gavage. Endpoint parameters entered as 'adult' included maternal
    body weight gain, food and water consumption, fertility and pregnancy, and other
    general maternal effects. Parameters entered as 'fetal' included fetal weight reduc-
    tion, skeletal variations, malformations and  other pathologies. Due to the two
    annotation systems  used to enter data into ToxRefDB (DevTox.org and OECD-OPPTS)
    different expressions of fetal wastage cross-reference maternal and conceptal fea-
    tures. For that reason, the present analysis lumped all expressions of fetal wastage,
    including pre-implantation loss, implantation failure, resorptions, fetal death, preg-
    nancy loss as a maternal feature under the category of'pregnancy-related losses'.
    Any maternal or fetal outcome tagged as a'critical effect' inToxRefDB was reported in
    the DER to have occurred at the minimum dose for which any specific effect or group
    of effects had been observed (LEL, Lowest Effect Level); the  next lowest dose being
    the NEL (No Effect Level). Although some of these dose levels may represent the
    NOAEL and LOAEL (No Observed Adverse Effect Level, and Lowest Observed Adverse
    Effect Level) used for risk assessment purposes across different study types, the ter-
    minology used here (NEL,  LEL) is intended specifically to rank chemical endpoints
    and endpoint-combinations on maternal and fetal parameters. As such, these terms
    are used without regulatory implications.

    2.5. ToxRefDB data extraction

       Relational data  were expressed using specific SQL™ queries and global data
    dump to a sortable  data grid having rows of exposure conditions and columns of
    input/output criteria. Input source information included details on study design such
    as unique study identifier (MRID), chemical CAS registry number (CASRN), route of
    administration, exposure window, and dose level (mg/kg-d). Output endpoint effects
    included details on  evaluation criteria such as the biological compartment (adult,
    fetus), type of effects (developmental, systemic), their localization and phenotype
    (target, description), and any LEL noted (maternal mLEL, or developmental dLEL).
    Because ToxRefDB entered data for description-target effects individually when
    more than one effect may have occurred within the same  fetus or litter, the data
    grid replicated rows if more than one treatment-related effect was entered for the
    same dose group in  a particular study.
                                            Previous
TOC

-------
                                             T.B. Knudsen et al. / Reproductive Toxicology 28 (2009) 209-219
                                                                                                                                        211
Table 1
Summary statistics for prenatal developmental toxicity studies entered into ToxRefDB (December31, 2008).
ToxRefDB_prenatal'
                                                     Rat
                                                                       Rabbit
                                                                                        Ratio1
                                                                                                          Normalized^
                                                                          Species biash
A. Input source information
  Number of studies entered (MRID)
    > Studies passing acceptability criteriab
  Number of chemicals represented (CASRN)
383
357
372
368
325
320
1.04
1.10
1.16
0.00
0.08
0.16
Number of dose groups (mg/kg-d) represented0
> Dose groups by oral administration"1
B. Output endpoint effects
Number of dose-effect groups recorded6
> Maternal endpoint effects (pregnancy)
> Reduced maternal weight gain
> Resorptions/fetal loss
> Fetal endpoint effects (developmental)
> Fetal weight reduction
> Developmental defects
2469
2463

5592
2429
596
262
1588
182
1383
2327
2307

4749
2462
482
498
716
95
611
1.06
1.07

1.18
0.99
1.24
0.53
2.22
1.92
2.26
0.03
0.04

0.15
-0.10
0.23
-1.00
1.06
0.86
1.09




Rabbit

Rabbit
Rat
Rat
Rat
   Denotes prenatal developmental toxicity studies.
   Acceptable guideline pre-1998, acceptable guideline post-1998, acceptable non-guideline.
   Adult and fetal evaluation in the same study counts as separate dose groups.
   Includes gavage, intubation, feed; residual routes of administration were dermal, subcutaneous, or not indicated.
   Total number of endpoint effects for mother, fetus or pups recorded across all dose groups.
   Rat to rabbit ratio.
  : Inputs normalized to studies entered and outputs normalized to dose groups (mg/kg-d) represented and Iog2-transformed.
   Rule  for species bias = normalized ratio falls outside the range -0.02 < ratio <+0.46 based on 95% confidence interval on the mean.
2.6. Data analysis and visualization

   Associations for exposure-effect and effect-effect were built across dose groups
for all prenatal developmental studies.  Summary statistics for each treatment-
related effect extracted from ToxRefDB was based on target-description entity and
its higher level classification by embryological system. Dose values in mg/kg-d were
used to calculate potency for each chemical in rat and/or rabbit at the maternal,
developmental, and categorical LEL. For  consistency the data transformation rule
used here was the same one applied in  the chronic/cancer and multi-generation
reproduction studies [7,8]: LEL mg/kg-d extracted when available for maternal and
developmental effects, else blank; compute -log2(LEL mg/kg-d), else zero (LTD
present) or blank; and add constant = 12  to scale the data between ~0.0 (very low
LEL) and ~20 (very high LEL). Unsupervised two-way hierarchical clustering used
Pearson's dissimilarity measure for both chemicals and effects. This analysis used
Ward's method for linkage [23] and the agglomerative clustering method and was
implemented in R version 2.6.1 [24]. Clusters of chemicals were identified based on
a distance height cutoff of four.

3. Results

3.1. Input source information

   Table  1A summarizes several fields of input source informa-
tion for ToxRefDB.prenatal. With few exceptions (<2% DERs)  the
information entered derived from studies that evaluated chem-
ical effects in  pregnant rats or  rabbits.  The  number  of studies
entered by MRID was 383 for rat and 368 for rabbit, with the few
exceptions being mouse, hamster and other species. Due to the pre-
dominance of rat and rabbit studies we focus here on these two
species. OPP acceptability criteria [10] designated 79.1% studies
as acceptable/guideline (pre-1998);  5.5% as acceptable/guideline
(post-1998); 6.3% as acceptable/non-guideline; 3.5% with deficient
evaluation; and 5.6% as unacceptable. No attempt was made to re-
evaluate the acceptability or deficiencies of the studies relative to
the different guidelines; as such, the results for  all studies were
considered as presented in DERs.
   The number of  chemicals  by  CASRN represented  across rat
or rabbit studies was 387. About  280  of these overlap with 320
ToxCast™  phase-I chemicals;  therefore,  a number of chemicals
entered into ToxRefDB.prenatal  are not currently represented in
ToxCast™ and some ToxCast™ chemicals cannot draw on ToxRefDB
for direct information on developmental effects. Although the preg-
nant  female  was  the usual  exposure  unit for these  studies we
                  emphasize total number of dose groups (mg/kg-d) counted across
                  chemicals: 2469 for rat studies and 2307 for rabbit studies (Table 1).
                  As such, dose groups are replicated in  adult and fetal evaluations
                  for the same study. The usual (>97.7%)  route of exposure was oral
                  (gavage, intubation, feed).
                     Information about dosing interval (start, finish, duration) is sum-
                  marized schematically in Fig. 1. This plots the exposure design  in
                  ToxRefDB.prenatal based on cumulative start and completion dates
                                          dosing period
                    RAT
                   RABBIT
                                           10       15
                                           gestation days
                                                                     25
                                                                              30
                  Fig. 1. Exposure design summarized for all dose groups entered into ToxRefDB pre-
                  natal developmental toxicity studies. Gestational period for rat and rabbit mapped by
                  gestation days (CD) from fertilization (CD 0) through usual parturition, CD 21-22
                  in rat and CD 30-32 in rabbit. The gray shaded histogram graphs the cumulative
                  number of dosing groups across DERs for rat (2469) and rabbit (2327). The fuzzy
                  exposure window across ToxRefDB is indicated by the most frequent start and finish
                  days for dosing (e.g., CD 6-17 in rat and CD 6-20 in rabbit). Superimposed is the
                  period of dosing for ICH 4.1.3 Segment II study covering primitive streak formation
                  through palatal closure. The shaded arrowhead denotes the usual time of evaluation
                  inguideline rat (CD 20) and rabbit (CD 29) studies. A few ToxRefDB studies extended
                  into the postnatal period for  rats (postnatal day 3, line) and rabbits (postnatal day
                  42, not indicated).
                                         Previous
          TOC

-------
212
                                         T.B. Knudsen et al. / Reproductive Toxicology 28 (2009) 209-219
Table 2
Distribution of developmental effects across ToxRefDB dose groups.
ToxRefDB-prenatai
Number of dose groups (mg/kg-d) represented
Dose-effect groups with fetal (developmental) effects1
Skeletal defects
> Appendicular
> Axial
> Cranial
Orofacial defects
> Cleft lip/cleft palate
Neurosensory defects
> Brain
> Optic
Cardiovascular defects
> Heart
> Major vessels
Urogenital defects
> Renal
> Ureter
> Genital
Other visceral defects (splanchnic)
Body wall defects (somatic)
Rat
2469
1404
956
185
640
126
41
19
28
15
13
8
6
2
86
42
40
4
16
53
Rabbit
2327
709
366
77
241
48
18
5
22
13
9
10
5
5
5
2
2
1
23
14
Ratiob
1.06
1.98
2.61
2.40
2.66
2.63
2.28
3.80
1.27
1.15
1.44
0.80
1.20
0.40
17.20
21.00
20.00
4.00
0.70
3.79
Normalized^
-
0.90
1.30
1.18
1.32
1.31
1.10
1.84
0.26
0.12
0.45
-0.41
0.18
-1.41
4.02
4.31
4.24
1.92
-0.61
1.84
Species biasd








Rabbit
Rabbit
Rabbit
Rabbit
Rabbit
Rabbit
Rat
Rat
Rat
(Rat)
Rabbit

   Description-target terms for developmental effects (DevTox) integrated by systems-based ontology.
   Rat to rabbit ratio.
   Normalized to number of dose groups (mg/kg-d) represented and Iog2-transformed.
   Rule for species bias = normalized ratio falls outside the range 0.60 < ratio < 1.91 based on 95% confidence interval on the mean.
of oral dosing across gestation in 2469 dose groups for the rat and
2327 dose groups for the rabbit. In general, the dosing period for
OECD guidelines coincided with the ICH 4.1.3 Segment II study
guidelines to cover major events in morphogenesis and organogen-
esis for these species; however, a number of studies in rabbit had an
earlier onset than these guidelines and in both species a minority
of studies extended treatment to near term or postnatal days.

3.2. Output endpoint effects

   Table IB summarizes several fields of treatment-related effects.
These are sorted by adult and fetal compartments (Table IB). The
endpoint effects recorded for the pregnant dam or doe included
general maternal parameters, food and water consumption,  and
body weight gain as well as pregnancy-related indicators such as
increased resorptions and fetal wastage (pre- /post-implantation
losses, resorptions, intrauterine deaths). Fetal endpoints included
fetal weight reduction, structural abnormalities or variations, and
general fetal pathology. A small number of newborn observations
are not considered here. It should also be noted that the post-
treatment gestational interval was relatively longer in rabbits than
rats (Fig. 1); hence, any treatment-related effects that might have
been reversible or associated with developmental delay are more
likely to be detected in rats.
   Summarizing endpoint effects by dose group is a logical way to
array the response between rat and rabbit across a large number
of conditions. In this approach, each endpoint effect is linked to
a discrete condition (dose, chemical, species). Counting the num-
ber of effect-condition linkages in a class or subclass of endpoints
provides a qualitative measure of that response by its representa-
tion across the chemical spectrum. It is important to recognize that
ToxRefDB data entry tracks individual  target-description entities
for each discrete condition. As such, this read-out can artificially
amplify or de-emphasize specific classes of endpoints. ToxRefDB
registered 5592 effect-condition linkages in rat (average 2.3 effects
per non-zero dose group) and 4749 effect-linkages in rabbit (aver-
age 2.0 effects per non-zero dose group). Aggregating these effects
into higher  level classifications  revealed obvious species differ-
    ences in the representation of endpoints sorted by adult and fetal
    compartment (Table IB). Using a simple rule to compute over-
    representation, resorptions and fetal losses were more prevalent
    in rabbit, and fetal weight reduction and developmental defects in
    the rat. A species bias could arise from complex factors such as bio-
    logical variation in embryology, differences in maternal behavior
    or physiology, sensitivity to various xenobiotic disturbances, or the
    time between dosing and evaluation.

    3.3. Developmental defects

       To gain deeper insight into the species  response we next exam-
    ined the spectrum of developmental (fetal) effects across all studies.
    Individual effect-condition linkages were  counted for 988 features
    in the enhanced DevTox thesaurus. This iterated 1404 and 709
    developmental  (fetal) effects across dose groups in rat and rab-
    bit, respectively and covered 293 of 988 (29.7%) target-description
    terms. Representation of individual effects and their occurrences
    in ToxRefDB is dependent on the nature of the embryological sys-
    tems  from which the observation was originally made. Skeletal
    defects, for example, are highly represented in part because most
    bone elements  are entered into the database as individual targets
    (vertebrae, ribs, femur and so forth) and then further annotated
    by a range of elementary descriptions (absent, incomplete ossifi-
    cation, misshapen, bent and so forth). Other systems with isolated
    occurrences of malformation such as the  heart, brain or eye  have
    relatively low representation in part because they are annotated as
    individual targets.
       Given this caveat, we aggregated defects into specific embry-
    ological systems and focused on this representation across species
    (Table 2). Cross-species differences exist for some effects or groups
    of effects aggregated by target system. Although we did not analyze
    each skeletal element by abnormality or variation, aggregating the
    individual occurrences into regional anatomy showed a similar dis-
    tribution of response across species (axial > appendicular > cranial).
    For regional orofacial defects (palate, jaw, hyoid) we find over-
    representation of cleft palate in the rat. Urogenital defects (renal,
    ureter, reproductive) are also highly over-represented in rat and
                                        Previous
TOC

-------
                                        T.B. Knudsen et al. / Reproductive Toxicology 28 (2009) 209-219
                                                                                                                          213
Table 3
Distribution of LEL effects across ToxRefDB dose groups.
ToxRefDB-prenatal
Dose-effect groups at mlEl or dlEl (total effects)
Dose-effect groups atmLEL (maternal)
Maternal body weight gain
Maternal-pregnancy losses
Embryo-fetal losses
Dose-effect groups atdlEl (fetal)
Fetal weight reduction
Variations and abnormalities
> Skeletal defects
> Appendicular
> Axial
> Cranial
> Urogenital defects
> Renal
> Ureter
> Genital
> Orofacial defects
> Cleft lip / cleft palate
> Neurosensory defects
> Brain
> Eye
> Cardiovascular defects
> Heart
> Major vessels
> Other visceral defects (splanchnic)
> Body wall defects (somatic)
Rat
1711
996
976
28
45
715
95
609
492
97
320
66
34
17
10
7
20
10
6
4
2
4
2
2
3
9
Rabbit
1363
1013
991
115
91
350
57
288
189
36
120
25
4
2
2
0
11
2
11
8
3
7
3
4
10
0
Ratio
1.26
0.98
0.98
0.24
0.49
1.62
1.32
1.68
2.07
2.14
2.12
2.10
6.75
6.75
3.97
>5.56
1.44
3.97
0.43
0.40
0.53
0.45
0.53
0.40
0.24
>7.14
Normalized1
-
-0.36
-0.36
-2.37
-1.35
0.70
0.40
0.75
1.05
1.10
1.08
1.07
2.75
2.75
1.99
>2.47
0.53
1.99
-1.21
-1.33
-0.92
-1.14
-0.92
-1.33
-2.07
>2.84
Species biasb

Rabbit
Rabbit
Rabbit
Rabbit



Rat
Rat
Rat
Rat
Rat
Rat
Rat
Rat

Rat
Rabbit
Rabbit
Rabbit
Rabbit
Rabbit
Rabbit
Rabbit
Rat
   Normalized to number of dose groups with critical effects and Iog2-transformed.
   iNoiiiianzeu LO nuiiiuei 01 uose gioups wan cinicai eiiecis anu logz-Liansioinieu.
   Rule for species bias = normalized ratio falls outside the range -0.26 < ratio < 0.88 based on 95% confidence interval on the mean.
somatic body wall defects  (e.g., umbilical hernia, diaphragmatic
hernia) nearly so. In contrast, neurosensory defects (brain, eye),
cardiovascular defects (heart, great vessels)  and defects  of the
splanchnic body wall (abdominal and thoracic viscera) are over-
represented in rabbit (Table 2). Incomplete ossification and missing
skeletal elements are the most frequent observational terms in both
species (not shown).
   Would a similar discordance follow endpoint effects limited to the
dLEL? We  looked at  the defects that occurred at the LEL for any
effect versus all effects no matter what dose (above). ToxRefDB iter-
ated dLEL effects in 609 and 288 effect-conditions for rat and rabbit
fetuses, respectively and covered 212 of 988 (21.5%) description-
target terms.  LELs were recorded for 202 of 383 (52.7%) chemicals
tested in rat (3.2 effects per chemical on average) and 193  of 368
(52.4%) chemicals tested in rabbit (2.3 effects per chemical on aver-
age). Although total occurrence of effects or groups of effects was
predictably lower at the dLEL than when all dose groups are con-
sidered, the species  dissymmetry was  evident  (Table 3). Similar
to the analysis noted above, skeletal defects, cleft palate, urogen-
ital defects, and somatic body wall defects predominated  in the
rat, whereas  neurosensory defects, defects of the cardiovascular
system and splanchnic (visceral body wall) defects sided toward
rabbit. Effects on maternal body weight gain and fetal weight reduc-
tion were species-neutral although maternal-pregnancy losses and
embryo-fetal losses were more evident in rabbits (Table 3). We may
conclude from these findings that patterns of effects seen at the LEL
are also manifested at higher doses.

3.4. Developmental activity

   Profiling chemicals by developmental toxicity is an important
output from ToxRefDB; however, this analysis must consider that
developmental effects may not  be the most sensitive endpoint  in
      the database. For example, any particular chemical may be highly
      ranked based on dLEL but express an even lower NOAEL/LOAEL in
      other types of studies. Capacity for developmental activity does
      not necessarily indicate that the endpoint identified in a prena-
      tal study is the most sensitive endpoint in the database of for that
      matter the most appropriate for the exposure scenarios being eval-
      uated. Furthermore, developmental activity does not necessarily
      equate to risk since this  analysis  does not take into considera-
      tion exposure  to the human population, a key element in the
      determination of risk. To  evaluate the developmental  activity of
      chemical responses, we applied a  rules-based approach  adapted
      from ToxRefDB chronic/cancer and  multi-generation  reproduc-
      tion studies [7,8]. A chemical response was ranked by LEL dose
      level for mLEL and dLEL  effects using the value, &, represent-
      ing -Iog2(mg/kg-d) computed for each chemical, median-centered
      and scaled  (e.g., (9 = 1.0 when LEL = 2048 mg/kg-d and  0 = 18.0
      when  LEL = 0.015625 mg/kg-d). This derived parameter (0) was
      useful as a general  metric for representing chemical  activity
      in a computable form, based on  the administered dose at the
      LEL. Fig. 2 correlates mLEL and dLEL for within- and between-
      species; 283 chemicals had computable ® based on  mLEL and
      dLEL. This implies  any sort of treatment-related effect whether
      developmental or not. We  generally considered correlations inside
      2-fold as  concordant within studies (maternal versus fetal) and
      10-fold between  studies  (rat versus  rabbit).  The  correlation
      ranged slightly toward the maternal  field in  rabbits  (Fig. 2).
      For a subset of chemicals, however, the LEL effects were sensi-
      tive or specific for developmental endpoints  and are described
      below.
         Which  chemicals had LEL effects  that were developmentally spe-
      cific? A response was  considered 'specific' for developmental
      toxicity if an effect or class of effects was recorded at the  dLEL
      (@>2.0) without maternal toxicity. Benomyl, for example, was
                                     Previous
TOC

-------
214
                                         T.B. Knudsen et al / Reproductive Toxicology 28 (2009) 209-219
Rabbil








/
/.
/
•













/
>







/
t
..
*







/
«
•J
*
•
!




*
/*
•
'.•
:.
/
*
/
2'
/








j/
/-
2
/ *
k»/
i/«
«. /




*
y
/

*4

^








,
/
'•


X
' »







/
/

•

*
/
t
*







/'




/








/




/




/
/




/


.










/












/










/







Rat



/










0.03125
0.0625
0.125 r
0.25 1
0.5 g
J 1
4
8 ft
<
16 2.
32 f
64 "S-
M *-
128 1-
256 3
512
1024
2048

                                            Maternal
                                                            sc
                                                            a:
J-1 —
n











f
'



;velopmental








/
/

$\
* •
.*
in/







7


/'
4

/






•
/


/
:
•
/





y
y
/ ,
• y
*''t *
£ '
;*^7y
t! y
2








^
7


•


/









,
/


*
'
•
/

.







,
/*


/

»
/










/


*


/










/





/






/
/


•

/
/














-,













/









Maternal





—* 0.03125
0.0625
— ; 0.125
- 0.5
1


&








— Mf
                                                                                                            i
                                                                                                            €.
                                                                         —   '^ — 00
                                            U. k) — S O O
                          Lowest Effect Level (mg/kg/day)

              NE-Nol Established
                                                      —
                                                 i u *•> ui
                                                  K N
                        Kt — p p p p p

                              'vi IO ^ U* —
         Lowest Effect Level (mg/kg/day)"
                                                                                                  —
                                                                                                  "
Fig. 2. Chemical potency for maternal and developmental toxicity. Maternal and developmental lowest effect levels (LELs) extracted from ToxRefDB were compared for rat
and rabbit for 386 distinct chemicals. Data were graphed between studies to correlate rat and rabbit studies for maternal LEL (top left panel) and developmental LEL (top right
panel), as well as within studies to compare maternal and developmental LELs for rat studies (lower left panel) and rabbit studies (lower right panel). Points within amber
lines indicate less than 2-fold difference and points within orange lines indicate less than 10-fold difference. NE, not established due to lack of observation of effects.
flagged for specific developmental toxicity in the rat because
dLEL = 62.5mg/kg-d (© = 6.03) without maternal effects. Overall
this  rule flagged 16 unique chemicals of 283  chemicals tested
(5.7%) in both species. In 11  of the 16 instances, the chemical pro-
duced maternal toxicity in  the  other  test species. Benomyl had
no maternal toxicity in the rat but mLEL = 180 mg/kg-d in the rab-
bit.
   Which chemicals had LEL  effects that were developmental sen-
sitive? A response was considered 'sensitive' for  developmental
toxicity if the dLEL dose for any fetal endpoint was lower than the
corresponding mLEL for the chemical within a species. Prodiamine,
for example, was flagged for developmental sensitivity in the rat
because dLEL= 100mg/kg-d  (© = 5.36)  versus mLEL = 300 mg/kg-d
(© = 3.77).  Overall this rule  flagged 38 of 283  chemicals (13.4%)
in rat (30) or rabbit  (11).  Only 4 of them matched between
species (dLEL mLEL)  across species. Prodiamine showed greater potency
toward maternal effects in the rabbit (mLEL = 100 mg/kg-d).
   In total: 53 of 283 chemicals (18.7%) had  critical effects on
development that were either specific (no maternal toxicity) or
sensitive (dLEL< mLEL) to exposure in one species or another; in
43 cases a ©-value was computed for maternal and developmental
effects in both species (see Table 4). The complete list of m/d-LEL
and ©-values for all ToxRefDB chemicals can be downloaded from
http: //www.e pa.gov/ncct/toxre fdb/.
    3.5. Chemical-phenotype linkages

       Linkage classification clustered the LEL effects from 283 chem-
    icals utilizing maternal and developmental potency scores (©), as
    well as categorical LELs (cLEL) registered for each class of develop-
    mental effect. Hierarchical clustering is shown in Fig. 3. The primary
    division segmented 90 chemicals based on maternal toxicity. This
    grouping included chemicals with high © scores for reduced mater-
    nal body weight  gain, pregnancy-related losses and resorptions.
    Distinct responses in rabbit and rat further grouped these 90 chem-
    icals into subclasses of 40 and 50 chemicals,  respectively. There
    were 9 chemicals without maternal or fetal effects. All remaining
    chemicals shared  relatively high ©(dLEL) scores: 80 chemicals pos-
    itive for developmental toxicity in the rat, 50 chemicals positive
    in the rabbit, and  54 chemicals with developmental toxicity across
    species.
       How well do the different endpoint effects align with develop-
    mental potency? Most chemicals with higher ©(dLEL) scores had
    high ©(mLEL) scores as well. The overlap between chemicals in
    these clusters and the 43 chemicals flagged  for  developmental
    activity is given in Table 4. Whereas reduction in maternal  body
    weight correlated with ©(mLEL), fetal weight reduction correlated
    with ©(dLEL). We did not observe any correlation between weight
    changes in the pregnant mother and developing fetus at term for
    either species. It is clear from Fig. 3 that a second level of clustering is
                                        Previous
TOC

-------
                                         T.B. Knudsen et al. / Reproductive Toxicology 28 (2009) 209-219
                                                                                                                            215
Table 4
Chemicals (rat and rabbit data) tagged for developmental sensitivity or specificity.
CAS no.
Chemical name
RAT (mg/kg-d)
mlEl
RABBIT (mg/kg-d)
dLEL
mLEL
dLEL
RAT(0)
MAT
RABBIT (0)
DEV
MAT
DEV
CLUSTER-SO developmental toxicity in rat (subcluster of 17 chemicals)
34256-82-1
17804-35-2
3691-35-8
120116-88-3
83657-24-3
131341-86-1
103361-09-7
361377-29-9
85509-19-9
93-65-2
76738-62-0
86209-51-0
29091-21-2
28434-00-6
148477-71-8
149979-41-9
55219-65-3
Acetochlor
Benomyl
Chlorophacinone
Cyazofamid
Diniconazole
Fludioxonil
Flumioxazin
Fluoxastrobin
Flusilazole
MCPP acid
Paclobutrazol
Primisulfuron-methyl
Prodiamine
S-Bioallethrin
Spirodiclofen
Tepraloxydim
Triadimenol
600

0.1

20
1000
30
1000
50



300
195

360
15
150
62.5
0.0125
1000
1
100
10
300
0.4
125
40
500
100
50
1000
120
5

180
0.025


100
3000
400
35
75
125
300
100
300
300
180
125

180
0.025





35

125






2.77
0.00
15.32
0.00
7.68
2.03
7.09
2.03
6.36
0.00
0.00
0.00
3.77
4.39
0.00
3.51
8.09
4.77
6.03
18.32
2.03
12.00
5.36
8.68
3.77
13.32
5.03
6.68
3.03
5.36
6.36
2.03
5.09
9.68
0.00
4.51
17.32
0.00
0.00
5.36
0.45
3.36
6.87
5.77
5.03
3.77
5.36
3.77
3.77
4.51
5.03
0.00
4.51
17.32
0.00
0.00
0.00
0.00
0.00
6.87
0.00
5.03
0.00
0.00
0.00
0.00
0.00
0.00
CWSTER-54 developmental toxicity in rats and rabbits (subcluster of 17 chemicals)
1912-24-9
57966-95-7
85-00-7
79241-46-6
79622-59-6
117337-19-6
79983-71-4
36734-19-7
16484-77-8
141112-29-0
2212-67-1
123312-89-0
118134-30-8
4151-50-2
111988-49-9
210631-68-8
87820-88-0
Atrazine
Cymoxanil
Diquat dibromide
Fluazifop-P-butyl
Fluazinam
Fluthiacet-methyl
Hexaconazole
Iprodione
Mecoprop-P
Isoxaflutole
Molinate
Pymetrozine
Spiroxamine
Sulfluramid
Thiacloprid
Topramezone
Tralkoxydim
25
75
4
300
250

250

50
500
140
100
100
3.3
50
100
200
5
25
40
5
50

2.5
200
100
100
35
30
10
13.3
10
100
3
75

3
50
7

100
60
50
100
200
75
80
3
10

100
75
8
1
50
4
1000
50
200
20
5
200
75
80
0.3
10
50
100
7.36
5.77
10.00
3.77
4.03
0.00
4.03
0.00
6.36
3.03
4.87
5.36
5.36
10.28
6.36
5.36
4.36
9.68
7.36
6.68
9.68
6.36
0.00
10.68
4.36
5.36
5.36
6.87
7.09
8.68
8.27
8.68
5.36
10.42
5.77
0.00
10.42
6.36
9.19
0.00
5.36
6.09
6.36
5.36
4.36
5.77
5.68
10.42
8.68
0.00
5.36
5.77
9.00
12.00
6.36
10.00
2.03
6.36
4.36
7.68
9.68
4.36
5.77
5.68
13.74
8.68
6.36
5.36
CLUSTER-SO developmental toxicity in rabbits (subcluster of 9 chemicals)
61-82-5
99607-70-2
33629-47-9
82697-71-0
94361-06-5
142-59-6
82-68-8
43121-43-3
199119-58-9
3-Aminotriazole
AA 5-C-8-Q"
Butralin
Clofencet
Cyproconazole
EBC'
Quintozene
Triadimefon
Trifloxysulfuron-sodium

400
1250

12
75

25
1000
1000
400
500
1000
12
7.5
750
50
1000
80
300
135
500
50
32.8
125
120
250
80
60
45
500
10
2.62
250
50
100
0.00
3.36
1.71
0.00
8.42
5.77
0.00
7.36
2.03
2.03
3.36
3.03
2.03
8.42
9.09
2.45
6.36
2.03
5.68
3.77
4.92
3.03
6.36
6.96
5.03
5.09
4.03
5.68
6.09
6.51
3.03
8.68
10.61
4.03
6.36
5.36
   Abbreviations: EBC: ethylenebisdithiocarbamate, disodium: AA 5-C-8-Q: acetic acid, {(5-chloro-8-quinolinyl)oxy}-,l-methylhexyl ester: potency score 0 = -log2(LEL):
clusters correspond to chemical numbers in Fig. 3. Blank cells mean no effects reported.
patterned by 6>(cLEL) scores for fetal weight reduction and skeletal
defects. The @(cLEL) for embryonic-fetal loss was linked between
species although based  on critical  effect counts this  endpoint
was over-represented in rabbits. The strongest correlated variables
among embryonic targets  overall were rat appendicular-cranial
skeleton (correlation coefficient = 0.811) and rat kidney-ureter (cor-
relation = 0.843). Fig. 4 plots the chemical counts for each endpoint
variable at the dLEL and across all dose groups, dLEL or higher. As can
be seen the dLEL is sufficient to pick up categorical effects for most
but not all chemicals within each test species. A perhaps interesting
association is the over-representation of somatic body wall defects
in rats and of splanchnic (visceral) body wall defects in rabbit.

4. Discussion

   Mining 30 years worth of  guideline  animal studies using
ToxRefDB  classified  chemical-phenotype  relationships  for  hun-
dreds of chemicals and endpoints related  to pregnancy  outcome.
The process of gathering, curating, and integrating these data con-
       stitutes a considerable effort that now for the first time provides a
       common data model for mining the prenatal developmental tox-
       icity of environmental chemicals. This repurposes  study reviews
       from their original use in regulatory toxicology decisions to a novel
       use to anchor high-throughput screening assays in  ToxCast™ [4].
       ToxRefDB derives maternal and fetal data  comprehensively from
       reviews of prenatal studies on pregnant animals. The  present imple-
       mentation captured data on 387 environmental chemicals, mostly
       pesticides, from 751 studies in pregnant rats or rabbits. This imple-
       mentation adds to the considerable body of reference toxicity data
       for these chemicals for chronic/cancer rodent endpoints [7] and
       multi-generation reproduction rat endpoints [8].
         Focusing on inter-species comparison, the complexity of fetal
       target organ response  to maternal dosing with environmental
       chemicals during the period of major organogenesis revealed hier-
       archical relationships. There was a clear hierarchy to the sensitivity
       and specificity of maternal and fetal LELs in comparing responses
       between chemicals and inter-species, with rats being more sensi-
       tive to developmental effects than rabbits. The dependence of any
                                     Previous
TOC

-------
216
                                         T.B. Knudsen et al / Reproductive Toxicology 28 (2009) 209-219
                                    I  40  I    J~50~|

                                   Egn..rC7l jkrhikjr^  xSTftL
Fig. 3. Hierarchical relationship of ToxRefDB chemicals to developmental effects. Chemicals (columns) by effects (rows) for 283 chemicals with potency scores (0) in rat (A,
dark green) and rabbit (C, light green). Clustering by Pearson's and Ward linkage; color intensity represents potency score. Variables are maternal LEL (mLEL), developmental
LEL (dLEL), and categorical LELs. Other effects classes are grouped by system: cardiovascular (CV), general (GN), neurosensory (NS), orofacial (OF), pregnancy-related (PR),
skeletal (SK), trunk (TR) and urogenital (UG). Abbreviations for specific effects targets: reduced maternal body weight gain (MBW), fetal weight reduction (FWR), maternal-
pregnancy losses (PRL), embryo-fetal losses (RES), general fetal pathology (GRL); skeletal defects - axial (AXL), appendicular (APP), cranial (CRN); orofacial defects - cleft
lip/palate (CLP), altered jaw / hyoid bone (]W"H); defects of the fetal brain (BRN) and eye (EYE); renal (REN), ureteric (URT), genital (GEN); body wall defects (SOM) and
abnormal splanchnic viscera (SPL); heart (HRT) and major vessels (VAS).
LEL on the study design is clearly a matter of choice by the orig-
inal investigators. These dose selections may not have been ideal,
but merely reflect practice over the era in which the studies were
performed. The primary factors underlying developmental toxicity
were fetal weight reduction, skeletal variations and abnormalities,
and fetal urogenital defects in rats, and general pregnancy/fetal
losses and structural malformations to the visceral body wall and
CNS in rabbits. Many aspects were consistent with other database
studies  indicating the key relationships are likely driven by the
biology of test species.

4.1. Chemical activity

   The spectrum of chemical activity based on the  administered
dose was primarily resolved by the relative values of the mLEL
and fetal dLEL dosages. Reduced maternal body weight gain dur-
ing gestation and fetal weight reduction were the most common
endpoints. Weight changes would  have been expected to underlie
determination of the mLEL and dLEL for a large number of chem-
ical structures. This was evident  for fetal weight reduction and
the dLEL although neither weight change correlated  with mLEL. A
recent study [16] examined general relationships between maternal
and fetal toxicity using a dataset of 56 rat, 46 mouse and 25 rab-
bit studies compiled from the National Toxicology Program (NTP).
That study found weight changes to be the primary factors in deter-
mining  levels of maternal and fetal toxicity and noted that the
degree of association between maternal and fetal weight changes
    followed the rank order: mouse (91% concordance, P<0.001)>rat
    (41%, P< 0.01 )> rabbit (24%, not significant). They attributed the
    inter-species difference to time lapse between dosing and evalu-
    ation (mouse < rat < rabbit) and amount of time that the fetus has
    to recover [16].  Although guideline study designs used to build
    ToxRefDB.prenatal had the same inter-species time lapse, there was
    no correlation between doses that caused maternal and fetal weight
    changes (correlation coefficient < 0.01) in spite of a modest inter-
    species correlation for mLELs (correlation coefficient = 0.59). Data
    localizing maternal body weight changes to specific  gestational
    stages  may improve the correlation [16]; however, this informa-
    tion cannot be obtained from the current build of ToxRefDB which
    has term body weight information only.
       The subset of 387 ToxRefDB chemicals perturbing fetal weight
    was much higher in rats (35.7%) than rabbits (19.2%). Consistent
    with that finding, a high incidence of fetal weight reduction was the
    lone endpoint effect in defining fetal LOAELs for >71% NTP rodent
    studies [16]. Although the fraction of ToxRefDB chemicals that pro-
    duced fetal weight change at the dLEL (71.2%) was also consistent
    with NTP rat studies, the high incidence  of fetal weight changes
    also holds for the rabbit (86.9%). This contrasts with the NTP study
    where fewer than half of the  rabbit studies where a fetal LOAEL was
    determined involved fetal weight reduction [16]. Due to the preva-
    lence of pesticides for the initial build of ToxRefDB, it may be that
    these are more bioactive chemicals than the many industrial types
    of chemicals that NTP tested. Another disparate finding is the minor
    subset of ToxRefDB chemicals that produced fetal weight change as
                                         Previous
TOC

-------
                                         T.B. Knudsen et al. / Reproductive Toxicology 28 (2009) 209-219
                                                                                                                           217




244
4













































SPL
VAS
HRT
EYE





1 9fifi URUU


5 RES
PRL
UG
TR
cv
NS
OF
SK
GN
PR
RABBIT 	 	 0 	 	 RAT
317 chemicals 350 chemicals
Fig. 4. Percentage of chemical counts distributed across phenotype classes. Each phenotype is plotted by a normalized bar that shows the %-distribution of the response
in each species. Hatched area (dLEL) relative to solid area (cLEL) indicates the number of chemical counts at the dLEL relative to any dose, dLEL or higher. Rat studies are
plotted from left to right and rabbit studies from right to left on the same scale. Total chemical hits are listed aside each species bar. Effects (19) and classes of effects (8) are
abbreviated as in Fig. 3.
the sole determinant of the dLEL (~6% versus ~70%). Collectively,
these data support the notion that fetal weight reduction is a key
parameter in profiling developmental toxicity [16]. At the same time
the data suggest that for a substantial number of environmental
chemicals the fetal weight change correlates more strongly with
malformations than maternal weight change. Indeed, reductions in
fetal weight in absence of maternal weight change have been found
in several NTP chemicals that are 'preferentially toxic to both the
developing embryo and the fetus' [16].
   On the basis of mLEL and dLEL a subset of ToxRefDB chemicals
showed increased sensitivity or specificity of the developing fetus
as compared to the pregnant mother (dLEL< mLEL) or in one species
versus another. This list included 53 of 387 chemicals (13.7%) with
data in rat or rabbit,  and 43 of 283 chemicals (15.2%) where data
existed in both species. Although these chemicals provide inter-
esting prototypes for mechanistic study, a number of them were
active only at very high doses as defined on an mg/kg-d basis. The
proportions are slightly higher than observed in the NTP data anal-
ysis, which reported adverse effects in the fetus at lower doses of
exposure as compared to  the mother in 5/62  (8.1%)  cases across
rat-rabbit studies; however,  that study also  noted adverse fetal
effects and overt maternal toxicity at the same dose level in 19/62
(30.6%) rat-rabbit studies [16]. Further analysis is  needed to learn
if there were any developmental effects that were always occurring
only in the presence of a specific form and/or severity of maternal
toxicity.
   Across the 43 chemicals flagged for developmental sensitivity or
specificity, only a subset of cases had fetal weight effects. Apart from
the consistencies with NTP studies [16], the role of maternal under-
nutrition as a primary determinant of fetal weight at term could
not be substantiated through ToxRefDB profiling of primarily pes-
ticidal chemicals for developmental toxicity. The broader analysis
with 387 chemicals on 751 studies suggests a chemical-phenotype
       association with specific variations or malformations that is direct
       (mechanism) versus indirect (maternal) factors, especially in preg-
       nant rats.

       4.2.  Phenotype representation

         In current practice, the guideline prenatal developmental tox-
       icity studies are  used to  identify NOAELs and LOAELs based on
       maternal and  fetal  endpoints, rather than to estimate specific
       developmental phenotypes in humans [25]. Analysis of chemical
       counts for most developmental endpoints inToxRefDB captured the
       majority of actives at the  dLEL and many of the toxicants altered
       development at doses near the mLEL. The inter-relationships of
       developmental toxicity endpoints may,  however, provide  useful
       information that can be mined from guideline studies [16,26,27].
       A comprehensive weight-of-evidence model for reproductive and
       developmental toxicity hazard identification has been constructed
       by the U.S. Food and Drug Administration (FDA) to predict toxic-
       ities based on  quantitative structure-activity relationship (QSAR)
       across large blocks of chemicals  and chemical classes  [15]. That
       database  derives secondary data  for many chemicals (2000) and
       studies (10,000). In contrast, ToxRefDB structures data from origi-
       nal guideline studies. This  enables profiling developmental toxicity
       from high-quality source data annotated by internationally harmo-
       nized target-description effects [18].
         Most if not all of the  988  possible DevTox endpoints  might
       be expected from a  survey of 751 source studies observed; how-
       ever, only 29.7% of  these terms  were represented in the  initial
       build of ToxRefDB. This proved to be sufficient to classify targets
       into individual defect 'categories' and then analyze their distribu-
       tion by chemical count by species. Skeletal defects, in correlation
       with fetal weight reduction in rats were the strongest factors pro-
       filing developmental toxicity. Incomplete ossification and missing
                                     Previous
TOC

-------
218
                                        T.B. Knudsen et al. / Reproductive Toxicology 28 (2009) 209-219
bone elements are easily recognized in fetal evaluation protocols.
It is therefore not surprising to see these common findings over-
represented in the ToxRefDB source data and to have the response
amplified by aggregating individual skeletal elements by system. On
the other hand, some relatively common defects such as hypospa-
dias and ocular colobomata may have been under-represented in
ToxRefDB. These  phenotypes may be difficult to detect in a fetal
rat, or their under-representation may very well indicate that these
endpoints are not associated with the pesticides evaluated here;
hence, the need for a multi-generational study. Finally, the extent
to which low-frequency malformations induced by exposure  are
captured as being present in ToxRefDB has not been examined here.
   Mapping the  aggregation of less frequent defects to the tar-
get organ can improve the statistical power of representation
analysis [12,13,15]. For ToxRefDB,  the aggregation of fundamental
DevTox observations revealed chemical effects on 19 generalized
target classes and 8 higher level embryonic systems. Although
the percentage of active chemicals was low for individual cate-
gories of defects and observations (skeletal  changes excluded), a
large number of environmental chemicals had significant findings
among the 19 categories. Similarly, results from the QSAR repro-
tox database [15] showed the majority of agent-induced structural
defects could be aggregated into a relatively small number (11) of
defects categories. For both databases the major class  of malfor-
mations included cleft palate, CNS, craniofacial, eye, skeletal and
urogenital defects. Therefore, specific defects in addition to fetal
weight changes contribute to the phenotype spectrum for profiling
the developmental toxicity of agents in general and environmen-
tal chemicals in  particular. Because the  phenotype  spectrum is
well-represented at the dLEL, and because the percentage of active
chemicals becomes  only slightly broader at high-dose  levels,  we
speculate that chemical profiling for developmental toxicity has its
strongest predictive value of specific fetal target systems at the dLEL
It is axiomatic that a dLEL assignment in ToxRefDB was dependent
on the sensitivity of anatomical methods used to identify a fetal
change. This implies an underlying biology leading to apical end-
points that can serve as an in vivo anchor to the bioactivity profiles
generated in HTS in vitro and QSAR models [28].

4.3. Inter-species concordance

   Although many environmental chemicals had significant find-
ings among the 19 categories of effects, the  percentage of actives
for individual categories  varied  between species. The stronger
response of rats  in terms of skeletal defects, fetal weight reduc-
tion and urogenital defects perhaps reflects an underlying biological
susceptibility to  the chemical, or may be  simply explained by
nuances in the examination techniques. Of particular importance
is the longer period between exposure and evaluation that can
allow rabbit fetuses more time to recover from transitory delays
than might be detected in the rat fetus [14]. Incomplete ossification
or fetal weight reduction, for example, may be less evident in the
rabbit due to a longer recovery interval.
   On the other hand, study design factors are less compelling
explanations for some of the ToxRefDB findings. Rat renal defects
and resorptions  in  rabbits, for example,  may  be more  directly
principled on species differences. FDA's weight-of-evidence QSAR
database concluded from among  936 chemicals that the rabbit
was about 6-fold less susceptible than rat to  chemicals  causing
fetal dysmorphogenesis [13]. Results from ToxRefDB indicate that
inter-species differences depend on the target organ since some
endpoints were over-represented  in percentages for rabbits. Since
rabbit is noted as having more developmental variations than rat
it follows  that more uncertainty  can be anticipated in assessing
treatment-related malformations in this species; however, the His-
torical Control Database [29] did not reveal an inter-species bias for
    either renal defects (rat) or eye defects (rabbit). This further implies
    an underlying biology for ToxRefDB endpoints.
      The  importance of placental differences between rat and rab-
    bit embryos as a potential reason for inter-species differences has
    been emphasized [30]. Development of the chorioallantoic placenta
    is precocious in rabbit versus rat; consequently, visceral yolk sac
    expansion occurs relatively late in rabbits and the volume of exo-
    coelomic fluid is much higher than  in rat gestation. These factors
    may influence the transport and concentration of chemical reach-
    ing the embryo at critical times during organogenesis, which in turn
    may account for some of the inter-species differences in suscep-
    tibility. Because unique attributes of placentation in rabbits more
    than rats closely resemble the human condition [30], testing in both
    species has implications in estimating human risk [6].
      Not all inter-species findings were consistent between ToxRefDB
    and the weight-of-evidence QSAR database. A higher percentage of
    ToxRefDB chemicals with significant activity on fetal death param-
    eters in the rabbit, and the  higher percentage of chemicals with
    greater fetal weight reduction in the rat, were not noted in the QSAR
    training set [15]. Again, ToxRefDB chemicals are likely to be more
    bioactive in general because they are compiled of many pesticides.
    Analysis of data for 54 potential developmental toxicants and 73
    substances considered to be teratogenic in the rabbit and not the
    rat showed generally similar sensitivity between species, although
    for some chemicals the rat is more sensitive and others the rabbit
    study is more sensitive [25]. Those authors suggested that differ-
    ences between rat and rabbit studies in terms of classification of
    developmental toxicity may reflect consequences of maternal toxi-
    city between the species, rather than direct developmental toxicity
    [31].
      Aside from a relatively longer gestational period and the higher
    frequency of developmental variation in rabbits, the doe is less tol-
    erant of chemical  treatment than the rat dam [15]. Clearly, some
    ToxRefDB chemicals showing developmental activity in rats pro-
    duced maternal toxicity in rabbits at the same (dLEL)  dose level.
    Among 91 substances with teratogenicity information reviewed [6]
    a lack of concordance between  rat and rabbit was observed in 18
    of 91 (20%) compounds tested in both species. Chemical profiling
    of 283 ToxRefDB chemicals with an evaluation of developmental
    toxicity in both species identified clusters of about 130 chemicals
    with developmental effects in either species alone; however, chem-
    icals may have multiple effects on maternal and fetal parameters
    and the interaction between mother-conceptus  may differ across
    species and chemicals. Selection of rabbit as a test species is primar-
    ily driven by historical interest in thalidomide-induced limb defects
    observed in humans, monkeys and rabbits, but not rats  [32].
      The  present  study shows that specific developmental effects
    differ between  species,  and we know this to be true for the
    comparison with the human condition as experience with some
    chemicals shows [32]. The added value of rabbit studies for prena-
    tal developmental toxicity evaluation has been recently questioned
    based on NOAEL comparison and developmental outcomes [25]
    and the weight-of-evidence QSAR database finding "no evidence
    of trans-species tissue specific dysmorphogenic findings" [15]. Ret-
    rospective analysis of several hundred Pharmaceuticals tested in
    both rodent and non-rodent species for general toxicological end-
    points showed an overall 71% concordance with true positives in
    human populations; concordance was lower when non-rodents
    (63%) and rodents (43%) were considered separately [33]. For devel-
    opmental  toxicity, rat studies  alone predicted  teratogenicity in
    61% of chemicals that showed teratogenicity in rat, mouse or rab-
    bit,  whereas a rat study and a rabbit study together identified
    teratogenicity in 100% of these  chemicals [6]. Taking  the devel-
    opmental toxicity alone, without regarding maternal  toxicity as
    strictly causal and without extrapolating the nature of effects equiv-
    alently between species, the question remains open whether the rat
                                       Previous
TOC

-------
                                               T.B. Knudsen et al. / Reproductive Toxicology 28 (2009) 209-219
                                                                                                                                              219
as the only in vivo model would not detect almost all developmental
toxicants.

4.4.  Conclusion

   Results from analysis  of 387  chemicals in the  EPA ToxRefDB
support  the  value  of  a  traditional  two  species  paradigm  for
identification of developmental toxicity. Manifestations of direct
(mechanism-based) developmental toxicity with or without indi-
rect (maternal-mediated) effects underscore the need for improved
methods of assessing the dynamical relationship between devel-
opmental processes  and  maternal  health  status.  In the  future,
data from alternative methods and HTS in vitro assays that enable
'pathway-based risk assessment'  may increase confidence in test-
ing strategies while limiting required animal testing [1,34]. For this
to occur, public data models are needed that structure conventional
in vivo toxicity data into computable forms.  ToxRefDB provides
such a novel data model for relational assessment  of source data
from guideline (in vivo) prenatal developmental toxicity studies.
We  envisage high value in these animal studies to anchor cross-
scale  modeling and  predictive understanding of developmental
processes and toxicities [17].

Conflict of interest

   The authors declare they have no competing financial interest.

Acknowledgements

   We thank the United States Environmental Protection Agency's
Office of Pesticide Programs (OPP) for contributions to the ToxRefDB
project, including access to toxicity data evaluation records, scien-
tific consultation, and the review of this manuscript (Dr. Elizabeth
Mendez  and  Dr.  Vicki  Dellarco). We  also  thank Daniel Corum,
Daniel Rotroff and Jeffrey Finn for excellent work entering data into
ToxRefDB.

References

 [1]  National Research Council. Toxicity testing in the 21st century: a vision and a
     strategy. Washington, DC: The National Academies Press: 2007,196 pp.
 [2]  Collins FS, Gray CM, Bucher JR. Transforming environmental health protection.
     Science 2008:319:906-7.
 [3] Judson R, Richard A, Dix DJ, Houck  K, Martin M, Kavlock RJ, et al. The Tox-
     icity data landscape  for environmental chemicals. Environ Health Perspect
     2009:117:685-95.
 [4]  Dix DJ, Houck KA, Martin MT, Richard AM, Setzer RW, Kavlock RJ. The ToxCast
     program for prioritizing toxicity testing of environmental chemicals. ToxicolSci
     2007:95:5-12.
 [5]  Kimmel CA, Kimmel GL. Principles of developmental toxicology risk assess-
     ment. In: Hood RD, editor. Handbook of developmental toxicology. Boca Raton:
     CRC Press: 1996. p. 667-93.
 [6]  Hurtt ME, Cappon CD, Browning  A.  Proposal  for a  tiered approach to
     developmental toxicity testing for veterinary pharmaceutical products for food-
     producing animals. Food ChemToxicol 2003:41:611-9.
 [7[  Martin MT, Judson RS, Reif DM, Kavlock RJ, Dix DJ. Profiling chemicals based
     on chronic toxicity results from the U.S.  EPA ToxRef Database. Environ Health
     Perspect 2009:117:1-8.
 [8[  Martin MT,  Mendez E, Corum DC, Judson RS, Kavlock RJ, Rotroff DM, Dix
     DJ. Profiling the Reproductive Toxicity of Chemicals from Multigeneration
     Studies in the Toxicity  Reference Database  (ToxRefDB). Toxicol Sci 2009:
     doi:10.1093/toxsci/kfp080.[online 10 April 2009[.
 [9]  U.S.  Environmental  Protection  Agency.  Health  effects  test  guidelines
     OPPTS  870.3700  prenatal  developmental  toxicity study.  Washington,  DC:
     Office of Prevention,  Pesticides and  Toxic Substances.  EPA Publication
     712-C-98-207, 1998. Available at http://www.epa.gov/opptsfrs/publications/
     OPPTS _Harmonized/870_Health_Effects_Test_Guidelines/Series/870-3700.pdf
     [accessed 10 January 2009].
[10]  U.S. Environmental Protection Agency. Guideline 83-3: teratogenicity study,
     pesticide assessment guidelines, subdivision F, hazard evaluation: human and
            domestic animals. Washington, DC: Office of Pesticides and Toxic Substances:
            1982. EPA Publication 540.9-82-025.
        [11[ Organisation for Economic Cooperation and Development. OECD guideline for
            the testing of chemicals, No. 414: prenatal developmental toxicity study. Paris,
            France: Organisation for Economic Cooperation and Development: 2001,11 pp.
        [12] Scheuerle A, Tilson  H. Birth defect classification by organ system: a novel
            approach to heighten teratogenic signaling in a pregnancy registry. Pharma-
            coepidemiol Drug Safety 2002:11:465-75.
        |13] Correa-Villasenor A, Cragan J, Kucik J, O'Leary  L,  Siffel C, Williams L. The
            metropolitan Atlanta congenital defects program:  35 years of birth defects
            surveillance at the centers for disease control and  prevention. Birth Def Res
            (Part A) 2003:67:617-24.
        [14] Julien E, Willhite CC, Richard AM, DeSesso JM. Challenges in constructing sta-
            tistically based structure-activity models for developmental toxicity. Birth Def
            Res (Part A) 2004:70:902-11.
        [15] Matthews EJ, Kruhlak NL,  Benz RD,  Contrera JF.  A comprehensive model
            for reproductive  and developmental toxicity hazard  identification: I. Devel-
            opment of a  weight of evidence QSAR database. Regul Toxiol Pharmacol
            2007:47:115-35.
        [16] Chernoff NA, Rogers EA, Gage MI, Francis BM. The  relationship of maternal and
            fetal toxicity in developmental toxicology bioassays  with notes on the bio-
            logical significance of the "no observed adverse  effect level". Reprod Toxicol
            2008:25:192-202.
        [17] Knudsen TB,  Kavlock  RJ. Comparative  bioinformatics  and  computational
            toxicology.  In:  Abbott B, Hansen D, editors. Developmental toxicology vol-
            ume 3, target organ toxicology series. New York: Taylor and Francis: 2008.
            p. 311-60.
        118] Wise LD, Beck SL, Beltrame D, Beyer BK, Chahoud I, Clark RL, et al. Terminology
            of developmental abnormalities in common laboratory mammals (version 1).
            Teratology 1997:55:249-92.
        [19] Chahoud I, Buschmann J, Clark R, Druga A, Falke H, Faqi A, et al. Classification
            terms in developmental toxicology: need  for harmonisation. Report of the sec-
            ond workshop on the terminology in developmental toxicology. Reprod Toxicol
            1999:13:77-82.
        [20] Solecki R, Burgin H,  Buschmann J, Clark R, Duverger M, Fialkowski O, et al.
            Harmonisation of rat fetal skeletal terminology and classification. Report of
            the third workshop on the terminology in developmental toxicology. Reprod
            Toxicol 2001:15:713-21.
        [21] Solecki R, Bergmann B, Burgin H, Buschmann J, Clark R, Druga A, et al. Har-
            monization of rat fetal external and visceral terminology and classification.
            Report of the fourth workshop on the terminology in developmental toxicology.
            Reprod Toxicol 2003:17:625-37.
        [22] Yang C, Benz RD, Cheeseman MA. Landscape of current toxicity databases and
            database standards. Curr Opin Drug Discov Dev 2006:9:124-33.
        [23] Ward JH. Hierarchical grouping to optimize an objective function. J Am Stat
            Assoc 1963:58:236-44.
        [24] Ihaka R, Gentleman RR. A language for data analysis  and graphics. J Comput
            Graph Stats 1996:5:299-314.
        [25] Janer G, Slob W,  Hakkert BC,  Vermeire T, Piersma AH. A retrospective anal-
            ysis of developmental toxicity studies in rat and rabbit: what  is the added
            value of the rabbit as an additional test species? Regulat Toxicol Pharmacol
            2008:50:206-17.
        [26] Guittina  P,  Elefant  E, Saint-Salvic B.  Hierarchization of animal teratology
            findings for improving the human risk evaluation of drugs.  Reprod Toxicol
            2000:14:369-75.
        |27] Piersma AH, Janer G, Wolterink G, Bessems JG, Hakkert  BC, Slob W. Quanti-
            tative extrapolation of in vitro whole embryo culture embryotoxicity data to
            developmental toxicity in vivo  using the Benchmark Dose approach. Toxicol Sci
            2008:101:91-100.
        [28] Judson RS, Richard  AM, Dix DJ,  Houck K, Elloumi F,  Martin MT, et al.
            ACToR—aggregated computational toxicology resource. Toxicol Appl Pharm
            2008:233:7-13.
        [29] Historical Control Database of Preclinical Developmental and Reproductive
            Toxicology Parameters. A Joint Project of Middle Atlantic Reproduction and
            Teratology Association (MARTA) & Midwest Teratology Association (MTA)
            (http://www.hcd.org/) [accessed January 14, 2009].
        |30] Carney EW, Tornesi B, Keller C, Findlay HA, Nowland WS, Marshall VA, et al.
            Refinement of a morphological scoring system for postimplantation rabbit con-
            ceptuses. Birth Def Res (Part B) 2007:80:213-22.
        [31] Khera  KS. Maternal  toxicity:  a possible etiological  factor in embryo-fetal
            deaths and  fetal malformations of rodent-rabbit species. Teratology 1985:31:
            129-53.
        [32] Brent R, Holmes L. Clinical and basic  science lessons from the thalidomide
            tragedy: what  have we learned about the cause  of limb defects? Teratology
            1998:38:241-51.
        [33] Olson H, Betton G, Robinson D, Thomas  K,  Monro A, KolajaG, et al. Concordance
            of the toxicity of Pharmaceuticals in humans and in animals. Regulat Toxicol
            Pharmacol 2000:32:56-67.
        [34] Bremer S, Pellizzer C, Hoffmann S, Seidle T, Hartung T. The development of new
            concepts for assessing reproductive toxicity applicable to large scale toxicolog-
            ical programmes. Curr Pharm Des 2007:13:3047-58.
                                           Previous
TOC

-------
TOXICOLOGICAL SCIENCES 110(1), 181-190 (2009)
doi:10.1093/toxsci/kfp080
Advance Access publication April 10, 2009
  Profiling  the Reproductive Toxicity of Chemicals from Multigeneration
                        Studies  in the  Toxicity  Reference Database

Matthew T. Martin,*'1  Elizabeth Mendez,t Daniel G. Corum,* Richard S. Judson,* Robert!. Kavlock,* Daniel M. Rotroff,* and
                                                     David J. Dix*
 *National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North
 Carolina 27711; and ^Health Effects Division, Office of Pesticide Programs, U.S. Environmental Protection Agency, Washington, District of Columbia 20460

                                        Received January 26, 2009; accepted April 8, 2009
  Multigeneration reproduction studies are used to characterize
parental and offspring systemic toxicity, as well as reproductive
toxicity of pesticides, industrial chemicals and Pharmaceuticals.
Results from 329 multigeneration studies on 316 chemicals have
been digitized into standardized  and  structured  toxicity data
within  the Toxicity Reference Database (ToxRefDB).  An initial
assessment of data quality and consistency was performed prior to
profiling  these environmental chemicals based on reproductive
toxicity and  associated toxicity endpoints. The pattern  of toxicity
across 75 effects for all 316 chemicals provided sets of chemicals
with similar in  vivo  toxicity  for  future predictive  modeling.
Comparative analysis across the 329 studies  identified chemicals
with sensitive reproductive  effects, based  on comparisons  to
chronic and subchronic  toxicity  studies,  as did  the cross-
generational comparisons within the multigeneration study. The
general pattern of toxicity across  all  chemicals and the more
focused comparative analyses identified 19 parental, offspring and
reproductive effects with a high enough  incidence to serve  as
targets for predictive modeling that  will eventually serve  as
a chemical prioritization tool  spanning reproductive toxicities.
These  toxicity endpoints included  specific reproductive perfor-
mance  indices, male and female reproductive organ pathologies,
offspring viability, growth and maturation, and parental systemic
toxicities. Capturing this reproductive toxicity data in ToxRefDB
supports ongoing retrospective  analyses, test guideline revisions,
and computational toxicology research.
  Key  Words:  pesticides; relational  database;  reproductive
toxicity; toxicity profiling.
  The U.S. Environmental Protection Agency (EPA) and other
regulatory  agencies are investigating novel approaches  for

  Disclaimer: The United States Environmental Protection Agency through its
Office of Research and Development funded and managed the research
described here. It has been subjected to Agency  administrative review and
approved for  submission and peer review.
  1 To whom correspondence should be addressed at National Center for
Computational Toxicology, Office  of Research and Development, U.S.
Environmental Protection Agency, MD D343-03,  109 TW Alexander Drive,
Research Triangle Park,  NC  27711.  Fax:  (919)  685-3399.  E-mail:
martin.matt@epa.gov.

Published by Oxford University Press 2009.
      predicting chemical toxicity, with the goal of rapidly screening
      the thousands of environmental chemicals with limited toxicity
      data (U.S. EPA, 2009). Building predictive models of chemical
      toxicity requires high-quality in vivo toxicity data, in order to
      develop and validate new in vitro and in silico approaches, hi
      support of EPA's ToxCast predictive toxicology effort (Dix
      et al., 2007),  we  have  developed  the  Toxicity Reference
      Database (ToxRefDB) for capturing information from in vivo
      toxicity studies. ToxRefDB includes endpoints  from multiple
      study types, including chronic rat and mouse carcinogenicity
      2-year bioassays that have been previously reported  and made
      publicly available (http://www.epa.gov/ncct/toxrefdb/) (Martin
      et al., 2009). ToxRefDB is being used to build  computational
      models linking whole animal  toxicity, and specific tissue and
      cellular phenotypes, to specific chemical-biological interactions
      detected by cellular, genomic and biochemical in vitro assays.
      The in vivo toxicity data captured in ToxRefDB is facilitating
      a  transition to the National  Research Council's vision  for
      "Toxicity Testing in the 21st century: A Vision and a Strategy"
      (Collins et al., 2008; NRC, 2007), by linking toxicity endpoints
      from animal studies to molecular targets and pathways relevant
      to humans.
         The multigeneration study  data  entered into ToxRefDB
      provides anchoring in vivo reproductive toxicity data for  the
      EPA ToxCast research  program  (http://www.epa.gov/ncct/
      toxcast/).  Within the  ToxCast program, bioactivity profiles
      for hundreds of environmental chemicals are  being derived
      from hundreds  of in vitro assays (Dix et al., 2007; Houck and
      Kavlock, 2008). Phase I of ToxCast is focused on chemicals
      with known in vivo toxicity data,  supporting the development
      of in vitro data signatures predictive  of these in vivo outcomes
      (Kavlock et al., 2008). It is worth noting that for environmental
      chemicals, unlike pharmaceuticals, quantitative in vivo toxicity
      data is essentially restricted to animal species. Nearly all of the
      ToxCast Phase I  chemicals   are food-use pesticide  active
      ingredients that have undergone numerous mammalian toxicity
      tests, including guideline  multigeneration studies. This  highly
      standardized dataset provided in ToxRefDB facilitates profiling
      ToxCast Phase I chemical toxicity based on parental, offspring
                                  Previous
TOC
Next

-------
182
                                                          MARTIN ET AL.
and reproductive effects. It was hypothesized that, through the
use of ToxRefDB,  key effects from the 329 multigeneration
reproduction  studies  would  characterize  the  reproductive
toxicity potential of the 316 chemicals, and further differentiate
the chemicals and effects with regards to generational and life-
stage  sensitivities.  Subsequently,  the well-characterized re-
productive effects  could  be used to  phenotypically anchor
predictive toxicity models.
   Traditional  toxicity  testing  for  the  risk  assessment  of
environmental compounds or groups of compounds  can cost
millions of dollars and take years of effort. Since  1970, EPA
has accumulated a vast store of high-quality regulatory toxicity
information on hundreds  of compounds,  most of which  has
been  inaccessible to computational analyses.  The curation  and
structuring of this  chemical  toxicity  information  into Tox-
RefDB has created  a valuable resource for both retrospective
and prospective toxicological studies (Martin et al.,  2009). In
addition to  the chronic/cancer rat and  cancer mouse studies
(Martin et al., 2009) and multigeneration studies reported here,
we are also extracting developmental toxicity studies in the rat
and rabbit.  The  multigeneration  reproductive toxicity  data
set—studies used by EPA in the pesticide registration process
to assess the performance and integrity of the male and female
reproductive  systems (U.S. EPA, 1996) include assessment of
gonadal function, the estrous cycle, mating  behavior, concep-
tion, gestation, parturition, lactation, weaning, and on the growth
and development of the offspring. The multigeneration  study
also provides information about the effects of the test substance
on neonatal morbidity, mortality, target organs in the  offspring,
and data on prenatal and postnatal developmental toxicity.
   Two  historical  test  guidelines  have been  used for  the
multigeneration studies in  ToxRefDB. Multigeneration studies
according to the  1982  Reproductive  and  Fertility  Effects
guideline (U.S. EPA, 1982)  on over  700 chemicals have been
conducted  and submitted  to EPA.  Multigeneration studies
according to the newer 1998  guideline (U.S. EPA,  1998) on
over   90   chemicals have been  conducted  and  submitted,
including  40  studies extracted  into  ToxRefDB.  Information
on data submissions  to  EPA was drawn from the  Office of
Pesticide Programs  (OPP) Information Network—the OPPIN
database (http://www.epa.gov/pesticides/). The 1998 guideline
was harmonized by EPA's Office of Pollution Prevention &
Toxic Substances (OPPTS; http://www.epa.gov/opptsfrs/home/
guidelin.htm) to meet testing requirements of the EPA's Office
of  Pollution  Prevention  and Toxics  and OPP,   as well as
international  guidelines  published by  the   Organization  for
Economic  Cooperation  and  Development.  Both  of  these
guidelines call for a two-generation  study in which continu-
ously  treated  male and female rats are mated to produce first
generation offspring, and in  turn the  adult offspring are mated
to produce a second generation. Continuing refinement of these
test guidelines has been proposed  (Cooper et al.,  2006),  and
ToxRefDB is being used to test hypotheses concerning the
validity of these  refinements.
                   MATERIALS AND METHODS

     Data  characteristics.  Reviews of  registrant-submitted multigeneration
   reproductive toxicity studies, known as Data Evaluation Records (DER), were
   collected for roughly  300 chemicals  from EPA's OPP. File types of DER
   include TIFF, Microsoft Word, Word Perfect and PDF formats, some of which
   are not directly text-readable. Approximately 500 multigeneration reproductive
   toxicity DER were reviewed, and based on data quality a subset of 329 was
   selected for curation into ToxRefDB. The first portion of the DER outlines the
   test substance, purity, lot/batch numbers, MRID (Master Record Identification),
   study citation, OPPTS  test guideline (U.S. EPA, 1982, 1998) and reviewers of
   the study. The executive summary captures all  of the basic study design
   information, including species  and  strain, doses, number of animals per
   treatment group and any deficiencies in  study protocol.  All dose levels were
   stored in  ToxRefDB  as  "mg/kg/day" and, where possible, recorded or
   calculated from food consumption data as an average over the entire dosing
   period.  The executive summary also describes treatment-related effects
   observed at various dose levels  in the study. The body of the DER provides
   detailed test material and animal information, and full dose response data in text
   and tables  for all measured and observed endpoints. All treatment-related
   effects were captured for each study in ToxRefDB.
     Multigeneration study DER contain all the information necessary to infer
   lowest effect level  (LEL) values for all treatment-related  effects that were
   statistically or biologically significant. Typically, the DER also designated
   "critical"  effects  for each study, and lowest observed  adverse  effect level
   (LOAEL)  and no observed adverse effect level (NOAEL) for  each study. If
   provided by the DER, ToxRefDB captured these study-level NOAEL, LOAEL,
   and critical effect  data. However, it is important to note that the critical effects
   used  to establish  NOAEL, LOAEL,  and a  reference  dose (RfD) for
   a conventional chemical pesticide  active,  and to make regulatory  risk
   assessment and management decisions,  are based on a toxicological review
   of multiple studies across many  study  types.
     Treatment-related effects were further identified as  either  a  "Parental,"
   "Offspring," or  "Reproductive" effect. Consistent  with DER, "Parental"
   endpoints were defined as systemic toxicity observed in the male or female
   adult parents, and  exclude  effects  directly related to reproduction  (e.g.,
   reproductive organ toxicity). "Offspring" endpoints were defined as systemic
   toxicity observed in the preweaning and juvenile animals,  and exclude birthing
   indices up to  postnatal day (PND) 4 (e.g., litter size and  live birth index).
   "Reproductive" endpoints were defined as observed effects on the reproductive
   performance or capacity of the  animals and included all reproductive organ
   toxicities, effects on estrous cyclicity,  sperm parameters, fertility, and mating,
   and prenatal and early postnatal viability.
     A small number of ToxCast Phase I chemicals were not pesticide active
   chemicals, such as  some perfluorinated compounds and phthalates. Though
   DER and pesticide registration studies were not available for these chemicals,
   there was  often  high-quality,   standardized  reproductive  toxicity studies
   available from the National Toxicology Program, peer-reviewed literature, or
   other sources. When data from such studies were available, it was crated into
   ToxRefDB consistent with information taken from DER.
     Data model and quality control. The relational data model for ToxRefDB
   was previously described (Martin et al., 2009) in a diagram showing the data
   model and field-level. A  Data Entry  Tool was  developed for  database
   population, including a controlled vocabulary for reproductive and other test
   data (available for download at http://www.epa.gov/ncct/toxrefdb/). Additional
   data entry and quality control  procedures for ToxRefDB are described in
   Martin et al. (2009), and on the ToxRefDB homepage.
     Full descriptions of the available data and conclusions as to the potential for
   the pesticides to cause harm to humans or the environment, risk mitigation
   measures, and the regulation of pesticides can be found  at U.S. EPA's OPP
   websites:  http://www.epa.gov/pesticides/regulating/index.htm;  http://www.epa.
   gov/pesticides/reregistration/status.htm;  http://www.epa.gov/oppsrrdl/registration_
   review/; http://www.epa.gov/oppsrrdl/reregistration/index.htm.  The study-level
   critical effects captured in ToxRefDB and taken from individual DER and studies
                                          Previous
TOG
Next

-------
                                      REPRODUCTIVE TOXICITY PROFILING FROM TOXREFDB
                                                                                                                               183
cannot be related directly to regulatory determinations or Reds without additional
information and analysis.

  Data  output and analysis.  The structured toxicity information stored
within ToxRefDB can be extracted in various formats utilizing SQL queries.
For the purpose of providing computable outputs, that is, quantitative outputs
amenable to statistical analysis, a consistent data output was used. The cross-
tabulated data output consisted of rows of chemical information (e.g., Chemical
Abstracts Service Registry Number, chemical name), by columns of toxicity
endpoints with the value entered being the lowest dose at which the endpoint
was observed (i.e.,  LEL) in "mg/kg/day." Even though NOAEL/LOAEL
values for each study's "Parental," "Offspring," or "Reproductive" effect can
be queried  from the database, the current analyses for ToxCast only utilized
LEL. Log-transformed potency values were derived using -Iog2 of LEL. A
constant  value of 12 was then added to zero-center the data allowing for zero to
represent no observed effect. Therefore, a value of 1 would be equivalent to an
effect at  2048 mg/kg/day and 18 would be equivalent to 0.015625 mg/kg/day.
The log-transformed values are predominantly used in the current analysis.
However, mill molar concentrations (mol/kg/day) were calculated for each
endpoint using the molecular weight of tested chemical and the LEL in mg/kg/
day. These  data tables are  available  on the ToxRefDB homepage: http://
www.epa.gov/ncct/toxrefdb/.  It  should  be noted, however, that potency does
not  necessarily equate to risk because this analysis does not take  into
consideration levels of exposure, a key element  in the determination of risk.
Moreover, the potency of a compound in this analysis does not represent that
the endpoints identified in the multigeneration  toxicity study are the most
sensitive in the database or for that matter the most appropriate for the exposure
scenarios being evaluated.
  Hierarchical clustering across all the chemicals and effects was carried out
based on log-transformed potency values. Effects were  selected based on an
occurrence in five or more chemicals, which was  level shown to have minimal
predictive capability based on a simulation study performed by Judson et al.
(2008a).  The clustering analysis, implemented in R version 2.6.1 (Ithaca and
Gentleman, 1996), used Pearson's dissimilarity as the distance measure for both
chemicals and effects and Ward's method  for  linkage (Ward,  1963).  The
chemicals were divided into six groups based on the  percent  of explained
variance  (Thorn  dike, 1953). The weight for  each effect in  deriving the
chemical groupings  was calculated as the ratio of the number of positives
within the chemical grouping over the number of positives out of the chemical
grouping.
                           RESULTS

Summary Characterization of Multigeneration Study Results
   This  analysis focused  on  reproduction-related  endpoints
culled from  329  multigeneration rat studies on 316 unique
chemicals entered into ToxRefDB. The vast majority of studies
(294 of  329) were performed using a two-generation protocol.
There were seven one-generation studies, for which four were
supplementary studies to longer-term two- or three-generation
studies.  Of the  28  three-generation  studies,  only  first and
second generation effects  were  used  in subsequent analyses,
whereas third generation effects  were excluded, In total, there
were 11 chemicals with more than  one study in this dataset.
Four chemicals had  an additional study run to satisfy  study
guideline requirements. Two chemicals had an additional study
to test at additional dose levels. Five  chemicals had two studies
performed at similar dose  levels  and the conclusions  between
each pair of studies were similar.
                                  TABLE 1
           Distribution of Chemicals and Effects Across Life-Stage,
           Endpoint Category and Generation for 316 Chemicals in
            ToxRefDB with a Multigeneration Reproductive Study
Life stage
Endpoint category
Generation PI
Fl
F2
Adult
Parental"
275° (2935)6
275 (3265)

Adult
Reproductive^
100 (376)
129 (648)

Juvenile
Offspring"

255 (2274)
247 (1979)
         "Number of chemicals with at least one effect observed at specified life-
       stage, endpoint category and generation.
         ^Number of effects observed at specified life-stage, endpoint category, and
       generation.
         "Parental endpoints include adult body weight, mortality, clinical signs, and
       target-organ weight and pathology effects.
         ^Reproductive endpoints include reproductive organ weight and pathology
       and reproductive indices (e.g., fertility, mating, live birth index).
         "Offspring endpoints include pup weight, offspring survival (e.g., viability
       and lactation index), and juvenile target-organ weight and pathology, and
       pubertal delay (e.g., PPS and VO) effects.
         Across all studies and treatment groups 12,230 treatment-
       related effects were observed, corresponding to 458 different,
       unique types of effects. Each effect was tagged with specific
       endpoint category,  life-stage,  and generational information.
       The  distribution  of  treatment-related  effects  and  positive
       chemicals across life-stage and generation provide insight into
       the  sensitivities  of specific  classes of  endpoints  (Table  1).
       Parental effects were associated with 275 of the 316 chemicals
       for both  the PI and Fl generation, whereas reproductive effects
       were associated  with only 100 or 129 chemicals in the PI and
       Fl generations,  respectively. Besides more  chemicals,  there
       were 73% more  adult reproductive effects in the Fl generation,
       than in the PI.  A similar number of chemicals and offspring
       effects were observed in the Fl and F2 generation. The relative
       generational sensitivity among reproductive effects  compared
       to offspring effects prompted us to investigate the  patterns  of
       specific   reproductive  and  offspring   toxicities   across  all
       chemicals.

       Patterns of Reproductive Toxicity
         Identification  of chemical  groups with similar reproductive
       toxicity profiles  was achieved by hierarchical clustering of 75
       target-level effects  (Fig. 1). These were defined  as target-level
       effects because  specific descriptive terms were  aggregated  to
       the target organ (i.e., liver) or  measured index (e.g., lactation
       index),   rather  than  all  possible  outcomes  for each  target
       (hypertrophy, hyperplasia, degeneration, etc.). Six groups  of
       chemicals were  identified based methods described  above  in
       the Methods section. Each chemical grouping  was described by
       the  effects  that  most heavily weighted  the  formation of the
       chemical groupings in Figure 1 and does not mean that every
                                      Previous
TOC
Next

-------
184
                                                      MARTIN ET AL.
                      Chemical Groupings
                            2048     0.015625
                          mg'kgday    mgitg/day
                    Lowest Effect Level (LEL)
                        -Log,(LEL)
                                                       Chemicals
  FIG. 1.  Two-way hierarchical clustering of 75 treatment-related effects from multigeneration reproduction tests on 316 chemicals inToxRefDB. Six chemical
groups were identified based on their patterns of reproductive toxicity. Each chemical group description is derived from the mostly heavily weighted endpoints (see
Results and http://www.epa.gov/ncctAoxrefdb/ for details) and does not mean that every chemical in the group causes the endpoint.
chemical in the group causes the endpoint. Group 1 consists of
the 14 chemicals with no  observed toxicities  across  the  75
effects in this analysis. Group 2  contains  115 chemicals for
which general systemic toxicities  are driving the formation of
the group. Interestingly, this chemical grouping is also heavily
weighted  with   endpoints  relating  to  sperm  counts and
morphology, endocrine-related organ pathologies and  weight
changes,  and  delays  in  sexual maturation.   Of  the 115
chemicals,  all  five  phthalate compounds  are  found  in this
group. Group 3 contains 63 chemicals with limited toxicity for
which parental and offspring body weight changes are driving
the formation of the chemical group.  Group 4 formation is
heavily  weighted with cholinesterase inhibition effects and is
comprised of 12 organophosphorus compounds. Groups 5 and
6 contain 48  and  64 chemicals, respectively, and the formation
of these  groupings were  heavily  weighted with reproductive
toxicity endpoints, including testicular and  epididymal pathol-
ogies in group 5 and decrements in offspring viability and
survival in group 6.
  The complete  listing of  chemical groupings  and endpoint
weights  are  available  for download  from the ToxRefDB
homepage  (http://www.epa.gov/ncct/toxrefdb/). This analysis
clearly segmented the chemicals into distinct classes based  on
their  profile  of  systemic  and  reproductive toxicities.  This
analysis also guides  endpoint selection process by highlighting
groups of related chemicals or endpoints based on the entire
profile of toxicological activity rather than a single  outcome.
Many of the associations between  endpoints or chemicals were
  expected,  but others were not. For instance,  reproductive
  performance, reproductive organ and offspring viability effects
  were segregated slightly from each other and to a greater extent
  from parental systemic  effects and even  delays in sexual
  maturation.

  Comparative Analysis with Chronic and Subchronic Systemic
     Toxicity
     Parental, reproductive, and offspring potencies (i.e., inverse
  log-transformed LEL) from the multigeneration  studies were
  compared to potency values for systemic toxicity from 2-year
  chronic and 90-day subchronic studies in the rat (Fig. 2). For
  this comparison, data were available in  ToxRefDB for 254
  chemicals  tested in both  multigeneration and 2-year  chronic
  studies, and 207 chemicals tested in both multigeneration and
  90-day subchronic studies.  The  potency  values compared
  rarely correspond to the same treatment-related  effect across
  study type.  For the majority  of  chemicals,  potency values
  between the multigeneration,  chronic and subchronic studies
  were comparable,  with a general  linear relationship falling
  within ten-fold of each other. However, for four chemicals
  (bisphenol A, deltamethrin, flucycloxuron, flufenpyr-ethyl) that
  caused parental or reproductive effects in the multigeneration
  study, there was no systemic toxicity observed  in either the
  chronic  or subchronic  studies. For another five chemicals
  (cyprodinil, diethyltoluamide, difenoconazole, ethametsulfuron
  methyl,  thiamethoxam)  potencies  for  the  most  sensitive
  multigeneration endpoints were more than 10-fold greater than
                                       Previous
TOC

-------
                                      REPRODUCTIVE TOXICITY PROFILING FROM TOXREFDB
                                                                                                                             185
                                  CHRONIC SYSTEMIC
           SUBCHRONIC SYSTEMIC
                  C/3
                  1
                               Lowest Effect Level (mg/kg/day)
                        ME- Not Established
                        254 Chemicals with Multigeneration and 2-year Chronic Study
                        207 Chemicals with Multigeneration and 90-day Subchronic Study
           Lowest Effect Level (mg/kg/day)
                                           Within 2-Fold
  FIG. 2.  Parental, reproductive, and offspring LELs (inverse log transformed) from multigeneration rat studies were compared to systemic LEL from chronic/
cancer and subchronic rat studies for 254 and 207 chemicals, respectively. Points within gold lines indicate less than twofold difference between multigeneration
and chronic studies. Points within orange lines indicate less than 10-fold difference between multigeneration and chronic studies. "NE" stands for not established.


for the most sensitive effects in chronic studies. Of these five   reproductive, or offspring endpoints, respectively. Of the seven
chemicals only thiamethoxam was more potent based solely on   chemicals   identified  as  twofold  more  potent reproductive
reproductive endpoints, that is,  testicular atrophy. Decreasing   toxicants, no reproductive organ toxicity was observed in the
the threshold  from 10-fold to  a 2-fold  increase in potency   rat chronic/cancer or subchronic studies for these chemicals—
resulted in 37, 7, and 20 chemicals more potent for parental,   the  multigeneration  test detected reproductive toxicity  that
                                     Previous
TOC

-------
186
                                                     MARTIN ET AL.
could have been  missed  in  chronic or subchronic  studies.
Under the conditions of the 2-year chronic studies, the vast
majority of chemicals observed effects at lower doses than in
the multigeneration reproductive study. However, even in these
cases,  the  multigeneration test  often  identified  selective
reproductive  toxicants  and endpoints  not detected in  the
chronic study.

Comparative Analysis of Parental, Reproductive, and
   Offspring Endpoints
   Chemicals with increased potency in the second generation
were identified by  comparing PI  and Fl,  or Fl and F2 LEL
across parental, reproductive and offspring endpoint categories
for 316 chemicals (Fig. 3). Specific second generation effects
(i.e., Fl parental or reproductive, F2 offspring) not observed in
the first  generation (i.e.,   PI  parental or  reproductive,  Fl
offspring), or sensitive effects occurring at a lower LEL in the
second generation  are  provided for all 316  chemicals on the
ToxRefDB   homepage  (http://www.epa.gov/ncct/toxrefdb/).
For parental  effects, 15 chemicals had specific effects in the
Fl versus PI, and  another 48  were more sensitive in the Fl
versus PI based upon at least a twofold difference in LEL.
   For  reproductive toxicity  endpoints,  52 chemicals  had
specific effects in the Fl versus PI, and another 14 were more
sensitive in the  Fl versus  PI  based upon at least a  twofold
difference  in LEL.  For   offspring  toxicity  endpoints,  14
chemicals had specific effects in the F2 versus Fl, and another
28 were more sensitive in the F2 versus Fl based upon at least
a twofold difference in LEL. Across reproductive and offspring
effects,   a total  of 94  chemicals displayed  specificity  or
sensitivity in the  second  generation. However, the Fl  re-
productive or F2 offspring LEL was  the most  sensitive LEL
across all endpoint categories for only 16 of these 94 chemicals
(Table 2). Of the 16 second generation sensitive chemicals as
determined by specific LEL, only three of these  chemicals had
reproductive or offspring LOAEL based on critical effects that
required mating of the Fl  adults or were  observed in the F2
offspring. Of these three, only fenarimol effects on F2 litter size
determined the chronic reference dose in the risk assessment.
This analysis in ToxRefDB has identified a subset of reference
chemicals for ToxCast predictive modeling that may be more
specific  or potent  reproductive  toxicants. However,  it  is
important to note that these ToxRefDB values are  LEL for
all treatment-related effects, and are in only a small minority of
cases critical effects being  used for determination of NOAEL/
LOAEL.

Selected Multigeneration Study Endpoints for Predictive
  Modeling
   Figure  4  presents   the incidence  and distribution  by
generation of effects on reproductive performance, reproduc-
tive organs, offspring viability, and parental systemic toxicities
selected   as  anchoring  endpoints for  ToxCast predictive
Fl








/
/, *.
/^
^
yf








/

,7
•











/
f
,.»
'














t

/








*


*
,
<*

/
'







*

x4
*

^
T
»
/









^
/'.
•
•
/
"•
» t
*






*

v.
*

/.*


X






,
/
t

KT





7
~'

**




*


'•
I
*
»ly
/



s

/
'i '





/


1
•

/


/
/





.* --/
*

/


*
4
/

X



PI




0.03125
0.0625
0.125
0.25
0.5
1
2
4
8
16
32
64
128
256
512
1024
2048
                                                     I
                                                     I
  O
  1













/
/

•


Fl










t
/


'












-•
/


• i












s
/
«
.
••
1











/
/

.
4,


/'









/
'/

'.
,'
'

/









/
/


A
•

/









f
/»

..
/:


/









/
/


•.


/









*
/
/


/
•
, •
V
'








t
/


*
*

/

*








/

••
•


/









/
/


«•


/









A
/





/










/


•


/
















/
*









\





/
f










Jl




/














U.U130J3


0.125









*>
/


•












/
/
*
,
/













/


y
/
,

/










/
«,,
'
/
.
*
/









A
/

'
/
•'
• •
/








•
t
/
\
»
/
'.
*
/
•








,
-'

,
/'

»
/


»






>
/


/
,
*
'/
' •
*







/
/


.'

*
<










/

'
/

*
/









J
/

'
/
•

/









/
/





,-'
'









/


•


/'















*
'/










\

















 — p p
      to —
                                   Wi IO C> 'JJ —
                                     w* to — ui
                                       vy> to ON
             Lowes! Effect Level (mg/kg/day)
 ME - Not Established
                                                Within 10-Fold
                    Wilhin 2-Fold
  FIG. 3.  Comparing LELs across generation and endpoint category. Points
within dark orange  lines indicate less than twofold difference  between
generations. Points within light  orange lines  indicate less  than 10-fold
difference between generations. "NE" stands for not established.
                                      Previous

-------
                                   REPRODUCTIVE TOXICITY PROFILING FROM TOXREFDB
                                                                                                                     187
                                                        TABLE 2
                Sixteen Chemicals with the Most Sensitive LELs from Fl Reproductive or F2 Offspring Toxicities

                                              LEL (mg/kg/day)
    Chemical name
                             Parental
                                                Reproductive
              Offspring
                          PI
                                    Fl
                                               PI
                                                         Fl
                                                                   Fl
                                                                             F2
                                                                                       Fl reproductive/F2 offspring sensitive effect
2,4-DB
Azoxystrobin
Bromuconazole
Carbaryl
Chlorethoxyfos
Clethodim
Desmedipham
Dicyclohexyl phthalate
Epoxiconazole
Fenarimol
Mepiquat chloride
Propetamphos
Stannane, tributylchloro-
TCMTB
Thiamethoxam
Triclosan
112
165
141
92.4
NE
263
20
402
31.9
NE
499
2.8
NE
NE
61.3
50
NE
165
141
92.4
0.78
263
20
89.9
22.1
NE
575
7.1
6.25
NE
61.3
150
112
NE
141
NE
NE
51
NE
NE
22.1
NE
499
5.5
NE
NE
NE
NE
NE
NE
NE
92.4
NE
1
110
17.8
22.1
1.2
575
0.3
0.25
NE
1.84
NE
112
165
141
92.4
0.6
NE
20
457
22.1
NE
575
5.5
1.25
NE
158
150
25
32.3
15.5
31.3
0.3
NE
4
457
0.85
NE
48.6
5.5
0.25
38.4
158
15
Kidney dilation6
Pup and liver weight changes6
Liver weight changes6
Offspring viability"'6
Pup weight decrease6
Prostate and seminal vesicle weight
Liver and kidney weight changes6
Prostate weight decrease"
Offspring viability6
Litter size decrease"'6
Eye opening delay6
Litter size decrease6
Pup, testis, and epididymis weight changes"
Pup weight decrease"'6
Testicular atrophy"
Pup weight decrease6
  Note. Underline = parental, reproductive, or offspring LOAEL (study-level LOAEL). NE = not established (no observed effects).
  "Study-level critical effect (Fl reproductive or F2 offspring).
  6F1 mating required.
modeling.  Toxicity profiles  from multigeneration studies on
316  chemicals were  based  on a diverse set  of 19 selected
effects or effect aggregations distributed in various combina-
tions  across the PI, Fl, and F2 generations. A detailed table
listing all 19  of  these  endpoints for  the  316  chemicals,
including endpoint descriptions and various transformations of
LEL  values,  is  available for download from  the ToxRefDB
homepage (http://www.epa.gov/ncct/toxrefdb/).
  These 19 highly prevalent effects identified treatment-related
changes   to  reproductive  performance  including  fertility,
mating, gestational interval, implantations, litter size, and live
birth  index demonstrated effects  at different  stages  of  the
reproductive  cycle.  Besides  effects  of many  chemicals  on
offspring viability at PND4 and PND21 (viability and lactation
indices, respectively),  pubertal delays were also recorded for
some chemicals. Pubertal delays were not part of the ToxCast
modeling dataset because only a small subset of chemicals and
studies  assessed  these  endpoints.  Effects  on  reproductive
performance  and  offspring  viability were observed in 110
(35%) and 108 (34%) of the 316 tested chemicals, respectively.
Effects   on reproductive  organs,  both organ  weight and
pathology, were observed in 98 (31%) of the  chemicals with
roughly 50% of those chemicals causing the effect only in the
second generation (Fl adult). Of the  98 chemicals, 31  caused
both male and female reproductive organ effects, 43 male only,
and  24  female  only.  Systemic  target  organ weight and
pathology  endpoints were also selected, including the liver,
      kidney and spleen, along with  the endocrine-related  adrenal,
      pituitary, and thyroid glands.
         The fairly restricted set of 19 effects characterized 151 of the
      152 chemicals that demonstrated  any reproductive toxicity.
      Additionally,  these  19 effects  identified 229 of the   269
      chemicals  that caused  any offspring toxicity. The  remaining
      40 chemicals not identified were predominantly affecting  pup
      weight only. This supports the hypothesis that we  can extract
      a small finite set of key reproductive effects from this dataset
      for use in developing robust predictive signatures in the future
      stages of ToxCast research as  a prioritization tool spanning
      reproductive toxicity.
                             DISCUSSION

         ToxRefDB is being developed with several applications in
      mind. One is to provide in vivo  toxicity effects as targets for
      ToxCast predictive models.  In this fashion, ToxCast can  be
      established as a cost-effective rapid approach for screening and
      prioritizing a large number of chemicals for further lexicolog-
      ical testing (Dix et al., 2007). Using data from high-throughput
      screening (HTS) bioassays  developed in  the  pharmaceutical
      industry, ToxCast is building computational models to forecast
      the potential toxicity of chemicals. These  hazard predictions
      should provide EPA  regulatory programs  with science based
      information helpful in prioritizing chemicals for more detailed
                                  Previous
TOG
Next

-------
188
                                                       MARTIN ET AL.
                      Reproductive
                       Performance
                      Reproductive
                            Organ
                         Offspring
                          Parental
            Fertility

             Mating

    Gestational Interval

        Implantations

          Litter Size

Live Birth Index (PND1)


             Testis

          Epididymis

            Prostate

             Ovary

             Uterus


 Viability Index (PND4)

Lactation Index (PND21)


            Adrenal

            Pituitary

            Thyroid

             Kidney

              Liver

             Spleen
                                                                20     40     60     80    100
                                                                   No. Chemicals (Total: 316)
                                                                                              120
                                                                                                    140
  FIG. 4.  Incidence and distribution, by generation, of the 19 endpoints selected for predictive modeling, including reproductive, offspring, and systemic
toxicity endpoints from the rat multigeneration reproduction study (see Results and http://www.epa.gov/ncctAoxrefdb/ for details). The light gray bar indicates
chemicals observing the endpoint only in the first generation, either PI adult or Fl juvenile. The medium gray bar indicates chemicals observing the endpoint in
both first and second generation treatment groups. The dark gray bar indicates chemicals observing the endpoint only in the second generation, either Fl adult or
F2 juvenile.
toxicological  evaluations, and therefore lead to using fewer
animal  tests. Target chemicals for such prioritization include
pesticidal  inerts,  antimicrobials,  and  the  many  industrial
chemicals  with  limited  toxicity information (Judson  et  al.,
2008c). ToxCast is  currently in the proof-of-concept phase,
wherein over 300 chemicals have been assayed in over 500
different HTS bioassays, creating bioactivity profiles being
used  to derive  signatures predicting the  known  toxicity for
these chemicals (Judson  et al., 2009).
  The Phase I chemicals are primarily conventional pesticide
actives  that have been extensively evaluated using traditional
mammalian toxicity testing, and hence have known properties
representative  of a  number  of toxicity  outcomes  (e.g.,
reproductive toxicity). Thus a critical component of ToxCast
is ToxRefDB, which is being populated with data from OPP for
pesticide  active  chemicals  and  being  extracted from  the
evaluations on  these  studies  conducted  by  OPP scientists.
Comparable toxicity  data  from  other  toxicity  sources (e.g.,
National Toxicology Program)  are  also  being captured  in
                        ToxRefDB. A broader and more diverse set of complementary
                        data on  thousands of chemicals is being captured  in  EPA's
                        Aggregated Computational Toxicology Resource (http://actor.
                        epa.gov/actor; Judson  et al.,  2008b).  Although  pesticide
                        toxicity  data currently  predominates   in  ToxRefDB,  the
                        database is being expanded to a broader range of chemicals,
                        both by category and use.
                          The underlying data represented  in  ToxRefDB  has been
                        evaluated by EPA in prior pesticide registration decisions, and
                        the presence  of effects in high-dose animal studies do  not
                        translate directly into  significant human risk stemming from
                        registered uses of the pesticide. One major issue to note is that
                        the current analysis  of ToxRefDB is not  limited to just the
                        critical effects leading to regulatory determinations of LOAEL
                        and NOAEL. In addition,  it should be noted that the  EPA uses
                        animal toxicology studies, like those entered into ToxRefDB,
                        as well  as other  sources of  information such  as  effects on
                        wildlife  populations,  mechanisms of  action,  use  patterns,
                        environmental fate and persistence,  food residue levels, and
                                        Previous

-------
                                   REPRODUCTIVE TOXICITY PROFILING FROM TOXREFDB
                                                                                                                   189
human  exposure potential  in  its determinations to register
pesticides,  and  to  establish  acceptable  levels  of  pesticide
residues for uses in  the United States (http://www.epa.gov/
pesticides/).
  The toxicity data in ToxRefDB (www.epa.gov/ncct/toxrefdb)
and the HTS data generated in ToxCast  (www.epa.gov/ncct/
toxcast)  is  being  made  publicly  available  through  EPA
websites.  The  first component of ToxRefDB  was  recently
published (Martin et al., 2009), presenting endpoints to be used
for predictive modeling from two-year rodent bioassays on 310
chemicals. The analysis and release of developmental toxicity
endpoints on 383 chemicals from ToxRefDB will also provide
key endpoints to be used for predictive modeling (Knudsen,
2009).  Multigeneration  reproduction  study  data  for  316
chemicals was entered into ToxRefDB making the vast library
of legacy data computable  for the first time. The pattern of
reproductive toxicity across  these chemicals resulted in group-
ings of  similar chemicals that could be used to match up with
HTS  bioactivity profiles,  In  the  meantime,  the   analysis
corroborated  the distinction  between parental, offspring and
reproductive  effects  in downstream  analyses  based  on the
distribution of endpoints across the chemical groups.
  All 12,230 effects in the multigeneration study dataset were
placed into three major classes of effects; parental, reproductive
and offspring. The LEL for each class or category of effects
were  used  to  identify sensitive  or  specific   reproductive
toxicants based on comparisons to chronic and subchronic
study data  and  cross-generational  comparisons  within the
multigeneration  reproductive test, In general,  chemical expo-
sures  under conditions of  the  multigeneration reproduction
study were  less  sensitive  than under the conditions of the
2-year chronic study and comparable to the 90-day subchronic
study.  The  analysis  did, however,  identify a  subset  of 94
chemicals with sensitive or specific reproductive or  offspring
toxicities when  compared to systemic effects  under longer
continuous  exposure  periods.  Additional  future   analyses
comparing,  for  instance, maternal  and  fetal toxicity  from
developmental toxicity studies (Knudsen, 2009) to parental and
offspring  toxicity from reproductive  toxicity  studies,  will
provide  additional  insight  into  the  role of developmental
exposures in the manifestation of specific  toxicities.
  Similar insight can  be  gleaned from comparing endpoints
occurring at a lower dose or only in the second generation, that
is, second generation sensitive or specific effects, respectively.
Effects  that  occur  in  the first  generation   and   are  not
corroborated  in  the second generation can be questioned as
to  its  toxicological   relevance.  Conversely,  effects  with
consistent  increases   in  second  generation  sensitivity  or
specificity  might reflect  the need  for reproductive  or de-
velopmental exposure to  occur.  Comparisons  across  these
broad classes  of endpoints honed  in  on  specific effects for
which  to characterize the  chemical set. The primary  set of
effects selected as anchoring endpoints for ToxCast predictive
modeling were reproductive indices, offspring  viability, and
      male and female reproductive organ effects, along with a set of
      parental systemic organ toxicities.
        The current  study  focused  on providing  endpoints for
      predictive modeling as part of the ToxCast research program
      (Dix et al., 2007), but also began to address the importance of
      specific study design parameters, including differences across
      generation, life-stage and various classes  of endpoints.  It has
      recently been suggested that the reproductive test guidelines for
      agrichemicals could be refined to make the second generation
      optional based on results seen in the  first generation  (Cooper
      et al., 2006). Consistent  with  results from Janer et al. (2007),
      the  current  analysis of this ToxRefDB dataset  supports the
      hypothesis that the  second, F2 generation  in these 329 studies
      would rarely impact  either  the  qualitative  or quantitative
      evaluations of these studies. Of the sixteen second generation
      sensitive chemicals, carbaryl, fenarimol, and TCMTB observed
      second generation effects that  would have required Fl mating.
      However,  of these three chemicals only fenarimol effects  on F2
      litter size  determined the chronic reference dose determination
      (U.S.  EPA,  2006, 2007a,b). Additional analysis  wiU be per-
      formed on this  dataset in collaboration with  OPP and other
      international chemical regulatory agencies to expound upon the
      role of these  and other study design parameters with respect to
      chemical  regulation and  potential   guideline study  design
      changes.  For  instance, 53 of  the  73  chemicals proposed for
      screening in the Endocrine Disrupter Screening Program (EDSP;
      http://www.epa.gov/endo/pubs/prioritysetting/draftlist.htm) have
      multigeneration studies entered into ToxRefDB. Where  avail-
      able, multigeneration study data for the remaining chemicals are
      now being entered  into ToxRefDB. A focused analysis  of the
      EDSP chemical set to  assess  the ability of the current  and
      previous  guidelines to  identify reproductive effects related to
      endocrine disruption would be just one example  of the  utility
      of ToxRefDB (Kavlock et al.,  2009). The use of ToxRefDB to
      address many research and regulatory science  questions re-
      garding in vivo  mammalian toxicity not only provides  trans-
      parency, but also assists in guiding the next set of questions.
        The diverse utility of ToxRefDB as a reference database for
      research applications such as ToxCast demonstrates the power
      of curating toxicity information into  a relational database, In
      the  current  analysis  on  the  multigeneration  reproductive
      toxicity test,  six chemical sets were derived and subsequently
      nineteen specific endpoints were identified to serve anchoring
      endpoints for eventual predictive  modeling. These endpoints
      are  further  defined by  life-stage  or generation,  and  fully
      characterize the  reproductive  toxicity potential of the 316 in
      this  study.   Capturing  this  reproductive  toxicity  data  in
      ToxRefDB  supports  ongoing  retrospective   analyses,  test
      guideline revisions, and computational toxicology research.

                       SUPPLEMENTARY DATA

        Supplementary data  are available  online  at  http://toxsci.
      oxfordjournals .org/.
                                  Previous
TOG
Next

-------
190
                                                               MARTIN ET AL.
                             FUNDING

   United States Environmental Protection Agency.


                          REFERENCES

Collins, F.  S.,  Gray,  G.  M., and  Bucher, J. R.  (2008).  Transforming
  environmental health protection. Science 319, 906-907.
Cooper, R. L., Lamb, J.  C.,  Barlow,  S. M.,  Bentley, K., Brady, A.  M.,
  Doerrer,  N. G., Eisenbrandt, D. L., Fenner-Crisp, P. A., Mines, R. N.,
  Irvine, L. F.,  et al.  (2006). A  tiered approach to life  stages testing for
  agricultural chemical safety  assessment. Crit. Rev. Toxicol. 36, 69-98.
Dix, D. J., Houck, K. A., Martin, M.  T., Richard, A. M., Setzer, R. W., and
  Kavlock, R. J. (2007). The ToxCast program for prioritizing toxicity testing
  of environmental chemicals. Toxicol. Sci. 95, 5-12.
Houck, K. A., and Kavlock, R.  J. (2008). Understanding mechanisms of toxicity:
  Insights from drug discovery research. Toxicol. Appl. Pharmacol. 227,163-178.
Ihaka,  R., and Gentleman, R.  (1996). R: A  language for data analysis  and
  graphics. /. Comput.  Graph. Stat. 5, 299-314.
Janer, G., Hakkert, B. C., Slob, W., Vermeire, T., and Piersma, A. H. (2007). A
  retrospective analysis of the two-generation study: What is the added value
  of the second generation? Reprod. Toxicol.  24, 97-102.
Judson, R. S., Dix, D.  J.,  Houck,  K.  A., Martin, M. T., and Kavlock, R. J.
  (2009). Developing predictive signatures using in vitro data from the EPA
  ToxCast program. Toxicologist 108, 79-80.
Judson, R.  S., Elloumi, F., Setzer, R. W., Li,  Z., and  Shah,  I.  (2008a). A
  comparison of machine learning algorithms for chemical toxicity classifica-
  tion using a simulated multi-scale data model. BMC Bioinformatics 9,  241
  doi:10.1186/1471-2105-9-241.
Judson, R. S., Richard, A. M., Dix, D. J., Houck, K., Elloumi,  F.,  Martin, M. T.,
  Carney, T., Transue, T. R., Spencer, R., and Wolf, M. (2008b). ACToR—
  Aggregated  computational toxicology resource. Toxicol.  Appl.  Pharmacol.
  doi:10.1016/j.taap.2007.12.037.
Judson, R., Richard, A., Dix, D.  J.,  Houck, K., Martin,  M., Kavlock, R.,
  Dellarco, V., Henry, T., Holderman, T., Sayre, P., et al.  (2008c). The toxicity
  data landscape for environmental chemicals. Environ.  Health Perspect.doi:
  10.1289/ehp.0800168.
Kavlock, R. J., Dix,  D. J., Houck, K. A., Judson, R. S., Martin, M. T., and
  Richard,  A.  M.  (2008). ToxCast:   Developing  predictive signatures for
  chemical toxicity. Altern. Anim. Test. Exp. 14(Special  issue),  623-627.
   Kavlock, R. J., Dix, D. J., Houck, K. A., Martin, M. T., and Judson, R. (2009).
     Biological profiling of endocrine related effects of chemicals using ToxCast.
     Toxicologist 108, 226.
   Knudsen, T. B., Martin,  M. T., Kavlock,  R. J., Judson, R. S., Dix, D. J., and
     Singh, A. V. (2009).  Profiling the  activity of environmental chemicals in
     prenatal developmental toxicity  studies using the U.S. EPA's ToxRefDB.
     Reprod. Toxicol. doi: 10.1016/j.reprotox.2009.03.016.
   Martin, M. T., Judson, R. S., Reif, D. M., Kavlock, R. J., and Dix, D. J. (2009).
     Profiling chemicals based on chronic  toxicity results  from the  U.S.  EPA
     ToxRef database. Environ. Health Perspect. 117, 392-399.
   National Research Council  (NRC).  (2007). Toxicity Testing in the  21st
     Century: A Vision  and  a  Strategy.  The  National Academies Press,
     Washington, DC.
   Thorndike, R. L. (1953).  Who belongs in the family? Psychometrika 18,
     267-276.
   U.S. Environmental Protection Agency (U.S. EPA).  (1982). OPP Guideline
     83—84: Reproductive  and Fertility Effects. Pesticide Assessment Guide-
     lines, Subdivision F, Hazard Evaluation: Human and Domestic Animals.
     Office of Pesticides and Toxic Substances,  Washington, DC,  EPA-540/
     9-82-025.
   U.S. Environmental Protection  Agency  (U.S.  EPA). (1996).  Reproductive
     toxicity  risk assessment guidelines. Fed. Regist. 61, 56274-56322.
   U.S. Environmental Protection Agency (U.S. EPA).  (1998). Health Effects
     Test  Guidelines OPPTS 870.3800 Reproduction  and Fertility Effects.
     Office of Pesticides and Toxic Substances, Washington, DC,  EPA  712-
     C-98-208.
   U.S. Environmental Protection Agency  (U.S.  EPA).  (2006). Reregistration
     Eligibility Decision  for  2-(Thiocyanomethylthio)-benzothiazole  (TCMTB).
     Office of pesticides and toxic substances, Washington, DC, EPA 739-R-07-003
   U.S. Environmental Protection Agency (U.S.  EPA).  (2007a).   Fenarimol
     Summary Document Registration Review: Initial Docket. Office of Pesticides
     and Toxic Substances, Washington, DC, EPA-HQ-OPP-2006-0241.
   U.S. Environmental Protection Agency (U.S. EPA).  (2007b). Reregistration
     Eligibility Decision  (RED) for Carbaryl. Office of Pesticides  and Toxic
     Substances, Washington, DC, EPA 738-R-07-018.
   U.S. Environmental  Protection  Agency (U.S.  EPA).  (2009).  The  U.S.
     Environmental  Protection Agency's  Strategic Plan  for  Evaluating the
     Toxicity  of  Chemicals.  Office  of  the Science  Advisor,  Science  Policy
     Council, Washington,  DC, EPA  100/K-09/001.
   Ward, J. H. (1963). Hierarchical grouping to optimize an objective function.
     /. Am. Stat. Assoc. 58, 236-244.
                                             Previous
TOG
Next

-------
                                                                                      Guest Editorial   The future of toxicity testing
The Future of Toxicity
Testing for Environmental
Contaminants
Toxicity testing and assessment sit on the cusp
of a transformational change brought about by
the rapid emergence of tools and capabilities
in molecular biology and computational and
informational sciences. This transformation
has the potential to dramatically reshape the philosophy and approaches
underlying toxicity testing and the assessment of human health risks
associated with exposure to environmental contaminants.
    Such a transformation is especially significant for agencies that
are responsible for implementing congressionally mandated pro-
grams under which the risks of exposure to  a wide variety of envi-
ronmental pollutants are assessed and regulated. Most often, such
regulatory decisions have relied on toxicity testing data obtained
nearly exclusively from experimental animal models. This approach,
however, presents challenges in accommodating the need for more
efficient and cost-effective means to screen and prioritize chemicals
for testing and addressing increasingly complex issues such as life-
stage susceptibility and genetic variations  in the human population,
the risks of concurrent, cumulative exposure to multiple and diverse
chemicals,  and, fundamental to all, improved understanding of the
mechanism through which toxicity occurs.
    The U.S. Environmental  Protection Agency (EPA) has recognized
the potential application of emerging science to improve toxicity test-
ing and risk assessment (U.S. EPA 2002, 2004), notably by taking the
lead in commissioning the National Research Council (NRC) in 2004
to review existing strategies  (NRC 2006) and develop a long-range
vision for toxicity testing and risk assessment  (NRC 2007). Beyond
EPA, other federal programs have also recognized the need for this
transformative shift, as reflected in the National Toxicology Program's
(NTP) A National Toxicology Program for  the 21st Century: Roadmap
for  the Future (NTP 2004) and the Food and Drug Administration's
FDA's Critical Path Initiative  (FDA 2008).
    To  build on  the  NRC  document, the  U.S.  EPA established
an  internal, cross-agency  workgroup that produced The U.S.
Environmental Protection Agency's Strategic Plan for Evaluating the
Toxicity of Chemicals  (U.S. EPA 2009) to provide a framework for
EPA to comprehensively move forward to  incorporate this new
scientific paradigm into future toxicity testing and risk assessment
practices. The strategy is centered on three interrelated issues: a) the
use of toxicity pathways information in screening and prioritization
of chemicals for further testing; b) the use  of toxicity pathways infor-
mation  in  risk assessment; and c) organizational transition. The last
element explicitly recognizes that regulatory offices within EPA will
need to be  actively involved in overseeing the significant transition to
this new paradigm and the translation of the attendant data for regu-
latory application.
    Research to address the first issue will build  on the efforts of
EPA's ToxCast program in  identifying and developing simple, reli-
able screening models to predict chemical  hazard (U.S. EPA 2008a).
The second effort will seek  to apply the toxicity pathways concept
in a systems biology approach, to better delineate the molecular
and cellular changes that perturb normal homeostatic mechanisms
toward a given toxicity pathway or set of toxicity pathways. This
information should reduce the uncertainty currently associated with
dose-response models by increasing their biological plausibility.
    Recognizing the  necessity and benefits of collaboration to
achieve the NRC's vision, EPA recently signed a Memorandum of
Understanding with the NTP and the National Institutes of Health
Chemical Genomics Center (U.S. EPA(2008b) to advance the high
throughput screening and toxicity pathway profiling in risk assess-
ment. This "Tox21" consortium is now actively coordinating efforts
to identify chemicals, pathways, screening assays, and informatic
approaches to assess the effects of thousands of chemicals (Kavlock
et al. 2009). The U.S. EPA is also working with the European
Commission  and the  Organization for Economic Cooperation and
Development to facilitate global collaborations.
    As recognized by the NRC (2007), the development and imple-
mentation of a transformational paradigm will require a major
commitment to new funding to sustain an iterative and long-term
process that changes institutional toxicity testing and risk assessment
practices. Regulators, stakeholders, and the public must be confident
that the new types of data can be used to effectively assess risk and
ultimately protect public health. As such, education and transpar-
ent communication will be critical. Ultimately, the testing paradigm
must be  evaluated via a comprehensive development and review
process, involving public comment, expert peer review, and harmo-
nization with other agencies and international organizations. EPA's
Strategic Plan for Evaluating the  Toxicity of Chemicals (U.S.  EPA
2009) lays the framework upon which the development, implemen-
tation, acceptance, and application of this transformative paradigm
can be built.
  The views expressed in this letter are those of the individual authors and
do not necessarily reflect the views of the U.S. EPA.

                                            Melissa G. Kramer
                                   Office of the Science Advisor
                          U.S. Environmental Protection Agency
                                              Washington, DC
                                E-mail: Kramer.melissa@epa.gov

                                            Michael Firestone
        Office of Children's Health Protection and Environmental
                                                    Education
                          U.S. Environmental Protection Agency
                                              Washington, DC

                                               Robert Kavlock
                                                Harold Zetiick
                            Office of Research and Development
                          U.S. Environmental Protection Agency
                                    Research Triangle Park, NC

Melissa Kramer has worked at the U.S. EPA on science policy issues since
2002, when she was a AAAS Science and Technology Policy Fellow. She
currently works in the Office of Policy, Economics, and Innovation.
 Environmental Health Perspectives  •  VOLUME 1171 NUMBER 7 I July 2009
                                     Previous
                                                         A 283

-------
Guest Editorial  The future of toxicity testingGuest
Michael Firestone is Science Director in the U.S. EPA Office of
Children's Health Protection and Environmental Education. His
primary focus is on the development of risk assessment guidance and
policy that explicitly  considers the potential for greater early lifestage
susceptibility, including work on developing EPA's guidance on cancer
risk assessment, childhood age grouping, and probabilistic modeling. In
addition, he has helped develop EPA's Risk Assessment Portal (www.epa.
gov/risk) and was the agency's project officer for the recent NRC report
Toxicity Testing in the 21st Century.

Robert  Kavlock is the Director of U.S. EPA National Center for
Computational Toxicology. He has more than 30years' experience in
developing screening tools and methodologies for noncancer risk assess-
ments.

Harold Zenick is Director of the National Health and Environmental
Effects Research Laboratory in the Office of Research and Development
in the U.S. EPA. His current interests are in integrating human health
and ecological risk assessment, strengthening the linkages between envi-
ronmental and public health agendas and agencies, and the application
of emerging computational, informational, and  molecule sciences in
improving toxicity testing and risk assessment practices.
                              REFERENCES

    FDA. 2008. FDA's Critical Path Initiative. Available: http://www.fda.gov/ScienceResearch/
       SpecialTopics/CriticalPathlnitiative/default.htm [accessed 10 June 2009].
    Kavlock RJ, Austin CP, Tice RR. 2009. Toxicity testing in the 21st century: implications for human
       health risk assessment. Risk Anal 29(41:485-487.
    NRC (National Research Council). 2006. Toxicity Testing for Assessment of Environmental
       Agents. Washington, DC:National Academies Press.
    NRC (National Research Council). 2007. Toxicity Testing in the 21st Century: A Vision and a
       Strategy. Washington, DC:National Academies Press.
    NTP. 2004. A National Toxicology Program for the 21st Century: A Roadmap for the Future.
       Research Triangle Park, NC:National Toxicology Program.
    U.S. EPA. 2002. Interim Genomics Policy. Washington, DC:U.S. Environmental Protection
       Agency. Available:  http://www.epa.gov/osa/spc/genomics.htmlaccessed 1 August2008].
    U.S. EPA. 2004. Genomics White Paper. EPA 100/B-04/002. Washington, DC:U.S. Environmental
       Protection Agency.
    U.S. EPA. 2008a. ToxCast™ Program: Predicting Hazard, Characterizing Toxicity Pathways,
       and Prioritizing the Toxicity Testing of Environmental Chemicals. Washington, DC:U.S.
       Environmental Protection Agency. Available: http://www.epa.gov/nccVtoxcast/ [accessed
       1 August 2008].
    U.S. EPA. 2008b.Memorandum of Understanding. Washington, DC:U.S. Environmental Protection
       Agency, National Center for Computational Toxicology. Available: http://www.epa.gov/
       comptox/articles/comptox_mou.html [accessed 1 August 2008].
    U.S. EPA. 2009. The U.S. Environmental Protection Agency's Strategic Plan for Evaluating the
       Toxicity of Chemicals. EPA100/K-09/001. Washington, DC:U.S. Environmental Protection
       Agency.
Benzene 2009:  Health  Effects and Mechanisms of Bone Marrow Toxicity. Implications  for  t-AML
and the  Mode of Action  Framework will be  held September 7-11,  2009  at  Technische Universitat
Munchen, Munich Germany. A satellite meeting to Eurotox 2009, Benzene 2009 will:
           •          Review the relationship between bone marrow toxicity and leukemia;
           •          Discuss recent studies on the epidemiology of benzene-induced diseases;
           •          Explore  mechanistic  studies  of reactive   metabolites  and  reactive  oxygen   species,
                      transgenics and  signal transduction, DNA damage and repair;
           •          Evaluate exposure metrics and biomarkers; and
           •          Place the evidence in context using a mode-of-action approach to inform risk assessment.
Abstracts are invited for poster  and plenary presentations (deadline  for submitting: June 30, 2009). Travel
fellowships are  available  for  students or postdoctoral researchers.    For  further information,  please visit
www.tum-benzenesvmposium.de.
A 284
        VOLUME 1171 NUMBER 7 I July 2009  •  Environmental Health Perspectives
                                         Previous
TOC

-------
                                                                                                                      Review
The Toxicity  Data  Landscape for Environmental Chemicals
Richard Judson,1 Ann Richard,1 David J. Dix,1 Keith Houck,1 Matthew Martin,1 Robert Kavlock,1 Vicki Dellarco,2
Tala Henry,3 Todd Holderman,3 Philip Sayre,3 Shirlee Tan,4 Thomas Carpenter,5 and Edwin Smith6
1National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency,
Research Triangle Park, North Carolina, USA; 2Office of Pesticide Programs, Office of Prevention, Pesticides, and Toxic Substances,
U.S. Environmental Protection Agency, Arlington, Virginia, USA; 3Office of Pollution Prevention and Toxics and 4Office of Science
Coordination and Policy, Office of Prevention, Pesticides, and Toxic Substances, U.S. Environmental Protection Agency, Washington,
DC, USA; 5Office of Water, Office of Ground Water and Drinking Water, U.S. Environmental Protection Agency, Washington, DC, USA;
6Great Lakes National Program Office, U.S. Environmental Protection Agency, Chicago, Illinois, USA
 OBJECTIVE: Thousands of chemicals are in common use, but only a portion of them have undergone
 significant toxicologic evaluation, leading to the need to prioritize the remainder for targeted testing.
 To address this issue, the U.S. Environmental Protection Agency (EPA) and other organizations are
 developing chemical screening and prioritization programs. As part of tliese efforts, it is important
 to catalog, from widely dispersed sources, the toxicology information that is available. The main
 objective of this analysis is to define a list of environmental chemicals that are candidates for the
 U.S. EPA screening and prioritization process, and to catalog the available toxicology information.
 DATA SOURCES: We are developing ACToR (Aggregated Computational Toxicology Resource),
 which combines information for hundreds of thousands of chemicals from > 200 public sources,
 including the U.S. EPA, National Institutes of Heahh, Food and Drug Administration, correspond-
 ing agencies in Canada, Europe, and Japan, and academic sources.
 DATA EXTRACTION: ACToR contains chemical structure information; physical—chemical properties;
 in vitro assay data; tabular in vivo data; summary toxicology calls (e.g., a statement that a chemical
 is considered to be a human carcinogen); and links to online toxicology summaries. Here, we use
 data from ACToR to assess the toxicity data landscape for environmental chemicals.
 DATA SYNTHESIS: We show results for a set of 9,912 environmental chemicals being considered for
 analysis as part of the U.S. EPA ToxCast screening and prioritization program. These include high-
 and medium-production-volume chemicals, pesticide active and inert ingredients, and drinking
 water contaminants.
 CONCLUSIONS: Approximately two-thirds of  these chemicals have at least limited toxicity sum-
 maries available. About one-quarter have been assessed in at least one highly curated toxicology
 evaluation database such as tlie U.S. EPA Toxicology Reference Database, U.S. EPA Integrated
 Risk Information System, and the National Toxicology Program.
 KEY WORDS: ACToR, carcinogenicity, database, developmental, hazard, HPV, MPV, pesticide,
 reproductive, toxicity. Environ Health Perspect  117:685-695 (2009).  doi:10.1289/ehp.0800168
 available via http:lldx,doi,orgl [Online 22 December 2008]
The U.S. Environmental Protection Agency
(EPA) has a significant interest in develop-
ing more efficient and informative toxicity
determination approaches in part because
of the large number of chemicals under its
jurisdiction. Ultimately, it would be bene-
ficial to characterize the toxicologic profiles
of all chemicals in use in the United States.
However, the  size of this chemical universe
[in excess of 75,000 chemicals, which is the
estimated number in the Toxic Substances
Control Act (TSCA 1976) inventory (U.S.
EPA 2004b) makes this goal too difficult
using current approaches to toxicity charac-
terization that rely on extensive animal test-
ing,  cost millions of dollars, and can take
2—3 years per chemical. The International Life
Sciences Institute/Health and Environmental
Sciences Institute  (ILSI/HESI) recently
released several reports describing a more
focused, tier-based approach for toxicity test-
ing of agricultural chemicals, which would
ultimately lead to the use of fewer animals
(Barton et al. 2006; Carmichael et al. 2006).
The National Research  Council (NRC)
recently released a report titled  Toxicity
Testing in the 21st Century: A Vision and a
Strategy that outlines a much more ambi-
tious and long-term vision for  developing
novel in vitro approaches to chemical tox-
icity characterization and prediction (NRC
2007) that would largely eliminate animal
testing. The NRC report addresses several
concerns about the current testing methods,
specifically, the desire d) to reduce the num-
ber of animals used in testing, b) to reduce
the overall cost and time required to charac-
terize each chemical, and c) to increase the
level of mechanistic understanding of chemi-
cal toxicity. The U.S. EPA and the National
Institutes of Health (NIH) are actively pursu-
ing approaches to implement ideas outlined
in the NRC report (Collins et al. 2008).
   Regardless of the level of quality of toxi-
cology data  on environmental chemicals,
many chemicals lack significant amounts
of data. In the United States and  Canada,
an  estimated 30,000 chemicals are in
wide commercial use, based on U.S. EPA
and  Environment Canada data (Muir and
Howard  2006). The European Union's
Registration, Evaluation, and Authorization
of Chemicals (REACH) program has recently
released its first set of registered substances,
which contains > 140,000 entries (REACH
2008). The exact number of chemicals in use
is, in a sense, unknowable because it depends
on where one sets the threshold of use and
because use changes over time. The major
point is that the number is relatively large
and that only a relatively small subset of these
chemicals have been sufficiently well charac-
terized for their potential to cause human or
ecologic toxicity to support regulatory action.
This "data gap" is well documented (Allanou
et al.  1999;  Applegate and  Baer 2006;
Birnbaum et al. 2003; Guth et al. 2005;  NRC
2007; U.S. EPA 1998).
   The high cost and lengthy  times associ-
ated with the use of animal testing to deter-
mine a chemical's potential for toxicity make
this  strategy impractical for evaluating tens
of thousands of chemicals, hence the large
inventories of existing chemicals for which
few or no test data are available. An alterna-
tive  approach is to attempt to assess much
larger numbers of chemicals by employing
more efficient in vitro methods.  One strategy
applies a  broad spectrum of relatively inex-
pensive and rapid high-throughput screening
Address correspondence to R. Judson, U.S.
Environmental Protection Agency,  109 T.W.
Alexander Dr. (B205-01), Research Triangle Park,
NC 27711  USA. Telephone: (919) 541-3085. Fax:
(919) 541-1194. E-mail: judson.richard@epa.gov
  We acknowledge significant contributions from
members of the U.S. EPA Aggregated Computational
Toxicology Resource (ACToR) development team:
T. Cathey,  T. Transue, and R. Spencer of Lockheed
Martin, and F. Elloumi, D. Smith, J. Vail, and  K.
Daniel. We also acknowledge the significant contribu-
tion of M. Wolf (Lockheed Martin) in relation to the
U.S. EPA's  Distributed Structure-Searchable Toxicity
Data Network structure inventory incorporated into
ACToR.
  This article has been reviewed  by the U.S. EPA
and approved for publication. Approval does not
signify that the contents necessarily reflect the views
and policies of the agency, nor does mention of trade
names or commercial products constitute  endorse-
ment or recommendation for use.
  The  authors declare they  have no  competing
financial  interests.
  Received 8 September 2008; accepted 22 December
2008.
Environmental Health Perspectives •  VOLUME 117 I NUMBER 5 I May 2009
                                       Previous
                                                                                 685

-------
Judson et al.
(HTS) assays to a large set of chemicals, fol-
lowed by the use of these results to prioritize
a much smaller subset of chemicals for more
detailed analysis. The "prioritization score"
for a chemical would be based on signatures,
or patterns extracted from the HTS data, that
are predictive of particular effects or modes
of chemical toxicity. A comprehensive priori-
tization approach will also require the use of
exposure and pharmacokinetic estimates, in
addition to the intrinsic  hazard information
provided by in vitro assays. Chemicals of known
toxicity make up the  training and validation
sets that are used to  develop and validate these
predictive signatures. HTS assays that yield
data  for the predictive signatures would then
be run on chemicals of unknown toxicity (the
test chemicals), and a prioritization score for
those chemicals would be produced. The U.S.
EPA has made a significant investment in this
approach through the ToxCast research pro-
gram (Dix et al. 2007). ToxCast is currently
screening hundreds,  and eventually thousands,
of environmental chemicals using hundreds of
HTS assays with the goal to develop predictive
toxicity signatures,  and is using these signa-
tures to prioritize chemicals for further test-
ing. In this context, the term "environmental
chemicals" refers primarily to pesticides and
industrial chemicals  that are used or produced
in large enough quantities to pose potential for
human  or ecologic exposure [largely the high-
production-volume  (HPV) and medium-pro-
duction-volume (MPV) chemicals described
below].  However, a  number of environmental
chemicals that are captured in our analysis are
food ingredients or naturally occurring human
metabolites. We included many of the former
because they are classified as inert ingredients
in pesticide products.
   In this article we address  two key aspects
of this chemical screening and prioritization
process. The first is the definition of a set of
chemicals  of interest to a screening program,
based on their widespread use or other poten-
tial for significant  human exposure, or the
current availability of toxicity information
that can be used in building  screening mod-
els. Some widely used but as yet uncharacter-
ized  chemicals may not  be good candidates
for screening  because  their physical-chemical
properties make them impractical to test in
in vitro assays (e.g., insoluble or highly vola-
tile compounds), whereas other substances
that we define as environmental chemicals are
regarded to be safe  under intended use situa-
tions and may not require further testing, but
can serve as negative controls. For instance, a
subset  of pesticide  inert  ingredients  are also
on the U.S. Food and Drug Administration
(FDA) Generally Recognized as Safe chemical
list. As  a further example, some "chemicals"
that are listed as pesticide inert ingredients are
common foods, such as milk.
   The second objective is the characterization
of the sources and amount of reliable in vivo
toxicology data that can be used for develop-
ing and validating screening models in pro-
grams such as ToxCast. A significant amount
of high-quality toxicity  data are needed to
train and validate in vitro—based models for
predicting chemical hazard. Equally important
is the presence  of both negative and positive
examples for each toxicity end point to  be
modeled. In addition to  the sets of environ-
mental chemicals described here, pharmaceuti-
cal compounds are another source of detailed
animal and human toxicology data.
   The sets of chemicals on which we have
focused  are the HPV and MPV chemicals
from the TSCA inventory, pesticide and anti-
microbial active and inert ingredients, known
drinking water contaminants, hazardous air
pollutants (HAPs) and certain defined classes of
chemicals of interest, including the U.S. EPA's
Toxics Release Inventory (TRI), Integrated Risk
Information  System (IRIS), and the first set of
chemicals to be tested through the Endocrine
Disrupter Screening Program (EDSP). The
TRI, drinking water contaminant, and EDSP
chemicals are largely included in the TSCA
inventory and pesticide active and inert ingredi-
ent lists. By combining these sources, we define
a set of 9,912 chemicals. Below we describe in
detail the process we used to arrive at this num-
ber. At present, we have  limited the scope of
in vivo toxicology data to that which is relevant
to human health, as opposed to ecotoxicity. An
equivalent analysis for the ecotoxicity data land-
scape will be carried out in the future.
   To support  a data-intensive analysis
of environmental chemicals, we have devel-
oped a system called  ACToR (Aggregated
Computational Toxicology Resource) (Judson
et al. 2008; U.S. EPA 2008a), which is a data-
base holding essentially all publicly available
information on chemical identity, structure,
physical-chemical properties, in vitro assay
results, and in vivo toxicology data. All of the
data described  in this  article have been col-
lected in ACToR.

Target Chemicals for Analysis
The U.S. EPA has authority to review and/or
regulate  a large number  of chemicals under
a variety of statutes, including those govern-
ing the manufacture, import, sale, and use of
pesticides and industrial chemicals. The large
numbers of chemicals on  various U.S. chemi-
cal inventories, and the limited toxicity infor-
mation for many of these, have already been
stated as the driver for the need to set priorities
for additional testing. Because this universe of
chemicals is so large, it is even  necessary to
prioritize what  goes into  a science-based pri-
oritization approach such as ToxCast. In this
article we focus on chemicals that are  of inter-
est because d) they are  known to be bioactive
(e.g., pesticide active ingredients), £) they are
manufactured or used in large quantities (HPV
and MPV chemicals), or c) many people may
be exposed to them on a  routine basis (e.g.,
drinking water contaminants).  We include
both largely uncharacterized chemicals and
chemicals for which significant toxicology
information  is already available (e.g., pesti-
cide active ingredients, IRIS chemicals, and
chemicals on the TRI). The well-characterized
chemical groups are important because these
allow us to  develop and validate predictive
models for prioritization of the remaining,
largely uncharacterized chemicals.
   Based on these criteria, we focused on sets
of chemicals that are defined in the remain-
der of this section. Some of these lists  are not
static,  so we have chosen versions  available
as of a specific date. For each of the lists, we
describe the  rules for inclusion  and provide
the total number of chemicals used  for the
current evaluation. "Official" versions of these
lists are updated and posted to  the relevant
U.S. EPA websites only every 2 or more years,
so in  several cases, we have extracted more
current snapshots of the  lists from internal
U.S. EPA databases. Many of the chemicals
we included in this analysis are complex mix-
tures.  Additionally, these lists have significant
overlap; for  instance, some pesticide active
ingredients are also HPV chemicals. Finally, to
be included in the current ACToR inventory,
a chemical must be identified by a Chemical
Abstracts Service Registry Number (CASRN).
   Possible later extensions of  this analysis
could consider chemicals with lower produc-
tion volumes or lower exposure potential
than those considered presently.  These would
include the Canadian Domestic Substances
List (DSL),  which includes approximately
30,000 chemicals, and the large  collection of
chemicals to  be analyzed under the REACH
program. REACH is still in  the process
of defining its target list, but an estimated
30,000 chemicals will be  included.  Many of
the Canadian DSL and REACH chemicals
have U.S.  use and/or production levels below
the cutoffs used for the present analysis. Note,
however,  that the Canadian DSL and the
chemicals we considered here significantly
overlap. Additionally, pharmaceutical com-
pounds will be included in the future because
of the corresponding wealth of  both  animal
and human toxicology data.
    The TSCA Inventory and Inventory Update
Reporting (IUR). In 1977,  the U.S. EPA pub-
lished a rule to assemble an  inventory of chemi-
cal substances currently in commerce. This
inventory, commonly referred to as the TSCA
Inventory, is the basis for the  U.S. EPA's
Existing Chemicals Program. Starting in 1986,
the Inventory was periodically updated using
the IUR regulation. The TSCA  Inventory is
composed of approximately 85,000 chemical
686
                            VOLUME 117 I NUMBERS I May 2009  • Environmental Health Perspectives
                                        Previous
                  TOC

-------
                                                                         The toxicity data landscape for environmental chemicals
substances (U.S. EPA 2004b), including both
substances that are nonconfidential and those
claimed to be confidential business informa-
tion (CBI) under TSCA. Originally,  the IUR
was updated on a 4-year cycle, but starting
with the 2006 IUR, it will be updated on a
5-year cycle. The IUR reporting requirements
depend on the volume of the chemical that is
produced as well as certain exemptions. Hence,
the IUR list  is a subset of the larger TSCA
inventory. Before  2006, the IUR contained
organic chemicals manufactured or distributed
in the United States in amounts > 10,000 lb/
year. The 2006 IUR regulation requires manu-
facturers and importers of certain chemical
substances to  report site and manufacturing
information for chemicals  manufactured or
imported in amounts of > 25,000 lb at a single
site. Additional information on domestic pro-
cessing and use must be reported for chemicals
manufactured in amounts of > 300,000 lb at a
single site. The full inventory, including both
confidential and nonconfidential substances,
is maintained by  U.S. EPA and Chemical
Abstract Service and is not available to the
public. The nonconfidential or "public" inven-
tory is published periodically, usually after each
IUR cycle. We have included the 2002 version
of the public TSCA inventory in our analy-
ses. This list  is available from the U.S. EPA
Substance Registry System (U.S.  EPA 2008q).
This list contains 65,513 chemicals indexed by
CASRN. Note that this number differs from
the 75,000 quoted elsewhere because this is the
publicly released list and excludes chemicals
added under the claim of CBI.
    HPV chemicals. The U.S. HPV chemi-
cals are those manufactured in or imported
into the United States in amounts > 1 million
Ib/year. The  U.S. EPA HPV  list  is fluid,
changing to some degree with  each IUR
cycle. Our current list contains 2,539 chemi-
cals (U.S. EPA 1990). We also include two
important subsets of the HPV list.
    U.S.  EPA HPV Challenge. The HPV
Challenge Program chemical list  consists of all
the HPV chemicals reported during the 1990
IUR reporting year. Inorganic chemicals and
polymers, except in special circumstances, were
not included in the HPV Challenge Program.
Our version of the HPV Challenge  list con-
tains 1,973 chemicals (U.S. EPA 1990).
    U.S. EPA HPV information system. These
are chemicals with data submitted under the
HPV Challenge Program for which  "Robust
Summary" data have been entered  into the
U.S. EPA HPV information system (HPVIS;
U.S. EPA 20081). There are 991 chemicals
from HPVIS with information in ACToR.
    MPV chemicals. Another set of industrial
chemicals of interest are the  non-HPV chemi-
cals included in the TSCA IUR list. These are
the chemicals exceeding a reporting threshold
of 10,000 Ib/year before 2006, and 25,000 lb in
2006 and beyond, but < 1 million Ib/year. The
2002  IUR list contains 5,375 MPV chemi-
cals (U.S.  EPA 2004a, 2004b). The updated,
draft 2006 IUR list  contains approximately
3,668 MPV chemicals that are not CBI; the
2006  IUR public list will be released by the
U.S. EPA in 2009.
   Pesticides and  antimicrobials.  This
category covers a wide range of substances.
Chemicals regulated  as part of the U.S. EPA
pesticide program are generally classified as
"active" or "inert."  The active ingredients
are further classified by whether they are tar-
geted at microbes (antimicrobials) or complex
organisms (pesticides). Additionally,  all pesti-
cide compounds (conventional actives, anti-
microbials, and inert ingredients) are classified
by whether or not they have food-use toler-
ances  or tolerance exemptions. Finally, one
can classify these chemicals by whether or not
they are in use in significant quantities.  Here
we rely on the Office of Pesticide Products
Information (OPPIN) system of the U.S.
EPA to extract lists of chemicals. OPPIN is
not publically accessible. From this, we have
drawn the following subsets:
• Conventional  Pesticide Actives: (EPA
  OPPIN  pesticide active): active pesticide
  ingredients (834 chemicals)
• Antimicrobial Actives (EPA OPPIN  anti-
  microbial active):  active  ingredients used
  against microbes (337 chemicals)
• Pesticide inert  ingredients: an inert ingre-
  dient means any  substance, other  than
  an active ingredient, that is intentionally
  included in a pesticide product. Inert ingre-
  dients have a number of uses, for instance,
  as a  solvent, as  an aid in increasing the pes-
  ticide product's shelf life, or as an agent
  to protect the  pesticide from degradation
  due  to exposure to sunlight. We used two
  sources:  a) U.S. EPA OPPIN inert ingredi-
  ents (the complete OPPIN list containing
  3,532 chemicals);  and V) U.S. EPA  inert
  nonfood ingredients [a list of inert pesticide
  ingredients classified by hazard potential,
  not  approved for food contact use, avail-
  able from the U.S. EPA's Office of Pesticide
  Programs (OPP) website (3,492 chemicals)
  (U.S. EPA2008n)]
• Pesticide ingredients with food-use toler-
  ances or tolerance  exemptions (U.S. EPA
  OPPIN food use) (1,320 chemicals)
    U.S. EPA TRI. The Emergency Planning
and Community Right-to-Know Act of 1986
(EPCRA 1986) requires businesses to report the
locations and quantities of chemicals stored on-
site to state and local governments in order to
help communities prepare to respond to chem-
ical spills  and similar emergencies.  EPCRA
requires U.S. EPA and the states to annually
collect data on releases and transfers of certain
toxic chemicals from industrial facilities, and
to make the data available to the public in the
TRI. In 1990 Congress passed the Pollution
Prevention Act of 1990 (Pollution Prevention
Act 1990), which requires that additional data
on waste management and source reduction
activities be reported under the TRI. The U.S.
EPA compiles the TRI data each year and
makes these data available through several data
access tools, including their website (U.S. EPA
2008p). Our analysis includes 636 chemicals
from TRI.
   Drinking water contaminants. The U.S.
EPA develops drinking water standards and
identifies lists of potential drinking water con-
taminants because they are anticipated to occur
in drinking water supplies and may have adverse
health effects. The lists tracked in the present
analysis are the U.S. EPA's Drinking Water
Standards and Health Advisory Chemicals
(DWSHA; 200 chemicals) and the Candidate
Chemical  Lists [CCLs: U.S. EPA CCL1, U.S.
EPA CCL2, and U.S. EPA draft CCL3, which
include 47, 39, and 92 chemicals, respectively
(U.S. EPA 2008e)]. We also included the
Preliminary CCL (PCCL) listing of the 528
chemicals that the U.S. EPA evaluated  during
the development of draft CCL3  (U.S. EPA
2008d). The U.S. EPA PCCL was derived from
a collection of approximately 6,000 chemicals
analyzed by the U.S. EPA's Office of Water,
and the PCCL was selected from these 6,000
chemicals  based on available health effects and
occurrence data (U.S. EPA 2008c).
    U.S. EPA Great Lakes National Program
Office. A set of 429 candidate persistent, bioac-
cumulative toxicants (PBTs) compiled by the
U.S. EPA Great Lakes National Program Office
(GLNPO) are included in the present analysis
(Muir and Howard 2006). These are designated
as U.S. EPA GLNPO PBT chemicals.
    U.S. EPA HAPs. This is a list of chemi-
cals that are under review by the U.S. EPA
specified in the Clean Air Act Amendments of
1990. These chemicals include volatile organic
chemicals, chemicals used as pesticides and
herbicides, inorganic chemicals, and radionu-
clides. Many of these chemicals are used for a
variety of purposes in the United States today.
Other chemicals, although not in use today,
were used extensively in the past and may still
be found  in the environment.  We include a
total of 185 chemicals from this source.
   EDSP chemicals. A variety of chemicals
have been found to disrupt the endocrine
systems of animals in laboratory studies, and
compelling evidence shows that endocrine
systems  of certain fish and wildlife have been
affected by chemical contaminants, resulting
in developmental and reproductive problems.
Based on this and other evidence, Congress
passed the Food Quality Protection Act of
1996, which  requires that the U.S. EPA test
for the potential estrogenic effects in humans.
Subsequently, a U.S. EPA advisory committee
recommended that this be expanded to include
Environmental Health Perspectives •  VOLUME 117 I NUMBER 5 I May 2009
                                       Previous
                                                                                 687

-------
Judson et al.
effects occurring via androgen and thyroid
mechanisms and potential for effects on eco-
logic species. We have included the 73 chemi-
cals that were listed to be screened under Tier 1
of the U.S. EPA EDSP (U.S. EPA 2007a).
    ToxCast phase I chemicals, ToxCast is a
U.S. EPA program  designed to apply HTS,
high-content screening and genomics tech-
niques to the screening and prioritization  of
environmental  chemicals (Dix et al. 2007).
Phase I of this program is screening 309
unique chemicals, most of which are pesti-
cide active ingredients. (One of the ToxCast
chemicals has  no  CASRN,  so we do not
include it in the analyses below.) This chemi-
cal listing is available for download from the
ToxCast or U.S. EPA Distributed Structure-
Searchable Toxicity Data Network (DSSTox)
websites (U.S. EPA 2008k, 2008o).
    Toxicology Reference Database, This is
a collection of summary in vivo toxicology
data,  currently focused on pesticide active
ingredients.  Data on pesticide actives is col-
lected and summarized from U.S.  EPA OPP
data evaluation records (DERs),  which are
summaries  of guideline studies required
before approval of new pesticide active ingre-
dients. The  Toxicology Reference Database
(ToxRefDB) provides the  toxicology data
required to link in vitro assays from ToxCast
with in vivo  toxicity end points (Martin et al.
2008). ToxRefDB will eventually contain
information on most of the pesticide active
chemicals of ToxCast phase I and will later
expand to include toxicity data on additional
pesticide and  nonpesticide chemicals. The
current database contains information on 431
chemicals. In addition to data derived from
pesticide DERs, ToxRefDB  will contain data
from other primary in vivo toxicology sources.
    U. S. EPA  Integrated Risk Information
System, The  collection of chemicals subject to
evaluation by the U.S. EPA Integrated Risk
Information  System (IRIS) program make up
three major lists: the main U.S. EPA IRIS set
(U.S. EPA 2008f), for which evaluations are
currently available (535 chemicals); the U.S.
EPA IRIS nominations (U.S. EPA 2008g;
currently 20 chemicals nominated for inclu-
sion); and the U.S. EPA IRIS queue (U.S.
EPA 2008h), which are chemicals in queue to
have IRIS reports written (68 chemicals).
    Target collection summary. The total
number of chemicals (defined by unique
CASRN) in this set of collections comes to
9,912. Table  1  shows the overlap matrix
between these  target chemical lists.  The sum
of the number of chemicals in the individual
lists is 23,985. This number drops  to 9,912
once we remove overlaps.  For instance, 720
chemicals are on the U.S. EPA HPV and on
the U.S. EPA  OPPIN inert ingredients lists.
From the U.S. EPA CCL3 list, 29 of 92 are
also HPV chemicals. Interestingly, in a few
cases no overlap occurs between pairs of lists.
Two instances  are the lack of overlap between
the U.S. EPA CCL1 and CCL2 lists and the
U.S. EPA GLNPO PBT list.

Information Sources
The information that is available on the target
chemicals can  be divided  into several assay
categories. The sources for each of these types
of data are available online at http://www.epa.
gov/ncct/toxcast/.
Table 1. Numbers of chemicals that overlap between the screening target chemical collections.



EPACCL1
EPACCL2
EPAdraftCCL3
EPA PCCL
EPA DWSHA
EPA EDSP 73
EPAGLNPOPBT
EPA HAPs
EPA HPV
EPA HPV Challenge
EPA HPVIS
EPA IRIS
EPA IRIS nominations
EPA IRIS queue
EPA IUR (2002|
EPA OPPIN pesticide active
EPA OPPIN antimicrobial active
EPA OPPIN food use
EPA OPPIN inerts
EPA inerts nonfood
EPA TRI
ToxCast phase 1
ToxRefDB



47
39
92
528
200
73
429
185
2,539
1,973
992
535
20
68
5,375
834
337
1,320
3,532
3,492
636
308
431

EPACCL1
47
47
39
13
34
25
6
0
14
11
11
4
31
2
7
14
15
5
14
8
6
27
11
16

EPACCL2
39
39
39
13
28
19
5
0
12
10
10
4
25
1
5
13
13
4
13
7
5
22
10
15

EPAdraftCCL3
92
13
13
92
92
19
9
2
28
29
30
8
56
1
10
39
31
11
41
19
10
60
25
31

EPA PCCL
528
34
28
92
528
62
33
21
77
237
259
91
187
6
27
302
125
52
166
162
135
206
73
93

m
CO
1
200
25
19
19
62
200
29
4
69
61
60
26
176
6
40
77
63
23
69
55
33
130
43
59

00
CO
£
73
6
5
9
33
29
73
1
12
8
11
6
57
1
3
12
64
15
66
15
12
44
56
66

EPAGLNPOPBT
429
0
0
2
21
4
1
429
8
109
75
37
22
2
4
194
3
2
12
43
39
20
4
7

1
185
14
12
28
77
69
12
8
185
92
101
27
144
3
36
122
24
15
43
68
43
173
15
21

1
2,539
11
10
29
237
61
8
109
92
2,539
1,746
701
145
11
34
2,187
102
84
246
720
676
162
13
28

EPA HPV Challenge
1,973
11
10
30
259
60
11
75
101
1,746
1,973
703
147
10
37
1,759
77
60
212
612
567
166
11
25

EPA HPVIS
992
4
4
8
91
26
6
37
27
701
703
992
54
6
12
747
37
34
81
268
250
58
8
15

CO
en
535
31
25
56
187
176
57
22
144
145
147
54
535
10
50
183
179
42
187
115
75
290
122
147

•I
'E
CO
CC
£
20
2
1
1
6
6
1
2
3
11
10
6
10
20
0
13
2
2
4
6
5
9
1
2

O"
CO
CC
68
7
5
10
27
40
3
4
36
34
37
12
50
0
68
46
8
8
10
31
18
46
2
4

CC
£
5,375
14
13
39
302
77
12
194
122
2,187
1,759
747
183
13
46
5,375
151
140
378
1,195
1,126
230
23
47

EPA OPPIN pesticide acti\
834
15
13
31
125
63
64
3
24
102
77
37
179
2
8
151
834
217
484
178
169
175
272
363
CO
EPA OPPIN antimicrobial;
337
5
4
11
52
23
15
2
15
84
60
34
42
2
8
140
217
337
129
155
151
57
33
63

1
O
O
i
1,320
14
13
41
166
69
66
12
43
246
212
81
187
4
10
378
484
129
1,320
744
724
169
239
300

EPA OPPIN inerts
3,532
8
7
19
162
55
15
43
68
720
612
268
115
6
31
1,195
178
155
744
3,532
3,183
136
22
35

EPA inerts nonfood
3,492
6
5
10
135
33
12
39
43
676
567
250
75
5
18
1,126
169
151
724
3,183
3,492
92
15
26

CC
I —
£
636
27
22
60
206
130
44
20
173
162
166
58
290
9
46
230
175
57
169
136
92
636
112
144

CO
CO
to
I —
308
11
10
25
73
43
56
4
15
13
11
8
122
1
2
23
272
33
239
22
15
112
308
304

ToxRefDB
431
16
15
31
93
59
66
7
21
28
25
15
147
2
4
47
363
63
300
35
26
144
304
431
688
                                       Previous
                 TOC
                            VOLUME 117 I NUMBERS I May 2009  •  Environmental Health Perspectives

-------
                                                                         The toxicity data landscape for environmental chemicals
    Chemical structures. We have compiled
structures for most of the defined compounds
(as opposed to mixtures) in the target lists.
For subsets of chemicals, structures have been
hand curated and quality reviewed as part
of the U.S. EPA DSSTox program (Richard
et al. 2008). We took the remaining structures
from a variety of sources, including PubChem
[National Center for  Biotechnology
Information (NCBI) 2008], the National
Cancer Institute's Chemical Structure Lookup
Service (National Cancer Institute 2008),
and the U.S.  EPA Substance Registry System
inventory. In many cases, structures were
derived from Simplified Molecular Input Line
Entry Specification (SMILES) codes (Daylight
Chemical Information Systems, Inc. 2008).
At present, we have chemical structures  for
7,099 of the 9,912 target chemicals. We lack
structure information for many chemicals
because many  substances on these lists  are
mixtures, sometimes relatively simple ones
for which representative structures could be
designated (e.g., "sulfuric acid, mono-C]4_]8-
alkyl  esters, sodium salts"), and sometimes
very complex mixtures (agar, sesame oil).
    Physical-chemical properties. We used
U.S. EPA's EPISuite (U.S. EPA 2007c)  set
of programs  to calculate physical—chemical
properties for a subset of chemicals. The
input to  EPISuite is a list of SMILES codes.
Several EPISuite programs were used includ-
ing KOWWIN [estimates the logarithmic
octanol-water partition coefficient  (logP,
also sometimes called log Kov)  of organic
compounds (Meylan and Howard 1995)],
MPBPWIN [estimates the boiling point (at
760 mm Hg), melting point, and vapor pres-
sure of organic  compounds (Stein and Brown
1994)], WATERNT (estimates the water solu-
bility of organic compounds at 25°C; Meylan
and Howard  1995),  and WSKOWWIN
(estimates the water solubility of an organic
compound using the compounds log octanol-
water partition coefficient; Meylan et al.
1996). The properties we use are molecular
weight (MW), logP, boiling point, melting
point, vapor pressure,  phase at 25°C, and
molar water solubility. EPISuite reports (and
we) use experimental values when available.
    Biochemical (in vitro or cell-based) assay
data.  For a subset of the chemicals of inter-
est, in vitro (biochemical) or cell-based assay
data are currently available. This can include
receptor binding, enzyme inhibition, or
cytotoxicity. The major sources of these data
are PubChem and the National Institute of
Mental Health's Psychoactive Drug Screening
Program Kt Database (Roth and Lopez 2008).
    In vivo toxicology assay data (tabular).  We
derived these data from  guideline (or equiva-
lent) toxicology studies  from which the pri-
mary or secondary data are available. For  our
purposes, the main sources of this primary data
are the National Toxicology Program (NTP),
U.S. EPA OPP (through ToxRefDB; Martin
et al. 2008), the U.S. EPA's HPVIS, and the
FDA. The FDA data we used here came from
the following databases: a) FDA Generally
Recognized as Safe list; b) FDA Cumulative
Estimated Daily  Intake/Acceptable Daily
Intake Database;  c) FDA Everything Added
to Food in the United States database; and
the d) FDA List of "Indirect"  Additives Used
in Food Contact Substances. We compiled
our tabular primary data largely through the
ToxRefDB database (Martin et al. 2008) and
the DSSTox programs  (Richard et al. 2006).
HPVIS is a special case because it  includes
both primary and  secondary data, often pro-
vided in summary by sponsors, with data
derived either from the open literature or
from sponsor-derived study reports. The data-
base  captures so-called  "Robust Summaries."
Examples of secondary tabular  in vivo tox-
icity data are the Carcinogenic  Potency
Database (Gold et al. 2001),  U.S. EPA IRIS
reports,  National Library of Medicine (NLM)
TOXNET databases (Hazardous Substances
Data Bank and Chemical Carcinogenesis
Research Information System), and California
EPA. Data from a  number of  these secondary
sources  have been  tabulated and made avail-
able through the DSSTox program  (Richard
et al. 2007). Types of tabular information that
are captured in the DSSTox program include
high-level summary results such as  food-use
tolerances, LOAELs and NOAELs (lowest
and no  observed adverse effect levels), and
reference doses, as well  as highly detailed data
such as the per-animal or group-level  results of
toxicology studies. Cell-based genotoxicity is
currently captured under this category because
it  co-occurs with rodent carcinogenicity data
in current ACToR data  sources.
   In vivo toxicology text reports via URL,
Much of the publicly available in vivo toxi-
cology  data are  in the form  of narrative
reports from which detailed tabular data may
or may not have been extracted. Examples
are the  original NTP,  IRIS,  and Screening
Information Data Sets (SIDS)  reports, the
latter from the Organization for Economic
Cooperation  and Development (OECD)
HPV Programme. We also  included the
International Agency for Research on Cancer
(IARC)  and Agency for Toxic  Substances and
Disease Registry (ATSDR) study reports in
this set.  These reports contain quantitative and
categorical data, but for most of these sources,
the data provided are not easily extractable.
All of the studies  we used here are accessible
via the  Web.  Information can be extracted
from these reports on a case-by-case basis.
   In vivo toxicology summary calls. Several
sources  have made definitive calls concerning
particular modes of toxicity, for instance, label-
ing chemicals as being human carcinogens or
developmental toxicants. These calls are made
by experts using data from the detailed toxicity
reports described previously. Although the calls
are subject to debate by experts, they provide
a useful source of data for training prioritiza-
tion models. This information is typically cate-
gorical. Examples of summary calls are cancer
potential determinations of the California EPA
(2008), the NTP Report on Carcinogens (NTP
2008b), NTP Center for the Evaluation of
Risks to Human Reproduction (NTP 2008a),
and the U.S. EPA OPP cancer classifications
(U.S.EPA2007b).
   Regulatory listings. By law, the U.S. EPA
and some state agencies maintain a number
of lists of chemicals that are of toxicologic
concern. The presence of a chemical on one of
these lists indicates that toxicity data are avail-
able. For the present analysis, we derived these
lists from the U.S. EPA  Substance Registry
System (U.S. EPA2008J).
   Phenotypes, Above we have described the
information types of the data rather than the
disease or toxicology categories. Where pos-
sible,  assays or data sources have also been
labeled by appropriate disease or toxicology
categories, and we label  these categories as
"phenotypes." The set of phenotypes  imple-
mented in ACToR span traditional toxicology
study areas. The subset of phenotypes we use
here are general hazard, carcinogenicity, geno-
toxicity, developmental toxicity, reproductive
toxicity, and chronic  toxicity. Other toxic-
ity phenotypes are represented in ACToR,
but for small numbers of chemicals. Many
data sources, especially the toxicology sum-
mary  reports, contain information on mul-
tiple types of toxicity or  end points. In this
category, we have included only IRIS, NTP,
ToxRefDB, and U.S. EPA and OECD HPV
SIDS  reports because they can be assumed to
have covered a defined standard set of areas
of toxicity for most chemicals. "Hazard" is
a very broad phenotype category that can
include assays derived from  acute and sub-
chronic rodent studies at one end  or material
safety data sheets at the other. We  further
track information on food safety assessments,
as provided by the FDA  (FDA 2006, 2007,
2008). In addition, the U.S. EPA sets food-
use tolerances (or tolerance exemptions) for
a subset of pesticide ingredients.  There is a
significant overlap between chemicals regu-
lated by the U.S. EPA and those analyzed by
the FDA. It is obviously of great value to have
both positive and negative  toxicity informa-
tion for all of the phenotypes, and both types
were captured where they were available.
   Several reviews of the toxicology data land-
scape have described sources of data that are
included in ACToR. Yang et al. (2006a, 2006b)
have recently published two such  reviews. In
2001  and 2002, several review papers were
published surveying the landscape of toxicity
Environmental Health Perspectives •  VOLUME 117 I NUMBER 5 I May 2009
                                       Previous
                                                                                 689

-------
Judson et al.
data available on the Internet (Brinkhuis 2001;
Felsot 2002; Junghans et al. 2002; Patterson
et al. 2002; Polifka and Faustman 2002; Poore
et al. 2001; Richard and Williams 2003;
Russom 2002; Winter 2002; Wolfgang and
Johnson 2002; Young 2002).
    We provide a summary of the sources
of toxicology data we used in  this analysis,
available online at http://www.epa.gov/ncct/
toxcast/. In the simplest case, each toxicology
source is a single assay in the ACToR database.
(There are multiple exceptions;  e.g., DSSTox
and NTP each contribute multiple assays.) For
each assay, we list the short  name, a descrip-
tion, the institutional source, the number of
chemicals covered, the types of information
provided, and a URL. There were 22 sources
from which target screening chemicals were
taken, 47 sources of toxicology data, and
48 lists of chemicals covered by regulations.

Data Collection and
Integration: ACToR
All of the data for this analysis are collected
in the  ACToR system (Judson et al. 2008;
U.S. EPA 2008a). The organizing principles
for the design of the chemical/assay system
are largely derived from the PubChem proj-
ect, which captures  chemical structure and
HTS information on millions of chemicals
in its role as the main data repository for the
NIH Molecular Libraries Roadmap (Austin
et al. 2004).  PubChem characterizes data
in terms of "substances" (the actual chemi-
cal  on  which one performs an experiment as
defined by  the  data source), "compounds"
(the idealized structures of chemicals), and
assays (data generated on substances). ACToR
collects these same three main  types of data:
substances,  indexed by substance identifier
(called the  SID); compounds  (i.e., chemi-
cal  structures) indexed by compound iden-
tifier (called the CID); and assays,  indexed
by  assay identifier (called the AID). A sub-
stance  is a single chemical  entity from one
data source and often corresponds to the
   physical substance on which some experiment
   was performed. A compound is a chemical
   entity that corresponds to a unique chemical
   structure. Because a substance is defined as
   being specific to both data source and experi-
   ment, many substances (SIDs) may map to a
   single compound (CID). An assay, indexed
   by AID, represents a specific type of test data
   associated with one or more substances.  In
   ACToR, a substance is  minimally  charac-
   terized  by a data-collection-specific  SID
   and a chemical name. Most  often, the  sub-
   stance will also have synonyms, a CASRN,
   and several other parameters. A compound
   always has an associated chemical structure
   and a data-collection—specific CID, in addi-
   tion  to optional parameters derived  directly
   from chemical structures, such as SMILES
   (Daylight Chemical Information Systems, Inc.
   2008) and International  Chemical Identifier
   [International Union of Pure and Applied
   Chemistry (IUPAC) 2008)]  linear  chemi-
   cal structure representations and MW. Note
   that  because ACToR is in essence a "super-
   aggregator," pulling in large external data col-
   lections, it also stores the  source-labeled SIDs
   and CIDs from each independent  collection
   (e.g., PubChem CID, DSSTox CID).
      In ACToR, as in DSSTox, data on chemi-
   cals across data collections are aggregated
   using the concept of a generic chemical.
   Because most environmental chemicals, along
   with their related toxicity  data, are indexed  by
   CASRN, which can be thought of as a source-
   independent test  SID, ACToR aggregates
   information based on this identifier. A generic
   chemical is defined by a CASRN, a preferred
   name (typically a common name rather  than
   an IUPAC or other systematic name), and
   an optional ACToR CID. Some sources (in
   particular, the FDA and NTP) have provided
   CASRN-like identifiers for some compounds,
   and these are used in ACToR in place of the
   CASRN. All data on all substances  sharing
   a particular CASRN are attached to the cor-
   responding generic chemical. In particular,
a generic chemical will  inherit all names
attached to substances with the corresponding
CASRN as synonyms.
    In ACToR, an assay is a generic collec-
tion of data values associated with a set of
substances and (potentially) compounds (i.e.,
chemical structures). An assay has  a unique
AID, a name,  an assay category, and, option-
ally, one or more "phenotypes." Table 2 lists
the assay categories (major types of assays).
Assay phenotypes are linked  to high-level
classes of toxicity testing such as carcino-
genicity or reproductive or  developmental
toxicology. This allows quick searching of the
database to find all assays that pertain to that
high-level toxicology concept. The concept of
an assay as  implemented  in ACToR is pur-
posely broad so as to capture any information
potentially relevant to understanding toxicity
and evaluating risk for environmental chemi-
cals. An assay can also have one or more com-
ponents, which are separate data fields that
naturally fall together into an assay (e.g., the
binding constant  to  a receptor at  different
concentrations). Each component is defined
by an assay component identifier, the cor-
responding AID, a name, a description, units
(when applicable), and a data type (float, inte-
ger, categorical, text, Boolean, URL). The
actual data values are called assay results and
are linked to the assay, the assay component,
and the original  data-collection—specific sub-
stance. All of the data for an assay can be rep-
resented as a table with one row per chemical
and one column per assay component.
    To be included in ACToR, a data source
must meet several criteria: a)  data must  be
publicly available; b) information sources
must have a significant overlap with chemicals
of interest;  c) information must be indexed by
chemical, that is, available on a chemical-by-
chemical basis; and d) information must  be
indexed by CASRN (although data are also
included for substances having no assigned
CASRN). We do not require that data be peer
reviewed, although for the analysis we report
Table 2. Categories of assays in ACToR that are described in this analysis.
Assay category
                 Description
                       Examples
Physical-chemical

Biochemical

In wVotoxicology (tabular)

In wVotoxicology (study listing primary)

In wVotoxicology (summary calls)

In wVotoxicology (summary report via URL)

Regulatory
Physical and chemical properties (in vitro and/or in silica)
Biochemical (non-cell-based) (in vitro and/or in silica}

Tabulated results from primary or secondary animal-based
 studies of chemical effect
Primary studies are available but have not been tabulated
Derived summary determinations of risk

Links to text reports on the Web for which specific data values
 are not directly accessible in tabular form
Listings of chemicals that fall under specific environmental laws
 or government mandates
             LogP
             Boiling point
             Enzyme inhibition constants
             Receptor binding constants
             Clinical chemistry
             Histopathology
             Clinical chemistry
             Histopathology
             Developmental and reproductive assays
             Chemicals determined to pose a
              defined risk of human cancer
             Reports from U.S. EPA IRIS or NTP

             TSCA
690
                               VOLUME 117 I NUMBERS I May 2009 • Environmental Health Perspectives
                                        Previous
                     TOC

-------
                                                                           The toxicity data landscape for environmental chemicals
here, most of the data sources either have been
externally peer reviewed or, when from gov-
ernment agencies, have undergone extensive
internal review. Data entered into  ACToR
undergo a limited quality control  process.
Data are preferably taken from  sources  of
high-quality data, so our quality control is
limited to checking that the data are  correctly
transferred from the  source via a reformatting
and loading process into the ACToR data-
base. No checks are  made on the correctness
of the data from the original source. Each data
set  is manually spot-checked for gross issues
with reformatting. All CASRNs in the data-
base are checked to  be sure that they have a
proper checksum (Chemical Abstracts Service
2008).  (The checksum is the result of a par-
ticular formula performed on all but the final
digit of the CASRN. This result must match
the final digit.) All  data-handling tasks are
documented in standard operating procedures
to ensure consistency.
    The ACToR database is implemented using
MySQL. Software to preprocess and load data
is written in Perl, and the Web interfaces are
written in Java. The use of 100% open-source
software allows the entire system to  be easily
distributed to other interested groups.  We used
the ACToR database version 2008Q2d for
all of the analyses in this article. Of the sub-
sets of data  sources in ACToR, only the ones
most relevant to toxicology are included in this
analysis and publication. ACToR is  available
online (http://actor.epa.gov).

Results
In vivo toxicology data. This section describes
the overlap between the target  chemical set
and the set  of toxicity data sources. Table 3
summarizes the overlap matrix. Each cell
provides the number and the percentage  of
the 9,912 for the chemicals that have infor-
mation for  a specific category of data (e.g.,
tabular) and  a particular phenotype (e.g.,
carcinogenicity). The last column gives the
number and percentage of chemicals for each
phenotype, regardless of the information cate-
gory. Chemicals are only counted once in any
cell, even if they have multiple data points  or
sources of data. Cells that list 0  indicate that
                       there were no corresponding data from any
                       source. The available toxicity data almost all
                       derive from animal studies, because essentially
                       no experimental human data are available.
                       However, some of the data are in the  form
                       of human reference doses,  or summary calls
                       of the form "this chemical is considered to
                       be a human carcinogen." These data points
                       were, of course, derived by extrapolating from
                       primary data on animals.  Chemical hazard
                       has been evaluated for 5,810 (58.6%) of these
                       chemicals. Carcinogenicity potential for 2,579
                       (26%) of these chemicals has been evaluated
                       by at least one  source. The genotoxic poten-
                       tial  of 2,724 (27.5%) of the  chemicals has
                       been evaluated. A total of 2,862 (28.9%) of
                       the chemicals have their developmental toxic-
                       ity reported, and 1,081 (10.9%) have repro-
                       ductive  toxicity data reported.  Food  safety
                       information (from one of  the sources men-
                       tioned above) is available for 2,258 (22.8%)
                       of the chemicals. Chemicals  count in this table
                       whether they have positive or negative data for
                       toxicity  for a particular phenotype. To date,
                       we have not systematically tabulated the rela-
                       tive number of toxic and nontoxic indications
                       for all chemicals.
                          Table 4 provides overlaps of the chemicals
                       of interest with more general information and
                       biological assays of potential interest. One or
                       more in vitro biochemical assays are available
                       for 781 (7.9%) of the chemicals. Most of these
                       are in vitro cytotoxicity assays in PubChem,
                       but  also  include receptor binding and enzyme
                       inhibition data. A small number of the target
                       chemicals (234  or 2.4%)  are naturally occur-
                       ring human metabolites, based on data from
                       the  Human Metabolome Database (Wishart
                       et al. 2007).
                          The highest-quality toxicity assessments,
                       based on guideline studies or on extensive
                       review of the literature, are U.S. EPA OPP
                       reviews (which are captured in the ToxRefDB
                       database), U.S. EPA IRIS assessments,  NTP
                       studies, OECD  SIDS guideline studies of HPV
                       chemicals,  studies in the U.S. EPA HPVIS,
                       and assessments by the ATSDR and I ARC.
                       From the current list, there are  431  (4.3%),
                       536 (5.4%), 1,168  (11.8%),  343 (3.5%),
                       992 (10%), 216 (2.2%),  and  537 (5.4%)
TableS. Summary of overlap between the target chemical list and the set of assay components.
Assay
Tabular
  Primary
study listing
Summary calls
  Summary
report via URL
Any
Hazard
Carcinogenicity
Genotoxicity
Developmental toxicity
Reproductive toxicity
Food safety
4,454 (44.9)
1,211 (12.2)
2,496(25.2)
755(7.6)
734(7.4)
1,692(17.1)
0
401 (4.0)
1,102(11.1)
37(0.4)
0
0
255(2.6)
726 (7.3)
32 (0.3)
125(1.3)
31 (0.3)
533 (5.4)
4,767(48.1)
2,035(23.3)
1,047(10.6)
2,324(23.4)
396(4)
0
5,810(58.6)
2,579 (26)
2,724 (27.5)
2,862 (28.9)
1,081(10.9)
2,258 (22.8)
                                                        chemicals in these respective sets (Table 4).
                                                        Looking across all of these data sources, 2,767
                                                        (27.9%) are covered by one or more of these
                                                        high-quality toxicology sources. Finally, a total
                                                        of 4,641 (46.8%) are currently subject to one
                                                        or more U.S. EPA regulations.  These regula-
                                                        tions are available online (http://www.epa.gov/
                                                        ncct/toxcast/).
                                                            Chemical categories. Both the U.S.  HPV
                                                        Challenge and the OECD HPV programs
                                                        encourage the  use  of categories because of
                                                        the  large number of chemicals being assessed.
                                                        Using a category approach, chemicals are evalu-
                                                        ated as a group, or category, rather than as indi-
                                                        vidual chemicals, and not every chemical needs
                                                        to be tested for every end  point. The category
                                                        approach entails grouping chemicals with simi-
                                                        lar structures, physical-chemical properties, fate
                                                        parameters, and toxicologic properties in order
                                                        to extrapolate toxicologic information from
                                                        tested chemicals and end points  to untested
                                                        chemicals and end points. For most categories,
                                                        the  number of chemicals with toxicology data
                                                        that could be used for model building is much
                                                        smaller than the total number of chemicals
                                                        included within the category.
                                                            ACToR includes listings of chemical cat-
                                                        egories taken from the U.S. EPA  HPVIS and
                                                        from the OECD HPV Programme. From these
                                                        lists, a total of 1,274 (12.9%) chemicals are in
                                                        at least one category, and there are 256 unique
                                                        categories that include at least one  of the target
                                                        chemicals. However, most of the categories in
                                                        HPVIS represent "proposals," which are cur-
                                                        rently under review by the U.S. EPA, such that
                                                        the  final number of categories and chemicals
                                                        assigned to them is subject to change. In addi-
                                                        tion, the U.S. EPA is currently using chemical
                                                        clustering techniques with the goal of creating
                                                        chemical categories to facilitate hazard assess-
                                                        ment of MPV chemicals. The outcome  of these
                                                        efforts will be included in ACToR in the future.
                                                        Information will also flow in the opposite direc-
                                                        tion; that is, the data and information included
                                                        in ACToR will be useful in reviewing and refin-
                                                        ing  the U.S. EPA's HPV and MPV categories.
                                                            Production volumes. An important com-
                                                        ponent of any prioritization program will be

                                                        Table 4. Coverage by specific data types and
                                                        sources.
Each cell provides the number and the percentage of the 9,912 for the chemicals that have information for a specific
category of data (e.g., tabular) and a particular phenotype (e.g., carcinogenicity). The last column gives the number and
percentage of chemicals for each phenotype, regardless of the information category. Chemicals are only counted once in
any cell, even if they have multiple data points or sources of data. Cells with 0 indicate that there were no corresponding
data from any source.
Name
Biochemical
Human-metabolite
ToxRefDB
IRIS
NTP
SIDS
HPVIS
ATSDR
IARC
ToxRefDB, IRIS, NTP,
SIDS, ATSDR,
and/or IARC
Regulation
Total
781
234
431
536
1,168
343
992
216
537
2,767


4,641
Percent coverage
7.9
2.4
4.3
5.4
11.8
3.5
10.0
2.2
5.4
27.9


46.8
Environmental Health Perspectives •  VOLUME 117 I NUMBER 5 I May 2009
                                         Previous
                                                                                                          691

-------
Judson et al.
an assessment of potential for exposure.  In
the absence of specific  information and for
screening and prioritization purposes, produc-
tion volumes are often used as a surrogate for
exposure potential. Table 5  lists counts for
each of the production  volume categories. A
total of 5,939 (59.9%) of the  target chemicals
have production volume information in the
2002 IUR.
    Properties related to chemical structure,
Physical-chemical properties  were calculated
using the EPISuite collection of programs,
which use chemical structure (in the form of a
SMILES string) as input. Of the 7,099 chemi-
cals for which structures  and SMILES data
were available,  EPISuite was able to process
5,857. The chemicals for  which calculations
could not be performed were mainly certain
types of salts, inorganic  compounds, organo-
metallics, or chemicals with nonstandard
SMILES.
    Several parameters will be  useful for deter-
mining whether a compound  can be bioavail-
able or whether it will be amenable to HTS
assays:  MW, logP, solubility,  and vapor pres-
sure. Typical ranges for  properties for chemi-
cals that can be tested using HTS methods are
MW < 500 Da, logP between 0  and 6, and
vapor pressure < 10 mm (not  volatile at room
temperature). Filtering  the larger list against
this set of criteria yields a set of 3,060 com-
pounds that are candidates for HTS testing.
One could produce slightly different lists,  of
course, by altering these  threshold values. The
primary requirements for use in an HTS assay
are that chemicals be soluble in dimethyl sul-
foxide  or water, that they be nonvolatile, and
that they be stable in solution.
    Figures  1  and 2 show distributions  of
MW and logP for the complete set of chemi-
cals with structures and for four representative
subsets of the larger data collection: HPV
chemicals, pesticide inert ingredients, pesti-
cide active ingredients, and the ToxCast phase
I collection. For MW, the main trend is that
the HPV and pesticide  inert  collections con-
tain significantly larger fractions of low-MW
chemicals (< 200 Da) than do the  pesticide
active  ingredients and  the ToxCast chemi-
cals. Given that most ToxCast phase I chemi-
cals are pesticide  active  ingredients and that
Table 5. Production volumes from the 2002 IUR.
Production volume (Ib/year)
<10K
10K-500K
>500K-1M
>1M-10M
>10M-50M
>50M-100M
>100M-500M
>500M-1B
>1B
Total
Count
11
2,827
485
1,381
512
130
246
67
280
5,939
Percent coverage
0.1
29.0
4.9
14.0
5.0
1.0
2.0
0.7
3.0
60.0
this set was prefiltered for HTS suitability,
it is not surprising that this set has a smaller
fraction of high-MW chemicals (> 500 Da)
than do the other collections. Distributions
of logP are similar for all of the subsets except
for ToxCast, which is more tightly clustered,
with a peak between 0 and 2.

Discussion
In this article we describe and analyze a com-
pilation of chemical structures,  physical-
chemical properties, in vitro biochemical assay
data,  and in vivo toxicology data on a large
collection of chemicals of interest to the U.S.
EPA.  Most of these data are currently pub-
licly available but have not been  organized
previously in a unified manner that allows
for the analysis of large trends and simplified
review based on either chemical or assay axes.
The data we describe here are a subset of those
contained in the ACToR system being devel-
oped at the U.S. EPA to manage large collec-
tions of data on environmental chemicals.
   We have used the ACToR database to
characterize the state of toxicologic knowl-
edge on a subset of environmental chemicals
that are on a variety of lists of interest to the
U.S. EPA. This analysis is used to address the
extent of the perceived data gap on potentially
toxic chemicals. Although the picture is com-
plicated, some summary observations are pos-
sible. About two-thirds of the chemicals have
some  toxicology  information. The unique set
of chemicals in  Table 3  is 6,551 of 9,912
(66%). The alternative view is that many of
these  chemicals  remain largely uncharacter-
ized—a total of 3,361  (34%) chemicals have
no information in any  of the data sources we
used in this analysis. On the other hand, more
than one-quarter (27.9%) have been analyzed
in one or more high-quality and/or systematic
evaluation programs (NTP, IRIS, ToxRefDB,
U.S. EPA HPV, OECD SIDS,  IARC, and/
or ATSDR). Of the individual types of toxic-
ity (or end points) that have been tabulated,
carcinogenicity, genotoxicity, and develop-
mental and reproductive toxicity have been
most widely covered (26%, 27.5%,  28.9%,
and 10.9%, respectively).
   One immediate application of this analysis
is to  select compounds for further screening in
programs such as ToxCast. ToxCast phase I
is using a set of compounds (primarily pesti-
cide  active  ingredients)  that are  amenable to
HTS and that have rich toxicologic data. The
outcome of the phase I analyses will be a set of
"signatures" that use  in vitro screening data as
inputs to predict in vivo toxicology phenotypes
with high enough sensitivity and  specificity to
be useful for prioritization  for more detailed
testing. Phase II needs to include compounds
that can be used to independently validate the
phase I signatures. Therefore, the phase II set of
chemicals should contain as many compounds
as possible with high-quality in  vivo  toxicol-
ogy data, have physical—chemical properties
that  make them candidates for HTS, and be
drawn from a more diverse collection than the
phase I chemicals to help define the chemi-
cal domain of applicability of the signatures.
We calculated the intersection of the set of
2,767 chemicals that have  data  from one of
the high-quality and/or systematic toxicol-
ogy data sources (NTP, IRIS, HPVIS, OPP/
ToxRefDB, OECD  SIDS, IARC, ATSDR)
with the set of 3,060 chemicals with reasonable
physicochemical properties. This yields a list of
1,308 candidate chemicals that have both high-
quality toxicity data and physicochemical prop-
erties very well suited  for HTS. After removing
the ToxCast phase I  chemicals, we arrived at
a list of 1,046 chemicals that are candidates
for inclusion in ToxCast phase  II for use in
validating ToxCast phase I findings across a
0.45

0.40


0.35
I 0.30
0
r*
| 0.25
O
= 0.20
o
o
= 0.15
0.10
0.05
n














r-



n ALL
n HPV


_




















L
1 .




























~L


















• Pesticide inerts
• Pesticide actives







m














• ToxCast








4
1 R^L n^ n^ ^^_ ^m
Abbreviations: B, billion; K, thousand; M, million.
                                             MW
Figure 1. Distribution of MW for representative chemical sets. The sum of fractions for each data set equals 1.
692
                            VOLUME 117 I NUMBERS I May 2009 • Environmental Health Perspectives
                                        Previous

-------
                                                                          The toxicity data landscape for environmental  chemicals
variety of end points. Many of these chemicals
are currently being analyzed in a series of HTS
assays at the NIH Chemical Genomic Center
(NCGC) as part of the Tox21 partnership
between U.S. EPA,  NCGC, and NTP. These
Tox21  chemicals include an even broader
range of physicochemical properties, with a
MW range of 32 to 1,255 and a logP range
of-13.2 to 13.2. An important analysis that
is yet to be carried out is chemical structure
characterization and clustering for the ToxCast
phase I and II lists and the larger target list.
This will be important to help understand our
ability to extrapolate within and across chemical
structural classes.
   ACToR is not alone in its goal of aggregat-
ing large sets of chemical structure and assay
data but is distinguished from other efforts
by its focus on toxicology and environmen-
tal chemicals and its goal of facilitating com-
putational analysis. PubChem (NCBI  2008)
is the largest effort currently available, with
information  on more than 10 million unique
chemical compounds. ChemSpider (2008)  is
an even larger chemical aggregation project but
does not house biological data or download-
able data sets. Another important compari-
son is with TOXNET, which is a collection of
multiple data sources covering many aspects of
chemical toxicity. TOXNET has a common
search engine that allows the user to easily find
data from multiple sources. However,  it  is a
closed system that does not allow a user to pull
together data sets that are useful for  compu-
tational purposes.  One unique aspect  of the
ACToR system is that it aggregates the data
from PubChem (focused on chemical structure
and  HTS in vitro assay data) and TOXNET
(NLM 2008) (focused on in vivo toxicology
data) and combines it in a way that it can be
used for computational analysis. eChemPortal
(OECD 2008) is an OECD effort very similar
     0.20
     0.15
to ACToR.  It mainly aggregates information
on HPV chemicals and pesticides. eChem
Portal currently contains links to seven large
database systems, some of which contain what
in ACToR are multiple individual databases
(e.g.,  INCHEM contains 11 individual data-
bases; International Programme on Chemical
Safety 2008). Unlike eChemPortal, which pro-
vides  links to Web pages for the component
databases, ACToR extracts tabular data from
a large number of sources and makes it search-
able by name, CASRN, or chemical structure.
A system called Vitic is being developed by
Lhasa Limited in collaboration between the
European Chemicals Agency's International
Uniform  Chemical Information Database
(IUCLID 2008) project and a number of phar-
maceutical companies, with the goal of being
an international toxicology information center
(Judson et al. 2005). In addition, the European
Substances Information System provides links
to a number of databases, including U.S. EPA
HPV, IUCLID, and European Inventory of
Existing Commercial Chemical Substances.
Finally, the Chemical Effects in Biological
Systems project at the National Institute of
Environmental Health Sciences is constructing
a multidomain information repository to hold
the detailed results and summaries of in vivo
and in vitro  toxicology experiments from NTP
studies, with particular emphasis on toxico-
genomics and microarray experiments (Waters
et al. 2008).
    To adequately characterize the toxicology
of all environmental chemicals  of potential
concern, we still face significant challenges.
Screening and prioritization approaches such as
ToxCast can make  significant headway in ana-
lyzing small organic and organometallic com-
pounds, for which most HTS methods have
been developed for use in the pharmaceutical
industry. Because of solubility and volatility
     0.05
Figure 2. Distribution of calculated logP for representative chemical sets. The sum of fractions for each
data set equals 1.
issues, however, many exceptionally high- and
low-MW environmental compounds or highly
lipophilic compounds may require new screen-
ing methods. Of special interest are nano-
materials, which will require new standards for
description (i.e., size, shape, composition, etc.)
and may require  entirely new approaches to
thinking about cellular and organism-level tox-
icity (Maynard et  al. 2006; Shaw et al. 2008).
One rarely has knowledge of metabolites that
can arise from a parent compound in vivo and
whether any of these metabolites are more or
less toxic than the parent. However, a number
of metabolic pathway databases and/or simula-
tors are currently available or under develop-
ment that could potentially be incorporated
into ACToR in the future. Finally, a large
number of known biological pathways (i.e.,
signaling, metabolism, etc.) have the potential
to lead to toxicity when significantly perturbed.
Many toxicity pathways have been implicated
in whole-animal end points, such as  liver can-
cer, and most chemicals can perturb multiple
candidate toxicity pathways. Gaining a predic-
tive and mechanistic understanding of chemi-
cal toxicity will require the ability to predict
which set of toxicity pathways are triggered by
individual chemicals.
   A significant amount of data on chemicals
is not currently accessible for modeling,  either
because it is not publicly available or because
it is  not yet extracted from primary reports
in a useful, tabular format. Several efforts are
under way at the U.S. EPA and other institu-
tions to extract, standardize, compile, and ana-
lyze such high-quality data (U.S. EPA 2008b).
We would welcome collaborations with other
groups  producing such tabular data sets on
these important classes of chemicals.

Conclusions
In this article, we  have described a process for
determining a set of environmental chemi-
cals with the highest need for hazard and risk
evaluation, which is based primarily on objec-
tive, simple measures of data availability.  In
addition, we have collected information  from
a large number of publicly available sources
to determine the  state of our current knowl-
edge of these chemicals.  The list we developed
includes HPV and MPV chemicals, pesticide
and antimicrobial active and inert ingredi-
ents, and potential air and drinking water
pollutants, in addition  to chemicals already
being evaluated by the U.S. EPA IRIS and
ToxCast programs. Although the input lists
are developed from the  perspective of regula-
tory and research needs of the U.S.  EPA, we
believe that our overall  conclusions  will have
wide applicability. This process resulted in
a collection of 9,912  unique chemicals. We
have at least limited hazard information on
approximately two-thirds of these and detailed
toxicology information on approximately
Environmental Health Perspectives • VOLUME 117 I NUMBER 5 I May 2009
                                        Previous
                  TOC
                                                                                  693

-------
Judson et al.
one-quarter. The combination of chemical
structure and in vivo data on  this large range
of environmental chemicals  in ACToR can
facilitate structure-activity relationship  and
other types of trend analyses. These analyses
will have direct relevance to  U.S. EPA pro-
grams such as HPV  Challenge (U.S. EPA
2008m)  and the Chemical Assessment  and
Management Program (U.S. EPA 2008b).
    The principal reason for the lack of more
complete toxicity information  is the extremely
high cost for full evaluation  using standard
guideline animal studies, which is millions of
dollars per chemical. This has prompted the
call for the use of more cost-effective HTS
methods for quickly screening and prioritiz-
ing chemicals  for more detailed testing.  The
analysis presented here is a first step in such
a screening and prioritization process being
carried out at  the U.S. EPA as part of the
ToxCast  program. ToxCast is using hun-
dreds of in  vitro HTS assays  to assess poten-
tial  mechanisms through which  chemicals
could cause toxicity. This hazard prediction is
just one of several axes along  which potential
risk needs to be  evaluated. Chemicals need
to be evaluated for exposure potential,  and
for adsorption, distribution,  metabolism,
and excretion  (ADME) and pharmacokinet-
ics  properties.  Of special concern would be
compounds that are persistent or bioaccu-
mulative.  Researchers at Health Canada have
demonstrated  a process to  evaluate exposure
for many of these chemicals (Health Canada
2006). Chemical structure analysis  can be
used as part of the prioritization  process,
for instance, in predicting bioaccumulation
potential (Meylan et al.  1999; Weisbrod
et al. 2007) and fractional  absorption (Ekins
et al. 2007a, 2007b). Nonanimal experimen-
tal methods are available to approximate gut
absorption (Sun et al. 2008) and total hepatic
clearance (Naritomi et al. 2003). Reverse-
pharmacokinetic methods  (Brightman et al.
2006) can be used to predict oral doses  that
would be required to trigger  molecular pro-
cesses, for instance, based on half maximal
inhibitory concentrations (ICjo)  for recep-
tor binding from in vitro assays. These  and
other related approaches are being considered
as part of the  overall ToxCast screening  and
prioritization process.  Of special relevance to
the ToxCast program, we have identified a set
of 1,046 candidate  chemicals that have  reli-
able in vivo toxicology data and have physico-
chemical properties that make them  suitable
for in vitro HTS analysis. These are candidates
for phase II of the ToxCast program, which
will be used to validate in vitro-to-in vivo  tox-
icity predictions, which are one outcome of
phase I of this program.
    Another important input to this process is
high-quality, tabular in vivo toxicity data. This
is required to  anchor our in  vitro-to-in  vivo
prediction models, in both the model build-
ing and model validation  phases. Initially,
we are making use of the results of guideline
toxicology studies  for pesticide active  ingredi-
ents,  which are being collected into the U.S.
EPA  ToxRefDB (Martin et al. 2008). We are
expanding this data collation effort in coordi-
nation with the ACToR project. As already
described, ACToR is a  database  consisting
of information on environmental chemicals
from a wide number of sources.  However,
currently much of the high-quality toxicology
data  indexed in ACToR still resides in  text
reports and remains to be manually extracted
into tabular form.
    An important aspect of this program is
openness and transparency.  The ToxCast
program  is making  all of its data publicly
available. It has a large  community of col-
laborators, from government  labs,  compa-
nies,  and universities. Finally,  important open
venues for learning about this program  and
the Chemical Prioritization and Exposure
Communities of Practice  are providing
input (U.S. EPA 20081). These are bringing
together representatives from U.S. EPA, state,
and other national environmental regulatory
organizations,  academic  labs, stakeholder
companies, and public interest groups, all of
whom are providing important input  as we
collectively work  to address  this  important
problem.  All of these efforts are  consistent
with achieving the goals  and  vision of the
recent NRC report Toxicity Testing in the 21st
Century (NRC 2007).

                  REFERENCES

Allanou R, Hansen B, van det Bill Y. 1999. Public Availability of
    Data on EU High Production Volume Chemicals. Available:
    http://ecb.jrc.it/documents/Existing-Chemicals/PU BLIC_
   AVAILABILITY_OF_DATA/[accessed 8 August 2008].
Applegate J, Baer K. 2006. Strategies for  Closing the Data Gap.
   Available: http://www.progressivereform.org/articles/
    Closing_Data_Gaps_602.pdf [accessed 8 August 2008].
Austin  CP, Brady LS, Insel TR, Collins FS. 2004. NIH Molecular
    Libraries Initiative. Science 306(56991:1138-1139.
Barton HA, Pastoor TP, Baetcke K, Chambers JE, Diliberto J,
    Doerrer NG, et al. 2006. The acquisition and application
    of absorption, distribution, metabolism, and excretion
    (ADME) data in agricultural chemical safety assessments.
    Crit Rev Toxicol 36(1]:9-35.
Birnbaum LS, Staskal DF, Diliberto JJ. 2003. Health effects of
    polybrominated dibenzo-p-dioxins (PBDDs) and dibenzo-
   furans(PBDFs). Environ Int29(61:855-860.
Brightman  FA,  Leahy  DE, Searle GE,  Thomas S.  2006.
   Application of a generic physiologically based  pharmaco-
    kinetic model to the  estimation of xenobiotic levels in rat
    plasma. Drug Metab Dispos 34(11:84-93.
Brinkhuis RP. 2001. Toxicology information from US  govern-
    ment agencies. Toxicology 157(1-21:25-49.
California EPA. 2008. Chemicals Known to the State of California
   to Cause Cancer. Available: http://www.oehha.ca.gov/
    prop65/prop65_list/Newlist.html [accessed 8 August2008].
Carmichael NG, Barton HA, Boobis AR, Cooper RL, Dellarco VL,
    Doerrer NG, et al. 2006. Agricultural chemical safety assess-
    ment: a multisector approach to the modernization of human
    safety requirements. Crit Rev Toxicol 36(1):1-7.
Chemical Abstracts Service. 2008. Check Digit Verification of
    CAS Registry Numbers. Available:  http://www.cas.org/
    expertise/cascontent/registry/checkdig.html  [accessed
   8 August 2008].
ChemSpider. 2008. ChemSpider: Building a Structure Centric
    Community for Chemists. Available: http://www.chemspider.
    com/ [accessed 8 August 2008].
Clean Air Act Amendments of 1990.1990. Public Law 101-549.
Collins FS, Gray GM, Bucher JR. 2008. Toxicology. Transforming
    environmental health protection. Science 319(58651:906-907.
Daylight Chemical  Information Systems, Inc. 2008. SMILES
    (Simplified Molecular  Input Line Entry System). Available:
    http://www.daylight.com/dayhtml/doc/theory/theory.
    smiles.html [accessed 24 October 2008].
Dix DJ, Houck KA, Martin MT, Richard AM, Setzer RW, Kavlock RJ.
    2007. The ToxCast program for prioritizing toxicity testing of
    environmental chemicals. Toxicol Sci95(1):5-12.
Ekins S, Mestres J, Testa B. 2007a. In  silico pharmacology
    for drug discovery: applications to targets and beyond.
    Br J Pharmacol 152(11:21-37.
Ekins S, Mestres J,  Testa B. 2007b. In silico pharmacology for
    drug discovery: methods for virtual  ligand screening and
    profiling. Br J Pharmacol 152(11:9-20.
EPCRA.  1986. Emergency  Planning and  Community Right-to-
    Know Act of 1986. 42 USC §11001 et seq. Public Law 103-337.
FDA (Food and  Drug  Administration). 2006. Select Committee on
    GRAS Substances (SCOGS) Database Overview. Available:
    http://www.cfsan.fda.gov/~dms/opascogs.html [accessed
    8 August 2008].
FDA (Food  and  Drug Administration). 2007. Cumulative
    Estimated  Daily Intake/Acceptable Daily Intake Database.
    Available: http://www.cfsan.fda.gov/~dms/opa-edi.html
    [accessed 8 August 2008].
FDA (Food and Drug Administration). 2008. EAFUS: A Food
    Additive  Database. Available: http://vm.cfsan.fda.
    gov/~dms/eafus.html [accessed 8 August 2008].
Felsot AS. 2002. Web resources for pesticide toxicology,
    environmental  chemistry, and policy:  a utilitarian per-
    spective. Toxicology 173(1-21:153-166.
Food Quality Protection Act of 1996.1996.  Public Law 104-170.
Gold LS, Manley NB, Slone TH, Ward JM. 2001. Compendium of
    chemical carcinogens by target organ: results of chronic
    bioassays in rats, mice, hamsters, dogs, and monkeys.
    Toxicol  Pathol 29(61:639-652.
Guth J, Denison R, Saas J. 2005. Background  Paper for Reform
    No. 5 of the Louisville Charter for Safer Chemicals:
    Require Comprehensive Safety Data for All Chemicals.
    Available: http://www.louisvillecharter.org/downloads/
    CharterBkgrdPaper5.pdf [accessed 8 August 2008].
Health Canada. 2006. Categorization of Substances on the
    Domestic  Substances List. Available: http://www.hc-sc.
    gc.ca/ewh-semVcontaminants/existsub/categor/index-eng.
    php [accessed 1 August 2008].
IPCS (International  Programme on  Chemical Safety).
    2008. INCHEM—Chemical Safety Information  from
    Intergovernmental Organizations. Available: http://www.
    inchem.org/ [accessed 8 August 2008].
IUCLID. 2008. IUCLID —International Uniform Chemical
    Information Database. Available: http://ecbwbiu5.jrc.it/
    [accessed 8 August 2008].
IUPAC (International Union of Pure and Applied Chemistry).
    2008. IUPAC—International Chemical Identifier. Available:
    http://old.iupac.org/projects/2000/2000-025-1-800.html
    [accessed 24 October 2008].
Judson PN, Cooke PA, Doerrer NG, Greene N,  Hanzlik RP, Hardy
    C, et al.  2005. Towards the creation of an  international toxi-
    cology information centre. Toxicology 213(1-2):117-128.
Judson  R, Richard  A, Dix D, Houck K, Elloumi F, Martin M,
    et al. 2008. ACToR—Aggregated Computational Toxicology
    Resource. Toxicol Appl Pharmacol 233:7-13.
Junghans TB, Sevin IF, lonin B, Seifried H. 2002. Cancer infor-
    mation resources: digital and online sources. Toxicology
    173(1-21:13-34.
Martin MT,  Judson RS, Reif DM, Kavlock RJ, Dix DJ. 2009.
    Profiling chemicals based  on chronic toxicity results from
    the  U.S. EPA ToxRef Database.  Environ Health Perspect
    117:392-399.
Maynard AD, Aitken RJ, Butz T, Colvin V,  Donaldson K,
    Oberdorster G,  et al. 2006. Safe handling of nanotechnol-
    ogy. Nature 444(71171:267-269.
Meylan W, Howard  P, Boethling R. 1996.  Improved method for
    estimating water solubility from octanol/water partition
    coefficient. Environ Toxicol Chem 15:100-106.
Meylan W, Howard  P, Boethling R. 1999.  Improved method for
    estimating bioconcentration/bioaccumulation factor from
    octanol/water partition coefficient. Environ Toxicol Chem
    18(41:664-672.
Meylan WM, Howard PH. 1995. Atom/fragment contribution
694
                                VOLUME 117 I NUMBERS I May 2009 • Environmental Health Perspectives
                                             Previous

-------
                                                                                              The  toxicity data  landscape for environmental  chemicals
    method for estimating octanol-water partition coefficients.
    J PharmSci 84(11:83-92.
Muir DC, Howard  PH. 2006. Are there other persistent organic
    pollutants? A challenge for environmental chemists.
    Environ Sci Technol 40(231:7157-7166.
Naritomi Y, Terashita S, Kagayama A, Sugiyama Y. 2003. Utility
    of hepatocytes in predicting drug metabolism: comparison
    of hepatic intrinsic clearance in rats and humans in vivo
    and in vitro. Drug Metab Dispos 31(51:580-588.
National Cancer  Institute. 2008. Chemical Structure Lookup
    Service. Available:  http://cactus.nci.nih.gov/cgi-bin/
    lookup/search [accessed 8 August 2008].
NCBI  (National Center for Biotechnology Information). 2008.
    PubChem. Available: http://pubchem.ncbi.nlm.nih.gov/
    [accessed 8 August 2008].
NLM  (National  Library of Medicine). 2008. TOXNET—
    Toxicology Data Network. Available: http://toxnet.nlm.nih.
    gov/[accessed 8 August 2008].
NRC (National  Research Council).  2007. Toxicity Testing  in
    the 21st Century: A Vision and  a Strategy. Washington,
    DC:National Academies Press.
NTP (National  Toxicology Program). 2008a. Center for the
    Evaluation of Risks  to Human Reproduction (CERHR).
    Available: http://cerhr.niehs.nih.gov/chemicals/index.html
    [accessed 1 August 2008].
NTP (National Toxicology  Program). 2008b. Report on
    Carcinogens (RoC). Available: http://ntp.niehs.nih.
    gov/index.cfm?objectid = 72016262-BDB7-CEBA-
    FA60E922B18C2540 [accessed 8 August 2008].
OECD. 2008. eChemPortal—the Global Portal to Information on
    Chemical Substances. Available: http://webnet3.oecd.org/
    echemportal/ [accessed 8 August 2008].
Patterson  J, Hakkinen PJ, Wullenweber AE. 2002. Human
    health risk assessment: selected Internet and World Wide
    Web resources. Toxicology 173(1-21:123-143.
Polifka JE, Faustman EM. 2002. Developmental toxicity: web
    resources for  evaluating risk in  humans. Toxicology
    173(1-21:35-65.
Pollution Prevention Act of 1990.1990. Public Law 106-40.
Poore LM, King G, Stefanik K. 2001. Toxicology information
    resources at the Environmental Protection  Agency.
    Toxicology 157(1-21:11-23.
REACH (Registration,  Evaluation, and Authorization  of
    Chemicals). 2008. European Chemical Agency—List  of
    Pre-registered  Substances. Available: http://apps.echa.
    europa.eu/preregistered/pre-registered-sub.aspx
    [accessed 27 October 2008].
Richard A, Judson R, Yang C. 2008.  Toxicity data informatics:
    supporting  a new paradigm for toxicity prediction. Toxicol
    Mech Methods 18:103-108.
Richard A, Williams C. 2003. Public  sources of mutagenicity
    and carcinogenicity  data: use in structure-activity rela-
    tionship models. In: QSARS of Mutagens and  Carcinogens
    (Benigni R, ed). NewYork:CRC Press, 145-173.
Richard AM, Gold LS, Nicklaus MC.  2006. Chemical structure
    indexing of toxicity data on the  Internet: moving toward a
    flat world. CurrOpin Drug Discov Devel 9(31:314-325.
Richard AM, Wolf MA, Burch J. 2007. DSSTox EPA Integrated
    Risk Information System (IRIS) Toxicity Review Data: SDF
    File and Documentation. Available: www.epa.gov/ncct/
    dsstox/ [accessed 2008 8 August].
Roth B, Lopez E. 2008. PDSP Ki Database. Available: http://pdsp.
    med.unc.edu/kidb.php [accessed 8 August 2008].
Russom CL. 2002. Mining environmental toxicology information:
    Web resources. Toxicology 173(1-2):75-88.
Shaw SY, Westly EC, Pittet MJ, Subramanian A, Schreiber SL,
    Weissleder R. 2008. Perturbational profiling of nanomaterial
    biologic activity. Proc Natl Acad Sci USA 105(211:7387-7392.
Stein S, Brown R. 1994. Estimation of normal boiling points from
    group contributions. J Chem Inf Comput Sci 34:1242-1250.
Sun H, Chow EC, Liu S, Du Y, Pang KS. 2008. The Caco-2 cell
    monolayer: usefulness and limitations. Expert Opin Drug
    Metab Toxicol 4(41:395-411.
TSCA (Toxic Substances Control Act of 1976). 1976. 15 USC
    §2601 etseq. Public Law 94-469.
U.S. EPA (Environmental Protection Agency).  1990.  The HPV
    Voluntary Challenge  Chemical List. Available: http://
    www.epa.gov/hpv/pubs/update/hpvchmlt.htm [accessed
    8 August 2008].
U.S. EPA (Environmental Protection Agency). 1998. Chemical
    Hazard  Data Availability Study. Available: http://www.epa.
    gov/hpv/pubs/general/hazchem.pdf [accessed 8 August
    2008].
U.S. EPA (Environmental Protection Agency). 2004a. Inventory
    Update Reporting. Available: www.epa.gov/opptintr/iur/
    [accessed 8 August2008].
U.S. EPA (Environmental Protection Agency). 2004b. What Is
    the TSCA Chemical Substance Inventory? Available: www.
    epa.gov/oppt/newchems/pubs/invntory.htm [accessed
    8 August 2008].
U.S.  EPA (Environmental  Protection Agency). 2007a. EPA
    Endocrine Disruptor Screening Program (EDSP). Available:
    http://www.epa.gov/endo/Iaccessed 8 August2008].
U.S. EPA (Environmental Protection Agency). 2007b. EPA OPP
    List of Chemicals Evaluated for Carcinogenic Potential.
    Av a i I a b I e:  h tt p ://www .epa.gov/pesticides/carlist/
    [accessed 8 August2008].
U.S.  EPA. 2007c. Estimation Programs Interface  Suite
    for Microsoft Windows, v3.20. Washington, DC U.S.
    Environmental Protection Agency.
U.S.  EPA  (Environmental  Protection Agency). 2008a.
    ACToR  Online (Aggregated Computational Toxicology
    Resource).  Available: http://actor.epa.gov/actor
    [accessed 19 November 2008].
U.S. EPA (Environmental Protection Agency). 2008b. ChAMP—
    EPA OPPT Chemical Assessment and  Management
    Program. Available: http://www.epa.gov/oppt/ar/2007-2008/
    managing/champ.htm [accessed 8 August 2008].
U.S. EPA (Environmental Protection Agency). 2008c. Contaminant
    Candidate List 3 Chemicals:  Identifying  the Universe.
    Available: http://www.epa.gov/ogwdw/ccl/pdfs/report_
    ccl3_chemicals_universe.pdf [accessed 8 August 2008].
U.S.  EPA  (Environmental  Protection Agency). 2008d.
    Contaminant  Candidate List 3 Chemicals: Screening to a
    PCCL. Available: http://www.epa.gov/ogwdw/ccl/pdfs/
    report_ccl3_chemicals_screening.pdf [accessed 8 August
    2008].
U.S. EPA (Environmental Protection Agency). 2008e. Drinking
    Water  Contaminant  Candidate List and Regulatory
    Determinations. Available: http://www.epa.gov/ogwdw/
    ccl/index.html [accessed 8 August2008].
U.S.  EPA (Environmental Protection Agency). 2008f. EPA
    Integrated Risk Information System. Available: http://cfpub.
    epa.gov/ncea/iris/ [accessed 24 October 2008].
U.S. EPA (Environmental Protection Agency). 2008g.  EPA IRIS
    Nominations. Available: http://www.epa.gov/fedrgstr/
    EPA-RESEARCH/2007/December/Day-21/r24844.htm
    [accessed 24 October 2008].
U.S. EPA (Environmental Protection Agency). 2008h. EPA IRIS
    Queue. Available:  http://www.epa.gov/fedrgstr/EPA-
    RESEARCH/2007/December/Day-21/r24844.htm [accessed
    24 October 2008].
U.S. EPA (Environmental Protection Agency). 2008L EPA NCCT
    Communities of Practice. Available: http://www.epa.gov/
    ncct/practice_community/ [accessed 27 October 2008].
U.S.  EPA (Environmental Protection Agency).  2008J. EPA
    Substance Registry System. Available: http://www.epa.
    gov/srs/[accessed 8 August 2008].
U.S.  EPA (Environmental Protection Agency). 2008k. EPA
    ToxCast.  Available: http://www.epa.gov/ncct/toxcast/
    [accessed 8 August 2008].
U.S.  EPA (Environmental Protection Agency). 2008I. High
    Production Volume Information System (HPVIS). Available:
    http://www.epa.gov/hpvis/index.html [accessed 8 August
    2008].
U.S.  EPA (Environmental  Protection Agency). 2008m. HPV
    Challenge. Available: http://www.epa.gov/HPV/ [accessed
    19 November 2008].
U.S.  EPA (Environmental  Protection Agency). 2008n. Inert
    Ingredients Permitted in Pesticide Product.  Available:
    http://www.epa.gov/opprd001/inerts/lists.html  [accessed
    8 August 2008].
U.S. EPA (Environmental Protection Agency). 2008o. TOXCST:
    Research Chemical Inventory for EPA's ToxCast Program:
    Structure-Index File. Available: http://www.epa.gov/ncct/
    dsstox/sdfjoxcst.html [accessed 8 August 2008].
U.S. EPA (Environmental Protection Agency). 2008p.  Toxics
    Release Inventory Program. Available: http://www.epa.
    gov/tri [accessed 8 August 2008].
U.S. EPA (Environmental  Protection Agency). 2008q. TSCA
    Inventory. Available: http://iaspub.epa.gov/srs/srs_proc_
    qry.navigate?P_REG_AUTH_ID=1&P_DATA_ID=169&P_
    VERSIONS [accessed 8 August 2008].
Waters  M,  Stasiewicz S, Merrick BA, Tomer K, Bushel  P,
    Paules R,  et al. 2008. CEBS—Chemical Effects in Biological
    Systems:  a public  data repository integrating study design
    and toxicity data  with microarray and proteomics data.
    Nucleic Acids Res 36:0892-0900.
Weisbrod AV,  Burkhard LP, Arnot J, Mekenyan 0, Howard PH,
    Russom C, et al. 2007. Workgroup  report: review offish
    bioaccumulation databases used to identify persistent, bio-
    accumulative, toxic substances. Environ  Health Perspect
    115:255-261.
Winter CK.  2002.  Electronic information  resources for food
    toxicology. Toxicology  173(1-21:89-96.
Wishart  DS, Tzur D, Knox C, Eisner R, Quo AC, Young N, et al.
    2007. HMDS: the Human Metabolome Database. Nucleic
    Acids Res 35:0521-0526.
Wolfgang GH, Johnson  DE. 2002. Web resources for drug toxicity.
    Toxicology 173(1-21:67-74.
Yang C, Benz RD, Cheeseman MA. 2006a. Landscape of current
    toxicity databases and  database standards. Curr Opin
    Drug Discov Devel 9(11:124-133.
Yang C,  Richard AM, Cross KP. 2006b. The art of data  mining
    the  minefields of toxicity databases to link chemistry to
    biology. Curr Comput Aided Drug Dis 2(21:135-150.
Young RR. 2002. Genetic toxicology: Web resources. Toxicology
    173(1-21:103-121.
Environmental Health Perspectives  •  VOLUME 117 I NUMBER 5 I May 2009
                                                   Previous
                       TOC
                                                                                                         695

-------
TOXICOLOGICAL SCIENCES 109(2), 358-371 (2009)
doi: 10.1093/toxsci/kfp061
Advance Access publication March 30, 2009
  Toward  a Public Toxicogenomics Capability for Supporting Predictive
   Toxicology: Survey  of Current Resources  and  Chemical Indexing of
                           Experiments in GEO and ArrayExpress

                       ClarLynda R. Williams-Devane,* Maritja A. Wolf,t and Ann M. Richard^'1
 * U.S. EPA/Office of Research and Development (ORD)INational Health & Environmental Effects Research Laboratory (NHEERL), Research Triangle Park, NC
 27519; ^Lockheed Martin (Contractor to U.S. EPA), Research Triangle Park, NC 27519; and $U.S. EPA/Office of Research and Development (ORD)INational
                            Center for Computational Toxicology (NCCT), Research Triangle Park, NC 27519
                                       Received January 18, 2009; accepted March 23, 2009
  A publicly available toxicogenomics capability for supporting
predictive toxicology and meta-analysis depends on availability of
gene expression data for chemical treatment scenarios, the ability
to locate and aggregate such information by chemical, and broad
data coverage within  chemical,  genomics, and  toxicological
information domains. This capability also depends  on  common
genomics standards, protocol description, and functional linkages
of diverse public Internet data resources. We present a survey of
public genomics resources from these vantage points and conclude
that, despite progress in many areas, the current  state  of  the
majority of public microarray databases is inadequate for support-
ing these objectives, particularly with regard to chemical indexing.
To  begin  to  address  these  inadequacies, we focus chemical
annotation efforts on experimental content contained in the two
primary  public genomic  resources:  ArrayExpress  and Gene
Expression Omnibus. Automated  scripts and extensive manual
review were employed to transform free-text experiment descrip-
tions into a standardized, chemically indexed inventory of experi-
ments in both resources.  These  flies, which include  top-level
summary annotations, allow for identification of current chemical-
associated experimental content, as well as chemical-exposure-
related (or "Treatment") content  of greatest potential  value to
toxicogenomics investigation. With these chemical-index  files, it is
possible for the first time to  assess the breadth  and overlap of
chemical study space represented in these databases, and to begin
to assess the sufficiency of data with shared protocols for chemical
similarity  inferences. Chemical indexing  of public genomics
databases is a first  important step toward integrating chemical,
toxicological and genomics data into predictive toxicology.
  Key  Words:  microarray; chemical;  toxicogenomics;  toxicity;
prediction.


  Disclaimer: This  manuscript was approved by the U.S. EPA's National
Center for Computational Toxicology  for publication. However, the contents
do not necessarily reflect the views and policies of the EPA and mention of
trade names or  commercial products does  not constitute endorsement or
recommendation for use. Each of the authors declares no competing interests
pertaining to the present work.
  1 To whom correspondence should be addressed at Mail Drop D343-03, 109
TW Alexander Dr., U.S. Environmental Protection Agency, Research Triangle
Park, NC 27711. Fax: (919) 685-3263. E-mail: richard.ann@epa.gov.

Published by Oxford University Press 2009.
  Conventional  toxicology  investigates  cellular  and animal
responses to chemical treatment through  domain-specific bio-
assay studies (e.g., chronic, developmental), typically mapping
a single chemical  to  a toxicological endpoint.  Microarray
technologies, in contrast, detect  genome-wide perturbations
resulting from a chemical treatment, and  measure response
variables that probe a large number of genes and gene pathways
potentially underlying  multiple  toxicological  endpoints. A
typical toxicogenomics  experiment requires that  linkages be
established between these technologies, focusing on treatment-
related effects of one or a few chemicals and attempting to relate
gene expression  changes to a toxicological endpoint (Gomase
et al., 2008; Hamadeh et al., 2002; Hirabayashi and Inoue, 2002).
In silico toxicogenomic meta-analysis methods combine data
across existing toxicological and gene expression experiments to
generate new, and to confirm existing hypotheses of the effect of
a compound treatment.  Such a capability depends upon the
availability of gene expression data derived  from chemical
treatment scenarios, as  well as anchoring toxicology data to
support predictive inferences.
  The chemical  nature of the problem requires a standardized,
chemical-centric view of data at all levels.  Hence, a  publicly
available  toxicogenomics capability  sufficiently  robust for
mechanistic inferences and building predictive models requires
not only common data standards,  protocols, and the ability to
query and aggregate common data types across resources, but
also broad data coverage within, and linkages across chemical,
genomics  and  toxicological   information   domains.  These
requirements have, to varying degrees, informed development
of the major public microarray databases, and have been the
central design principle of specialized toxicogenomic resources
(Waters et al.,  2008).  In  recent years, there have also been
significant  advances in promoting  toxicology standards and
data models (i.e., controlled vocabulary and hierarchical data
organization),  quantitative high-throughput  screening,  and
chemically indexed bioassay data  that, taken as a whole, have
                                     Previous

-------
                                     CHEMICAL INDEXING OF TOXICOGENOMICS RESOURCES
                                                                                                                           359
the potential to greatly enhance toxicogenomics capabilities in
the public  domain  (Dix et al,  2007;  Martin  et al, 2009;
Richard et al., 2008; Yang  et al, 2006a, 2006b).
   In  the genomics field, the two largest public resources for
deposition  of microarray data,  approved by  the Microarray
Gene  Expression  Data  (MGED)  Society (http://www.mged.
org/),   are  the  European  Bioinformatics  Institute's  (EBI)
ArrayExpress   (http://www.ebi.ac.uk/arrayexpress)  and   the
National Center for Biotechnology Information's (NCBI) Gene
Expression  Omnibus  (GEO)  (www.ncbi.nlm.nih.gov/geo).
Publishing requirements for the deposition of raw or processed
microarray  data into these database repositories, coupled  with
MIAME (Minimum Information About a Microarray Experi-
ment) standards for data reporting, are increasing the compara-
bility, utility and breadth of these resources (Ball et al.,  2004).
Enhanced external programmatic  access to the major  public
microarray  data   repositories   also  allows  third  parties  to
automatically extract and reformulate data to enhance informat-
ics and data mining capabilities (Boyle, 2005; Ivliev et al., 2008;
Zhu  et al., 2008). Additional public  efforts are  aimed at
standardizing the description of experimental protocols (Taylor
et al., 2008), as well as improving toxicity data standards in
relation to toxicogenomics experiments (Burgoon, 2007; Fostel,
2008; Fostel et al. 2005,  2007).  Largely  neglected  in the
genomics  field, however,  has  been  the standardization  of
chemical information  associated with  the experimental  data
when   chemical treatment   is  a  primary objective  of  the
experiment. Such   annotation  is  essential for systematically
relating chemical property and effects information, irrespective
of whether the study has an explicit toxicological focus, across
the diverse data domains potentially contributing to toxicoge-
nomics. Furthermore, the ability to query, relate, and aggregate
information by  chemical and across chemical space is essential
to the  goal of chemical screening  and  toxicity assessment
(Dix et al., 2007; Richard et al., 2008; Yang et al., 2008).
   In the remainder of this paper, we broadly survey the current
state  of public  microarray  resources  from the above vantage
points, focusing particularly on  the  two primary resources,
ArrayExpress and  GEO. Although the latter resources are not
explicitly designed to meet the needs of the  toxicogenomics
community, they  currently serve as  the two largest  public
microarray   data  repositories  of potential  toxicogenomic
relevance  and,  as  such,  are potentially valuable sources  of
data for toxicogenomics study. Despite progress in many areas,
we find  the  present state  of public  microarray  repositories
inadequate  for  supporting interoperability and linkages  across
diverse data domains  in support of toxicogenomics. Particu-
larly  noteworthy is  the  lack of minimal chemical annotation
and, as a result, the effective isolation of these resources and
associated data from the growing inventories of chemically
indexed  bioassay   information  of  potential relevance  to
toxicology  (Richard et al, 2006, 2008).
   To  begin to address  these  inadequacies, we propose and
implement  a  set of standard genomic  fields for indexing of
experimental study records,  aligned with  current  MIAME
guidelines, that enables cross-referencing and comparison of
total  experimental content  in  GEO and ArrayExpress.  In
addition, we implement a set of established chemical standards
for labeling experiments contained  within ArrayExpress  and
GEO, in collaboration with the U.S. Environmental Protection
Agency's  (EPA)  Distributed  Structure-Searchable  Toxicity
Database (DSSTox) project. We briefly  describe  the process
of  annotation and  creation of  public-distribution  DSSTox
chemical-index files for both GEO Series and Array Express
Repository. These files enable, for the first time, assessment of
the chemical scope, diversity,  and coverage of experimental
content within  GEO  and ArrayExpress  of potential use for
toxicogenomics study.
                         METHODS

  For the purpose of assessing the relevance of public microarray resources to
toxicogenomics  and predictive  toxicology,  we considered  the current
annotation of experimental content pertaining to chemical treatment scenarios,
that is, cases in which study of gene expression changes induced by chemical
treatment constituted the primary goal of the experiment. As a measure of
interoperability between data resources, we examined the standardization of
terminology  and data accessibility, as well as  the formatting of data, paying
particular attention to specification of experimental protocols, such as animal/
tissue/cell treatment, RNA extraction, microarray preparation, data import/
export, and analysis. As a measure of chemical indexing, we examined the
degree of standardization and annotation pertaining to chemical-associated
experiments  across public genomics resources and, particularly,  whether the
chemical information was formally indexed, that is, contained in a separate,
searchable field.
  For the purpose of chemically indexing experimental content  in ArrayEx-
press and GEO, a chemical-exposure (or "Treatment") microarray experiment
is broadly defined by us as a study in which the cells, tissues, or whole
organisms were treated with a defined chemical, chemical mixture, or natural
substance  (including biologies and proteins), DNA was extracted, and gene
expression changes resulting from this  treatment  were  investigated  with
microarray technologies. Whether the chemical  to which the  system was
exposed  is  a  known  toxicant,  potential toxicant,  natural substance, or
therapeutic agent need not be distinguished because the measured outcome is
the  same, that is,  treatment-related gene expression changes. However,
experiments  in which  chemical treatment was  secondary  to the primary
purpose of the experiment (e.g., treatment with prophylactic antibiotics for
maintaining tissue culture conditions) or where study of chemical-exposure-
induced effects was not the primary purpose of the experiment (e.g., treatment
with streptozocin  to induce Diabetes Mellitus  for  investigating the effects of
diabetes) required further annotation and review. These cases of chemical-
experiment associations were labeled by us to indicate the role of the chemical
as other than "Treatment."
  For initial inventory purposes, extraction of experimental description fields,
and locating  chemical-associated experiments within ArrayExpress and GEO,
we used available web search options and programmatic access  tools within
each system, as well as extensive manual review (a workflow diagram is
provided in  Supplemental Fig.  1; additional details of the methodology
employed here are publicly available—see Acknowledgments). For the present
study, GEO Series provides the most complete inventory of current
experiments  within GEO and these are also most closely aligned  with
ArrayExpress Repository experiments.  Hence, ArrayExpress Repository and
GEO Series  experiments  were the focus of the  present  chemical-indexing
efforts.
                                     Previous

-------
360
                                               WILLIAMS-DEVANE, WOLF, AND RICHARD
  ArrayExpress.  Due to its large size (over 6300 experiments at the time of
data extraction), limited  and unstructured chemical annotation, and dynamic
content (updated regularly with new experiments), the review and annotation
process for ArrayExpress involved several iterative steps for identification and
characterization of chemical treatment experiments within the  main database
repository. Initially, a bulk download of all data housed in the repository from
the main web site ((http://www.ebi.ac.uk/arrayexpress) was undertaken with
a wildcard query  in the accession number  query box (i.e.,  to retrieve all
experiments). The resulting records were individually reviewed and  a  pre-
liminary  index of chemical information and  indications of  a Chemical
Exposure Record> was constructed. The latter field included any detail deemed
as potentially useful for  discerning whether a record  pertained to a chemical
treatment  experiment,  for example,  designations   in  the  ArrayExpress
 field, such as "dose," "treatment," etc.  This  pre-
liminary chemical index was used to identify chemical-associated and chemical
"treatment" experiments, to infer the minimum information necessary to
identify  such records  from within ArrayExpress,  and to build  and test an
automated indexing capability using custom Perl scripts (http://www.perl.com).
Through an iterative process, scripts were refined to achieve better success at
detecting "true" chemical treatment experiments, verified by manual review
according to our definition above. Perl scripts for keyword text searching and
filtering were combined with manual curation methods to construct ArrayEx-
press Repository  chemical-index files  from website  content downloaded on
September 20, 2008.

  Gene  Expression  Omnibus.  GEO  contains user-deposited  dynamic
content,  and limited and unstructured chemical annotation. Hence, a manual
method similar to that employed in the review  of ArrayExpress was initially
required. All data were downloaded from the GEO homepage in the GSE Series
format. Each of the GEO Series was manually reviewed for chemical content
and this information was used to construct an index of the chemical information
and indications of a  Chemical Exposure Recordx As in ArrayExpress, the
indications of a Chemical Exposure Record> field contained details to aid in
discerning whether a  record pertained  to a  chemical treatment  experiment.
From this chemical-experiment index, the first chemical annotation of GEO
was completed. Similar to the annotation of  ArrayExpress,  this manually
curated  chemical  index  was  used to  test  and  refine automated curation
approaches that were applied  to subsequent versions of the GEO  Series
inventory. Several automated methods  were developed using NCBI Entrez
Programming Utilities  (E-Utilities) (http://www.ncbi.nkn.nih.gov/entrez/query/
static/eutils_help.html), an XML version of the  U.S. National  Library of
Medicine's (NLM) chemical Medical Subject Headings (MeSH) library (http://
www.ncbi.nkn.nih.gov/sites/entrez7db = mesh), and  a series of custom  Perl
scripts to parse through a complete XML version of the GEO Series database.
The chemical index of GEO Series was completed using a series of Perl scripts
that call on E-Utilities, combined with manual curation methods, and was based
on content downloaded on September 20, 2008.

  Chemical index.   The main result of the above process was to produce
a static,  preliminary chemical  index for all  chemical-associated  microarray
experiments in ArrayExpress  and GEO, in  which the subset of  chemical
"treatment" experiments were identified. These preliminary index files took the
form of a list of minimal chemical identifiers (most often chemical names only)
directly extracted from the user-deposited information in these  two resources.
These chemical-experiment index files  subsequently underwent  a rigorous
cleanup  and chemical quality  review, using  source (submitter)-provided
chemical information  and contextual text descriptions to definitively identify
the chemical substance and its relationship to the experiment (e.g., treatment,
vehicle, reference). The  generally poor quality and consistency  of chemical
information contained in ArrayExpress  and  GEO   submitter-supplied  de-
scription  fields, the high frequency of abbreviations and spelling errors, and
the lack  of chemical identifiers such as Chemical Abstracts Service Registry
Numbers (CASRN; http://www.cas.org/) or EBI's Chemicals of Biological
Interest (ChEBI; http://www.ebi.ac.uk/chebi) identifiers, all prevented greater
application of automated text-mining and chemical name-to-structure conver-
   sion capabilities. In addition, the need to accurately discern the role of the
   chemical in the experiment (i.e., treatment, etc.) from the free-text description
   prevented use of more efficient automated methods.
     DSSTox Standard Chemical Fields were assigned to the chemical-index files
   according  to   established  procedures   (http://www.epa.gov/ncct/dsstox/
   ChemicalInfQAProcedures.html). These fields allow for standardized repre-
   sentation of both the test substance ("TestSubstance" fields) and the chemical
   structure  ("STRUCTURE" fields)  in  relation to  any  chemical-associated
   experiment record. DSSTox Standard Chemical Fields include chemical name,
   CASRN (if available),  and test  substance description (e.g., single chemical
   compound, macromolecule, mixture or formulation, etc).  Where  the test
   substance is not overly large (>  1800 amu) and can be reasonably represented
   by a single molecular structure,  "STRUCTURE" fields are provided. These
   include a  public standard, "molfile"  of the chemical structure  (a  two-
   dimensional projection of the three-dimensional  structure)  assigned to the
   substance,  several fields automatically derived  from the "molfile" structure
   (i.e.,  molecular weight,  formula, IUPAC name, SMILES, SMILES_Parent,
   InChI, InChlKey), chemical type (i.e., defined organic, inorganic, organome-
   tallic), and a field  indicating the relationship  of the STRUCTURE to the
   TestSubstance (i.e.,  tested chemical, representative isomer in mixture, active
   ingredient in a formulation, etc.) (for more information,  see http://www.epa.
   gov/ncct/dsstox/MoreonStandardChemFields.html). Assessment of chemical
   overlap between GEO and ArrayExpress DSSTox chemical-index files was
   determined on the basis of DSSTox TestSubstance identifiers.
                               RESULTS

     Over 42 public Internet resources housing microarray data of
   potential  toxicogenomics  relevance  were  initially  identified
   from various categories  (Microarray World list  of databases,
   http://www.microarrayworld.com/DatabasePage.html).   From
   this  list,  we  identified  eight resources  containing chemical-
   exposure-related content,  and divided these  into two catego-
   ries:  primary  and  secondary  genomics  resources.  Primary
   genomics resources consist  of the three MIAME-supportive,
   MGED-approved gene expression repositories: NCBI's  GEO,
   EBI's ArrayExpress, and the Center for Information Biology
   Gene Expression (CIBEX) database (see Table 1 for listing of
   Sources,  URLs,  and references). Secondary genomics resour-
   ces  consist of five additional  public  genomics  resources of
   potential toxicogenomics relevance that  contain data gathered
   from chemical-exposure  experiments in one or more laborato-
   ries  (see Table 1 for listing of Sources, URLs, and references).
   A selection  of public cheminformatics  resources potentially
   useful  for  supporting  a  public toxicogenomics capability  are
   listed in  Supplemental Table 1. A brief description of survey
   results  are given for each data resource below, followed by
   chemical-indexing   results   for  the   two   major  resources,
   ArrayExpress and GEO.

   ArrayExpress Repository
     ArrayExpress  is the  largest user-depositor data repository
   and  MIAME-supportive public  archive  of  microarray data in
   Europe, consisting of two parts—ArrayExpress Repository and
   the ArrayExpress  Data Warehouse (Table 1). The  ArrayEx-
   press Repository currently exceeds  6900 experiments, and is
                                            Previous
TOC

-------
                                   CHEMICAL INDEXING OF TOXICOGENOMICS RESOURCES
                                                                                                                    361
                                                        TABLE 1
              Primary and Secondary Genomics Data Resources with Content of Potential Use for Toxicogenomics

Primary genomic
resources





Database Source/URL
ArrayExpress European Bioinformatics Institute
(EBI); www.ebi.ac.uk/
microarray-as/ae/

GEO National Center for Biotechnology
Information (NCBI), National
Institutes of Health; www.ncbi.
References
Ball at al., 2004;
Brazma et al., 2006;
Parkinson et al., 2007;
Rustici et al., 2008
Barrett and Edgar, 2006;
Barrett et al., 2007;
Wheeler et al, 2008
Public data deposition Programmatic access
Yes Yes (XML)



Yes Yes (E-Utilities)


                  CIBEX
Secondary genomic   EDGE
  resources
                  CEBS
                  PEPR
                  dbZach
                  CTD
  nkn.nih.gov/geo
DNA Data Bank of Japan (DDBJ),
  National Institute of Genetics;
  http://cibex.nig.ac.jp/
McArdle Laboratory for
  Cancer Research, University of
  Wisconsin-Madison; http://
  edge.oncology.wisc.edu/edge3.php
National Institute of Environmental
Health Sciences (NIEHS); http://
  cebs.niehs.nih.gov/cebs-browser/
Center for Genetic Medicine
  Research; http://pepr.
  cnmcresearch.org/
Department of Biochemistry &
  Molecular Biology, Michigan
  State University; http://
  dbzach.fst.msu.edu
Mount Desert Island Biological
Laboratory; http://ctd.mdibl.org
                                                              Tateno and Ikeo, 2004
Hayes et al., 2005
Fostel et al., 2005;
Waters et al., 2003; 2008

Chen et al, 2004
                                                              Burgoon et al., 2006
                                                              Davis et al, 2009
                                                                                           Yes
                                                                                           No
                                                                                           Yes
                                                                                           No
                                                                                           No
                                                                                           No
                                                                                                          No
                                                                                                          No
                                                                                                          No
                                                                                                          No
                                                                                                          No
                                                                                                          No
indexed by Experiment Array Design  and Protocol. Experi-
ments  can be queried by Keyword,  Experimental Accession
Number,  Species,  Experiment Type  and Factors, Author,
Laboratory,  and  Publication  information  (http://www.ebi.
ac.uk/microarray-as/aer/entry).  Repository  data are cataloged,
assessed for completeness, and assigned a MIAME  score that
represents the degree of MIAME compliance. The  ArrayEx-
press Data Warehouse is based on more limited processed data
results from the ArrayExpress Repository, currently contains
740  Expression Profiles (website accessed on November 14,
2008), and allows users to browse curated datasets from both
a gene-  and/or experiment-centric  view. ArrayExpress has
incorporated  significant  experimental   content  from  GEO,
which  can be located within ArrayExpress  by GEO Accession
Identifiers.
  At the  time  of survey, ArrayExpress was not chemically
indexed, nor did it  contain additional  information  about the
chemical tested other than the infrequently provided CASRN
or ChEBI number. Chemical information may be located in the
user-supplied protocols  and free-text experimental description,
or can be  searched with  the advanced query  tools  from
ArrayExpress,  including  a keyword  or  text  search in  the
Description field in "Query for Experiments."  These can also
be combined  with  specifications of   =
                              "compound treatment" or  "dose response," but these latter
                              annotations are optionally utilized and not consistently applied
                              by  depositors to all chemical treatment experiments  in  the
                              database.  Chemical information can also be  embedded within
                              the  ArrayExpress   Sample-Data Relationship  File   (http://
                              tab2mage. s ourceforge .net/docs/sdrf .html).
                                In  2002, ArrayExpress introduced the Tox-MIAMExpress
                              data entry method,  optionally employed by data submitters to
                              store  toxicogenomics  data  in  an  effective manner  (http://
                              www.ebi.ac.uk/miamexpress/). Tox-MIAMExpress  was later
                              discontinued;  however, the ArrayExpress Accession Number
                              Code, TOXM,  designated  to identify experiments for this
                              purpose,  is still available  for use  when  requested by data
                              submitters. Currently, the optional TOXM label is assigned to
                              fewer than 25 experiments, but in these cases, typically more
                              chemical  identifier  information,  such  as  a CASRN  and/or
                              a ChEBI number,  is provided by  the submitter along with
                              additional  information recommended  by  the  MIAME/Tox
                              initiative (http://www.ebi.ac.uk/tox-miamexpress).

                              Gene Expression Omnibus
                                GEO  is  the  largest  user-depositor  data repository  and
                              MIAME-supportive  public  archive of microarray data in  the
                              U.S.  (Table  1), containing data from  approximately  10,000
                                  Previous

-------
362
                                        WILLIAMS-DEVANE, WOLF, AND RICHARD
experiments at the time of this writing. In GEO, raw and/or
processed data can be exported through the ftp website as well
as through  the main GEO Series website.  User information,
however, is  entered using  a free-text format that is  sub-
sequently curated.  GEO allows for a wide range of informed
queries with the Preview/Index window, where users can select
data based on choices for each attribute of the experiment. The
GEO  repository  has  three  key  components:  "Platform,"
"Sample,"  and "Series."  "Platform" provides a description
of the  array used in the experiment,  as well as a data table
defining the array template. The data table contains hybridiza-
tion measurements for each element of the  corresponding
platform. "Sample"  provides a description of the biological
source and the experimental protocols.  "Series" defines a set of
related samples considered to be part of a study and describes
the overall  study  aim and  design.  GEO has a  complex,
hierarchical  structure that  works  with the  NCBI E-Utilities,
allowing one to query by  submitter, organism,  platform,
sample  type,  sample  titles,  and release  date.  Similar  to
ArrayExpress, GEO  hosts  a smaller warehouse-type addition
named "GEO Datasets and  Profiles" containing processed,
curated datasets that can be explored from both a gene- and/or
experiment-centric  view.
  Also,  similar  to  Array  Express, GEO is  not chemically
indexed nor does it consistently contain information about the
chemical tested. Chemical  names  may be  located  in  the
submitter-deposited GEO Data Series fields—Title, Summary,
Citation, or Samples—and are not consistently present in any
single field. Chemical names are provided by the submitter, are
rarely accompanied by CASRN or ChEBI identifiers, and do
not undergo curation or review.  Hence,  as is the case with
ArrayExpress, there  is  no easy or reliable way  to identify
a chemical-exposure-related  experiment, there is no central
listing of chemical  content and, in both resources, we find that
the chemical names embedded within user-deposited descrip-
tion fields are highly variable, prone to errors and misspellings,
and frequently incorporate  nonstandard abbreviations.

Center for Information Biology Gene Expression
  The   CIBEX  database   is  a  Japanese  gene  expression
MIAME-supportive,  MGED-approved user-depositor system
(Table  1) that primarily  serves  experimenters  from  Asian
countries. It is included for completeness  sake, but currently
does not contain  significant  chemical  treatment  content.
However, the experimental protocol and detail standardization
are noteworthy, with each record accompanied by a document
containing full MIAME details. There is  also a high level of
curation and collaboration between CIBEX administrators and
depositors that allows for missing information to be identified
before publication,  as well as for a high level of standardization
and accuracy.
  At the time of this  writing, CIBEX contains 32 experiments,
only one of which is  a chemical-exposure experiment, with
CBX14 clearly labeled in the  field
  as "compound_treatment_design." Despite the high degree of
  standardization of this resource, however, there is currently no
  formal chemical  annotation field  accompanying  a chemical
  treatment experiment.

  Environment, Drug, Gene Expression Database
     The EDGE database (Table  1) is a closed (i.e.,  not open to
  public user-deposits of data), curated system designed for the
  comparison,  analysis  and  distribution  of  toxicogenomics
  information in a relational format. EDGE is chemical treatment
  centric and chemically indexed, with a toxicological focus. All
  experiments were performed in the Bradfield Laboratory using
  a  standardized protocol  involving  custom cDNA arrays  of
  minimally redundant hepatic clones, chosen through chemical-
  exposure experiments with prototype hepatic toxicants: 2,3,7,8,
  tetrachlorodibenzo-/? dioxin (TCDD), cobalt chloride, and
  phenobarbital. The experimental conditions include 22 chem-
  ical  treatments, 4 control treatments, and  1 environmental
  stressor (fasting)  over 1 mutant (circadian wild-type control).
  All chemical treatments were chosen for the express purpose of
  investigating  transcriptional profiles pertaining to hepatotoxic-
  ity in mice.
     Despite its small size and limited focus, EDGE incorporates
  a  high level of standardization  and comparability across
  species, array, experimental protocol, and  experimental details,
  and demonstrates how a fully relational database built on such
  data can  facilitate  toxicogenomics  investigation. However,
  EDGE is not a user-depositor  system and currently lacks the
  tissue, species, and chemical diversity necessary  for broader
  toxicogenomics exploration.

  Chemical Effects  in Biological Systems
     CEBS is a public user-depositor data repository  with  an
  explicit toxicological and  toxicogenomics focus  (Table  1).
  CEBS can accommodate study  design,  timeline,  clinical
  chemistry,  and histopathology  findings, as well as microarray
  and  proteomics data. Each experiment in CEBS pertains to
  a chemical/environmental exposure or a  genetic  alteration in
  reference  to  clinical or environmental  studies.  CEBS  has
  a  complementary functional component  known   as the Bio-
  medical Investigation Database (BID) (https://dir-apps.niehs.
  nih.gov/arc/), which  is  a relational database used  to load and
  curate study data  prior to exporting to public CEBS. BID also
  aids  in the capture and display of novel  data, including PCR
  and  toxicogenomic-relevant fields, as used in Array Express's
  TOXM designation.
     CEBS is currently indexed by study and subject character-
  istics, such as environmental, chemical, or genetic  stressor and
  stressor protocol,  and includes  observations on rat, mouse, and
  C. elegans. CEBS is one of the few genomics resource profiled
  in this  survey,  and  the  only   resource  with  significant
  toxicogenomics-relevant microarray content, that incorporates
  formal chemical name annotation of experiments.  At the time
                                     Previous
TOC

-------
                                  CHEMICAL INDEXING OF TOXICOGENOMICS RESOURCES
                                                                                                                 363
of this  writing,  CEBS  lists  an inventory  of 136  chemical
names, or "chemical stressors," associated with experimental
content, along with a searchable CASRN field containing 121
entries. CEBS plans to incorporate additional chemical stand-
ards, including structure annotation, in collaboration with the
EPA DSSTox project.

Public Expression Profiling Resources
  Similar to EDGE, the PEPR database (Table 1) is a closed,
curated system designed to serve as a public resource of gene
expression profile data generated in the same laboratory, using
the  same chip type for three species, and subject to the same
quality and procedural controls. PEPR is aimed  at providing
a standardized warehouse for the analysis of time-series data.
The high degree of standardization within PEPR grants users
comparability  across arrays without laboratory and array bias,
much like EDGE. PEPR adheres to quality  control and
standard  operating procedures  and is indexed by  Principle
Investigator, Tissue type, Experiment,  and Organism, but has
a very few chemical treatment-related  experiments and lacks
relational  searching  capabilities.   However,  the  time-series
query analysis tool (SGQT) enables the novel generation  of
graphs and spreadsheets showing the action of any transcript of
interest  over  time.  PEPR  also differs from EDGE in  the
extensive  data export options that include raw image files
(.dat), processed image files (.eel) and interpretation files (.txt).
PEPR also has external links to GEO, where PEPR  data are
mirrored through an automated export/import process.
  In PEPR, chemical information is stored in free-text fields
such  as  the  title,  description, and array  titles, similar  to
ArrayExpress  and  GEO. At the time of this writing, PEPR
contains 72 experiments, of which 10 are  determined  to be
chemical/environmental  exposure  experiments. Hence, PEPR
currently covers  very  limited chemical space, but the SGQT
tool for  analysis of time-series microarray data, as well as the
standardized chemical-exposure experiments are of potential
value for toxicogenomics studies.

DbZach
  dbZach, a laboratory tool offered for local installation, is of
interest  as a modular  MIAME-compliant,  toxicogenomic-
supportive relational  database designed  to facilitate  data
integration, analysis, and  sharing  in support of mechanistic
toxicology and  toxicogenomics studies (Table  1).  dbZach
consists of several subsystems  for the standardization  of all
data elements of a  toxicogenomics experiment  as  well  as
traditional  toxicological experiments  and,  additionally, has
built-in functionality for data import and export of both raw
data and processed data. Similar to EDGE, the dbZach project
has created a sophisticated relational  data environment for
integrating and exploring many aspects of  a toxicogenomics
study. However, also similar to EDGE, this  project is very
narrowly focused in chemical space and primarily limited to
estrogen and estrogenic chemicals.
      Comparative Toxicogenomics Database
        The  CTD  is  worthy of  mention for its toxicogenomics
      relevance, but is not a traditional genomics database (Table 1).
      Rather,  it is a  database  of curated relationships between
      chemicals, genes, and  diseases mined  from journal  articles.
      CTD  provides  text-mineable  access  to the toxicogenomic
      literature,  but currently provides direct linkage to only one
      secondary genomics resource, that is, EDGE.
        Also worthy of note, CTD uses the chemical subset of the
      NLM MESH vocabulary to  provide formal chemical annota-
      tion of its content and  to link to various chemically  indexed
      toxicology resources (http://ctd.mdibl.org/resources.jsp7type =
      chem). The  present CTD  inventory of over 4400 chemical
      substances also has recently been deposited into the NCBI
      PubChem resource (http://pubchem.ncbi.nlm.nih.gov/;  Supple-
      mental Table 1) to  offer stracture-searchability and  broader
      access to chemically indexed resources.
        Table 2 compares the above inventory of genomics resources
      from the standpoint of being chemically indexed (i.e., chemical
      identifiers are  required  and  entered   in  standard  fields),
      MIAME-supportive, and standardized with respect to various
      experimental descriptions. Additional details on the compari-
      son of the primary and secondary genomics resources identified
      in this study with respect to the types of gene expression data
      stored, toxicological focus, formats  of data available  for
      download (raw or processed), ability to query data, ability to
      import or export experimental data,  and programmatic access
      are presented in Supplemental Table 2.
        Web-based queries and programmatic access were  used in
      the present study to  extract current experimental content from
      ArrayExpress Repository and  GEO Series, and  to  identify
      corresponding  experiment annotation fields (resulting from
      adherence to MIAME guidelines in the two systems) that could
      be mapped to common  fields to enable comparisons across the
      two  inventories.  We  implemented  a   set  of  14 Standard
      Genomics Fields in  Table  3  to serve  this purpose and  to
      confer read-across capability between the two inventories. All
      but two of these fields map to existing MIAME-compliant data
      fields,  which  vary  only  slightly  in  name in  GEO  and
      ArrayExpress (see expanded columns in  Supplemental Table
      3) and, thus, are  straightforward to implement. One new field,
      "Experiment_URL," contains  a static URL link to enable
      outside Internet  access directly to the  experiment accession
      summary page in either ArrayExpress or GEO. The last field,
      "Chemical_StudyType," has no corresponding field in either
      ArrayExpress or GEO,  and was introduced  by us  to begin to
      address  the currently missing  chemical annotation layer for
      gene expression experiments in both resources.
        DSSTox chemical-index  files for GEO and ArrayExpress
      created  by the  above methods  are publicly available  for
      download from  the DSSTox  website  (http://epa.gov/ncct/
      dsstox/). In addition to  the main DSSTox chemical-index files
      that include one record for each unique chemical (i.e., unique
                                 Previous
TOC

-------
364
                                             WILLIAMS-DEVANE, WOLF, AND RICHARD
                                                              TABLE 2
                                     Standardization and Indexing of Genomics Data Resources
                                                                               Standardized
Data resource"
                  Indexed by
                   chemical
                                  MIAME-supportive
                      Species
                                     Array
                                   information
Experimental
  protocol
Experimental
   details
Allows relational
   searching17
ArrayExpress
GEO
CIBEX
EDGE
CEBS
PEPR
dbZach
CTD
                                                                        NA
                                                                                        NA
  "Refer to Table 1 for full names, sources, URLs, and references associated with these data resources; feature present (+) or absent (—).
  ^Standardized entries refer to internal content adhering to controlled vocabularies, and represented in defined and required fields; NA, not applicable.
  Relational searching refers to the ability to construct AND/OR-type queries across the content of defined fields.
test substance) in the "Treatment"  category in each of the two
repositories,  we have published Auxiliary  files that include
DSSTox Standard Chemical Fields, Standard Genomics Fields
(14),  and  additional Source-specific  experiment description
fields (33 for ArrayExpress, 4 for GEO) for the  full chemical-
associated experiment inventories  in the two resources (i.e.,
one record  for  each  chemical-experiment  pair).  Detailed
descriptions of the content of these  files and their incorporation
                                  into  the  DSSTox  Structure-Browser  (http://www.epa.gov/
                                  dsstox_stracturebrowser/) and PubChem, the results of which
                                  enable  structure-based Internet linkages directly to ArrayEx-
                                  press  and  GEO  experiment  summary  pages,  are provided
                                  elsewhere (Williams-Devane et al, 2009).
                                     Table  4 provides  a breakdown  of  the current  chemical-
                                  associated experimental  content within ArrayExpress  Reposi-
                                  tory and GEO  Series according to  all  Chemical_StudyType
                                                              TABLE 3
      Standard Genomics Fields for Common Indexing of Experiments Contained in ArrayExpress Repository and GEO Series
Field name
                                                                               Description
Experiment_Accession
Experiment_AlternativeAccession
ExperimentJdNumber
Experiment_Title
Experiment_Description
Experiment_URL
Experiment_PubMed_Information
Experiment_PublicationDate
Species
Number_S amples
Experiment_ArrayAccession
Experiment_ArrayType
Experiment_ArrayTitle
Chemical_StudyType:°

  Reference
  Treatment
  Vehicle
  Combination ^Treatment
  Media
  Not_Enough _Information
A unique combination of informative prefix and number used to identify each dataset.
An alternate accession number. Example: GEO files in ArrayExpress have GSE#### (GEO Series) secondary Accession
number for users
  to find the same data in GEO).
A unique identification number for each experiment within each database.
The title of the experiment.
A free-text, user-submitted description of the experiment or dataset.
URL links to the Source Experimental Download Page.
A unique number that links users to each PubMed publication associated with each experiment or dataset.
Date indicating when the dataset was released to the public or published.
Species as listed by the user.
Number of samples used  within a microarray experiment or dataset.
An accession number for each array design or platform.
Details about the platform used or details about data other than raw data that users have submitted.
The user-submitted title of the Array/Platform used in the experiment.
A designation of the role of the identified chemical in the given experiment. Allowed entries are listed as Subsections to
  this field (e.g., Reference, Treatment, Vehicle, ...).
Chemical used to mimic a biological or environmental situation.
The primary focus of experiment or study is to understand the transcriptomic effects of the chemical.
Chemical used to aid the  administration of the treatment to the organism, such as dimethyl sulfoxide.
Multiple chemicals used together for treatment purposes  (see "Treatment" above).
Chemical used in maintenance of the tissue culture or sample conditions, such as phosphate buffered saline.
Sufficient information is not present in the experimental  description to determine the role of the chemical.
  "Subsections to the Chemical_StudyType field have allowed entries: Reference, Treatment, etc., with linkage text "AND" used for combinations (e.g.,
TreatmentANDReference).
                                          Previous

-------
                                    CHEMICAL INDEXING OF TOXICOGENOMICS RESOURCES
                                                                                                                       365
                                                         TABLE 4
        Classification of Chemically Indexed Genomics Experiments in ArrayExpress and GEO by "Chemical_StudyType"
Database"
ArrayExpress Repository
GEO Series
Total no. of
Experiments6
6346
9957
Total no. of
Chemical-
Experiment
Records'7
2365
2381
Chemical_StudyType Classification6
Breakdown of Total no. of Chemical-Experiment Records (Unique Chemicals/
Total no. of
Unique Chemicals^
1011
1064
Treatment*
1609 (f
1951 (f
510)
538)
Reference
266 (157)
152 (60)
Vehicle
138 (26)
81 (48)
Media
111 (68)
14 (14)
Combination
Treatment*
109 (91)
72 (38)
Multiple
Classifications
118 (83)
111 (67)
Other
14 (10)
0(0)
  "All numbers relate to database content extracted on September 20, 2008.
  6Total number of experiments contained in the public resource (also corresponds to the number of unique Accession IDs).
  ^Number of Chemical-Experiment pairs extracted from the Total no. of Experiments prior to determination of the Chemical_StudyType Classification, where
some experiments in the Total no. of Experiments map to no chemicals, and some experiments involving multiple chemicals map to more than one Chemical-
Experiment record.
  rfTotal number of unique chemical test substances (i.e., no chemical test substance identity is duplicated) identified in the total group of Chemical-Experiment
records, irrespective of Chemical_StudyType Classification.
  ^Definitions of Chemical_StudyType Classifications are provided in Table 3.
  ^Number of Chemical-Experiment Records corresponding to each Chemical_StudyType category (with corresponding number of unique chemicals in
parentheses), where for the purposes of this table one record is assigned to one category and if the chemical is used for different purposes within one experiment
(e.g., TreatmentANDReference), it is assigned to the "Multiple Classifications" category.
  ^Number of Chemical-Experiment Records (with corresponding number of unique chemicals in parentheses) out of the total group of Chemical-Experiment
Records that are associated with the ' 'Treatment''  category according to the criteria for a chemical-exposure scenario set forth in this paper; any record labeled as
"Treatment" or "CombinationTreatment" (alone or in combination with other Chemical_StudyType labels, e.g., TreatmentANDReference), are included in the
final DSSTox chemical-index file.
categories (all counts correspond to data extracted on September
20,  2008).  Of  the 6346  total  ArrayExpress  experimental
descriptions  downloaded,  more  than  a  third  (2365)  were
identified as chemical-associated experiments by the procedures
outlined in the Methods section, corresponding to 1011 unique
chemical test substances.  Similarly,  of the 9957 GEO  Series
experimental descriptions downloaded, nearly  a quarter (2381)
were identified as chemical-associated experiments, correspond-
ing to a total of 1064 unique chemical test substances (Table 4).
   Table 5  provides  a  breakdown of  the  "Treatment"-
associated  experimental  content within  the  ArrayExpress
Repository and GEO Series according to DSSTox  chemical
classification  categories.  Of  the  1835  total  "Treatment"-
associated  experiments in the  ArrayExpress Repository,  1282
experiments (or 70% of the total) are associated with a "defined
organic" chemical test  substance (note that  multiple experi-
ments  can  map to the same chemical).  GEO Series contains
a  similarly  high  percentage  of "Treatment"  experiments
associated  with a defined organic chemical test substance, that
is, 1544/2134, or 72%. The above indicators give a rough sense
of  the  size  of the  inventory  of  microarray  experiments
associated  with defined organics in the public domain.
   Table 5  also provides indications of the size of the chemical
space associated with these "Treatment" experiments. Of the
total number of unique  chemical test substances associated
with the "Treatment" category of experiments in ArrayExpress
Repository, 628/887, or 71%  correspond to defined  organics.
Although  these  include  some  drugs,  small  peptides  and
biologies with molecular weights ranging 600-1700 amu, the
majority (>  90%)  are  small molecular  weight (< 500  amu)
organic  chemicals for  which  a  chemical structure  can  be
assigned and  that tend to be of greatest interest for environ-
mental toxicology and structure-activity relationship models and
inferences. A  similar percentage applies  to GEO, that is, 751/
1014, or 74% of unique chemical test substances associated with
"Treatment"  experiments  correspond   to  defined  organics.
Hence, both resources span a relatively large number of unique
defined  organic  chemicals,  which  implies  a broad  chemical
diversity associated with public microarray experiments. Within
ArrayExpress, the chemical that maps to the largest number of
chemical-experiments  is  "estradiol,"  occurring  in  53  experi-
ments, 44 of which are classified as "Treatment" experiments.

Comparison of GEO and ArrayExpress Experimental and
  Chemical Content
  Application of DSSTox Standard Chemical Fields and the
set of 14 Standard Genomics Fields enable direct comparison
of GEO  Series and ArrayExpress Repository experimental and
chemical content. In addition, the  DSSTox Auxiliary files for
ArrayExpress   include a number  of easily extracted  field
characteristics  affiliated  with each  experiment,  including
Array/Platform type, Species, and the MIAME Score  and its
five  subcategories:  Array or  Platform  information,  Factor
information, Raw Data information, Processed  Data informa-
tion,  and  Protocol information.   The latter  annotations  are
particularly valuable  for  assessing  the  sufficiency  of  the
experimental data for reanalysis.
  The distribution  of ArrayExpress "Treatment"  chemical-
experiments assigned to these various categories of experimental
description is provided in Table 6.  The distribution of MIAME
                                   Previous

-------
366
                                          WILLIAMS-DEVANE, WOLF, AND RICHARD
                                                        TABLE 5
    Classification of Chemically Indexed "Treatment" Genomics Experiments in ArrayExpress and GEO by DSSTox Chemical
                                                       Classification

Database"
ArrayExpress
repository
GEO series

Total no. of
Chemical-
Experiment
Records6
2365

2381
Total no. of
"Treatment"
Chemical-
Experiment
Records'7
1835

2134

DSSTox Chemical Classification6
Breakdown for "Treatment" Chemical-Experiment
Records (Unique Chemicals/
Total no. ot
Unique Chemicals'* No structure*
887 373 (179)

1014 346 (173)
Defined organic Inorganic Organometallic
1282 (628) 153 (60) 27 (20)

1544 (751) 210 (71) 34 (19)
  "All numbers relate to database content extracted on September 20, 2008.
  6See Table 4.
  cTotal number of Chemical-Experiment records assigned to any "Treatment" Chemical_StudyType (e.g., Treatment, CombinationTreatment,
Treatment&Reference, etc.) according to the criteria for a chemical-exposure scenario set forth in this paper.
  ''Total number of unique chemical test substances (i.e., no chemical test substance identity is duplicated) identified in the total group of "Treatment" Chemical-
Experiment Records.
  eRefers to DSSTox Standard Chemical Field Definition and allowed entries for STRUCTURE_ChemicalType (http://www.epa.gov/ncct/dsstox/
CentralFieldDef.html#STRUCTURE_ChemicalType).
  ^Number of "Treatment" Chemical-Experiment Records corresponding to each Chemical Classification category (with corresponding number of unique
chemical substances in parentheses), where each record maps to a single chemical classification and the list of unique chemicals for this "Treatment" subset of
experiments constitutes the final DSSTox structure-index file.
  ^Number of "Treatment" Chemical-Experiment Records (with corresponding # of unique chemicals) where the chemical test substance is identified, but not
assigned to a DSSTox chemical structure, for example, this can be an undefined mixture, polymer, or macromolecule.
scores is particularly illuminating. A total MIAME Score of
5 indicates  that  all components  of  the  MIAME  compliance
criteria have been included by the submitter. Only 18% (or 216)
of the chemical treatment  experiments  in  the  ArrayExpress
Repository have  all five components of  information, whereas
50%  (or 596) have four components of information. Most
noteworthy for this  subset of "Treatment" experiments, how-
ever,  raw data information is missing for 29% (or 347), pro-
cessed data is missing for 11% (or 131), and protocol is missing
for  21% (or 291)  (Table 6).  Given that  these are essential
experimental components for the reanalysis of gene expression
data,  these numbers limit the  number of chemical  treatment
experiments within  ArrayExpress that are potentially useful for
broader toxicogenomics investigation.
  The ArrayExpress Repository has experienced steep growth in
the past few years, largely as a result  of the integration of GEO
experimental content (approximately 4500  experiments  were
added from January 2007 to January  2008). ArrayExpress files
with E-GEOD-XXXX accession numbers mirror GEO  Series
entries and currently represent more than 50% of the  chemical-
exposure, or  "Treatment"  experiments  in  the  ArrayExpress
Repository (Fig. 1).  Figure 1 also shows that the total number of
chemical-experiment pairs (a pair being a  1:1  mapping of
chemical to  experiment) and  total  number of "Treatment"
chemical-experiment pairs  identified  in  the  current  study  are
comparable between ArrayExpress and GEO, with  greater than
50% overlap of chemical-experiment pairs in all categories.
  Unlike ArrayExpress,  GEO  Series  currently provides  no
MIAME scoring of content. However, because a significant
  portion of the "Treatment" experiments represented in GEO
  Series has been incorporated into the ArrayExpress Repository,
  it was possible to create a table summarizing these experimental
  factors for the subset of GEO chemical "Treatment" experi-
  ments contained within ArrayExpress (Table 6). Only 11% (or
  81) of the GEO records in ArrayExpress are assigned a MIAME
  Score of 5; however, 56% (or 415) have a MIAME score of 4. A
  much greater percentage,  45% (or  335)  of GEO records  in
  ArrayExpress, have corresponding Raw Data, whereas 100% (or
  745)  have  Processed Data (most likely  a precondition for
  inclusion of GEO experiments in ArrayExpress).
     Figure 2 presents  overlap  of the unique chemical content
  pertaining to  the  "Treatment" chemical-experiment category.
  Assessment of chemical overlap between GEO and ArrayExpress
  DSSTox  files was  determined  on the  basis  of  DSSTox
  "TestSubstance"  identifiers. The  steroids, estradiol and  dexa-
  methasone, are associated with the largest numbers of microarray
  experiments in both  cases, and the  largest  number  of  shared
  experiments  as well.  Other test  substances most commonly
  associated with experiments in either GEO or ArrayExpress
  include Ethanol, 2,3,7,8-TCDD, Retinoic Acid, and Trichostatin,
  each  of which is of broad lexicological interest.
  Assessing Toxicological relevance of GEO and ArrayExpress
     Chemical Content
     DSSTox chemical structure annotation enables, for the first
  time, an examination of the chemical diversity and coverage of
  GEO Series and ArrayExpress  Repository experiments. We
                                       Previous
TOC

-------
                                   CHEMICAL INDEXING OF TOXICOGENOMICS RESOURCES
                                                                                                                     367
                         TABLE 6
  Characteristics of the ArrayExpress Repository pertaining to
    "Treatment" Experiments (based on Data Extracted on
                    September 20, 2008)

                                              GEO series from
                                  ArrayExpress   ArrayExpress
Characteristics
   Major     Number (%) of Number (%) of
characteristic   "treatment"     "treatment"
   value      experiments'*     experiments'*
Array/platform


Species






MIAMEScore_Total6




MIAMEScore_Arrayc

MIAMEScore_Factor'J

MIAMEScore_RawDatae

MIAMEScore_
ProcessedData^
MIAMEScore_ProtocoP

Affymetrix
Agilent
Other
Homo sapiens
Mus musculus
Rattus
Arabidopsis
Saccharomyces
cerevisiae
Other
5
4
3
2
1
0
1
0
1
0
1
0
1
0
1
861 (73%)
82 (7%)
238 (20%)
377 (32%)
317 (27%)
173 (15%)
159 (13%)
55 (5%)

100 (8%)
216 (18%)
595 (50%)
309 (26%)
55 (5%)
6 (1%)
78 (7%)
1103 (93%)
551 (47%)
630 (53%)
347 (29%)
834 (71%)
131 (11%)
1050 (89%)
291 (21%)
890 (79%)
691(93%)
54 (7%)
0 (0%)
264 (35%)
220 (30%)
126 (17%)
76 (10%)
15 (2%)

44 (6%)
81(11%)
415 (56%)
211 (28%)
38 (5%)
0 (0%)
0 (0%)
745 (100%)
421 (57%)
324(43%)
335 (45%)
410 (55%)
0 (0%)
745 (100%)
195 (26%)
550 (74%)
  "Note that the total number of "Treatment" experiments (or studies) will be
less than the total number of ' 'Treatment'' chemical-experiment pairs in Table
5 due to inclusion of experiments/studies that have tested multiple chemicals
(and/or used multiple platforms, etc.).
  6The Total MIAME score ranges from 0 to 5 and is a sum of the independent
scores of the five subcomponent scores, each of which takes on the value of
either 0 or 1 (absent or present).
  cSpecific information about the design of the array or the platform used was
submitted (1) or not submitted (0) with the experiment by the submitter.
Included Array information is assigned an Array Accession number (see
Experimental_Accession, Table 3) within ArrayExpress.
  "A list of experimental factors was submitted (1) or not submitted (0) with
the experiment by the submitter; factors might include information on the cell
line or particular compounds and dose information used in the experiment.
  eRaw data was submitted (1) or not submitted (0) with the experiment by the
submitter.
  ^Processed data was submitted (1) or not submitted (0) with the experiment
by the  submitter.
  ^Specific information about the experimental protocols used in the
experiment was submitted  (1) or not submitted (0) with the experiment by the
submitter. Included Protocol information is assigned a Protocol Accession
number within ArrayExpress.

found  significant numbers of  experiments  in both resources
mapped to families of similar chemicals, as well as to a broad
diversity of  chemical structures,  spanning  a wide range  of
lexicologically relevant chemical  functional hierarchies  and
classes (Supplemental Fig. 2).
  A further metric of toxicological relevance is provided by
the overlap of unique  "Treatment" chemical substances in
GEO and ArrayExpress with the current published DSSTox
inventory, which includes  more than 10,000 unique chemical
substances,  and  spans a  variety  of  environmentally  and
lexicologically relevant chemical  inventories  and dala  sels
from various sources, including EPA, Ihe National Toxicology
Program, and Ihe U.S.  Food and Drug Association (Richard
et al., 2008). Al Ihe lime of Ihis survey, more lhan 550 unique
chemical subslances in Ihe  DSSTox GEO and/or ArrayExpress
files (GEOGSE and ARYEXP) corresponding to "Trealmenl"
experimenls  are  conlained  wilhin  one  or  more  of Ihe  11
previously published DSSTox Dala Files (http://www.epa.gov/
nccl/dsslox/DalaFiles.hlml),  and there  are  a  lolal  of 1294
overlapping instances (i.e.,  some chemicals occur in multiple
DSSTox Dala Files). Of  Ihese  overlapping  instances, Ihree
chemical subslances (Bisphenol A,  di(2-elhylhexyl) phlhalale
and  dibulylphthalale) occur in eighl DSSTox Dala  Files,  and
a lolal  of  309  chemical  subslances occur in Iwo or more
DSSTox Dala Files.  These numbers indicate  lhal significanl
numbers of GEO and Array Express  "Trealmenl"  chemical-
experimenls correspond to  chemicals of potential toxicological
concern, for which additional in vitro or in vitro dala may exist
                                                               DISCUSSION

                                           The  term "chemogenomics"  has been  proposed to  more
                                        generally encompass  Ihe  overlap of genomics  technologies
                                        with Irealmenl-relaled chemical effecls on biological systems,
                                        including both toxicily-relaled and Iherapeulic effecls (Fielden
                                        and Kolaja, 2006). Chemogenomics adds a lop-mosl chemical
                                        layer to dala organization, wilh broad  chemical coverage of
                                        slandardized-prolocol  experimenls  a   key  requiremenl for
                                        discerning activity patterns lhal can be confidenlly extrapolated
                                        across chemical space. This approach and  ils implementation
                                        are perhaps besl exemplified by Ihe Iconix  DrugMalrixR
                                        database  and applications (Ganler  et al., 2005).  The Iconix
                                        database consisls  of dala generated for a single species  (ral),
                                        treated  by more lhan 600 compounds  in seven  tissue types,
                                        representing upwards  of 3200 differenl drug-dose-lime-lissue
                                        combinations.  The database covers five differenl domains of
                                        dala: microarray, clinical chemislry, hemalology, organ  weighl,
                                        and hislopalhology, and was buill using a common microarray
                                        platform and srringenl experimental protocols and standards for
                                        dala generation and processing.
                                           Whereas Ihe Iconix database represenls an ideal, practically
                                        speaking,  il is far removed from  Ihe reality  of  a public
                                        microarray resource, upon which mosl public toxicogenomics
                                        investigations musl rely. In Iheir role as primary repositories of
                                        dala associated wilh Ihe published scientific lileralure, public
                                        microarray  dala repositories  such  as GEO and ArrayExpress
                                   Previous
                                  TOC

-------
368
                                           WILLIAMS-DEVANE, WOLF, AND RICHARD
                    10000

                    9000

                    8000

                    7000

                    6000

                    5000

                    4000

                    3000

                    2000

                    1000
9957'
                                               GEO Series

                                               ArrayExpress Repository

                                             |  Overlapping Content
    6346
       3713
2500

2000

1500

1000

 500
                                2381
                                    2365
                                       1374
2500

2000

1500

1000

 500
                                                            2134
                                                                   1075
                            Total Experiments
                                    Total
                          Chemical-Experiment Pairs
                                   Total "Treatment"
                               Chemical-Experiment Pairs
  FIG. 1.  Comparison of numbers of GEO Series and ArrayExpress Repository experiments, chemical-experiment pairs, and ' 'Treatment'' chemical-experiment
pairs, also showing overlapping content between the two systems; refer to totals and legends in Tables 4 and 5 (based on data extracted September 20, 2008).
cannot limit their content to include only experiments adhering
to strict  common protocol  standards  and traditional  model
organisms.  A public data resource can, however, strive for
completeness and accuracy of experimental annotations and to
provide user-access  to raw data for reanalysis. Similarly, the
accurate  identification  of  a  chemical  in  relation   to  an
experiment, particularly where  the primary purpose  of the
                                    experiment is to discern effects of the chemical on a biological
                                    system,   should  be  considered   as   primary   experimental
                                    annotation and absolutely essential to experimental reproduc-
                                    ibility. Whereas  standardization  and  chemical  indexing  of
                                    microarray experiments  at  the time of data deposition and
                                    publication is the ideal, if minimally sufficient information (i.e.,
                                    a valid chemical name, along with specification of the purpose
  FIG. 2.  Comparison of the total sets of unique chemicals pertaining to Treatment Chemical-Experiment pairs in ArrayExpress Repository and GEO Series
from the DSSTox data files; shown in each section are the chemicals mapping to the largest number of "Treatment" Chemical-Experiments in each case, with the
number of experiments shown in parentheses (GEO/ArrayExpress) (based on data extracted September 20, 2008).
                                        Previous
                                  TOC

-------
                                  CHEMICAL INDEXING OF TOXICOGENOMICS RESOURCES
                                                                                                                 369
of the chemical in relation to the experiment) is collected at the
time of data deposition in required data fields, formal chemical
indexing with structure annotation and  quality review can  be
performed efficiently with the appropriate chemical expertise in
collaboration  with public efforts such as DSSTox and ChEBI
(Supplemental Table 1).
  As the present survey has shown,  although a number  of
public microarray resources  have  the  potential to support
toxicogenomics investigations, these resources currently rep-
resent a patchwork of disconnected  or  loosely  connected
inventories and  capabilities (Larsson  and  Sandberg, 2006),
having different goals, degrees of standardization, public data
accessibility,  data mining ability, and utility for toxicogenom-
ics  investigation  (Table 2; Supplemental  Table 2).  Primary
genomics resources  (Table  1)  serve  as official  MGED-
sanctioned repositories  for  public gene expression  data
associated with the scientific literature (Mattes et  al., 2004;
Salter, 2005)  with GEO and ArrayExpress, by  far, the largest
and  most  important  resources,  currently.  They  both  are
MIAME-supportive databases, meaning that  they  accept  all
information about an experiment  set forth by the MIAME
guidelines; however,  they  do not actually require  this  in-
formation.  In addition,  there  is  insufficient  standardization
currently within GEO or ArrayExpress  pertaining to  protocol
or experimental description  to fully support exploration within
and  across these resources (Table 2).  Secondary  genomics
resources contain  genomic-related  data but  generally have
more limited  content  and  are designed for more specialized
purposes and  applications (Table 1).
  With  its specific  focus  on  toxicogenomics,  attention  to
chemical indexing, and addition of the  BID system, CEBS is
worthy of special mention, having incorporated many elements
of  an ideal  toxicogenomics  resource. To  support robust
relational searching for toxicogenomics, CEBS has  the added
task  of  capturing  and  systematizing user-deposited  data
pertaining to  a  study. CEBS bridges  the gap  between  an
open-access,  user-depositor system and a relational, curated
database by instituting a high degree of standardization and
data controls  that extend beyond  MIAME guidelines (Fostel
et al., 2005). CEBS is striving for much  larger coverage  of
chemical space in relation to chemical treatment  experiments.
In  collaboration  with DSSTox,   and  building on  current
annotation  efforts of GEO and  ArrayExpress, CEBS will
provide structure-searching  capabilities  and chemical  linkages
to external public resources, such as PubChem. In addition,
CEBS will provide direct  access  to GEO and ArrayExpress
"Treatment"  chemical-experiment content, as well  as auto-
mated secondary deposition of CEBS content to GEO.
  A most noteworthy deficiency of most secondary genomics
resources and the two primary genomics resources—ArrayEx-
press and  GEO—highlighted  in   the  present study, is  the
complete lack of incorporation of chemical  annotation and
standards that would allow aggregation of data for the same or
similar chemicals, and linkage to  growing lists of chemically
      indexed resources (Supplemental Table 2). Due to the lack of
      chemical-reporting  standards,  the  process for  identifying
      chemical treatment-related experiments in ArrayExpress Re-
      pository and GEO Series in this study was time-consuming and
      difficult to automate (Supplemental Fig. 1 and Supplemental
      Example 1). Present efforts serve to highlight deficiencies in
      microarray  experiment  data  deposition  requirements  and
      standards  with regard to chemistry and chemical treatment-
      related experiments  that, if better  addressed,  could greatly
      facilitate chemical annotation and data integration efforts in the
      future. With formal chemical annotation, it becomes possible to
      assess  the  chemical coverage  of public gene  expression
      databases, to link data for common or similar chemicals across
      information domains, including toxicology, as well as to gather
      data from comparable  experiments, possibly  performed  in
      different labs and species, that  can begin to serve as the basis
      for meta-analysis or structure-activity hypotheses. Furthermore,
      the proposed set of Standard Genomics Fields, most of which
      map to existing fields from both GEO and  ArrayExpress, serve
      to bridge  the two resources and facilitate comparisons and
      incorporation of their content into other resources  in a stan-
      dardized way.
                           CONCLUSION

        It is hoped that the current exercise to create, publish, and
      link chemical-index files for GEO  Series and ArrayExpress
      Repository has had  two primary  impacts:  (1) to highlight
      deficiencies in the current chemical annotation and curation
      methods within ArrayExpress and GEO that particularly impact
      toxicogenomics applications of these resources;  and (2)  to
      show the way forward in terms of the potential benefits that can
      be derived by incorporating robust chemical annotation and
      linkages of chemical treatment-related content to these public
      resources.   Recently  improved  coordination  of the  EBI
      ArrayExpress and ChEBI projects, whereby ChEBI provides
      link-outs from chemical structure to particular ArrayExpress
      experiments (currently only provided for a handful of experi-
      ments for which ArrayExpress data submitters provided ChEBI
      identifiers), is  a significant step forward and should immedi-
      ately benefit from incorporation of the DSSTox ArrayExpress
      chemical-experiment index file, as well as the addition of the
      corresponding DSSTox GEO  index file.  However,  as  is
      apparent from past failures, it is not sufficient to recommend
      that users add accurate chemical information at the time of data
      submission unless  more  stringent  efforts  to require this
      information are instituted. In  addition, we strongly recommend
      adoption of the "Chemical_StudyType" categories, or some-
      thing  comparable,  for  each  chemical-associated study  or
      experiment deposited into GEO and ArrayExpress.  Finally,
      recognizing that GEO and  ArrayExpress  are not designed
      primarily as toxicogenomics resources, submitters of explicit
      toxicogenomic study data should be strongly encouraged to
                                 Previous
TOC

-------
370
                                              WILLIAMS-DEVANE, WOLF, AND RICHARD
initially deposit studies into CEBS as a way to ensure capture
of sufficient toxicogenomics experimental description, utilizing
the automated deposition  capabilities of CEBS  to secondarily
deposit well annotated, chemically indexed  data into GEO.
  Postscript: All  of the  initial  published DSSTox chemical
files  and results reported  here were based  on  data extracted
from  the  ArrayExpress  Repository   and   GEO  Series  on
September 20, 2008. Subsequent updates of both ArrayExpress
Repository and GEO Series chemical-index files, based on data
extracted  on  January  20,  2009   and  February 2,   2009,
respectively,  have  been published  on the   DSSTox website
and  incorporated  into PubChem as  of March  2009;  these
updated files do not change the overall trends or conclusions of
the present study.
                   SUPPLEMENTARY DATA

   Supplementary  data  are available online at  http://toxsci.
oxfordjournals.org/.


                           FUNDING

   NCSU/EPA Cooperative Training Program in Environmen-
tal Sciences  Research, Training  Agreement (CT833235-01-0)
with North Carolina State University supported C.R.W.
                    ACKNOWLEDGMENTS

   We would like to thank Drs Jennifer Fostel (CEBS), Chihae
Yang (FDA Center for Food Safety and Nutrition), David Dix
(EPA), and William Ward  (EPA) for  helpful comments and
suggestions in review  of  this  manuscript.  This  work  was
carried out by C.R.W. as part of a graduate research project
within  the Bioinformatics Program  at North Carolina  State
University; thesis  is  publicly  accessible at  http://www.lib.
ncsu.edu/theses/available/etd-12112008-214342/.
                         REFERENCES

Ball, C.,  Brazma, A., Causton,  H., Chervitz, S., Edgar,  R., Hingamp, P.,
  Matese, J. C., Icahn, C., Parkinson, H., Quackenbush, J., at al.  (2004).
  Microarray Gene Expression Data (MGED) Society. Standards for micro-
  array data: An open letter. Environ. Health Perspect. 112, A666-A667.
Barrett,  T., Troup,  D.  B., Wilhite,  S.  E., Ledoux,  P., Rudnev,  D.,
  Evangelista,  C.,  Kim,  I. F., Soboleva,   A., Tomashevsky,  M., and
  Edgar,  R. (2007). NCBI GEO: Mining tens of millions of  expression
  profiles—Database and tools update. Nucleic Acids Res. 35(Database issue),
  D760-D765.
Barrett, T., and Edgar, R.  (2006). Gene expression omnibus: Microarray data
  storage,  submission, retrieval, and analysis.  Methods  Enzymol.  411,
  352-369.
Boyle, J. (2005). Gene-Expression Omnibus integration and clustering tools in
  SeqExpress. Bioinformatics 21, 2550-2551.
   Brazma, A., Kapushesky, M., Parkinson, H., Sarkans, U., and Shojatalab, M.
     (2006). Data storage and analysis in ArrayExpress. Methods Enzymol. 411,
     370-386.
   Burgoon, L. D. (2007). Clearing the standards landscape: The  semantics of
     terminology and their impact on toxicogenomics. Toxicol. Sci. 99, 403^1-12.
   Burgoon, L. D., Boutros, P. C., Dere, E., and Zacharewski, T. R. (2006).
     dbZach: A MIAME-compliant toxicogenomic supportive relational database.
     Toxicol. Sci. 90, 558-568.
   Chen, J., Zhao, P., Massaro, D., Clerch, L. B., Almon, R. R., DuBois, D. C.,
     Jusko, W. J., and Hoffman,  E. P. (2004). The PEPR GeneChip data
     warehouse, and implementation of a dynamic time  series query tool (SGQT)
     with graphical interface. Nucleic Acids Res. 32(Database issue), D578-D581.
   Davis, A.  P., Murphy, C. G.,  Saraceni-Richards, C. A., Rosenstein, M. C.,
     Wiegers, T. C., and Mattingly, C. J. (2009). Comparative toxicogenomics
     database: A knowledgebase  and  discovery tool for  chemical-gene-disease
     networks. Nucleic Acids Res. 37(Database issue), D786-D792.
   Dix, D. J.,  Houck, K. A., Martin, M. T., Richard, A. M., Setzer, R.  W., and
     Kavlock, R. J. (2007). The ToxCast program for prioritizing toxicity testing
     of environmental chemicals. Toxicol. Sci. 95, 5-12.
   Fielden, M. R., and Kolaja, K. L.  (2006). The state-of-the-art  in predictive
     toxicogenomics. Curr.  Opin. Drug Discov. Dev. 9, 84-91.
   Fostel, J. M. (2008). Towards standards for data exchange and integration and
     their  impact on  a  public database such as CEBS  (Chemical Effects in
     Biological Systems). Toxicol. Appl. Pharmacol. 233, 54-62.
   Fostel, J. M., Burgoon, L., Zwickl, C., Lord, P., Gorton, J. C., Bushel,  P. R.,
     Cunningham, M., Fan, L., Edwards, S. W., Hester, S., et al. (2007). Toward
     a checklist for exchange and interpretation of data from a toxicology  study.
     Toxicol. Sci. 99, 26-34.
   Fostel, J., Choi, D., Zwickl, C., Morrison, N., Rashid, A., Hasan, A., Bao, W.,
     Richard,  A.,  Tong, W., Bushel, P., et al.  (2005). Chemical Effects in
     Biological Systems—Data dictionary (CEBS-DD): A compendium  of terms
     for the  capture  and  integration  of  biological study  design  description,
     conventional phenotypes, and 'omics data. Toxicol. Sci. 88, 585-601.
   Ganter, B., Tugendreich, S., Pearson, C.  L, Ayanoglu, E., Baumhueter, S.,
     Bostian,  K. A., Brady, L., Browne, L.  J., Calvin, J. T., Day, G. J.,  et al.
     (2005). Development of a large-scale chemogenomics database to  improve
     drug candidate selection and to understand mechanisms of chemical toxicity
     and action. /. Biotechnol. 119, 219-244.
   Gomase, V. S., Tagore, S., and Kale, K. V. (2008). Microarray:  An approach
     for current drug targets. Curr. Drug Metab. 9, 221-231.
   Hamadeh, H. K., Amin, R. P., Paules, R. S., and Afshari, C. A. (2002). An
     overview of toxicogenomics. Curr. Issues Mol. Biol. 4, 45-56.
   Hayes, K. R., Vollrath, A. L., Zastrow, G. M., McMillan, B. J., Craven, M.,
     Jovanovich, S., Rank, D. R., Penn, S., Walisser, J. A., Reddy, J. K., et al.
     (2005). EDGE: A  centralized resource for  the comparison, analysis, and
     distribution of toxicogenomic information. Mol. Pharmacol. 67, 1360-1368.
   Hirabayashi, Y., and Inoue, T. (2002). Toxicogenomics—A new paradigm of
     toxicology and birth of reverse toxicology.  Kokuritsu lyakuhin Shokuhin
     Eisei Kenkyusho Hokoku 120, 39-52.
   Ivliev, A. E., 't Hoen, P. A., Villerius,  M. P., den Dunnen, J. T., and Brandt,  B. W.
     (2008). Microarray retriever: A web-based tool for  searching and large scale
     retrieval of public microarray data.  Nucleic Acids Res. 36, W327-W331.
   Larsson, O., and Sandberg, R.  (2006).  Lack  of correct  data  format and
     comparability limits future integrative microarray research. Nat. Biotechnol.
     24, 1322-1323.
   Martin, M. T., Judson, R. S., Reif, D. M., and Dix, D. J. (2009). Profiling
     chemicals based on chronic toxicity results  from the U. S. EPA ToxRef
     Database. Environ.  Health Perspect. 117, 392-399.
   Mattes, W.  B., Pettit, S. D., Sansone, S. A., Bushel, P. R., and Waters, M. D.
     (2004).   Database  development  in toxicogenomics: Issues  and efforts.
     Environ.  Health Perspect. 112, 495-505.
                                           Previous
TOC

-------
                                         CHEMICAL INDEXING OF TOXICOGENOMICS RESOURCES
                                                                                                                                          371
Parkinson,  H.,  Kapushesky,  M.,  Shojatalab,  M., Abeygunawardena,  N.,
  Coulson, R., Fame, A., Holloway, E., Kolesnykov, N., Lilja, P., Lukk, M.,
  et al. (2007). ArrayExpress—A public database of microarray experiments
  and  gene  expression profiles.  Nucleic Acids Res. 35(Database  issue),
  D747-D750.
Richard,  A.,  Yang, C., and Judson, R. (2008).  Toxicity data informatics:
  Supporting a new paradigm for toxicity prediction. Toxicol. Mech. Methods
  18, 103-118.
Richard, A. M., Gold, L. S., and Nicklaus, M. C. (2006). Chemical structure
  indexing of toxicity data on the internet: Moving toward a flat world. Curr.
  Opin. Drug Discov. Dev. 9, 314-325.
Rustici, G., Kapushesky, M., Kolesnikov, N., Parkinson, H., Sarkans, U., and
  Brazma,  A.  (2008).  Data  storage and analysis  in  ArrayExpress  and
  expression  profiler. Curr. Protoc. Bioinformatics. Chap. 7, Unit 7.13.
Salter,  A. H.  (2005). Large-scale databases in toxicogenomics. Pharmacoge-
  nomics 6, 749-754.
Tateno, Y., and Ikeo, K. (2004). International public gene expression database
  (CIBEX)   and   data  submission.  Tanpakushitsu   Kakusan  Koso  49,
  2678-2683.
Taylor, C. F.,  Field, D., Sansone, S. A., Aerts, J., Apweiler, R., Ashburner, M.,
  Ball, C. A., Binz, P. A., Bogue, M., Booth, T., et al. (2008). Promoting
  coherent minimum reporting  guidelines for biological and  biomedical
  investigations: The MIBBI project. Nat. Biotechnol  26, 889-896.
Waters, M., Boorman, G., Bushel, P., Cunningham, M., Irwin, R., Merrick, A.,
  Olden, K.,  Paules, R., Selkirk, J., Stasiewicz,  S., et al. (2003). Systems
  toxicology  and the chemical effects in biological systems (CEBS) knowledge
  base. Environ. Health Perspect. Toxicogenomics 111, 15-28.
       Waters, M., Stasiewicz, S., Merrick, B. A., Tomer, K., Bushel, P., Paules, R.,
         Stegman, N., Nehls, G., Yost, K.  J., Johnson, C.  H.,  et  al. (2008).
         CEBS—Chemical  effects in biological systems: A public data repository
         integrating study design and toxicity  data with microarray and proteomics
         data. Nucleic Acids Res. 36(Database  issue), D892-D900.
       Wheeler, D.  L.,  Barrett, T., Benson,  D. A., Bryant,  S.  H.,  Canese, K.,
         Chetvernin, V., Church, D. M., Dicuccio, M., Edgar, R., Federhen, S., et al.
         (2008). Database  resources of the  National  Center  for  Biotechnology
         Information. Nucleic Acids Res. 36(Database issue), D13-D21.
       Williams-Devane, C. R., Wolf, M. A., and Richard, A. M. (2009). DSSTox
         Chemical-Index files for exposure-related experiments in ArrayExpress and
         Gene Expression Omnibus: Enabling  toxico-chemogenomics data linkages.
         Bioinformatics 25, 692-694.
       Yang, C., Benz, R. D., and Cheeseman, M. A. (2006a). Landscape of current
         toxicity databases and database standards. Curr. Opin. Drug Discov. Dev. 9,
         124-133.
       Yang, C., Hasselgren, C. H., Boyer, S., Arvidson, K., Aveston, S., Diekes, P.,
         Benigni,  R.,  Benz, R.  D., Contrera, J., Kruhlak, N. L.,  et  al. (2008).
         Understanding genetic toxicity through data mining: The process of building
         knowledge by integrating multiple genetic toxicity databases. Toxicol. Mech.
         Methods  18, 277-295.
       Yang, C., Richard, A. M., and Cross, K.  P. (2006b). The art of data mining the
         minefields of toxicity databases to link chemistry to biology. Curr. Comput.
         Aided Drug Design 2, 135-150.
       Zhu, Y., Zhu, Y., and Xu, W. (2008). EzArray: A web-based highly automated
         Affymetrix expression  array data management and analysis system. BMC
         Bioinformatics 9, 46.
                                         Previous
TOC

-------
Toxicology Mechanisms and Methods, 18:103-118, 2 008
Copyright© Informa Healthcare USA, Inc.
ISSN: 1537-6516 print; 1537-6524 online
DOI: 10.1080/15376510701857452
                                                            informa
                                                            healthcare
                                                                                              REVIEW
   Toxicity Data  Informatics: Supporting a New Paradigm
                                                                  for Toxicity  Prediction
 Ann M. Richard
 National Center for Computational
 Toxicology, U.S. Environmental
 Protection Agency, Research
 Triangle Park, NC 27711

 Chihae Yang
 Leadscope, Inc., Columbus,
 OH 43235

 Richard S. Judson
 National Center for Computational
 Toxicology, U.S. Environmental
 Protection AgencyResearch Triangle
 Park, NC 27711


Received 9 November 2008;
accepted 21 November 2008.
One of the authors (AR) thanks Maritja
Wolf and Thomas Transue (Lockheed
Martin, Contractors to the U.S. EPA) for
major technical support of the DSSTox
project and structure-browser, and Lois
Gold for continuing to expand the
exceedingly valuable,  publicly available
Carcinogenic Potency  Database. We wish
to further acknowledge the significant
efforts of Matthew Martin and David Dix
in development of the EPA's ToxRef DB
database, and Elizabeth Julien for her
leadership of the ILSI Developmental
Toxicity Database Project. Lastly, we
thank David DeMarini and Matthew
Martin for contributing valuable
suggestions and references in review of
the manuscript.
Disclaimer. This manuscript has been
reviewed by the U.S. EPA's National
Center for Computational Toxicology
and  approved for publication. Approval
does not signify that the contents
necessarily reflect the views and policies
of the agency, nor does mention of trade
names or commercial products constitute
endorsement or recommendation for
use.
Address Correspondence to Ann Richard,
109 TW Alexander Dr., Mail Drop
D343-03, U.S. EPA, RTP, NC 27711. E-mail:
richard.ann@epa.gov
ABSTRACT    Chemical toxicity  data at  all levels  of description, from
treatment-level  dose  response  data to  a  high-level summarized  toxicity
"endpoint," effectively circumscribe, enable, and limit predictive toxicology
approaches and capabilities. Several new and evolving public data initiatives
focused on the world of chemical toxicity information—as  represented here
by  ToxML  (Toxicology XML  standard),  DSSTox (Distributed  Structure-
Searchable Toxicity Database  Network), and ACToR (Aggregated Computa-
tional Toxicology Resource)—are  contributing to the creation of a more unified,
mineable,  and modelable landscape of public toxicity data. These  projects
address different layers in the spectrum of toxicological data representation
and detail and, additionally, span diverse domains of toxicology and chemistry
in relation to industry and environmental regulatory concerns. For each of the
three projects, data standards are the key to enabling "read-across" in relation
to  toxicity data  and  chemical-indexed information. In  turn,  "read-across"
capability  enables  flexible  data  mining,  as  well  as meaningful aggregation
of lower levels of toxicity information  to summarized, modelable endpoints
spanning  sufficient areas of chemical space for building predictive  models.
By  means of shared data  standards and transparent and flexible rules  for data
aggregation, these and related public data initiatives are effectively spanning
the divides among experimental toxicologists, computational modelers, and
the world of chemically indexed, publicly available toxicity information.

KEYWORDS   ACToR; Data Models; DSSTox; Predictive Toxicology; SAR; Structure-Activity
Relationships; Toxicoinformatics; ToxML


                           INTRODUCTION
  Computational  predictive toxicology  methods applied to screening chemicals  for
potential toxicity must be built on the scaffolding of existing data. The goal is to glean
sufficiently predictive patterns  and inferences to be  able to  extrapolate  from data-rich
chemicals to chemicals and classes that are data poor. If a deductive predictive method is
fully computational (or in silico; i.e.,  it requires no  a priori biological measurement data),
then toxicity endpoints are predicted based solely on properties derived from the molecular
structure of a chemical of concern. On the other hand, if it is possible and practical to
generate intermediate biological measurements at low cost relative to a benchmark in vivo
experiment, then these biological measures can be used to augment chemical structure in a
more encompassing predictive computational toxicology approach (Richard 2006). In either
case, a defined toxicological "endpoint" must be chosen for which sufficient existing data are
available for training and testing of candidate prediction models. These legacy data, which
                                                      103
                             Previous
              TOC

-------
often include supporting layers  of experimental  detail  and
biological description, serve as bookends to the problem of
prediction. On the one hand, models must be anchored to
a toxicological "endpoint" with relevance to product safety
or environmental  regulatory concerns.  On the other hand,
less summarized toxicological data can provide more nuanced
descriptions of activity that can better align with  underlying
biological mechanisms  and productively inform  and guide
model development.
   Issues pertaining to toxicity data availability, quality, com-
parability, and representation profoundly impact our ability to
derive and apply useful prediction models  and have slowed
the development of robust prediction models in many areas of
toxicology. In this paper, we will consider three public initiatives
that are directly  addressing  the  "data" arm  of  predictive
toxicology: toxicity data models as represented by the ToxML
(Toxicology XML  standard) publicly available schema, with
ontology-based fields and controlled vocabulary (ToxML 2007);
the U.S. Environmental Protection Agency's (EPA's) DSSTox
(Distributed Structure-Searchable Toxicity) Database Network
(EPA DSSTox 2007) that publishes summarized structure-
annotated toxicity data files for use in structure-activity rela-
tionship (SAR) modeling, and structure-index files for  public
toxicity data  inventories; and EPA's new ACToR (Aggregated
Computational Toxicology Resource) data integration system
slated for public release in 2008 (Judson et al. submitted)  that
will broadly index and house existing chemical toxicity data as
well as  new data being generated in EPA's ToxCast program
(Dix et al. 2007;  EPA  ToxCast 2007). These  initiatives  are
focusing on four fundamental and interrelated problems: (1)
the electronic capture and standardization of existing or legacy
toxicology experimental data down to the dose-treatment level
(i.e., the domain of experimental toxicologists); (2) meaningful
aggregation  and representation of toxicity data into  forms
that are easily understood and employed by modelers; (3) the
creation of bidirectional linkages between previously isolated or
"siloed" sources of toxicity data and the larger world of data in-
formatics; and (4) the central indexing of chemicals and publicly
available toxicity data, particularly in relation to environmental
regulations and industrial use, across the multitude of potential
sources on the Internet. Additionally, these three initiatives
are attempting to  address the questions: "How are toxicity
data best organized to facilitate exploration and data mining?";
"How do we  engage toxicologists (i.e., the domain  experts), in
these efforts?"; "How do we meaningfully aggregate data and
interface toxicologists and modelers?"; "Where are  the data?";
"Which chemicals should be of highest regulatory interest?";
and, finally, "Which chemicals have the richest foundation of
data for anchoring new predictive technologies?"
   TOXICITY DATA MODELS: WHAT'S
            IN IT FOR MODELERS?
   In a  recent  review,  Yang  and  coauthors (Yang  et al.
2006a) surveyed the "landscape of current toxicity databases
and  database  standards." These authors noted that what is
generally perceived as a "lack of data" problem encompasses the
fragmented, disparate state of existing data, much of which do
not exist in electronic form or exist in diverse nonexchangeable
data formats. These problems span both public (e.g., literature,
government regulatory agencies) and nonpublic (i.e., corporate

A. M. Richard et al.
  or proprietary) toxicity data sources. The authors argued that the
  path forward must involve the development and adoption of
  shared public standards for toxicity databases. These standards
  should include controlled vocabulary and  hierarchical data
  relationships or ontologies (i.e., using the same  terminology to
  describe the same things, and incorporating the layered relation-
  ships of different terms to one another), should be derived from
  close working knowledge of the toxicity study domain (e.g.,
  carcinogenicity, developmental toxicity, immunotoxicity, neu-
  rotoxicity, etc.), and should be inclusive of chemical structure.
  The authors cited a number of public efforts that are moving in
  this direction. The "Birth Defects System Manager," is an exem-
  plary example of a well-designed computational-bioinformatics
  infrastructure for standardizing available toxicity data, spanning
  the biological interaction spectrum, to enable  exploration of
  these data from a mechanistic and systems biology perspective
  (Singh et al. 2005). The public ToxML initiative  (ToxML 2007),
  on  the  other hand, has as  its clear ultimate objective  the
  enrichment and enlargement of chemical structure-mineable in-
  formation in relation to broadly encompassing and meaningful
  representations of toxicology experiments.
     A second paper (Yang et al. 2006b) elaborated on  the state
  of the existing public  toxicity databases and data  sources,
  particularly in relation  to their suitability and preparedness
  for relational "read-across" data mining  and  structure-based
  modeling. (The term "relational read-across" refers to the ability
  to broadly query a database using controlled vocabulary  and
  standardized  search terms, and to combine search terms in
  such a way  as  to  define  conditional statements  [i.e., using
  AND, OR, NOT].) This paper laid out in greater detail the
  requirements for such a capability, and a general strategy to data
  mine structure-integrated toxicity databases. This was illustrated
  with an example that focused on target organ specificity of
  chronic toxicity results, which involved iterative probing of the
  chemical domain with preindexed toxicity endpoint descriptors,
  and complementary probing  of the  biological domain with
  chemical descriptors. A third paper, published elsewhere in this
  issue (Yang et al., 2007), extends these themes  into the realm
  of practical application,  employing the commercially available
  Leadscope SAR-ready  Genetic Toxicity Database (Leadscope,
  Inc. 2007), built using the public ToxML data  model schema,
  to illustrate  how new insights  into the nature of chemical-
  biological  domains for various genetic toxicity endpoints  can
  be extracted from a well-constructed, populated data model.
     ToxML was an early  and influential entrant to  the field of
  toxicity data models, encompassing a wide domain of toxicity
  experiments,  including  genetic  toxicity,  chronic/subchronic,
  and reproductive and  developmental  studies. Under a cooper-
  ative research and development  agreement with the U.S. Food
  and Drug Administration (FDA), Leadscope has built databases
  enriched with pharmaceuticals  from  publicly available data
  within the FDA's Center for Drug Evaluation and Research
  (CDER) and food ingredients from the Center for Food Safety
  and Applied Nutrition (CFSAN).  At  the time  of this writing,
  eight  such databases  are commercially available (Leadscope,
  Inc. 2007). The ToxML  methodology, as implemented within
  Leadscope, also includes  aggregation criteria to construct higher-
  level (i.e.,  more summarized)  endpoints for  data mining  and
  SAR analysis. Figure  1  shows  an aggregated  list  of studies
  within Leadscope  for  the compound  "Atrazine,"  CASRN
  (Chemical Abstracts Service  Registry Number) [1912-24-9],
  spanning several Study Types, Sources, Species, etc.
                                                        104
                                        Previous
TOC

-------
• F3 Study Table
Add|Remove Columns Sort Export Table
Each row represents a Test (part of a Study).
CA5RN Study Type Study
Source
1 1912-24-9 bacterial mutagenesis ntp
1912-24-9 bacterial mutagenesis ntp
1912-24-9 bacterial mutagenesis ntp
1912-24-9 bacterial mutagenesis ntp
1912-24-9 bacterial mutagenesis ntp
1912-24-9 bacterial mutagenesis ntp
1912-24-9 bacterial mutagenesis ntp
1912-24-9 carcinogenicity dsscpdb
1912-24-9 bacterial mutagenesis dsscpdb
1912-24-9 carcinogenicity dsscpdb
1912-24-9 carcinogenicity dsscpdb
1912-24-9 carcinogenicity dsscpdb
1912-24-9 irritation rtecs
1912-24-9 irritation rtecs
1912-24-9 acute toxicity rtecs
1912-24-9 multiple dose rtecs
1 191 2-24-9 acute toxicity rtecs
1912-24-9 RTECS mutation rtecs
:= (
Print Print Preview Print PDF ,

Species Strain

Salmonella typhimunum TA98
Salmonella typhimunum TA97
Salmonella typhimurium TA 1 537
Salmonella typhimurium TA98
Salmonella typhimurium TA98
Salmonella typhimurium TA98
Salmonella typhimurium TA98
Rat
Salmonella typhimurium
Rat
Mouse
Mouse
Mammal - species unsp, , ,
Rabbit
Rat
Rat
Mouse
Mouse
,,,,L^ &..•,»
1 ' F-inH •• i ' " " - ' t

59 Type Sax


male Sprague Dawley rat liver 59
male Sprague Dawley rat liver 59
male Sprague Dawley rat liver 59
male Syrian hamster liver S9
male Syrian hamster liver 59
male Sprague Dawley rat liver 59
Male

Female
Male
Female







	
• - i dose

Route Of All Doses
Exposure
10.0; 33.0; ... _«j
33.0; 100,0;...
33.0; 100,0;...
10.0; 33,0; ...
33.0; 100.0;... i
100,0; 333,0.. ,_|
10,0; 33,0; ...
31.7mg/Ng

31,7mg/kg


Eyes lOO.Omg
Eyes 6.32 mg
Inhalation 5,2 micro-g/mL
Inhalation 5.494SOS2E-,,,
Intraperitoneal 626,0 mg/Jag
Intraperitoneal 500,0 mg/kg j*J
	 Ready 	 	 	 	 	 	 	 	 	 |~~
FIGURE 1 Sample view of Leadscope Study-level aggregation of data across all component databases for Atrazine, built on the ToxM
data model.



   Several  additional public efforts are under way  to  con-
struct toxicity data models for  different purposes and areas
of toxicology,  with  a goal  to make both  the  data and
schema publicly available. The International Life Sciences
Institute's (ILSI's) Developmental Toxicity Database Project
(http://rsi.ilsi.org/devtoxsar.htm) evolved from efforts to eval-
uate the state of SAR prediction models  for  developmental
toxicity (Julien et al. 2004). That evaluation  concluded that
SAR models  could be substantially improved by refining the
ways  in  which  toxicity  data  are  used to train the models
(e.g., use data on specific endpoints, which can be combined
into biologically meaningful categories). To begin to address
this  need,  a follow-up  effort  brought together prominent
developmental  toxicologists and modelers  to create a public
data model directed toward capturing data  from the disparate
developmental toxicology literature.
   Within EPA, the ToxRefDB data model (Martin et al. 2007)
has been constructed to  house  reference in vivo toxicology
data across  a variety of toxicity  domains (including acute,
chronic, subchronic, developmental, and reproductive toxicity)
in support of the ToxCast Program (Dix et al. 2007). ToxCast is
an EPA research initiative with the goal of building predictive
toxicity models based on a combination of in vitro and in silico
approaches. The program is generating a  large data set that
will include results from hundreds of in vitro high-throughput
screening (HTS) assays as well as whole-genome transcription
profiles of several hundred chemicals for which detailed in vivo
toxicology data are available (EPA ToxCast 2007). ToxRefDB is
housing these reference in vivo data, which  are being extracted
primarily from EPA Pesticide Data Evaluation Records (DERs)
(EPA Regulating Pesticides 2007)  for hundreds of registered
pesticidal active ingredients (Martin et al.  2007). The DERs
summarize data from standardized guideline studies  that are
          required to be run for all new food-use pesticides. Within the
          ToxCast Program, these data will serve as phenotypic anchors
          for deriving predictive "signatures" and models from chemical
          structure properties and newly generated HTS data (Dix et al.
          2007; EPA ToxCast 2007).
            Both the ILSI Developmental Toxicity Database Project and
          ToxRefDB have incorporated elements  of the public ToxML
          data  model and are striving for broad  chemical coverage in
          their content,  which is a key requirement for structure-based
          data mining and modeling applications. In addition, efforts are
          under way to ensure that the final  data models resulting from
          these various efforts (ToxML, ILSI, ToxRefDB), despite having
          somewhat different target data and objectives, have sufficient
          compatibility for interoperability and data merging, both of
          which are absolutely essential for  meeting  the goal of "read-
          across" in  the larger data world.
            These new toxicity data models provide  the means, and
          the  data  entry tools provide the  mechanism,  for migrating
          previously inaccessible data and new data into a standard-
          ized, relational format.  Since these toxicity  data models are
          designed to capture experimentally relevant details pertaining
          to dose treatment and  effects  at  the study  level, they have
          the potential to engage and be used  by toxicologists,  unlike
          the highly (some would say overly) summarized toxicity data
          representations typically  used  in  SAR modeling (e.g.,  yes
          or no  calls for carcinogenicity, teratogenicity,  genotoxicity,
          etc.).  SAR modelers,  on  the  other  hand,  have difficulty
          processing this level of experimental detail. Instead, they are
          more  interested in  obtaining statistically sufficient chemical
          coverage  in relation to  a defined toxicity endpoint,  which
          is a  requirement that most often mandates high levels of
          toxicity data aggregation and summarization. Hence, modelers
          generally rely upon toxicity domain experts and regulators to
105
                                           Toxicity Data Informatics
                                 Previous
TOC

-------
      CompoundSummaryCall I Multiple data sources: BacterialMutagenesis
     	L—	•	" All Strains (w/or w/o S9)
  
-------
      STRUCTURE
      STRUCTURE  Shown
     DSSTox_CID

     STRUCTURE Formula
                               - tested chemical,
                               - genera! form of chemical,
                               - active ingredient of formulation,
                               - representative isomer in mixture,
                               - representative comporient in mixture,
                               - monomer of polymer
                                   , simplified to parent
    STRUCTURE_MoiecularWeight

    STRUCTURE_ChemicalType

    STRUCTURE_TestedForm_D
    efinedOrganic
                  DSSTox  Generic  SID
                                   TestSubstance  ChemicalName
                                           TestSubstance CASRN
                                       TestSubstance_Description
                               • single chemical compound
                               • rnacrornolecule
                                mixture or formulation
                                unspecified or multiple forms
       defined organic
       inorganic
       organometallic
                         ChemicalNote
    STRUCTURE_ChemicalN
    ameJUPAC

    STRUCTURE_SMILES

    STRUCTURE_Parent_SMILES

    STRUCTURE  InChl
- parent,
- salt Na, Cl, etc
- complex HCI, H20,
     mesylate, etc
                                  description of mixture
                                  nature & CAS of components
                                  tautorners
                                  stereochemistry
                                                   DSSTox  RID
                                                   DSSTox FilelD
FIGURE 3  DSSTox  Standard Chemical Fields included in all  published  DSSTox data files, with the STRUCTURE-related  fields
automatically generated from the STRUCTURE field, and the STRUCTURE.Shown field indicating the relationship to TestSubstance
fields.
coverage for models. However, this high-level endpoint summa-
rization is not generally endorsed by toxicologists, particularly
for complex endpoints such as developmental toxicity (Julien
et al. 2004), and can effectively obfuscate meaningful patterns or
associations of chemicals aligned with biological mechanisms.
As toxicity data models for study areas such as these become
increasingly populated and available, modelers and toxicologists
alike will be  presented with a large and varied array of new
possibilities for defining groups or "profiles" of effects that can
constitute intermediate  endpoints and anchors for  structure-
based modeling and exploration (Yang  et  al. 2006b). The
ease and flexibility conferred by toxicity data models in how
aggregated toxicity endpoints are defined, and structure-indexed
data are  extracted, will add an increasingly valuable  degree of
freedom to predictive modeling.

     DSSTox DATABASE NETWORK:
   ENDPOINTS AND DATA LINKAGES
  A primary goal of EPA's DSSTox project (Richard and
Williams 2002a; Richard 2004; EPA DSSTox 2007) is to publish
structure-annotated toxicity data files for use in  structure-
activity modeling.  Whereas toxicity data models  focus on
                    the standardization and "read-across" of experimental toxicity
                    data fields, DSSTox is contributing the standardized chemical
                    description layer to these efforts and providing "read-across"
                    in chemical structure space. A major focus of this effort has
                    been on the quality review of the chemical annotation and
                    representation with respect to  the toxicity information (EPA
                    DSSTox 2007). In order to accommodate and represent the
                    diverse chemical content of public toxicity databases and inven-
                    tories, DSSTox Standard Chemical Fields (Fig. 3) also  make a
                    clear distinction between the assigned chemical structure and its
                    relationship to the actual substance tested, which can be  a single
                    compound, or a  defined mixture, polymer, macromolecule,
                    or active ingredient in a formulation. In addition, the entire
                    DSSTox chemical inventory is now centrally indexed by unique
                    record ID, generic substance ID, and chemical (i.e., structure)
                    ID to better serve data management  needs,  as well as to  be
                    100% compatible with indexing  of  the large  and growing
                    PubChem inventory (NCBI PubChem 2007), which currently
                    exceeds 7 million compounds,  as well as with ACToR, which
                    incorporates the entire PubChem inventory into its even larger
                    aggregated inventory. Currently, there are 11 published DSSTox
                    structure-data (SD) files available for download, spanning nearly
                    7,000 unique substances (EPA DSSTox 2007).
107
                                                  Toxicity Data Informatics
                              Previous
           TOG
Next

-------
                               Chemical Structures
                                 Toxiclty
                                Model Schema
                                Toxicity Data
     DSSTox  Summary
     Toxicity Data  Files
Compound
Cheml
Chem2
ChernS
Chem4
Chem5
Chein6
Chein?
ChemB

Tox1
rat








Tox2
male








Tox3
+








Tox4
lung


















FIGURE 4  Representation of structure-based toxicity data mining as exploring relationships between corresponding layers in the
chemistry and toxicity data model domains, fed by experimental toxicity data, and with DSSTox data files represented  as flat tabular
slices of summary toxicity data in corresponding chemical space.
   A number of published DSSTox data files include the sorts
of highly summarized "toxicity endpoints" that have been the
traditional focus of SAR prediction  modeling efforts. Since
its inception, the DSSTox project additionally has  sought to
provide an enhanced level of toxicity endpoint description
and annotation to encourage alternative modeling strategies
and use of the data files in structure-searching and chemical
relational database applications (Richard 2004; Benigni et al.
2007b). Although not approaching the level of experimental
detail captured in the toxicity data models discussed  above,
each summarized toxicity endpoint  can be  considered as a
high-level "slice"  of a hierarchical or layered  toxicity data
model, with less summarized endpoints extracted from lower
levels of the  data model. An example, with  correspondingly
smaller chemical coverage of actives, is a "carcinogenicity call"
defined by tumor incidence in a particular target organ (e.g.,
liver) and rodent species (e.g.,  rat). The analogy of a slice is
apt, since DSSTox structure-data files  (SD files) are "flat" files,
representing summary data in a single-layer, tabular spreadsheet
form (Fig. 4), whereas toxicity data models are hierarchical in or-
ganization, with multiple nested layers (Fig. 2) accommodating
increasingly detailed description (Yang et al. 2006b). In addition
to data aggregation, this data transformation,  or "flattening,"
is  a required step  to  effectively  interface toxicity  data with
SAR modelers (Richard  and Williams 2002b), as well  as with
other types of prediction approaches  and structure-searchable
bioassay data resources,  such as PubChem (NCBI PubChem
2007).
   Issues pertaining to  varied  representations of summa-
rized  toxicity data  in  relation  to  rodent  carcinogenicity
  will be illustrated by consideration  of the recently  updated
  CPDB  Summary Tables  (CPDB  Summary Tables  2007;
  Gold  et  al.  1997,  1999)  and  the  corresponding  DSSTox
  CPDBAS (Carcinogenic Potency Database—All Species) data
  file, CPDBAS_v5a_1547_25Oct2007 (Gold et al. 2007) contain-
  ing data for 1547 chemical substances, or records. In contrast
  to a detailed toxicity data model that captures data at the study
  level (Fig. 2),  each column of the published CPDB Summary
  Tables represents many layers of aggregation and consensus,
  with chronic bioassay study results pooled from the results of
  published literature studies and government research  institute
  publications (NTP and NCI) that meet broadly defined criteria
  for CPDB inclusion (CPDB 2007; Gold et al. 1997). Target site
  results are pooled to the level of species/sex, and TD50 potency
  values are computed for pooled species results. There  remains,
  however, potentially rich descriptive "read-across" information
  in the CPDB Summary Tables with respect to compound,
  genotoxicity,  species, sex, target sites, and consensus potency.
  The DSSTox CPDBAS file is an enhanced version of the CPDB
  Summary Tables that includes DSSTox Standard Chemical
  Fields and 32 DSSTox  Source-specific toxicity content fields
  that  are designed to facilitate  SAR  modeling, data mining,
  structure searching, and relational text searching (Table 1). Half
  of the 32 DSSTox Source-specific toxicity content fields listed
  in Table 1 closely correspond to entries in the published CPDB
  Summary Tables (CPDB Summary Tables 2007), whereas the
  remaining fields represent added  or modified content to the
  DSSTox file.  The latter include molar unit fields for TD50
  potency values, text note fields containing expanded footnotes
  from the original CPDB tables, DSSTox notes pertaining to
A. M. Richard et al.
                                                       108
                                       Previous
TOC

-------
TABLE 1  List of DSSTox Source-specific toxicity data fields for Carcinogenic Potency Database—All Species (CPDBAS)
CPDBAS_v5a_1547_25Oct2007a
          Values
 Modifications to CPDB entry    Nonblank entries*
Mutagenicity.SAL.CPDB
TD50.Rat.mg
TD50_Rat_mmol
TD50.Rat.Note

TargetSites.Rat_Malec
TargetSites.Rat.Female
TargetSites_Rat_BothSexes
TD50.Mouse.mg
TD50.Mouse.mmol
TD50.Mouse.Note

TargetSites_Mouse_Male
TargetSites_Mouse_Female
TargetSites.Mouse.BothSexes
TD50.Hamster.mg
TD50.Hamster.mmol
TD50_Hamster_Note

TargetSites.Hamster.Male
TargetSites.Hamster.Female
TargetSites.Hamster.BothSexes
TD50.Dog.mg
TargetSites_Dog
TD50_Rhesus_mg
TargetSites.Rhesus
TD50.Cynomolgus.mg
TargetSites.Cynomolgus
TD50_Dog_Rhesus_Cynomolgus_Note

ActivityCategory.SingleCellCall
ActivityCategory.MultiCellCall
ActivityCategory.MultiCellCalLDetails
Note.CPDBAS

NTP.TechnicalReport
ChemicalPage_URL
"Positive," "negative"
Numeric
Numeric
Defined text

Defined text
Defined text
Defined text
Numeric
Numeric
Defined text

Defined text
Defined text
Defined text
Numeric
Numeric
Defined text

Defined text
Defined text
Defined text
Numeric
Defined text
Numeric
Defined text
Numeric
Defined text
Defined text

"1" or"0"
"1" or"0"
"Multisite active,"
"multisex active,"
"multispecies active"
"Multisex inactive,"
"multispecies inactive"
Memo—version history
   notes
Text
URL to CPDB chemical data
   page
CPDB
CPDB—eliminated footnotes
Added to DSSTox file
CPDB—expanded text
   of footnotes
CPDB—no abbreviations
CPDB—no abbreviations
CPDB—no abbreviations
CPDB—eliminated footnotes
Added to DSSTox file
CPDB—expanded text
   of footnotes
CPDB—no abbreviations
CPDB—no abbreviations
CPDB—no abbreviations
CPDB—eliminated footnotes
Added to DSSTox file
CPDB—expanded text
   of footnotes
CPDB—no abbreviations
CPDB—no abbreviations
CPDB—no abbreviations
CPDB—eliminated footnotes
CPDB—no abbreviations
CPDB—eliminated footnotes
CPDB—no abbreviations
CPDB—eliminated footnotes
CPDB—no abbreviations
CPDB—expanded text
   of footnotes
Added to DSSTox file
Added to DSSTox file
Added to DSSTox file
Added to DSSTox file

Added to DSSTox file
Added to DSSTox file
 860
 586
 565
1,086
 975
  59
 445
 434
 929
 942
  22
  45
  45
  73
  67
   2
   3
   5
   9
  24
  10
  19
1,544
1,151
1,151
 415
1547
  aField definitions can be found at: http://www.epa.gov/dsstox/sdf.cpdbas.html #SDFFields.
  ^Nonblank entries for activity fields include activity measures, as well as indications of "no positive results" out of a total of 1 547 chemical records in
CPDBAS.
  cTargetSites field entries are expanded from the three-letter abbreviations used in the CPDB Summary Tables (e.g., bladder, kidney, lung, liver, etc.).
CPDB version updates and affected records, the NTP Technical
Report number where applicable, and a URL (Uniform Re-
source Locator, i.e., website address) field linking to the CPDB
chemical data page summary for each chemical record to enable
structure locating of these web pages. The numbers of nonblank
entries in each data field are listed in Table 1 to convey chemical
coverage.
                        In  addition to the  above content  fields, a set of three
                     "ActivityCategory"  fields have been added to the DSSTox
                     CPDBAS data file to offer a variety of summarized carcinogenic
                     activity representations for possible use in SAR modeling efforts.
                     The first of these fields, ActivityCategory_SingleCellCall, rep-
                     resents a low-evidence, conservative carcinogenic call and is the
                     activity measure most typically employed in past SAR modeling
109
                                                    Toxicity Data Informatics
                              Previous
            TOC

-------
TABLE 2  Combinations of "SingleCell" and "MultiCell" activity calls in CPDBAS.vSa, broken down by MultiCellCalLDetails, with each
row representing a candidate set of Model compounds
 Call
     ActivityCategory.           ActivityCategory .MultiCellCalLDetails

SingleCellCall  MultiCellCall   "Multisite"    "Multisex"   "Multispecies"
Total Incidences  Compound
  CPDBAS_v5a       Setc
1
1
1
1
Active (1) 1
1
1
1
1
0
0
Inactive (0) 0
0
0
xa
1
1
1
1
1
1
1
1
X
0
0
0
0
X X
V x
x v
X X
V V
V x
x v
V V
Total MultiCellCall
V x
V V
V x
V V
Total MultiCellCall
X
X
X
V
V
V
V
V
Incidences (1/1)
X
X
V
V
Incidences (0/0)
224*
81
113
8
123
27
37
193
582
169
266
15
288
569
A1
A2
A3
A4
A5
A6
A7
A8
A9
11
12
13
14
15
  a"X" denotes a negative condition (i.e., condition not met for compartment).
  ^Number represents total incidence compounds satisfying listed conditions (e.g., in Model A1, 224 compounds have a SingleCellCall = 1, but do
NOT also have a MultiCellCall = 1;  i.e., only 1 tumor target site in one species/sex cell is listed for all 224 compounds in this set).
  cEach row represents a set of conditions for defining "Active" and "Inactive" compound sets for use in modeling, such that a model can be specified
by pairing of rows as in [A1:  11] for only SingleCellCall results, or [A4-A8:I3,I4] for pooled "multispecies" results.
studies  of chemical carcinogenicity  (Benigni  1997;  Benigni
et al. 2007a). This field has a value of "1" if at least one tumor
site (e.g., bladder, liver, lung, etc.) is listed for any experimental
animal  "cell" (e.g., rat  male,  rat female, mouse male,  etc.),
and a value of "0" otherwise. Although attractive to modelers
for its large  coverage of chemical  space, this  SingleCellCall
representation of carcinogenic activity most likely overestimates
carcinogenic risk and obscures a number of other potentially
important carcinogenic activity measures contained within the
data file. With  few exceptions, known human  carcinogens
produce tumors in multiple test species, usually at multiple
target sites (Ashby and Patton 1993; Tennant 1993). To facilitate
consideration of more broadly weighted measures of carcino-
genic activity in SAR modeling and data mining, two summary
activity fields, ActivityCategory_MultiCellCall and Activity-
Category_MultiCellCall_Details, were added to the DSSTox
CPDBAS data file. The ActivityCategoryJVIultiCellCall field
is  assigned a value of "1" if a compound meets one or more
conditions of "multisite," "multisex," or "multispecies" across
the 12  CPBDAS Tai^etSites  fields listed in Table 1, and  is
assigned a value of "0" only if no tumor sites are reported for
any test cell AND if more than one species/sex cell experiment
was carried out reporting "no positive results." Hence, this field
conveys weighted evidence in  support of either an inactive or
active call (1  or 0), as well as weighted evidence of carcinogenic
activity in relation to potential human risk (multisite, multisex,
multispecies).
   Table 2  lists a variety  of conditions for defining active
(9) and inactive  (5)  sets  of compounds with  respect  to
carcinogenicity,  derived from  different combinations of Sin-
gleCellCall and MultiCellCall conditions.  These compound
sets provide various views of the carcinogenic activity landscape
within CPDBAS. A particularly interesting observation is that
Compound Set A9, with presumably the most restrictive set of
                                               conditions ("multisite" AND "multisex" AND "multispecies")
                                               has the largest numbers of compounds for any MultiCellCall
                                               active compound  set,  which  supports a nondiscriminating
                                               biological generality  to the  carcinogenic response for a sig-
                                               nificant fraction  of compounds. A number  of questions can
                                               be  easily  posed and  explored  within CPDBAS  by virtue of
                                               the structure-activity-annotation and data organization of this
                                               file, such  as: How  do the chemical and biological landscapes
                                               of  these  compound  sets  differ? Do particular  target  sites,
                                               or  combinations of target sites, cluster preferably in any of
                                               these active  sets? Does genotoxic activity cluster differently
                                               in  these active  sets?  Are there  distinguishing characteristics
                                               (biological or chemical) of those compounds that are exclusively
                                               SingleCellCall active (e.g., mouse liver carcinogens)? Which
                                               chemical  sets have greater biological relevance,  or  produce
                                               more internally consistent models? In addition, with increasing
                                               standardization  across the toxicity data landscape, including
                                               genotoxicity and other  data domains, the possibility to bring
                                               added layers of information to bear on these sorts of questions
                                               becomes possible.
                                                  In addition to  elaborating  toxicity data  fields within  in-
                                               dividual DSSTox data  files,  the DSSTox Project has moved
                                               in  the  past  few years  to  publish important toxicity-related
                                               chemical inventories  within the EPA and the NTP. These in-
                                               clude "structure-index" files, containing only DSSTox Standard
                                               Chemical Fields, of  chemical inventories for new predictive
                                               toxicology initiatives, such  as  the  EPA's ToxCast  Program
                                               and the NTP High-Throughput Screening Program (DSSTox
                                               File Names:  TOXCST, NTPHTS) (Houck et al.  2007; Smith
                                               et al. 2007). These  files  are serving to centrally structure-index
                                               bioassay data being generated in the ToxCast and NTP HTS
                                               testing  programs, enabling these programs to interface with
                                               PubChem and ACToR, which will house these bioassay data.
                                               In  addition, "structure-index locator" files are published  for
A. M. Richard et al.
                                                                                                      110
                                         Previous
                                             TOC

-------
             x*EPA DSETox
                SUucture-Browsef VI.03
                      Search
                                    File Incidences  1 f  Search Details
                                                          ?Help
                                                                                           Output Options
                 Oueiy
             Results Type
                                            Hits
          Display
Stiiictuie:
^x
J^^'

r
Exact matches
Substructures
Similarity > 80%
1
2
6
Details
Details
Details

7
I Choose Format _^J  Save

                Print
                 7
               DSSTox
              Substance Similarity    Structure
                 ID      Score'.      M.itch
                                     Substance
                                       Name
                                         Substance
                               CASRN     Description
                                                                                   Details
                                     pi
                                    N-4,
                20112       100   CH3  N=-NH  CH3
                               .,„ .n -|  single chemical
                                      compound
         CPDBAS
         IRtSTR
HPVCS!
TOXCST
                27603
                          96.3
                                     CH3CH3
                                      CHS
1,3.5-Triazine-2.4-diamine. 6-chloro-N-
(1,1-dimetfiyletfiyl)-N'-etfiyl-
                                                                        5915-41-3
                                                                 single chemical
                                                                 compound
                                                                                              HPVCSI
FIGURE 5  View of the DSSTox Structure-Browser resulting from a search initiated from the drawn Atrazine structure, showing Search
Details across DSSTox data files.
on-line toxicity data inventories,  including the NTP On-line
Database (NTP On-line Database 2007; Burch et al. 2007) and
the EPA High Production Volume Information System (EPA
HPV 2007; Wolf et al. 2007) (DSSTox File Names: NTPBSI,
HPVISD). These files include URLs to chemical-specific source
data pages to enable  a user to be linked directly to on-line
data web pages through a structure search. DSSTox files for the
on-line CPDB Summary Tables (Gold et al. 2007) and the EPA
IRIS (Integrated Risk Information System) Database (Backus
et al. 2007) (DSSTox  File Names: CPDBAS, IRISTR) include
both a rich complement of data (over 30 toxicity data fields in
both cases) and URLs for chemical-specific data pages linking
to extensive on-line data resources on the EPA IRIS and the
CPDB source websites (EPA IRIS 2007; CPDB 2007).
   With  the  recent launch  of the public DSSTox Structure-
Browser (Transue and Richard 2007), users can now perform
on-line structure searches (exact match/substructure/similarity)
of the entire published DSSTox chemical inventory, with the
option to search or view results by DSSTox  data file. The
DSSTox Structure-Browser is built from publicly available tools
and  open  access source code, and is  designed  to be easy
to understand and  use by regulators and toxicologists, with
hundreds  of hot links provided to file names and download
pages, field definitions, help pages, and URL links to database
documentation and  Source websites.  A specially designed
feature  of this browser  is  that it  can  be  easily called  up
(http://www.epa.gov/dsstox_structurebrowser/)  from a source-
collaborator website (e.g., CPDB, NTP,  IRIS,  HPV-IS), and
                                               can be  directed by  a simple extension to  the  URL  (e.g.,
                                               (http://www.epa.gov/dsstox_structurebrowser/?dbs = NPTBSI)
                                               to confine the structure search to the corresponding published
                                               DSSTox  structure-locator  file  (Transue and  Richard 2007).
                                               This effectively delivers local structure-search capability directly
                                               to  previously isolated on-line toxicity  databases.  The  NTP
                                               On-line Database (NTP On-line Database 2007) has already
                                               taken  advantage of  this  new  capability  and  for the first
                                               time  is  offering structure searching  through  the DSSTox
                                               Structure-Browser directly  from its website.  In  addition, the
                                               EPA's IRIS and HPV-IS websites (EPA IRIS 2007;  EPA  HPV
                                               2007)  are expected to incorporate  this feature in the  near
                                               future.
                                                 Accessing the DSSTox Structure-Browser directly, a user can
                                               perform a name or CASRN text search, or by typing a SMILES
                                               or  drawn structure can perform exact/substructure/similarity
                                               searches across the entire published DSSTox chemical inven-
                                               tory,  with search results displayed by generic substance ID.
                                               Sample screen shots of structure-search results for "Atrazine"
                                               are shown in Figures 5 and 6. Figure 5 illustrates the structure
                                               "read-across" capability to  identify this substance and analog
                                               substances in multiple DSSTox data files. The Substance Results
                                               page shown for IRISTR (Fig. 6) provides a URL  link for "EPA
                                               IRIS  Chemical  Substance  Data Page" that  takes the user to
                                               the on-line IRIS Quick View Document for "Atrazine" (EPA
                                               IRIS  2007).  The current  published  DSSTox  chemical  data
                                               inventory will soon be deposited into PubChem (estimated by
                                               March  2007), with data files such as CPDBAS and IRISTR,
111
                                                                                Toxicity Data Informatics
                                Previous
                                     TOC

-------
                    SfOwssjr VT.03
                                     Search
                                                   File incidences
                                                                     Search Details
                                                Substance Results  ?Help
             CH3—(
IRISTR:
EPA Irtteijinted Risk Information System (IRIS) Striietuie-
Index Locator File (544 retoids)

IRlSTR_v1 a_544_28Jul2007

IRISTR Soitice Website

View IRIS Chemical Substance Data Page
                                                                                           Output Options
                                                                                      [Choose Format^]   Save
             DSSTox_RID
             DSSTox_Gei»eiic_SID
             TeslSiilist.iiice Clietiiic.ilN.iiiie
             TestSnbstance_CASRN
             TestS u bsta it c e_D ese t i pti on
             STRUCTURE_Sri»wii
             StudyType
             Species
             Oral_RfD_Assessed
             0 1 al_Rf D_Ci iti ca I Eff e €te
             0 r al_RfD_m 0_|> e i _kg_d ,iy
             Qral_RfD_inni<»l_j>e(_kg_day
             Oral_RfD_Notes
             Oral_RfD_CoiifIdenee
             lnliala»ion_RfC_Asses$ed
             liihalatie>n_RfC_CiiticaIEffecls
             WtOfEvidence_Caficer_Assessed
             WtGfEvideiice_1986GiiideltneCa1egijfies
             DrinkiiigWater_P(eeiiisorEffect_TumorType
             Inli.il.itioii^UnitRisk Assessfiil
             li)lialation_P(eaiisoiEffect_TumotType
             Tata I Assessi 1 1 e i its
            23892
            20112
            Atrazine
            1912-24-9
            single chemical compound
            tested chemical
            Human Health Exposure Toxicity Review for Risk Assessment
            cancer, acute; short-term; sub-chronic; chronic, developmental
            rodent; human; dog; rabbit
            1
            decreased body weight gain; cardiac toxieity; moderate-to-severe dilation right
            atrium
            0.035 mg/kg-bw/day
            1.62 mmol/kg-bw/day
            NOAEL (No observed adverse effect level). 3.5 mg/kg-day
            High
            0
            Not assessed under the IRIS program.
            0
            Not assessed under the IRIS program.
            0
            Not assessed under the IRIS program.
            0
            Not assessed under the IRIS program.
            1
FIGURE 6   View of the DSSTox Structure-Browser resulting from a search initiated from the drawn Atrazine structure, showing
Substance Results within a particular DSSTox data file.
which contain many columns of summary toxicity information,
deposited under several DSSTox-defined PubChem "assay IDs,"
or AIDs, to take full advantage of new and evolving PubChem
assay search  and  SAR clustering  features.  Once deposited,
PubChem users will be provided direct access to the DSSTox
data inventory and chemical  data page URLs.  In addition,
PubChem CIDs (Compound  IDs) will  maplil onto DSSTox
CIDs and will enable auto-generation of URLs for PubChem
CID data pages to be  directly incorporated into the DSSTox
Structure-Browser. This will allow users to link from the DSSTox
Substance Results page directly to the corresponding PubChem
Compound (CID) results page by the  click of a button, provid-
ing easy access to  PubChem bioassay results, structure-analog
pages, and PubChem links to data on  external sites,  such as
TOXNET (NLM TOXNET 2007). A similar CID compatibility
will enable direct linkage from DSSTox Substance Results pages
to the EPA ACToR system, discussed in the next section. Hence,
new structure-searching and locating capabilities are effectively
opening a bidirectional information highway  and  new  data
mining opportunities between previously isolated toxicity data

A. M. Richard et al.
                          inventories and the larger world of structure-indexed bioassay
                          information (Richard et al. 2006).

                                  ACToR: WHERE IS  THE  DATA
                              AND WHAT TYPES  OF DATA ARE
                                              OUT THERE?
                             The ACToR program  (Aggregated Computational Toxicol-
                          ogy Resource)  is tackling  the  larger objective  of surveying
                          all publicly available chemical toxicity resources of potential
                          interest, and building tools to allow the construction of toxicity
                          data sets  for chemical structure "read-across" and  modeling.
                          An effort of this kind faces three  significant problems. First,
                          it is often  useful to merge different types  of data to build
                          a predictive model (e.g.,  chemical  structure, biochemical and
                          other in vitro data  and in vivo toxicology data);  second, these
                          data are likely to be distributed over a wide range of sources; and,
                          third, the data in each of those  sources is typically organized
                          in a unique scheme. The ACToR program  is taking a broad
                                                                                 112
                                         Previous
                         TOC
Next

-------
TABLE 3  Assay categories being incorporated into the EPA Aggregated Computational Toxicology Resource (ACToR) database to
allow a user to broadly survey data availability for chemicals of interest
No.
1
2
3
Assay category
Physicochemical
Biochemical
Genomics
Description
Physical and chemical properties (in
vitro and/or in silico)
Biochemical (non-cell-based) (in vitro
and/or in silico)
Gene expression values or signatures
Examples
Molecular weight, logP, boiling point
Enzyme inhibition or receptor binding
constants
Result of in vitro or in vivo microarray
 4       Cellular
 5       Tissue
 6       In vivo toxicology (tabular
           primary)

 7       In vivo toxicology (study
           listing primary)

 8       In vivo toxicology (tabular
           secondary)

 9       In vivo toxicology (summary
           calls)
10       In vivo toxicology (summary
           report via URL)
Cell-based assay
Tissue slice assays
Tabulated results from primary
  animal-based studies of chemical
  effect
Primary studies are available but have
  not been tabulated

Tabulated data from secondary sources
  of in vivo toxicology studies

Derived summary determinations of risk

Links to text reports on the web for
  which specific data values are not
  directly accessible in tabular form
  analysis
Cell culture cytotoxicity
Tissue slice cytotoxicity
Clinical chemistry, histopathology,
  developmental, and reproductive
  assays
Clinical chemistry, histopathology,
  developmental, and reproductive
  assays
Clinical chemistry, histopathology,
  developmental, and reproductive
  assays
Chemicals determined to pose a defined
  risk of human cancer
Reports from EPA Integrated Risk
  Information System (IRIS), National
  Toxicology Program (NTP)
approach to managing these three issues by bringing together
as much publicly available data on chemicals of environmental
interest as possible, into a limited number of databases linked
together by unique chemicals identifiers (Judson et al. 2007).
ACToR is  being  built with open source  MySQL relational
database technology (http://www.mysql.com/). The web-based
system is presently  available in  a prototype version on the
EPA intranet, and is scheduled for public release in 2008.
Currently, the largest components of ACToR are the chemical
and assay databases, which contain chemical structure, in vitro
bioactivity data, and summary toxicology data for over 500000
compounds derived  from more  than 200  sources, including
the EPA; FDA; Centers for Disease  Control and  Prevention
(CDC); National Institutes of Health (NIH); state agencies;
corresponding government agencies in  Canada, Europe, and
Japan;  universities; the World Health Organization  (WHO);
and nongovernmental organizations (NGOs). In  the past year,
the DSSTox internal data structure has been redesigned to align
with both PubChem and ACToR, and all of the structures and
data from the DSSTox  inventory now  are incorporated  into
ACToR.  In addition, ACToR is serving as the  primary  data
management  system for the ToxCast program (Dix  et al. 2007)
and will include the  ToxRefDB  reference data,  as well as all
newly generated HTS data. ACToR aggregates data from a large
number of sources and presents the data in a unified format on a
chemical-by-chemical basis. The system is designed  to facilitate
the construction of modeling data sets, which can consist of
many chemicals and data from many assays.
   Data that are potentially of interest for chemical toxicology
modeling exist in many places and  forms and, as discussed
previously,  can contain  multiple levels of detail. For ACToR
                       construction, it  is accepted that there is no single available
                       schema that is ideal for holding all types of existing chemical-
                       activity data; hence, several subschema are incorporated, often
                       by copying other databases  in their entirety.  For instance,
                       ACToR-affiliated databases include  ToxRefDB, already de-
                       scribed,  and EPA's  High  Production Volume  Information
                       System (EPA HPVIS 2007), which contains detailed data from
                       Organisation for Economic  Cooperation  and  Development
                       (OECD) guideline  SIDS  (Screening  Information  Data Set)
                       in vivo studies on high-production volume chemicals, along
                       with other industry-submitted environmental and toxicological
                       data.
                          The diverse data that exist on environmental chemicals cur-
                       rently are incorporated into ACToR and organized according
                       the basic categories listed in Table 3. Some of these types of data
                       are directly usable in modeling efforts (Nos.  1-6, 8, 9), whereas
                       others simply provide pointers to more detailed data (often still
                       in text form) (Nos. 7,  10)  that could be  mined manually to
                       build relevant tabular data sets for specific  sets of chemicals.
                       The benefit of having  all of these  data available through the
                       ACToR web interface is that it is relatively  straightforward to
                       build a list of target chemicals and then to systematically extract
                       data sets for further analyses.
                          ACToR is  specifically  designed to aggregate all available
                       information for each of a large number of chemicals. Currently,
                       CASRNs are employed as a unique identifier to link data from
                       multiple sources. Using CASRNs for this purpose has several
                       known drawbacks: they are not always available or unique for
                       a given substance (e.g., CASRNs can be retired and replaced),
                       they  do not typically  distinguish to the  level of compound
                       purity grade (e.g., analytical vs. technical  grade), and  they
113
                                                        Toxicity Data Informatics
                                Previous
              TOC

-------
are tied to a nonpublic registry system (Chemical Abstracts
Service, http://www.cas.org/). Nonetheless, these are the most
consistent and widely used  chemical identifiers for indexing
public and environmental regulatory data resources, and they
index  many  existing public databases  for  which chemical
structures are currently unavailable. Hence, for the larger "read-
across" objectives of ACToR, CASRNs are sufficiently general
and widely available to serve as the initial basis for compound
indexing and aggregation.
   Data are being systematically imported into ACToR from
a large  number  of public sources,  which are  referred to as
"data collections." A data collection will usually include  a set
of substances (unique chemicals) and may have corresponding
compounds (chemical  structures) and  one  or  more  assays.
The  largest source of data  currently  in ACToR in terms
of substances and  assay  data  points  is  PubChem  (NCBI
PubChem 2007), which  is  itself a  compilation of multiple
data  sources (57  of which currently  have data included
in ACToR).  Most  assay  data  in PubChem originate from
HTS  assays run by the Molecular  Libraries Screening Cen-
ters Network (MLSCN) (Austin et  al. 2004) on compounds
from  the  Molecular Libraries   Small  Molecule  Repository
(http://mlsmr.glpg.com/MLSMR_HomePage/).  However, the
vast  majority of chemicals in  PubChem  have  no  assay
data  and  come  from  collections  of molecular  structures
from  chemical  manufacturer catalogs (e.g.,  SIGMA-Aldrich,
http://www.sigmaaldrich.com/)  or virtual screening libraries
(e.g., ZINC, http://zinc.docking.org/). The balance of the data
collections within ACToR pertain more specifically to environ-
mental chemicals and were extracted from a wide range of public
governmental and nongovernmental resources listed previously.
   To  be  included  in ACToR, a data  collection must meet
several criteria.  First, it has  to be publicly available with no
restrictions on redistribution. An important goal of the ACToR
project is to create a widely  usable,  freely distributable, open
source system. Any conclusions  drawn from these data  should
be subject to independent confirmation, which is made possible
by this open source data model. Second, if a source consists of
a web-accessible database, an index of the chemicals  in the
database is required in order to link that web resource back into
ACToR.
   The ACToR system is currently being employed to support
the ToxCast environmental screening and prioritization project,
seeking to identify among large  lists  of chemicals of potential
environmental interest those  for which some reference toxicity
data exist and  those that are  good candidates for toxicity
screening.  The goal of the ToxCast program is to screen all
widely used chemicals using in  vitro and in silico techniques
and  to prioritize for detailed  testing those chemicals  with
"signatures" indicating possible toxicity. The derivation of these
predictive  signatures is the object of phase I of the ToxCast
project, which is  currently under way (Dix et  al. 2007;  EPA
ToxCast 2007).
   It is estimated that there are about 30,000 unique chemicals
in widespread use in many countries, and for a high percentage
of these there  is little  available toxicology information. A
first  challenge is  to  identify these chemicals and  important
subsets that should be subject to a screening effort. ACToR has
incorporated  ~500,000 unique  chemicals from many sources
(where  a unique chemical  corresponds to  one CASRN  in
ACToR). The initial step in this data  construction was  to
select several data collections that comprise  the  most  widely
  used chemicals in the United States and build an overlap set.
  These are comprised of High Production Volume chemicals
  (HPVs) (EPA HPV 2007), pesticide and antimicrobial active
  and inert ingredients (EPA Pesticides  2007), the EPA Toxic
  Release Inventory (EPA TSCA 2007), the EPA drinking water
  contaminant lists (EPA Drinking Water 2007), chemicals that
  are part of the EPA's endocrine disrupter screening  program
  (EPA  EDSP  2007),  and  the  so-called medium-production
  volume chemicals listed as part of the  EPA Inventory Update
  Rule list  (EPA IUR 2007), which are chemicals manufactured
  in excess of 10,0001b/year. In  total, there are  11139 unique
  chemicals in this aggregated collection.
    One can easily browse individual chemicals within ACToR
  to determine what  data have been  derived  from  the data
  collections. For comparison to earlier  example results  shown
  for Leadscope and DSSTox (Figs. 1, 5, and 6), the  chemical
  "Atrazine," CASRN  [1912-24-9] is  chosen to illustrate the
  types of information indexed and easily retrieved in  ACToR.
  ACToR  screenshots for a  portion of the Atrazine-retrieved
  results are presented in Figure  7. Information for Atrazine is
  found in 54 different data collections. It has 31 physicochemical
  parameters estimated using the EPA EPI Suite package (EPA
  EPI Suite 2007) and over 1,000 HTS assay values (these include
  multiple  dose measures in the  same assay) for a wide variety
  of biochemical and  cell-based assays contained in PubChem.
  There is tabulated in vivo toxicology data from IRIS, CPDB,
  World Health Organization (WHO) Classification of Pesticides
  (WHO IPCS 2007), the EPA Drinking Water Standards and
  Health Advisories (EPA Water  Quality 2007), and  the EPA
  Risk-based  Concentrations database (EPA RBC  2007).  The
  Pesticide  Action Network (PAN 2007), Scorecard  (Scorecard
  2007), and the EPA Office of Pesticide Programs (EPA Pesticides
  2007) each provides summary toxicity determinations,  and PAN
  and Scorecard indicated that Atrazine is a suspected carcinogen,
  whereas the EPA  has determined that it  is not likely to be
  a human carcinogen. However, all three sources indicate that
  the chemical is a reproductive toxin, and the EPA further lists
  the mode of action  as endocrine disruption. The chemical is
  manufactured or used in amounts between 10 M Ib/year and
  50 M Ib/year in the United States, is  on the EPA Superfund
  Amendments and  Reauthorization Act 110 Superfund  Site
  Priority Contaminant List (EPA SARA  2007), and is subject to
  the National Primary Drinking Water Regulations (EPA Water
  Quality 2007) and the  New Jersey Right-to-Know Hazardous
  Substances Regulation  (NJ Hazardous Substances  2007).  A
  Material  Data Safety Sheet is available from the CDC (CDC
  ISCS  2007). Using  the MESH (Medical Subject Headings)
  link to PubMed (PubMed MESH 2007),  one can find 1969
  references on Atrazine, including 41 reviews. Using the link to
  TOXNET DART  (Developmental and Reproductive Toxicol-
  ogy Database; NLM TOXNET 2007),  one finds 63 references
  relevant to reproductive toxicology. Genetic toxicology and
  immunology studies are available from the NTP (NTP On-line
  Database 2007).
    A first approximation of the total world of data  available
  for understanding toxicity can be gleaned from some summary
  statistics  within ACToR for the set of 11,139 environmental
  chemicals selected for screening and prioritization. Chemical
  structures are readily available for 7,990 (72%) of the target list.
  This relatively low rate is a consequence of the fact that many
  of these  high-use  chemicals are complex mixtures, including
  petroleum streams and substances such  as plant oils, tars, ashes,
A. M. Richard et al.
                                                       114
                                        Previous
TOC

-------
Chemical Summary: Atrazine
,f^,
GCID: 110 JA_/J^
CASRN: 1912-24-9 t-i'^'^^H'^'^^
CID: 11 978 I
CCID: 11 976 ^-l.
Formula C8H14CIN5
MW 215.61333
SMILES CIC1=NC(=NC(=N1)NC(C)C)NCC
InChI— 1/C8H14CIN5/c1-4-1 0-7-1 2-8(9)1 3-8(
3/h5H,4H2,1-3H3,(H2,10,1 1,12,13,14)
°C, QUEUED
Status
SuhstancesHide
SCID Name
112 Atrazine
. „, 1 ,3 ,5-Tri a z i n e-2 ,4- d i a rn i n e , 6- c h I o ro-
~ rnethylethyl)-
7119 Atrazine
7664 atrazine
8690 Atrazine
10641 Atrazine
11569 Atrazine
18479 Atrazine
24089 Atrazine
24159 Atrazine
24338 ATRAZINE
35738 Atrazine
35739 Atrazine
39956 Atrazine
49204 1,35-Tnazine-2,4-diamine,6-chloro-
rnethylethyl)-

Assay Data by Assay Category
* Show: PhysicoChemical (311
* Show: Biochemical (10791
* Show: Cellular tDl
N * Show: Tissue (01
CI • Show: Organ ID)
* Show: Organism (01

* Show: In vivo toxicology (tabular primary) (D)
* Show: In vivo loxicoloav (studv list in a primarvl (81
* Show: In vivo toxicology (tabular secondary) (631
* Show: In vivo toxicology (summary calls) (181
ii-7iii-w7i * Show: In vivo loxicoloav (sumrnarv report via URL1 (101
* Show: Regulation (31
* Show: Category (01
* Show: Chemical Summary URL (131
* Show: Chemical Use Level (51
• Show: Description (41
Data Collection Source SID
CPDBAS DSSTOX DSSTOX 20112
N-ethyl-N\"-(1- HPVCSI DSSTOX DSSTOX 27446
IRISTRJ3SSTOX DSSTOX_23892
NCTRER DSSTOX DSSTOX 22356
NTPBSI DSSTOX DSSTOX 30756
NTPHTS DSSTOX DSSTOX 33202
ToxCasl 320 DSSTOX 40346
ATSDR ToxFaq ATSDR ToxFac
EDC73 EDC73_8
EPA DWC EPA DWC 5
EXTOXNET EXTOXNET 18
INCHEMJARC INCHEMJARC,
INCHEM IARC INCHEM IARC
ITERJTERA ITERJTERAJ56
\l- pt h v I- NV- (1 -
x c L j i jf i j •< * i] HID QC m II ID QC m "3OQ
HJh. OD Uj- IUK OD UZ J^.y




Assay Data by Phenotype . Study Type
• Show: Hazard (341
* Show: AcuteTox (221
* Show: SubchronicTox (1)
* Show: ChronicTox (211
* Show: Carcinogenicity (E51
* Show: GeneTos (381
• Show: DevTox 122)
* Show: ReproTox (231
• Show: NeuroTox (201
• Show: DevNeuroTox (Dl
* Show: ImmunoTox 1211
* Show: DermalTox (11
* Show: RespiratoryTox (Dl
• Show: NephroTox (01
* Show: HepatoTox (11
* Show: Endocrine (21
• Show: CardioTox (D)
* Show: EcoTox (19)
• Show: FoodSafe 031
• Show: ToxOther (01
Deep Data Tables
* Show: TOXNET Toxicology (101
• Show: EPA HPVIS ID)
* Show: ToxRefDB (01


FIGURE 7  Sample views of the EPA Aggregated Computational Toxicology Resource (ACToR) results for Atrazine showing general
substance characteristics, incidence across ACToR data collections, and types of data available.
and plant products. About half of these chemicals have some
publicly available toxicology data within the sets of information
currently compiled.  Primary  in vivo  toxicology data (taken
into ACToR from the original  testing source) is  available for
1,447 chemicals (13%), and secondary in vivo toxicology data
(taken into ACToR from a secondary source that  compile and
summarize data from primary sources) is available  for a total of
1,405 chemicals (13%). A total  of 5205 chemicals (47%) have
one or more summary in vivo toxicity calls or determinations,
which are derived by experts who have curated data from the
primary scientific literature. Finally 5,244 chemicals (47%) have
one or more summary text reports on chemical toxicity available
on the  Internet.  However, many  of these,  especially from
the European Substances Information System Low Production
Volume list (ESIS HPV-LPV 2007), simply state that no hazard
or toxicology information is available for that chemical. These
are conservative numbers as  there  are still large collections
of data yet  to be compiled  and loaded into ACToR. The
bottom line, however, is that there  is relatively little detailed
in vivo toxicology information available for the majority of
these environmental chemicals. The toxicology "data gap" for
commonly used chemicals is well  known (Applegate and Baer
2006).
   ACToR is a rapidly evolving system. Future developments
will include  incorporation  of additional data collections, ex-
traction of tabular data from  on-line text documents linked
115
                                 Toxicity Data Informatics




-------
to chemicals, addition of more curated chemical structures,
and the construction of a more flexible query and data export
interface.  ACToR will be used for constructing  training and
validation data  sets for the  ToxCast  chemical screening and
prioritization effort,  and for building computational  models
linking chemical structure with in vitro and in vivo assays. It
is anticipated that this large structure-searchable database will
also be a  valuable resource for reviewers within the EPA and
other regulatory agencies who are examining new chemicals
submitted for marketing  approval. In relation to  the  toxicity
data models and the  DSSTox project  discussed previously,
ACToR can be considered a super aggregator and data mining
facilitator. Since it is a fully relational database system, ACToR
is  capable of incorporating databases such as ToxRefDB in
their entirety, retaining the full internal data model structure.
Efforts  to encourage  toxicity  data model  standards,  such as
ToxML, will equally enhance "read-across" data capability both
within the aggregated ACToR system and across federated,
or independently maintained, databases on the  Internet. The
strengths  of the DSSTox project, in terms of quality structure-
toxicity annotation of a growing data inventory, will also be fully
incorporated into ACToR.  In return,  the ACToR system will
add a full relational searchability across all DSSTox data fields
and files,  wherever fields  are standardized and "read-across" is
possible.


                  CONCLUSIONS

   A  confluence of new public  data initiatives  focused on
the world of chemical toxicity information, as  represented
here by ToxML, DSSTox,  and ACToR, is  creating  a more
unified, linked landscape of public toxicity data and propelling
the field  of toxicity data informatics forward.  The essential
elements  for advancing such capabilities are  shared standards
for chemical and toxicity data representation, involvement and
engagement  of toxicity domain experts,  transparent rules for
data aggregation, and chemical structure and text "read-across"
in relation to all levels of toxicity data and chemical-indexed
information.
   A prominent concern of toxicologists encountering  these
tools  for  the  first  time  is  to question  how   data  quality
and the sufficiency of an experimental toxicology study are
judged in  the process of data capture. Secondly, what is the
role  of toxicology  domain experts  in  defining  appropriate
means for aggregating data  into meaningful "endpoints"  for
modeling? Any predictive toxicology approach, including SAR
modeling, must  rely heavily on the peer review process and
author  conclusions  to self-regulate the quality  of published
toxicology studies  and reports. Having said that, however,
when a data model  succeeds  in faithfully capturing  the key
elements  of a toxicological  experiment,  sufficient to  allow a
toxicology domain expert to judge the overall merits of the
study and its conclusions, it has delivered a useful and unbiased
representation of that experiment. Furthermore,  a data model
that lends itself to flexible means for aggregating toxicity data
and defining summarized endpoints for modeling can easily
support alternative  viewpoints and changing attitudes  from
within the toxicology community as to what endpoints are more
meaningful and relevant  to  hazard identification  and  human
risk assessment. Hence, the engagement of toxicology experts
in this process is essential to this enterprise of building useful
predictive models.

A. M. Richard et al.
     There are three general types of currently available databases
  and technology in the hazard identification and risk assessment
  fields: (1) databases storing the results of toxicity experiments,
  (2) databases  for use by regulatory agencies, and (3) aggregated
  databases to support SARs and predictive modeling. A vision
  of the future is  that these three levels of databases will be
  stored  in the same database with  a well-designed database
  schema and will communicate seamlessly.  Initiatives such as
  ACToR, DSSTox,  and ToxML  are significant  steps  in  this
  direction. Entry of new toxicity data in the public domain occurs
  largely from  government and  academic research institutions
  and government regulatory agencies, through past, present and
  future data submissions and public  disclosure laws. With the
  broad adoption  of  shared  data  standards  and  data  models,
  and quality review of chemical information, data submissions
  from the private sector to a regulatory agency can  take place
  electronically and seamlessly between respective databases. All
  stakeholders in this process stand to benefit from the growth
  of quality curated and standardized public toxicity  databases,
  and the field  of predictive toxicology will be one of the largest
  beneficiaries.
                      REFERENCES

  Ames, B.  N. 1971. The detection of chemical mutagens with  enteric
       bacteria.  In Chemical Mutagens: Principles and Methods for Their
       Detection, Vol. 1, ed. A. Hollaender,  Plenum  Press, New York-
       London 267-282
  Applegate, J. S., and Baer, K. 2006. Strategies for closing the chemical
       data gap. Center for Progressive Reform (White Paper) 602:1-19.
  Ashby, J., and Paton, D. 1993. The influence of chemical structure on the
       extent and sites of carcinogenesis for 522 rodent carcinogens and
       55 different human carcinogen exposures. Mutat. Res. 286:3-74.
  Auletta.A. E., Brown, M., Wassom, J. S.,and Cimino, M. C. 1991. Current
       status of the Gene-Tox Program. Environ. Health Perspec 96:33-36.
  Austin, C.  P., Brady, L S., Insel, T. R., and Collins, F. S. 2004. NIH Molecular
       Libraries Initiative. Science 306:1138-1139.
  Backus, G. S., Wolf, M. A., Burch, J., and Richard, A. M. 2007. DSSTox
       EPA  Integrated Risk Information System (IRIS) Toxicity Review Data:
       SDF  File  and Documentation: IRISTR.v1a.544.28Jul2007.www.
       epa.gov/ncct/dsstox/sdf.iristr.html. Accessed  January 24, 2008.
  Benigni, R. 1997. The first US  National Toxicology Program exercise on
       the prediction of  rodent  carcinogenicity: definitive results.  Mutat.
       Res.  387:35-45.
  Benigni, R., Bossa,  C.,  Richard, A.  M., and  Yang, C. 2007b. A novel
       approach: Chemical  relational databases,  and  the  role  of the
       ISSCAN database on assessing chemical carcinogenicity. Annali dell'
       Institute Super/ore di Sanita. In press.
  Benigni, R., Netzeva, T.  I., Benfenati, E., Bossa, C.,  Franke, R., Helma, C.,
       Hulzebos, E., Marchant, C., Richard, A., Woo, Y. T, and Yang, C.
       2007a. The expanding role of predictive toxicology: an update on
       the (Q)SAR models for mutagens and carcinogens. J. Environ. Sci.
       Health C  25:53-97.
  Burch, J.,  Eastin, W.  C., Bowden,  B.,  Wolf, M.  A., and Richard,
       A.   M.  2007.  DSSTox   National  Toxicology   Program  Bioas-
       say  On-line  Database  Structure-Index  Locator File:  SDF File
       and  Documentation, NTPBSI_v2a_2293_24Aug2007.www.epa.gov/
       ncct/dsstox/sdf_ntpbsi.html. Accessed January 24, 2008.
  CDC ICSC. 2007.  Centers  for  Disease  Control  and  Prevention,
       National  Institute for Occupational  Safety and  Health,  Inter-
       national  Chemical Safety  Cards, http://www.cdc.gov/niosh/ipcs/
       nicstart.html. Accessed January 24, 2008.
  CPDB. 2007. Berkeley  Carcinogenic Potency Database. http://potency.
       berkeley.edu/. Accessed January 24, 2008.
                                                          116
                                          Previous
TOC

-------
CPDB Summary Tables. 2007. Summary Table of Chemicals in the Carcino-
     genic Potency Database: Results for Positivity, Potency (TD50), and
     Target Sites,  http://potency.berkeley.edu/chemicalsummary.html.
     Accessed January 24, 2008.
Contrera, J. R, Matthews,  E. J., Kruhlak, N. L, and Benz, R. D. 2005.
     In silica screening of chemicals for bacterial mutagenicity  using
     electrotopological E-state indices and MDL QSAR software. Regul.
     Tax. Pharmacol. 43:313-323.
Dix, D. J., Houck,  K. A., Martin, M. T, Richard, A. M., Setzer, W., and
     Kavlock,  R. J. 2007. The ToxCast program for prioritizing toxicity
     testing of environmental chemicals. Tox. Sci. 95:5-12.
EPA Drinking Water. 2007. U.S. Environmental Protection Agency's Office
     of Ground Water and Drinking Water—Contaminants. http://www.
     epa.gov/ogwdw/contaminants/index.html. Accessed January 24,
     2008.
EPA  DSSTox.  2007.  U.S.  Environmental  Protection   Agency's  Dis-
     tributed Structure-Searchable Toxicity (DSSTox) Database Network.
     http://www.epa.gov/ncct/dsstox/index.html. Accessed January 24,
     2008.
EPA  EDSP.  2007.   U.S.  Environmental   Protection   Agency's  En-
     docrine   Disruption   Screening  Program,  http://www.epa.gov/
     scipoly/oscpendo/. Accessed January 24, 2008.
EPA EPI Suite. 2007. U.S Environmental Protection Agency's Estimation
     Programs Interface Suite(tm) for Microsoft® Windows, XR http://
     www.epa.gov/opptintr/exposure/pubs/episuite.htm. Accessed Jan-
     uary 24, 2008.
EPA  HPV.  2007.  U.S.  Environmental  Protection  Agency's High  Pro-
     duction   Volume  Challenge   Program  and  Information  Sys-
     tem.  http://www.epa.gov/hpv/  and   http://www.epa.gov/hpvis/.
     Accessed January 24, 2008.
EPA IRIS. 2007. U.S. Environmental Protection Agency's Integrated Risk
     Information  System  Database, http://www.epa.gov/iris/. Accessed
     January 24, 2008.
EPA IUR. 2007. U.S. Environmental Protection Agency's Inventory Update
     Rule list, http://www.epa.gov/oppt/iur/. Accessed January 24, 2008.
EPA  Pesticides.  2007.  US Environmental  Protection  Agency's Office
     of Pesticide  Programs,  http://www.epa.gov/pesticides/. Accessed
     January 24, 2008.
EPA  RBC.   2007.  U.S.   Environmental  Protection   Agency's  Hu-
     man  Health Risk  Assessment,  Risk-based  Concentration Ta-
     ble.   http://www.epa.gov/reg3hwmd/risk/human/index.htm.  Ac-
     cessed January 24, 2008.
EPA  Regulating  Pesticides.   2007.   U.S.   Environmental  Protection
     Agency's  Office of  Pesticide  Programs, Regulating  Pesticides.
     http://www.epa.gov/pesticides/regulating/. Accessed January 24,
     2008.
EPA  SARA.  2007.  U.S.  Environmental   Protection   Agency's  Su-
     perfund  Amendments   and   Reauthorization  Act,   110  Su-
     perfund  Site  Priority  Contaminant  List,  http://www.epa.gov/
     lawregs/laws/cercla/html. Accessed January 24, 2008.
EPA  ToxCast  2007.  U.S.  Environmental  Protection   Agency's  Na-
     tional Center for Computational  Toxicology ToxCastTM Program.
     http://www.epa.gov/ncct/toxcast/. Accessed January 24, 2008.
EPA  TSCA.  2007.   U.S.   Environmental  Protection  Agency's  Toxic
     Substances  Control  Act  Release Inventory,  http://www.epa.gov/
     opptintr/newchems/pubs/invntory.htm. Accessed January 24, 2008.
EPA Water Quality. 2007. U.S.  Environmental Protection Agency's Water
     Quality Criteria, Drinking  Water Standards and Health Advisories.
     http://www.epa.gov/waterscience/criteria/drinking/.  Accessed Jan-
     uary 24, 2008.
ESIS  HPV-LPV.  2007. European Substances Information System  Low
     Production  Volume  Chemicals,   http://ecb.jrc.it/esis/index.php?
     PGM=hpv. Accessed January 24, 2008.
Gold,  L. S., Manley,  N.  B.,  Slone, T.  H., and  Rohrbach, L. 1999.
     Supplement  to the Carcinogenic Potency Database (CPDB): Results
     of animal bioassays published in the general literature in 1993 to
     1994  and by the  National Toxicology Program in 1995 to 1996.
     Environ. Health Perspect. 107(Suppl. 4):527-600.
           Gold, L. S., Slone, T. H., Ames, B. N., Manley,  N. B.,  Garfinkel, G. B.,
                Rohrbach, L. 1997. Chapter 1:  carcinogenic potency database. In
                Handbook of Carcinogenic Potency and Genotoxicity Databases,
                eds.  Gold  L.  S., and  Zeiger  E., CRC  Press,  Boca  Raton,  FL,
                1-605.
           Gold, L. S., Slone,  T. H.,  Williams,  C. R.,  Burch, J. M.,  Stewart, T.
                W., Swank, A.  E.,  Beidler, J., and Richard, A. M.  2007. DSSTox
                Carcinogenic  Potency  Database Summary Tables—All Species,
                SDF  Files and  Documentation,  CPDBAS.v5a.1527.250ct2007.
                http://www.epa.gov/ncct/dsstox/sdf_cpdbas.html.  Accessed  Jan-
                uary 24, 2008.
           Gombar, V.  K., Enslein,  K.,  and  Blake,  B.  W.  1995.  Assessment
                of developmental toxicity potential of chemicals by  quantitative
                structure-toxicity relationship  models.  Chemosphere  31:2499-
                2510.
           Houck,  K.,  Dix,  D.,  Judson,   R.,  Martin,  M.,  Wolf,  M.,  Kavlock,
                R.,  and  Richard,  A.  M. 2007.  DSSTox EPA ToxCast  High
                Throughput  Screening  Testing  Chemicals  Structure-Index  File:
                SDF   File  and  Documentation:  TOXCST.v2a.320.25Sep2007.
                http://www.epa.gov/ncct/dsstox/sdf.toxcst.html.  Accessed January
                24, 2008.
           Judson,  R., Richard,  A.,  Dix, D., Elloumi, F.,  Martin, M.,  Cathey, T.,
                Transue, T., Spencer,  R., and Wolf, M. 2007. ACToR—Aggregated
                Computational  Toxicology Resource.  Toxicol.  Appl.   Pharmacol.
                Submitted
           Julien, E.,  Willhite, C. C.,  Richard, A. M.,  and  DeSesso, J. M.  2004.
                Challenges in  constructing  statistically-based  SAR  models  for
                developmental toxicity. Birth Defects Res. Part A 70:902-911.
           Kazius, J.,  McGuire, R., and Bursi, R.  2005. Derivation and validation of
                toxicophores for mutagenicity prediction. J. Med. Chem. 48:312-
                320.
           Leadscope, Inc.  2007.  FDA  databases,   http://www.leadscope.com/
                fda.databases/. Accessed  January 24, 2008.
           Martin,  M. T, Houck,  K. A., McLaurin, K.,  Richard,  A.  M.,  and
                Dix,  D. J. 2007. Linking  regulatory toxicological  information on
                environmental chemicals with high-throughput screening (HTS) and
                genomic data. The  Toxicologist CD J. Soc. Toxicol. 96:219-220.
           NCBI PubChem.  2007. National Institutes of  Health, National Library
                of Medicine, National Center for Biotechnology  Information, Pub-
                Chem Project, http://pubchem.ncbi.nlm.nih.gov/. Accessed January
                24, 2008.
           NJ   Hazardous  Substances.    2007.   New   Jersey  Right-to-Know
                Hazardous  Substances   Regulation,   http://web.doh.state.nj.us/
                rtkhsfs/indexfs.aspx. Accessed January 24,  2008.
           NLM  TOXNET.  2007.   National  Institutes  of  Health,  National  Li-
                brary of Medicine (NLM), Toxicology  Data Network (TOXNET).
                http://toxnet.nlm.nih.gov.  Accessed January 24, 2008.
           NTP On-line Database. 2007. National  Institute of Health & Environmental
                Sciences, National  Toxicology Program  (NTP)  On-line  Database.
                http://ntp.mehs. nih.gov/ntpweb/ index. cfm?objectid = 72016020-
                BDB7-CEBA-F3E5A7965617C1C1. Accessed January 24, 2008.
           PAN. 2007. Pesticide Action  Network,  http://www.pesticideinfo.org/
                lndex.org. Accessed January 24, 2008.
           PubMed MESH. 2007. National Institutes of Health, National Library of
                Medicine, National  Center for Biotechnology Information, Medical
                Subject  Headings  for  indexing  articles  for MEDLINE/PubMed.
                http://www.ncbi.nlm.nih.gov/sites/entrez?db=mesh. Accessed Jan-
                uary 24, 2008.
           Richard, A. M. 2004. DSSTox Website  launch: Improving public  access
                to databases for  building structure-toxicity prediction models.
                Prec//n/ca2:103-108.
           Richard, A. M. 2006. Future of predictive toxicology: an expanded view of
                "chemical toxicity"—future of toxicology perspective.  Chem.  Res.
                Toxicol. 19:1257-1262.
           Richard, A. M.,  Gold,  L.  S.,  and Nicklaus,  M. C.  2006. Chemical
                structure indexing  of toxicity data  on the internet:  Moving  to-
                wards a flat world. Curr.  Opin.  Drug Discovery Develop. 9:314-
                325.
117
                                                Toxicity Data Informatics
                                    Previous
TOC

-------
Richard,  A.  M., and Williams,  C.  R.  2002a.  Distributed  structure-
     searchable toxicity (DSSTox) database network: a proposal. Mutat.
     Res. 499:27-52.
Richard, A. M., and Williams, C. R. 2002b. Public sources of mutagenicity
     and carcinogenicity  data:  Use in structure-activity relationship
     models. In QSARs of Mutagens and Carcinogens, eds. Benigni R.,
     CRC Press, New York,  pp. 145-173.
Scorecard. 2007. Scorecard Pollution Information  Site, Health Effects.
     http://www.scorecard.org/health-effects/.  Accessed  January  24,
     2008.
Singh, A. V, Knudsen,  K. B., and Knudsen, T. B. 2005. Computational
     systems analysis of  developmental toxicity:  design,  development
     and implementation of a Birth  Defects Systems Manager (BDSM).
     Reproductive Toxicol. 19:421-439.
Smith,   C.,  Collins,  B.,   Tice,   R.,  Wolf,  M.  A.,  and   Richard,
     A.   M.  2007.   DSSTox  National  Toxicology  Program  High
     Throughput  Screening Structure-Index File:  SDF File  and Doc-
     umentation:  NTPHTS.v2a.1408.25Jul2007.  http://www.epa.gov/
     ncct/dsstox/sdf_ntphts.html. Accessed January 24, 2008.
Swartz,  C. D., Parks,  N., Umbach, D.  M., Ward, W. 0., Schaaper, R.  M.,
     and DeMarini, D. M. 2007. Enhanced  mutagenesis of Salmonella
     tester strains due to deletion of genes other than uvrB.  Environ.
     Mol. Mutagen.  48:694-705.
Tennant, R. W. 1993. Stratification of rodent carcinogenicity  bioassay
     results to reflect  relative human  hazard. Mutat. Res.  286:111-
     118.
ToxML. 2007. A publicly available Toxicity XML standard and controlled vo-
     cabulary for representing toxicity data.  Database schema available
       at Leadscope Internet site, http://www.leadscope.com/toxml.php.
       Accessed January 24, 2008.
  Transue, T,  and Richard, A. M.  2007. U.S.  Environmental  Protection
       Agency DSSTox Structure-Browser  v1.03.  http://www.epa.gov/
       dsstox.structurebrowser/.   Supporting   documentation  available
       on     the    Internet    at:     http://www.epa.gov/ncct/dsstox/
       StructureBrowserlnfo.html. Accessed January 24, 2008.
  WHO IPCS. 2007. World Health Organization's International Programme
       on Chemical  Safety, Classification of Pesticides by Hazard, http://
       www.who.int/ipcs/publications/pestiddes_hazard/en/.    Accessed
       January 24, 2008.
  Wolf, M.  A., Martin, M., and  Richard, A. M.  2007. DSSTox EPA
       High  Production Volume  Information  System Data—Structure-
       Index  Locator   File:  SDF  File   and   Documentation,  Launch
       File  version:  HPVISD.v1a.1006.250ct2007.  www.epa.gov/ncct/
       dsstox/sdf_hpvisd.html. Accessed January 24, 2008.
  Yang, C., Arvidson, K., Aveston, S., Benigni, R., Benz, R.  D., Contrera, J.,
       Dierkes, A. N., Xing, Hasselgren-Arnby, C., Jaworksa, J., Matthews,
       E.,  Kruhlak,  N.,  Kemper,  R., Rathman,  J.  F, Richard,  A.  M.
       2007.  Understanding  genetic toxicity from data  mining: Process
       of building knowledge  by integrating multiple  genetic toxicity
       databases. Toxicol.  Mech. Meth. Present volume.
  Yang, C.,  Benz, R. D., and Cheeseman,  M. A. 2006a. Landscape  of
       current toxicity databases and database standards. Curr. Opin. Drug
       Discovery Develop. 9:124-133.
  Yang, C., Richard, A.  M., and Cross, K.  P. 2006b. The art of data mining
       the minefields of toxicity databases to  link chemistry  to biology.
       Curr. Comput. Aided Drug Design 2(2): 135-1 50.
A. M. Richard et a/.
                                                               118
                                             Previous
TOC

-------
Risk Analysis, Vol. 29, No. 4, 2009
                         DOI:10.1111/j.l539-6924.2008.01168.x
Commentary

Toxicity Testing in the 21st Century: Implications for Human
Health Risk Assessment
Robert J. Kavlock,1* Christopher P. Austin,2 and Raymond R. Tice3
The risk analysis perspective by Daniel Krewski and
colleagues lays out the long-term vision and strate-
gic plan developed by a National Research Coun-
cil committee/1^ sponsored by the U.S. Environmen-
tal Protection Agency (EPA)  with support from the
U.S. National  Toxicology Program (NTP), to "ad-
vance the practices of toxicity testing  and human
health assessment of environmental agents." Com-
ponents of the vision include chemical characteriza-
tion; the use of human-cell-based, high-throughput
assays that cover the diversity of toxicity pathways;
targeted testing  using animals to fill in data gaps;
dose-response  and extrapolation modeling; and  the
generation and use of population-based and human
exposure data  for interpreting the results of toxicity
tests. The strategic plan recognizes that meeting this
vision will require a major research effort conducted
over a period of a decade or more  to identify all of the
important toxicity pathways, and  that a clear distinc-
tion must be made between which pathway perturba-
tions are truly  adverse (i.e., would likely lead to  ad-
verse health  outcomes in humans) and those that are
not. Krewski et al note that achieving this vision in
a reasonable timeframe (i.e., decades) would require
the  involvement  of  an  interdisciplinary research
1National Center for Computational Toxicology, Office of Re-
search and Development, U.S. Environmental Protection Agency,
Research Triangle Park, NC, USA.
2NIH Chemical Genomics Center, National  Human Genome
Research Institute, National Institutes of Health, MSC 3370,
Bethesda, MD, USA.
 Biomolecular Screening Branch, National Toxicology Program,
National Institute of Environmental Health Sciences, Research
Triangle Park, NC, USA.
* Address correspondence to Robert J. Kavlock, Director, Na-
tional Center for Computational Toxicology, Office of Research
and Development, U.S. Environmental Protection Agency, Re-
search Triangle Park, NC 27711, USA; tel: 919-541-2326; fax: 919-
541-1194; Kavlock.Robert@epamail.epa.gov.
     institute that would be coordinated and funded pri-
     marily by the U.S.  federal government and that
     would foster appropriate intramural and extramu-
     ral research. It is expected that this approach would
     greatly increase the number of compounds that can
     be tested, while providing data more directly rele-
     vant for conducting human health risk assessment.
     The NTP though its Roadmap,4 the National Insti-
     tutes  of Health (NIH) Chemical Genomics Center
     (NCGC) through the Molecular Libraries Initiative,5
     and the EPA through its ToxCast™ program6 and its
     draft Strategic Plan for the Future of Toxicity Testing
     have individually recognized the need to bring inno-
     vation into the assessment of the toxicological activ-
     ity of chemicals, and each has made progress in do-
     ing so. However, the  grand challenge put forth by
     Krewski et al. requires an effort unparalleled in the
     field of toxicology and risk assessment.
         In recognition of the importance of the NRC re-
            and to accelerate progress in this area, two
     NIH  institutes and EPA have entered into a for-
     mal collaboration known as Tox21 to identify mech-
     anisms of  chemically induced  biological activity,
     prioritize chemicals for more extensive toxicological
     evaluation, and develop more predictive models of
     in vivo biological response.(2) Consistent with the vi-
     sion outlined by Krewski et al., success in achieving
     these  goals is expected to result in methods for toxi-
     city testing that are more scientific and cost effective
     as well as models for risk assessment that are more
     mechanistically based. As a consequence, a reduction
     or replacement of animals in regulatory testing is an-
     ticipated to occur in parallel with an increased abil-
     ity to evaluate the  large numbers  of chemicals that
     4 Available at: http://ntp.niehs.nih.gov/go/vision.
     5 Available at: http://www.ncgc.nih.gov/.
     6 Available at: epa.gov/ncct/toxcat.


 485       0272-4332/09/0100-0485$22.00/l © 2008 Society for Risk Analysis
                             Previous
TOC

-------
486
                             Kavlock, Austin, and Tice
currently lack adequate toxicological evaluation. Ul-
timately, Tox21 is expected to deliver biological ac-
tivity profiles that are predictive of in vivo toxicities
for the thousands of understudied substances of con-
cern to regulatory authorities in the United States, as
well as in many other countries.
    The Tox21 collaboration is being coordinated
through a five-year Memorandum of Understand-
ing (MoU),7 which leverages the  strengths of each
organization. The MoU builds  on the experimental
toxicology expertise at the NTP,  headquartered at
the NIH National Institute of Environmental Health
Sciences  (NIEHS); the high-throughput screening
(HTS) technology  of the NIH  Chemical Genomics
Center  (NCGC),  managed  by the National  Hu-
man Genome Research Institute (NHGRI);  and
the  computational toxicology capabilities  of the
EPA's National Center for Computational Toxicol-
ogy (NCCT). Each party brings complementary ex-
pertise to bear on the application  of novel method-
ologies to evaluate large numbers of chemicals for
their potential to interact with the myriad of biologi-
cal processes relevant to toxicity. A central aspect of
Tox21 is the unique capabilities of the NCGCs high-
speed, automated screening robots to simultaneously
test thousands of potentially toxic compounds in bio-
chemical and cell-based HTS assays, and an ability
to target this resource toward environmental health
issues. As mentioned by Krewski et al, EPA's Tox-
Cast™ Program is an  integral and  critical compo-
nent for  achieving the  Tox21 goals  laid out in the
MoU.
    To  support the  goals of   Tox21,  four fo-
cus groups—Chemical Selection,  Biological Path-
ways/Assays, Informatics,  and Targeted Testing—
have been established; these focus groups represent
the different components of the NRC  vision de-
scribed by  Krewski et  al.  The Chemical  Selection
group is coordinating the selection of chemicals for
the Tox21 compound library to test at the NCGC.
A chemical library of nearly 2,400 chemicals selected
by NTP and the EPA is already under study at the
NCGC and results from several dozen HTS assays
are already available. In the near term, this library
will be expanded to approximately  8,400 compounds,
with an additional  ~1,400 compounds selected by
the NTP, ~1,400 compounds selected by the EPA,
and  ~2,800 clinically  approved drugs selected by
the NCGC. Compound selection is currently based
largely on the compound having a defined chemical

7 Available at: http://ntp.niehs.nih.gov/go/28213.
     structure and known purity; on the extent of its sol-
     ubility and stability in dimethyl sulfoxide (DMSO),
     the preferred solvent for HTS assays conducted at
     the NCGC; and on the compound having low volatil-
     ity. Implementing quality control procedures for en-
     suring identify, purity, and stability of all compounds
     in the library is an important responsibility of this
     group. A subset of the Tox21 chemical library will be
     included in Phase II of the ToxCast program, which
     will examine a  broader  suite  of assays in order to
     evaluate the predictive power of bioactivity signa-
     tures derived in Phase I. Phase II of ToxCast will be
     launched by the summer of 2009.
         The  Biological  Pathways  and Assays group is
     identifying critical cellular toxicity pathways for in-
     terrogation using  biochemical- and cell-based high-
     throughput screens  and prioritizing HTS  assays for
     use at the NCGC. Assays already performed at the
     NCGC include  those to assess (1) cytotoxicity and
     activation of caspases in a number of human and
     rodent cell types,  (2) up-regulation of p53, (3) ago-
     nist/antagonist activity for a number of nuclear re-
     ceptors, and (4) differential cytotoxicity  in several
     cell lines associated with an inability to repair various
     classes of DNA damage. Other assays under consid-
     eration include those for a variety of physiologically
     important molecular pathways  (e.g., cellular  stress
     responses) as well as methods for integrating human
     and rodent hepatic metabolic activation into reporter
     gene assays. Based on the results obtained, this group
     will construct test batteries useful for identifying haz-
     ard for humans and for prioritizing chemicals for fur-
     ther, more in-depth evaluation.
         The Informatics group is  developing databases
     to store all Tox21-related data and evaluating the re-
     sults obtained from testing conducted at the NCGC
     and via ToxCast™ for predictive toxicity patterns.
     To encourage independent evaluations and/or anal-
     yses of the Tox21 test results,  all data as well as
     the comparative  animal and human data, where
     available, will be  made publicly accessible via var-
     ious databases, including EPAs Aggregated  Com-
     putational Toxicology Resource (ACToR), NIEHS'
     Chemical Effects in Biological Systems (CEBS), and
     the National Center for Biotechnology Information's
     PubChem.
         As  HTS data on compounds with inadequate
     testing for toxicity becomes  available via Tox21,
     there will be a  need to test selected compounds in
     more comprehensive assays. The Targeted Testing
     group is developing strategies and capabilities for this
     purpose using assays that involve higher order testing
                             Previous
TOC

-------
Commentary
                                              487
systems (e.g., roundworms (Caenorhabditis elegans),
zebrafish embryos, rodents).
    In addition to the testing activities, the MoU pro-
motes coordination  and sponsorship of workshops,
symposia, and seminars to educate the various stake-
holder groups, including regulatory scientists and the
public, with regard to  Tox21-related activities. Per-
sons interested in following the progress of Tox21
are invited to join the EPA's Chemical Prioritization
Community of Practice,8 which meets monthly via
teleconference.
    Given the scope of the challenge presented by
Krewski et al., success will require a long-term con-
certed effort by a  large number of investigators,
working in a coordinated manner. The Tox21 consor-
tium welcomes participation in our effort by individ-
ual scientists and by organizations. The implications
for success of this effort are considerable. If success-
ful, we will be able  to  address regulatory demands
such as those placed by the Food Quality Protection
Act for the endocrine screening program9  and the
new European Community Regulation on chemicals
and their safe use, known as REACH (Registration,
Evaluation, Authorization and Restriction of Chem-
ical Substances),10 identify key modes of action on a
scale  not imaginable even a few years ago, direct a
much more efficient and effective use of animals in
toxicity testing, identify potentially susceptible sub-
populations based on the presence  of polymorphisms

8 Available   at:  http://www.epa.gov/ncct/practice_community/
 category_priority.html.
9 Available at: http://www.epa.gov/opp00001/regulating/laws/fqpa/.
l° Available at: http://ec.europa.eu/environment/chemicals/reach/
  reach_intro.htm.
in toxicity pathways, screen the effects of mixtures,
and study emerging issues like the safety of nano-
materials. The acquisition of data from broad-scale
HTS programs also creates demands to integrate this
knowledge and understand the implications for sys-
tems biology,  and to have risk assessors trained and
conversant in the new technologies and their utilities.
While the ultimate goal of eliminating the use of an-
imals in toxicology testing might seem unattainable,
it is only by carefully evaluating the relevance and
reliability of strategies based on in vitro  test meth-
ods that the utility and limitations of such  an ap-
proach can  be determined  and decisions made on
how best to conduct toxicology testing in the future.
To do otherwise will result in increasing demands
being placed  on  systems never designed to  handle
the large numbers of chemicals in need  of evalua-
tion, and continued reliance on test methods based
on empirical observation rather than on mechanistic
understanding.


DISCLAIMER

    The research described in  this report has been
funded by one or more  of the participating federal
agencies. The  report does not necessarily reflect the
views of the respective organizations.


REFERENCES

1. National Research Council. Toxicity Testing in the 21st Cen-
  tury: A Vision and A Strategy. Washington, DC: National
  Academy Press, 2007.
2. Collins FS, Gray GM, Bucher JR. Transforming environmental
  health protection. Science, 2008: 319:906-907.
                              Previous

-------
                                             Toxicology and Applied Pharmacology 238 (2009) 80-89
                                                Contents lists available at ScienceDirect
                                   Toxicology and Applied Pharmacology
                                      journal homepage:  www.elsevier.com/locate/ytaap
Toxicogenomic  effects common to triazole antifungals  and conserved between  rats
and humans

Amber K.  Goetz a'b,  David J. Dix a'*
a National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, USA
b Department of Environmental and Molecular Toxicology, North Carolina State University, Raleigh, North Carolina 27695, USA
ARTICLE   INFO

Article history:
Received 16 January 2009
Revised 13 April 2009
Accepted 22 April 2009
Available online 3 May 2009

Keywords:
Myclobutanil
Propiconazole
Triadimefon
Toxicogenomics
CAR
PXR
                                          ABSTRACT
The triazole antifungals myclobutanil, propiconazole and triadimefon cause varying degrees of hepatic
toxicity and disrupt steroid hormone homeostasis in rodent in vivo models. To identify biological pathways
consistently modulated across multiple timepoints and various study designs, gene expression profiling was
conducted on rat livers from three separate studies with triazole treatment groups ranging from 6 h after a
single oral gavage exposure, to prenatal to adult exposures via  feed. To explore conservation of responses
across species, gene expression from the rat liver studies were compared to in vitro data from rat and human
primary hepatocytes exposed to the triazoles. Toxicogenomic data on triazoles from 33 different treatment
groups and 135 samples  (microarrays)  identified  thousands of probe sets  and dozens  of pathways
differentially expressed across time, dose, and species — many of these were common to all three triazoles, or
conserved between rodents and humans. Common and conserved pathways included androgen and estrogen
metabolism, xenobiotic  metabolism  signaling through CAR and PXR,  and CYP  mediated metabolism.
Differentially expressed genes included the Phase I xenobiotic, fatty acid, sterol and steroid metabolism genes
Cyp2b2 and CYP2B6, Cyp3al and CYP3A4, and Cyp4a22 and CYP4A11; Phase II conjugation enzyme genes
Ugtlal and UGT1A1; and Phase III ABC transporter genes Abcbl and ABCB1. Gene  expression changes caused
by all three triazoles in liver and hepatocytes were  concentrated in biological  pathways regulating  lipid,
sterol and steroid homeostasis, identifying a potential common mode of action conserved between rodents
and humans. Modulation of hepatic sterol and steroid metabolism is a plausible mode of action for changes in
serum testosterone and  adverse reproductive outcomes observed in rat studies, and may be relevant to
human risk assessment.
                                                                     Published by Elsevier Inc.
Introduction

   Myclobutanil, propiconazole and triadimefon are agrichemical
fungicides with a 1,2,4-N-substituted triazole moiety that binds the
heme  portion  of fungal cypSl, inhibiting  fungal  lanosterol-14a-
demethylase activity, blocking ergosterol biosynthesis and thus
controlling several species and strains of fungi (Ghannoum and Rice,
1999; Vanden Bossche et al., 1990). All three of the triazoles  in this
study are used in the control of brown patch, dollar spot, rust, and
several other fungal  and plant diseases to  protect turf, fruit, and
vegetable and seed commodities.
   Triazole fungicides exhibit a range of toxicological properties  in
mammalian species. Effects reported from triazole exposures in rodents
 * Disclaimer: The United States Environmental Protection Agency through its Office
of Research and Development funded and managed the research described here. It has
been subjected to Agency administrative review and approved for publication.
  * Corresponding author. National Center for Computational Toxicology, Mail Drop
D343-03, U.S. Environmental Protection Agency, Research Triangle Park, NC 27711, USA.
Fax: +1 919 5411194.
   E-mail address: dix.david@epa.gov (D.J. Dix).

0041-008X/S - see front matter. Published by Elsevier Inc.
doi:10.1016/j.taap.2009.04.016
                           include male infertility for myclobutanil and triadimefon in rats, liver
                           tumorigenicity for propiconazole and triadimefon in mice, thyroid
                           tumors for triadimefon in rats, and some measure of reproductive and/
                           or hepatic toxicity for all three of these triazoles (Goetz et al., 2007; U.S.
                           EPA, 1995,2005a, 2005b, 2006). Studies examining developmental and
                           reproductive effects of myclobutanil, propiconazole, and triadimefon
                           exposures beginning in gestation and continuing to adulthood in rats
                           have demonstrated that all  three triazoles caused increased serum
                           testosterone levels (Goetz et al., 2007). Several studies in mice and rats
                           on the hepatic and thyroid toxicity of triazole fungicides have identified
                           modes of action, in some cases conserved across rodent species, and
                           possibly relevant to the assessment of risk to human health (Allen et al.,
                           2006; Wolf etal., 2006).
                             Genomic data from these and other studies have linked triazole
                           specific toxicological endpoints to demonstrate the  ability of high-
                           content biology to  delineate potential pathways of toxicity (Goetz
                           et al., 2006; Hester et al., 2006; Hester and Nesnow, 2008; Martin
                           et al., 2007; Tully et al., 2006; Ward et al., 2006). To date, in vivo gene
                           expression profiles have demonstrated that triazoles appear to modu-
                           late CAR and PXR, and subsequently perturb hepatic lipid, sterol, and
                           steroid and xenobiotic metabolism pathways. The  concordance of
                                            Previous

-------
                                    A.K. Goetz, DJ. Dix / Toxicology and Applied Pharmacology 238 (2009) 80-89
                                                                                                                             81
these in vivo observations and gene expression findings demonstrated
the ability  of genomics to identify potential modes of action and
toxicity pathways.
   In the present study, genomic data from a series of in vivo and in
vitro studies on three triazoles (myclobutanil, propiconazole and tria-
dimefon) were analyzed to test the hypothesis that there are con-
served hepatic biological  pathways active in rat liver  and human
hepatocytes that are commonly perturbed  by  this chemical class.
While changes in individual genes did vary between the in vivo and in
vitro studies,  comparison  of  homologous genes across  species and
mapping these to biological pathways facilitated meaningful evalua-
tions of functional significance. These differentially expressed genes
and  affected pathways were  then  related to the toxicological end-
points from these same in vivo and in vitro studies, creating a frame-
work for understanding the toxicity pathways common  to triazoles
and conserved in rats and  humans.

Materials and methods

Prenatal to  adult rat study.   Details  of the animal husbandry and
study design for the prenatal  to adult rat study have been previously
published (Goetz et al., 2007). For this and all other studies; animal
care, handling, and treatment at EPA or by its contractors was con-
ducted in American Association for Accreditation of Laboratory Animal
Care —  International accredited facilities, and all procedures were
approved by an Institutional Animal Care and Use Committee. For the
prenatal to adult rat study conducted at EPA, pregnant Wistar Han rats
were exposed to triazoles in  feed from gestation day 6 (GD6) until
weaning of the pups. Offspring were housed with their respective
dams until  weaning on postnatal day  23 (PND23). Males were then
removed from the dams and  exposed until PND92. Feed was mixed
with triazoles by Bayer CropScience (Kansas City, MO)  as  part of a
Materials Cooperative Research and  Development  Agreement
between the U.S. EPA and the U.S. Triazole Task Force. Control animals
were fed 5002 Certified Rodent Diet with acetone vehicle added. The
six treatment groups carried forward to genomic analysis were myclo-
butanil (MYC) at 500 or 2000 ppm (equivalent to 32.9 and 133.9 mg/
kg/day); propiconazole (PPZ) at 500 or 2500 ppm (equivalent to 31.9
and  169.7 mg/kg/day); or triadimefon (TDF) at 500 or 1800 ppm
(equivalent to 33.1 and 139.1  mg/kg/day). One male from each litter
was euthanized by asphyxiation with carbon dioxide and necropsied at
PND92 and tissues collected for transcriptional profiling.

Repeated dose oral gavage  studies.   In collaboration with EPA, Gene
Logic Laboratories  Inc.  (Gaithersburg, MD) conducted  one  of the
toxicogenomic studies presented herein. Male Sprague-Dawley rats
10 weeks of age were administered vehicle control (15% Alkamuls EL-
620  in deionized  water), myclobutanil (300 mg/kg/day) or tria-
dimefon (175 mg/kg/day) by oral gavage.  Control and treated groups
were necropsied and tissues were collected at 6  h and 24 h after the
first dose, and 24 h after the 14th dose (336 h); a total of six treatment
groups. Histopathology and gene expression profiling in the liver, and
serum hormone levels, were conducted on samples from this study.
   A second  oral  gavage  study was  conducted for EPA by Iconix
Pharmaceuticals Inc. (Mountain View, CA) under contract EP-D-05-
006. This study has been previously described in  Martin et al. (2007).
However, the genomic data presented in  Martin et al.  (2007) was
derived from the CodeLink array platform. The genomic  data in the
present study were derived from Affymetrix microarrays, using RNA
from the same rat liver samples assessed in Martin et al. (2007). Male
10 week old Sprague-Dawley rats were obtained from Charles River
Laboratories (Hollister, CA) and acclimated for 5-9 days prior  to
dosing. Animals  received vehicle  control (15% Alkamuls  EL-620),
myclobutanil (300 mg/kg/day), propiconazole (300 mg/kg/day),  or
triadimefon (175 mg/kg/day) every 24 h by oral gavage at a volume of
10 ml/ kg for 72 h total; a total of three treatment groups. Animals were
euthanized by asphyxiation with carbon dioxide and necropsied with-
out fasting 24 h after final dose to assess select endpoints including
serum testosterone  levels and transcriptional profiling  analysis  of
the liver.

Liver RNA isolation.   Total  RNA was extracted  from individual rat
liver samples using TRI Reagent (Molecular Research Center  Inc.,
Cincinnati, OH) according to the manufacturer's protocol. For quality
control, RNA A26o^A28o  ratios  were assessed via NanoDrop Fluoro-
spectrometer (NanoDrop Technologies, Inc., Wilmington, DE).  RNA
absorbance readings with a range 1.8-2.1  were followed with DNase
treatment, Total RNA Cleanup (Qjagen RNeasy), and then checked for
RNA quality using a model  2100  Bioanalyzer (Agilent Technologies
Inc., Palo Alto, CA). Samples with a ratio of 28 S:18 S rRNA > 1.6 were
accepted for subsequent use in DNA microarrays. RNA was stored at
— 80 °C until labeling for microarray hybridization or PCR analysis. For
samples from the Gene  Logic study, RNA exaction was done by Gene
Logic according to that company's standard operating  procedures
(Gene Logic, Gaithersburg, MD).

Rat and human  primary hepatocytes.   Rat and  human hepatocytes
were cultured, and  RNA from these cells was prepared  under EPA
contract  4D-6182-NTSA with CellzDirect Inc. (Durham, NC). Rat
primary hepatocytes were derived from PND60 male Sprague-Dawley
rats. For each treatment group, two 60 mm plates at a density  of
3.5xl06  hepatocytes/plate  were  generated, allowing for duplicate
measures of each treatment. Hepatocytes were cultured for 24-36 h
prior  to  exposure  of test chemicals.  Primary  cultures  of human
hepatocytes were prepared from human liver tissue from three anony-
mous donors. Tissue samples were derived from normal remnants of
resected liver tissue, which had been removed due to the presence of
metastatic tumors or from non-transplantable donor organs. Primary
hepatocytes  were exposed to either vehicle  control (DMSO 0.1%);
positive controls:  phenobarbital  (rat, 200  uM;  human,  1000 uM),
pregnenolone 16-alpha  carbonitrile (PCN, 30 uM in rat), rifampicin
(Rif, 30 uM in human);  myclobutanil, propiconazole, or triadimefon
(10,30, or 100 uM); a total of nine triazole treatment groups from each
specie. The doses used  were estimated from an initial dose  range-
finding study to  represent a maximum  tolerated  dose without
significant cell lysis. Rat and human hepatocytes were isolated  by a
collagenase perfusion method  described by LeCluyse et al.  (1996,
2000). Final cell viability prior to  plating  was  determined by the
Trypan Blue exclusion  test  and was  >75% in all cases. Following
isolation, hepatocytes were resuspended in DMEM containing 5% fetal
calf serum, insulin and dexamethasone (1 uM) and added to 60 mm
NUNC  Permanox®  dishes (~4xl06/dish)  coated  with a simple
collagen, type I,  substratum and allowed to attach for 3-6 h at 37  °C
in a humidified chamber with 95%/5% air/C02. After attachment,
dishes were  swirled and medium containing debris and unattached
cells were aspirated. Fresh ice-cold serum-free culture medium
containing 50 nM dexamethasone, 6.25  ug/ml  insulin, 6.25  ug/ml
transferrin, 6.25 ng/ml  selenium  (1TS+) and 0.25 mg/ml Matrigel®
was added to culture vessels, which were immediately returned to the
humidified chamber. Medium was changed on a daily basis. Cultures of
hepatocytes  were maintained  for 36-48  h  prior to initiating expe-
riments with chemicals, vehicle, and positive controls. Hepatocytes
were dosed for 3 consecutive days refreshing media every 24 h. Twenty
four hours following the final treatment period, media was removed
from hepatocytes for cytotoxicity  analysis through determination  of
lactate dehydrogenase leakage from the cells (see below). Cells were
washed with Hanks' Balanced Salt Solution and lysed with RLT Buffer
(Qjagen) and frozen for  subsequent analysis.

Hepatocyte cytotoxicity.    Cytotoxicity  was assessed  by  measuring
levels  of lactate dehydrogenase  (LDH)  using  the  CytoTox-ONE™
Homogenous Membrane  Integrity Assay (Promega  Corp., San  Luis
                                        Previous

-------
82
                                    A.K. Goetz, DJ. Dix / Toxicology and Applied Pharmacology 238 (2009) 80-89
Obispo, CA). LDH released into the culture medium was measured
with a 10 min coupled enzymatic assay that results in the conversion
of resazurin into resorufin. One hundred microliters of supernatant
was mixed  with 100 ul of CytoTox-One reagent (lactate, NAD+, and
resazurin as substrates in the presence of diaphorase). The mixture
was incubated at room temperature for 10 min, followed  with an
addition of 50 ul/well of stop solution. Plates were read using the
FLUOstar  Optima  (BMG LABTECH, Offenburg,  Germany)  and
fluorescence  was  measured  with an  excitation  wavelength  of
560 nm and an emission wavelength of 590 nm. Generation of the
fluorescent resorufin product is proportional to the  amount of LDH.
The average fluorescence values from sample, maximum LDH release
(total lysis) and culture  medium blank were used  to calculate the
percent cytotoxicity for a given treatment group.

Hepatocyte  RNA isolation.   RNA was isolated  from lysed  hepato-
cytes by initial homogenization with a Turrax Ultra T8 homogenizer.
Samples were individually loaded onto Qjagen AllPrep columns for
isolation of both genomic DNA and RNA from each treatment group.
This method included two steps to remove genomic DNA contami-
nation. The first step was passage of lysates through a gDNA affinity
column that binds strongly to gDNA while allowing RNA to pass. The
second step was  removal  of  gDNA involved  on-column DNAse
digestion of gDNA, then subsequent washing steps to remove the
DNAse from the columns.  RNA was eluted in RNAse-free water,
analyzed by NanoDrop spectrophotometry, and frozen at — 70 °C. RNA
was confirmed free  of gDNA by lack of PCR amplicon. RNA was highly
pure (A26o/A28o>1.7) and not degraded (28 S/18 S rRNA ratio > 1.5).

Microarray hybridization and scanning.   The entire  microarray pro-
cessing, except for the Gene Logic liver samples, was conducted for
EPA by Expression Analysis Inc. (Durham, NC) under contract 68-D-
04-002. Five micrograms of purified total  RNA from 3-6 biological
replicates per  treatment  group  was hybridized to Affymetrix Rat
Genome 230 2.0 or  Human Genome U133 Plus 2.0 GeneChip® arrays
according to the Affymetrix GeneChip Expression Analysis Technical
Manual (www.affymetrix.com). In brief, after purified RNA passed
quantity and quality assessment (A26o/A28o  ratio with range 1.8 to 2.1
acceptable), double stranded cDNA was synthesized  from RNA using
reverse transcriptase and an oligo-dT primer. The cDNA served as the
template  in an in vitro transcription (IVT) reaction that produced
amplified amounts  of biotin-labeled antisense mRNA. Biotinylated
RNA (labeled cRNA) served as the microarray target. The cRNA was
fragmented using heat and magnesium (Mg + 2), reducing the cRNA to
25-200  base  fragments to facilitate  efficient and reproducible
hybridization. cRNA was combined with the hybridization  cocktail,
containing  3 nM B2 control oligo, 10 mg/ml Herring Sperm DNA,
50 mg/ml BSA, 100% DMSO, and 2x hybridization buffer (NaCl 5 M,
MES hydrate Sigma Ultra, MES Sodium Salt, EDTA Disodium Salt 0.5 M,
10% Tween-20). GeneChips were hybridized at 45 °C for 16 h. After
hybridization  the chip was washed and  stained with  fluorescent
streptavidin-phycoerythrin,  binding to  biotin for detection. Signal
amplification using anti-Streptavidin antibody and biotinylated goat
IgG antibody was used to bind biotin and provide an amplified fluor
that emits light when the chip is scanned with the GeneChip® Scanner
3000. Gene Logic also  used the Affymetrix Rat Genome  230 2.0
GeneChip for gene expression profiling, according to that company's
standard operating  procedures (Gene Logic, Gaithersburg, MD) and
the standard Affymetrix protocols.

Microarray data analysis.   To  minimize  non-biological factors  such
as the total amount of target hybridized to each array,  the signal values
for each hybridization were multiplied by a  scaling factor to achieve a
mean intensity equal to 500. The converted  .eel files were loaded into
the JMP Genomics program (v6.0.02; SAS Inc., Cary,  NC), Log2 trans-
formed, normalized using interquartile normalization, and analyzed
for  significant changes  in transcript levels  through  row-by-row
modeling using one-way analysis of variance (ANOVA). For initial
exploratory analysis, principle component analysis (PCA) was applied
using JMP Genomics. Comparisons were made between controls and
each treatment group with statistical cut-offs applied at a p-value
adjusted  false discovery rate  (FDR) of 5-10%, and an absolute
difference of |1.2| or greater. Probe sets representing transcribed loci,
unknown genes, and image clones were removed from the final list of
each analysis. The Affymetrix .eel files can be accessed through the
Gene Expression Omnibus (GEO; www.ncbi.nlm.nih.gov/geo); series
accession numbers GSE10408, GSE10409, GSE10410 and GSE10411.

Pathway analysis.   Ingenuity Pathways Analysis (IPA;  Ingenuity®
Systems, www.ingenuity.com) was used  for pathway level analyses.
Canonical pathway analysis  identified the pathways from the IPA
library that were most significant to the dataset. For each dataset, all
the probe sets on the microarrays were uploaded into  Ingenuity™
(rat: 31,099;  human:  54,000 probe  sets). Focus genes from each
dataset were defined as those with a fold change of |1.2| or greater and
the adjusted p-value for an FDR of 5 or 10%. Focus genes were overlaid
onto a molecular network defined  within the  Ingenuity Pathway
Knowledge  Base (IPKB). Genes that met the fold change and p-value
cut-off, and associated with  a canonical  pathway in the IPKB were
considered in the analysis. The significance of the association between
the dataset and the canonical pathway was measured using the ratio
of the genes from the dataset that mapped to the pathways divided by
the total  number  of genes that mapped to the  canonical pathway.
Significance was calculated using the right-tailed Fisher's Exact Test by
comparing the number of focus genes that participated in a  given
pathway, relative to the total number of occurrences of these genes in
all pathway annotations  in the  IPKB. Using this methodology, over-
represented pathways containing more focus genes than expected by
chance were identified.

Quantitative PCR.   To further compare in vivo and in vitro liver models, a
subset of homologous genes were identified  from the microarray datasets,
and these were selected for confirmatory testing. TaqMan^-based quan-
titative reverse transcription polymerase chain reaction (qRT-PCR) was
used to determine the relative levels of Abcbl, Cyplal, Cyp2bl/2, Cyp3al,
Cyp3a2, Cyp4al and Ugtlal mRNAs in rat liver and primary hepatocyte
samples;  and ABCB1, CYP1A1, CYP1A2,  CYP2B6,  CYP3A4,  CYP4A11,
UGT1A1 and SLC01B1 mRNAs in human hepatocyte samples. Primer/
probe sets specific for each enzyme were utilized from Applied Biosystems
(Foster City, CA) for Abcbl (Rn00561753_ml), Cyplal (Custom assay, Log
# 1045-28, Lot#001),  Cyp2bl/2 (Custom  assay, Lot#001), Cyp3al
(Rn01640761_gl),  Cyp3a2  (Rn00756461_ml),  Cyp4al
(Rn00598510_ml),  Ugtlal (Rn00754947_ml),  ABCB1
(Hs00184500_ml),  CYP1A1  (Hs00153120_ml),  CYP1A2
(Hs01070369_ml),  CYP2B6 (Hs00167937_gl), CYP3A4
(Hs00430021_ml),  CYP4A11 (Hs00167961_ml), UGT1A1
(Hs02511055_sl) and SLC01B1  (Hs00272374_ml). The exception  to
this was for Cyp2bl and Cyp2b2, for which the primer/probe set could
detect either gene  transcript — that is why these results are hereafter
referred to as Cyp2bl/2. A two-step RT-PCR process was performed by
initial reverse transcription of approximately 200 ng of total RNA in a
60 ul reaction using the High Capacity cDNA archive Kit (Applied Bio-
systems, Foster City, CA), followed by quantitative PCR amplification with
isoform-specific primer/probe sets on 2 ul of each reverse transcribed
cDNA. Reactions were characterized by the PCR cycle threshold (CT)
automatically determined by the PE Applied Biosystems ABI 7900HT
Sequencer software. CT values were within the linear phase (log scale) of
exponential growth for all targets. CT values were determined for target
genes and an endogenous control gene ((3-actin); each sample was
normalized to both (3-actin control and vehicle control. A difference of
one CT was considered equivalent to a two-fold difference in gene
expression (exponential relationship,  i.e. RQ=2~DDQ). Sample means
                                         Previous

-------
                                      A.K. Goetz, DJ. Dix / Toxicology and Applied Pharmacology 238 (2009) 80-89
                                                                                                                                   83
for each replicate were determined along with the standard error of the
mean if appropriate and percent of adjusted positive control. Relative
fold changes in mRNA content were analyzed using the Kruskal-Wallis
nonparametric ANOVA with Dunn's  multiple comparisons  post test,
measures with p< 0.05 were considered significant.

Results

Hepatocyte cytotoxicity

   Overall, there  was only  significant  cell lysis  for one triazole
treatment group, and for the majority of rat and human samples the
percent cytotoxicity was  1% or less (Fig. 1). There was an increase in
cytotoxicity for  one rat sample  following 100 uM  propiconazole
treatment, but not the other two rat samples in that treatment group.
An increase in cytotoxicity was observed  for the human 100 uM
propiconazole treatment group, to approximately a  27% level.

Serum testosterone levels in vivo

   In the Gene Logic repeated dose  oral gavage study, myclobutanil
significantly increased  serum testosterone levels 24 h after a single
exposure of rats by oral gavage (Fig. 2). Elevated testosterone levels for
triadimefon or for myclobutanil at other timepoints did not achieve
statistical significance.

CeneChip quality control analysis

   Affymetrix microarrays with a scaling factor greater  than 15.0,
indicative of poor data  quality, were  removed from the  analysis
in order to reduce technical variation due to assay noise.  Within
the prenatal to adult exposure  study, three GeneChips from the
liver dataset were removed prior  to  normalization and statistical
analysis based on scaling factor. After this removal, 3-7 GeneChips
(i.e.  biological replicates)  were  still  available from  each  treat-
ment group. In the  four additional datasets analyzed,  no  micro-
arrays  had a scaling  factor of  15.0  or greater, and all  arrays  were
analyzed.

Probe set level analysis

   Statistical significance was determined by a combination of a p-
value threshold adjusted for a false discovery rate (FDR) of 5-10%, and
an absolute difference of |1.2| or greater. The p-value for an FDR of 5%






V









/

3


D Rat 304
D Rat 293
P Rat 256




V V    j?    s    .y
Fig. 1. (A) Rat hepatocyte cytotoxicity: LDH leakage as a function of treatment (B) Human hepatocyte cytotoxicity: LDH leakage as a function of treatment. RFU: relative fluorescence
units. *p<0.01.
                                         Previous

-------
                                       A.K. Goetz, DJ. Dix / Toxicology and Applied Pharmacology 238 (2009) 80-89
                          300%
                 - -O -  Myclobutanil 225 mg/kg/day

                       Triadimefon 175 mg/kg/day
                                           6 hr                         24 hr                         336 hr
                                                                 Exposure Period

Fig. 2. Serum testosterone levels from the Gene Logic repeated dose study using adult male Sprague-Dawley rats. Triazoles were administered by oral gavage; dose is in mg/kg/day.
*p<0.05
for the Gene Logic 6 h rat liver samples was p<0.00029, for 24 h
p<0.00252, and for 336  h p< 0.00162; for the Iconix 72 h rat liver
samples with an FDR of 5%: p< 0.00036; and for the prenatal to adult
PND92 rat liver samples with an  FDR of 10%: p< 0.00072. For  the
primary hepatocyte  samples  the p-value for an FDR of 5% was
p< 0.00004 for rat,  and p< 0.00025 for human. Probe sets that
interrogated unknown genes or transcribed loci were removed from
the probe set list. The number of probe sets meeting p-value and fold
change criteria across the 33 treatment groups ranged from zero to
1439, and are presented in Supplemental Data Table 1.
Gene level analysis

   Gene level analysis  focused on differentially expressed genes
identified by microarray, whether or not they mapped to significantly
affected biological pathways. The goal was to identify genes common
to triazole toxicity, consistent across multiple studies and timepoints,
and conserved between rat and human. Table 1 presents rat genes
from the microarray analysis common to two  or more triazoles, and
consistent across multiple studies  and timepoints. Only statistically
significant gene expression changes are presented in Table 1 (p-value
Table 1
Common and consistent triazole gene expression changes in rat liver and hepatocytes detected by microarray.
Accession
number
AY082609
AF286167
AF257746
AF072816
NM_053754
NM_024484
NIVL022407
M23995
NIVL133586
L46791
X00469
AI454613
NIVL013105
AI639276
U46118
D38381
NIVL016999
M33936
AA893326
NIVL020540
NMJB1576
NMJM9283
NIVL131906
U95011
NMJM7136
M13506
J02612
AF461738
NM_022228
AA945082
AF228917
Gene
symbol
Abcbl
Abcbl
Abcbl
Abcc3
AbcgS
Alasl
Aldhlal
Aldhla4
Ces2
Ces3
Cyplal
Cyp2b2
Cyp3al
Cyp3a3
Cyp3al3
Cyp3al8
Cyp4alO
Cyp4al4
Cyp4al4
Gstm4
For
Slc3a2
Slcola4
Slcola4
Sqle
Udpgtr2
Ugtlal
Ugtlal
Ugt2al
Yc2
Zdhhc2
Myclobutanil
6 h
1.5
1.5
1.7


1.9


1.3


5.4

1.3


-1.4



1.6

1.4
1.5







24 h
1.7
1.6
1.6
1.4
1.4
1.9
2.7
3.7
2.4


6.0
1.6


1.3



1.4
1.7
1.2
1.8
2.1
1.4
2.1
1.4
1.5
1.3
1.7
1.3
72 h 336 h P92
mid



1.6 1.8


1.9 2.9
5.2
2.2 2.0


4.2 6.2 2.7
1.8






1.6


1.6
1.6
2.0
3.2
1.4
1.5

3.2 2.6
1.5 1.2
Propiconazole
P92 RPH 72 h P92 P92
high 100 mid high
1.5


1.4


2.0 2.2 2.9
4.7
1.7 2.8 2.3
-2.8
3.0
3.3 3.2 2.8 4.7





-2.2
-1.8
1.4
2.1


2.0 2.1

2.0 2.4
1.8

1.4 1.4
2.5 1.9
1.3 1.3
Triadimefon
RPH RPH 6 h
30 100
1.2
1.3
1.4


1.7



-2.0 -3.2

5.2




-3.6 -1.4
-3.8 -1.6
-3.0

1.6

1.2
1.3







24 h
1.6
1.7
1.7

1.6
2.2
2.3
4.2
2.1


5.9
1.6
1.4
2.3
1.4

1.5

1.3
1.9

1.8
2.0
1.8
1.8
1.5
1.5
1.4


72 h 336 h
1.4
1.4
1.5
2.2
1.8

1.9 3.4
3.5
2.4 2.8


2.8 5.7
1.7

1.6




2.0
2.4
1.3
1.7
1.9

2.8
1.6
1.6
1.3
2.6 2.9
1.2 1.3
P92 P92
mid high

1.4
1.4
1.8

2.1
4.8

3.4

5.6
2.8 7.6







1.4
2.2
1.4
2.2
2.0 2.0
1.3 1.7
3.0
1.8


3.1
1.4
RPH RPH RPH
10 30 100









-2.0 -2.8





















Note Average fold change is derived from all biological replicates per treatment group. All results are statistically significant, common to two or more triazoles, and consistent across
multiple timepoints or studies. RPH (rat primary hepatocytes).
                                             Previous

-------
                                      A.K. Goetz, DJ. Dix / Toxicology and Applied Pharmacology 238 (2009) 80-89
                                                                                                                                    85
threshold adjusted for an FDR of 5-10%, and an absolute difference of |
1.21 or greater). The common and consistent genes from Table 1 are
readily identifiable  as components from a xenobiotic  metabolism
pathway modulated by the triazoles, and also potentially significant to
triazole metabolism. These rat genes encode the transporters, and
Phase  1 and 11 metabolic enzymes for hepatocyte uptake, oxidation-
reduction, conjugation and excretion into the canalicular space of the
liver (Fig. 3A). Other genes  identified in Table  1 include additional
Phase 1 enzymes, mostly P450 s (Cyplal, Cyp2b2, Cyp3al3, Cyp3al8,
Cyp4alO,  Cyp4al4)  but  also P450 oxidoreductase  (For), aldehyde
dehydrogenase (Aldhla4) and carboxylesterase (Ces2); and the Phase
11  enzyme Yc2. The increased expression of Ces2  seen in  triazole
exposed livers was not observed in rat primary hepatocytes, in fact,
another carboxylesterase isoform (Ces3) was significantly decreased
in rat hepatocytes by all three triazoles (Table 1; Supplemental Data
Table 7). Extended lists of all the gene expression changes common to
two or more triazoles for each individual rat liver study or timepoint
are presented  in the Supplemental Data Tables 2-6, and similar results
for individual genes from rat primary hepatocytes are in Supplemental
Data Table 7. There were many differences between the rat in vivo and
in vitro model systems at the individual gene level. One  of many
examples of this was the decreased transcript levels for the gluco-
corticoid  receptor (Nr3cl)  by myclobutanil  and triadimefon after
24 hour exposure in vivo, but not in vitro.
   Based  on the microarray data alone, changes in gene expression
common to multiple triazoles, and consistent across rat studies were
not conserved in human hepatocytes. In order to identify conserved
triazole effects on the expression of human gene homologs for the rat
Phase 1 and Phase 11 metabolic genes presented in Fig. 3A, results from
qPCR have to be considered. There were, however, numerous genes
commonly affected by multiple triazoles in either the rat or human
primary hepatocytes individually, and these are presented in Supple-
mental Data Tables 7 and 8, respectively.
   Only propiconazole and triadimefon had significant effects on gene
expression in human hepatocytes, and commonalities between the
two  triazoles were limited; myclobutanil had no overall significant
effect on gene expression. Genes involved in endogenous and xeno-
biotic metabolism (alcohol dehydrogenases ADH1A and ADH1B), and
the regulation of cell cycle and cell fate (IGF binding  protein subunit
1GFALS, MAS-related  G protein receptor MRGPRF) were all  down-
regulated following exposure to the higher concentrations of 30 or
100 uM of propiconazole or triadimefon.
                         A  RAT
                                                         Aldhlal
                                 r
fk         v^ *"
    Triazole  c—->j

Q)Slc3a2   ^Cyp3a3
r*Ssico1a4
                                                               Triazole-OH
3Gsim4
Udpgtr2
        Ugtlal
        Ugt2a1
                                                                                    Triazole-CONJ
                                 bl
                                 caX
                  '-v)Abcg5 I—p/
                               C—C—CH3'

                            6—CH CH
                             O
                          B  HUMAN                 QCYPIAZ            Q)UGT1A1             Q)ABCB1\
                                         r(~   Triazole   ^^ Triazole-OH     ^^ Triazole-CONJ
                                            Q)SLC01B  QCYP3A4



                               O  CH3

                               C—C—CH3«»
Fig. 3. Proposed CAR/PXR regulated xenobiotic metabolism pathway modulated by triazole antifungals. Indicated genes encode transporters and Phases I and II metabolic enzymes
for hepatocyte uptake of triazoles, and subsequent oxidation-reduction, conjugation and excretion into the canalicular space of the liver. (A) Rat solute carrier family (Slcla4, Slc3a2)
uptake; oxidation-reduction by P450 s (Cyplal, CypSal, Cyp3a3) or aldehyde dehydrogenase (Aldhlal); glutathione S-transferase (Gstm4) or UDP-glucuronosyltransferase
(Udpgtr2, Ugtl al, Ugt2al) conjugation; ATP-binding cassette transporter (Abcbl, AbccS, AbcgS) excretion. (B) Human solute carrier family transporter (SLC01B) uptake; oxidation-
reduction by P450 s (CYP1A1, CYP1A2, CYP3A4); UDP-glucuronosyltransferases (UGT1 Al) conjugation; ATP-binding cassette transporter (ABCB1) excretion. Genes regulated by CAR
(Q) PXR (Q), or both (Q)) are indicated.
                                          Previous

-------
Table 2
Comparison of microarray and quantitative PCR assessment of triazole induced changes in gene expression.

Treatment
72 h rat liver
Myclobutanil
Propiconazole
Triadimefon
GD6-PND92 rat liver
Myclobutanil
Propiconazole
Triadimefon
Treatment

mg/kg/day

300
300
175

134
170
139
MM
Abcbl
Array

1.44
1.16
1.21

1.24
1.48
1.41


qPCR

4.53
1.25
-1.08

-2.37
1.64
-4.99

Cyplal
Array

1.71
1.96
1.35

1.76
3.02
5.60


qPCR

15.07
34.93
5.64

5.82
2131
79.99

Cyp2bl/2
Array3

4.17
3.16
2.80

3.29
4.67
7.63


qPCR

170.21
37.53
26.63

64.57
132.12
63.13

Cyp3al
Array

1.10
1.07
1.16

-1.12
1.08
1.98


qPCR

9.88
7.72
4.16

2.05
2.04
18.20

Cyp3a2
Array15 qPCR

-1.19
-1.10
-1.47

3.22
4.01
1.57

Cyp4al
Arrayc

-1.16
-1.35
-1.33

-1.73
-1.59
-1.30


qPCR

-3.41
-3.68
-4.59

-1.91
-1.53
-2.70

Ugtlal
Array

1.22
1.06
1.16

1.50
1.82
1.78


qPCR

1.30
-1.54
-1.20

1.44
1.32
7.51

Rat primary hepatocytes
PB
PCN
Myclobutanil

Propiconazole

Triadimefon


Treatment
200
30
30
100
30
100
30
100

MM
1.05
1.03
1.08
1.23
1.08
1.45
1.18
1.25
ABCB1
Array
1.37
1.50
1.52
1.30
1.29
1.46
1.81
2.83

qPCR
1.15
1.09
1.08
1.20
1.34
1.09
1.06
1.10
CYP1A1
Array
1.28
-2.29
1.18
1.50
-1.11
1.48
1.10
1.33

qPCR
2.39
2.03
3.22
1.54
3.08
-1.33
1.59
-1.02
CYP1A2
Array
23.42
-2.09
14.66
2.83
10.93
-1.93
3.94
1.66

qPCR
1.80
1.61
1.80
2.03
2.93
1.63
1.84
2.02
CYP2B6
Array
4.30
9.68
14.34
20.83
22.93
9.29
20.32
27.61

qPCR
3.51
3.05
5.98
8.00
7.69
4.56
4.65
6.36
CYP3A4
Array qPCR
-1.56
-1.36
-1.66
-2.65
-1.81
-3.60
-1.75
-2.24
CYP4A11
Array
-4.65
-27.63
-6.80
-3238
- 14.03
- 287.36
-11.24
-53.65

qPCR
1.17
1.14
1.08
1.16
1.65
-1.06
1.06
1.16
UGT1A1
Array
-1.85
-3.54
-1.65
-1.82
-1.80
-3.59
-2.84
-2.51
SLC01B1
qPCR Arrayd qPCR
Human primary hepatocytes
PB
RIF
Myclobutanil

Propiconazole

Triadimefon

1000
30
30
100
30
100
30
100
1.97
1.13
1.19
1.54
1.56
1.49
1.68
2.13
4.97
2.82
2.87
2.59
4.28
1.72
4.18
2.58
1.92
2.70
1.30
1.94
2.27
4.96
1.30
1.57
6.13
11.52
3.50
12.63
18.68
64.28
3.93
5.83
1.06
1.16
1.26
1.56
1.52
1.83
-1.35
1.03
1.82
2.12
4.20
4.43
5.14
3.77
1.48
2.44
2.60
2.11
1.87
1.75
1.83
2.36
1.52
1.59
26.36
27.12
25.22
1331
14.08
8.33
9.67
10.54
2.50 16.05
2.46 35.73
1.60 8.36
1.62 6.68
1.64 6.58
-1.55 1.19
1.84 10.46
1.99 7.66
-1.90
-1.70
-1.20
-1.65
-1.72
-1.35
-1.84
-2.23
-2.86
-2.85
1.11
-2.66
-1.58
-2.34
-2.07
-4.49
1.33
1.35
1.04
1.02
1.05
1.08
-1.03
1.27
7.93 1.01 2.32
9.61 1.02 2.09
8.94 - 1.02 2.70
5.69 -1.02 1.37
9.74 1.01 2.24
2.36 -1.01 -1.24
8.12 -1.01 1.50
5.79 1.02 1.18
Note. Significant fold change relative to control in bold. Reference chemicals: PB (phenobarbital) for CAR; PCN (pregnane 16ct carbonitrile) and RIF (rifampicin) for PXR.
  a Probe set representing Cyp2b2 on Affymetrix microarray.
  b No representative probe set on Affymetrix microarray.
  c Probe set representing Cyp4alO on Affymetrix microarray.
  d Probe set representing SLC01A2 on Affymetrix microarray.
                                                                       Previous
TOC

-------
Table 3
Triazole induced changes in gene expression mapped to biological pathways from both rat in vivo, and rat and human in vitro studies.
Pathways
Androgen-estrogen metabolism
Arachidonic acid metabolism
Arginine-proline metabolism
Ascorbate-aldarate metabolism
Bile acid biosynthesis
Butanoate metabolism
Fatty acid metabolism
Galactose metabolism
Glutamate metabolism
Glutathione metabolism
Glycerolipid metabolism
Glicine-serine-threonine metabolism
Glycolysis-gluconeogenesis
Histidine metabolism
Linoleic acid metabolism
Lysine degradation
Metabolism xenobiotics CYP450 s
Nitrogen metabolism
Pentose-glucuronate metabolism
Propanoate metabolism
Pyruvate metabolism
Retinol metabolism
Starch-sucrose metabolism
Sterol biosynthesis
Tryptophan metabolism
Xenobiotic metabolism signaling
Liver Liver
6 hour 225 24 hour
mg/kg 225 mg/kg
MYC TDF MYC
X
XXX

X
X
X X
XXX
X
X X
X X
X

X
X
X
XXX
X

X
X
X
X
X
X
X X
XXX
TDF
X
X


X
X
X
X

X
X
X
X

X
X
X
X

X
X

X
X
X
X
Liver
72 hour
300 mg/kg/ day
MYC

X

X
X
X
X

X
X
X

X
X
X
X
X




X

X
X
X
PPZ TDF


X
X
X X
X X
X X



X

X
X

X X
X


X
X



X X

Liver Liver GD6-PND92 Liver GD6-PND92
336 hour mid-dose high-dose
225 mg/kg/
day
MYC
X
X

X
X

X

X
X
X

X
X
X

X

X

X
X
X

X
X
TDF MYC
X
X X




X X
X

X
X
X
X
X
X X

X X
X


X

X X

X X
X X
PPZ TDF MYC
X
XXX

X


XXX



X
X


X

XXX
X
X


X
X

XXX
X
PPZ

X
X
X


X



X



X

X



X

X

X

TDF
X
X



X
X


X


X
X
X

X


X
X


X

X
RPH
30 MM
PPZ TDF


X
X
X
X
X
X


X

X X


X




X

X

X
X
RPH
100 MM
MYC PPZ
X
X X


X
X X
X X
X




X

X
X X
X



X X

X X
X
X

HPH HPH
30 MM 100 MM
TDF PPZ TDF PPZ TDF
X



X X X X
X
X
X XXX


X X X X

X X X X

X

X X X X





X X X X
X

X
Note Only pathways consistently affected across three or more exposure periods are indicated. Complete lists of all pathways for each triazole are in Supplemental Data Table 10A-C
                                                                     Previous
TOC

-------
                                     A.K. Goetz, DJ. Dix / Toxicology and Applied Pharmacology 238 (2009) 80-89
   Genes identified by microarray that were differentially expressed
in response to only individual triazoles, but were consistent across
multiple timepoints or  studies, are tabulated in Supplemental Data
Tables 9A-C.

Quantitative PCR

   To compare in vivo  and in vitro liver effects, and confirm and
expand  on the microarray  results,  seven rat and eight human
homologous genes were analyzed by qPCR (Table 2). PCR analysis
was restricted to the 72 h timepoint of the shorter term repeated dose
gavage studies since all three triazoles had this timepoint in common.
In general, qPCR indicated more robust results than the microarrays;
in several cases the qPCR results demonstrated greater magnitude of
change relative to  control, or gained statistical significance where
array results were equivocal. This was the case for rat Abcbl, Cyplal,
Cyp2bl/2, Cyp3al, and Cyp4al; and human ABCB1, CYP1A1, CYP1A2,
CYP2B6, CYP3A4 and UGT1A1, the combination of microarray and
qPCR data confirmed and magnified the effects  of triazoles on these
rat and human genes. It is worth noting that for a number of genes and
in vitro chemical treatment groups, there appeared to be a greater or
more statistically significant effect at 30 uM, than at the very high top
concentration of 100 uM.
   With the addition of qPCR data, it was clear that the Cyplal
induction common to triazoles in rat liver, was  not consistent in rat
hepatocytes. However, qPCR did confirm that  induction of human
CYP1A1  and  CYP1A2 was conserved  in  human hepatocytes. The
induction  of  rat Cyp2bl/2 and  Cyp3al,  and human CYP2B6 and
CYP3A4  indicated by microarrays as common to the triazoles, was
now clearly a consistent, conserved  and robust response in both
species.  Consistency across  rat  in vivo and  in vitro  results was
especially strong for the inductions of Cyp2bl/2 and Cyp3al, and the
suppression of Cyp4al  in  both model  systems. The  common
induction of Ugtlal by all three  triazoles in the rat liver  following
gestational to  adult exposure was detected by a mix of qPCR and
array results, but little to no effect was observed after 72 h exposure
in vivo, or in rat hepatocytes. However, the induction of UGT1A1 in
human hepatocytes by  at least two, if not all three of the triazoles
was fairly robust. Response of the rat and human hepatocytes  to the
reference chemicals phenobarbital (PB),  pregnane 16a carbonitrile
(PCN), and  rifampicin  (R1F) in  regard  to expression of ABCB1,
CYP1A1, Cyp2bl/2, CYP2B6,  Cyp3al, CYP3A4, and UGT1A1 was
consistent with CAR and PXR activation in the two species. A similar
pattern of expression was observed for all three triazoles, in both
species.
   The overall  conserved response across rat  and  human hepatic
systems  supported a common, consistent and  conserved response
involving a xenobiotic metabolism pathway not only modulated by
the triazoles, but also metabolizing the triazoles in both rat liver and
hepatocytes (Fig. 3A), and human  hepatocytes (Fig. 3B). Human genes
encoding the transporters SLC01B and ABCB1, the Phase I enzymes
CYP1A1, CYP1A2 and CYP3A4, and the Phase 11 enzyme UGT1A1 all fit
into  this pathway.  In  contrast  to the rat  models, there was no
suppression  of CYP4A11  in  human hepatocytes,  indicating  some
possible differences in how human PPARa and fatty acid metabolism
responds to triazoles.

Pathway level analysis

   A pathway based approach comparing gene expression across
triazoles, time, dose, in  vivo and in vitro models, and species placed
effects on individual gene expression into  meaningful biological and
toxicological context. Pathways were considered commonly affected
when all three triazoles caused significant effects at a given timepoint.
Consistent pathways were those affected across  multiple  timepoints,
or between rat liver and rat hepatocytes for a triazole. Conserved
pathways were those affected in both rat and human hepatocytes, by
the same  or all  of the triazoles.  Table 3 shows the 26 biological
pathways  significantly affected by triazoles consistently in three or
more exposure periods.  Several  consistently perturbed metabolic
pathways  were common to all three triazoles and also conserved
between rat and human models. These included androgen-estrogen
metabolism, bile acid biosynthesis, P450 xenobiotic metabolism, and
xenobiotic metabolism signaling. Three energy pathways affected in a
consistent, common and conserved manner were glycerolipid meta-
bolism, glycolysis-gluconeogenesis, and starch-sucrose metabolism.
Several pathways consistent and common in both rat models, but not
conserved in humans, were arachidonic acid metabolism and fatty
acid metabolism. Nine of the 33 rat and human treatment groups had
no affected pathways, including all six of the lowest concentration
(10 uM) treatments in vitro  (rat and human).  Lists of all the affected
pathways  from  the 33 different treatment groups analyzed are
presented  in Supplemental Data Tables 10A-C.

Discussion

   The goal of this study was to compare rat in vivo and in vitro
models, to human in vitro models and to identify common, consistent,
and conserved gene expression changes that better characterized the
modes of action for triazole toxicities. The potential of short term in
vivo or in vitro assays  for  predicting longer-term effects was also
explored in this comparative toxicogenomic  analysis. Measuring
changes in gene expression across different  exposure periods and
doses, triazole chemicals, and biological models revealed differential
gene expression patterns common to the triazole antifungals and
conserved across the species. These differentially  expressed genes
mapped to the fatty acid catabolism, lipid homeostasis,  and  steroid
and xenobiotic metabolism pathways; representing common biologi-
cal processes affected by triazole treatments in rat liver, and rat and
human primary hepatocytes.
   The consistent modulation of target pathways over several in vivo
studies and  timepoints indicated that triazoles were affecting fatty
acid catabolism, lipid homeostasis and xenobiotic metabolism in the
rat liver, similar to what has been reported previously in Tully et al.
(2006) and Martin et al. (2007). Several CAR and PXR regulated genes
were  differentially expressed by all three  triazoles.  Short-term
exposure studies showed an  increase in  Cyp2B and Cyp3A genes,
supporting the hypothesis  that all three  triazoles caused an  early
adaptive response  through  the CAR and  PXR  receptors. However,
continuing increased expression of Cyp2B and 3A after longer-term
exposures  indicated a  maladaptive, toxic response  to triazoles; a
viable  mechanism of action  for  the observed  hepatomegaly and
hepatocyte hypertrophy by all three triazoles. Increased expression of
specific rat  genes  in the  steroid  metabolism  pathway following
triazole exposures included  Cyp2b2, Cyp3al3, and Cyp3al8. Cyp2b2
and  Cyp3al8  are involved  in the 16(3-  and 6a-hydroxylation  of
testosterone, while  Cyp3al3 is involved in the metabolism of pro-
gesterone. These changes in  gene expression are likely in response to,
and closely related to the effects  on steroid homeostasis previously
reported in  these studies (Goetz et al., 2007; Martin et al., 2007).
Similar increases in CYP2B6 and  CYP3A4  were observed in  human
hepatocytes  exposed to triazoles.
   A few differences stood  out between the  rat in vivo and in vitro
model systems. The transcript levels for the glucocorticoid receptor
(Nr3cl) were decreased by myclobutanil and triadimefon in vivo, but
not in vitro.  Nr3cl, in addition to  CAR, is induced by androgen and
xenobiotics and upregulates expression of Cyp2b genes (Honkakoski
and Negishi, 2000). Down regulation of Nrc31 by the two triazoles that
are clearly reproductive toxicants may be significant to understanding
more fully hormone-dependent effects from these exposures.
   A select number of SLCs and  ABC transporter genes were con-
sistently modulated by triazole exposure. Triazole exposure increased
                                          Previous

-------
                                          A.K. Goetz, DJ. Dix / Toxicology and Applied Pharmacology 238 (2009) 80-89
                                                                                                                                                89
transcript levels of Slcola4 and Abcbl  in vivo. Slcola4 is an organic
anion transporter, transporting negatively charged substrates includ-
ing bile acids and estrogen conjugates whereas Abcbl is  a multidrug
transporter P-glycoprotein which is active  during liver regeneration
and hepatocarcinogenesis (Meng et al., 2002;  Ortiz  et al.,  2004).
Additional transporters involved in steroid metabolism, cholesterol
absorption, amino acid and bile acid transport were also  upregulated
by  triazoles. The  transcript  levels  of Slcola4,  Abcbl,  and  Abcc3,
mediated by CAR and PXR (Guo et al., 2002; Jigorel et al.,  2005, 2006;
Staudinger  et  al., 2003)  were upregulated in  the  rat liver  and
hepatocytes. A similar increase in expression of ABCB1  was observed
in human hepatocytes exposed to triazoles.  It  is  likely that  this
induction of transporter genes increased the potential for increased
uptake of triazoles  into  hepatocytes,  and subsequently increased
excretion of hydrophilic  metabolites of  triazoles.  The Phase  111
transporter gene expression profiles suggested increased fatty acid,
bile acid, and triazole metabolite transport in the rat liver  and primary
hepatocytes.
   This study demonstrated the value of gene expression profiling in
confirming a mode of action and underlying mechanisms of triazole
induced toxicity. Pathway-based and gene-level analyses were able to
define  specific  patterns  of gene  expression,  delineating common
mechanisms in triazole toxicity. Comparison  between liver and rat
primary hepatocytes showed a targeted effect on genes involved in early
fatty acid catabolism,  and to a lesser degree on Phase 111 trans porters and
sterol metabolism.  Several consistently perturbed metabolic pathways
were common  to  all three triazoles and conserved between rat and
human models. These conserved pathways included androgen-estrogen
metabolism, bile acid biosynthesis,  P450 xenobiotic metabolism, and
xenobiotic metabolism signaling. The subsets of genes involved in fatty
acid catabolism/lipid homeostasis, Phase 111  transporters, and metabo-
lism of testosterone are  robust  candidates for biomarkers defining
potential mechanisms of action for disruption of testosterone home-
ostasis. Furthermore, these gene expression changes indicate shifts in
lipogenesis to fatty acid oxidation for increased energy production, and
increased expression of Phase 111 transporters for the  uptake  and
excretion of triazoles  in the hepatocytes — both mechanisms of triazole
toxicity which were common across the triazoles. Overall, the results
from these genomic analyses provide new leads for understanding the
modes  of  action  responsible  for triazole  toxicities. These analyses
revealed  functional categories of chemical response genes that indicate
mechanisms and  provide direction  for further research on triazole
mechanisms of action.

Acknowledgments

   The authors thank Drs Wenjun Bao and Russ Wolfinger (SAS  Inc.,
Cary, NC) for expert advice on data analysis;  and Drs Hongzu  Ren
(EPA)   and  Stephen  Ferguson (CellzDirect  Inc., Durham, NC)  for
excellent technical support. We also thank Dr Douglas  Wolf (EPA/
ORD) for technical review of this manuscript; and Ms. Jennifer Hill
for excellent  management of the  EPA  contracts  with  Expression
Analysis   Inc.  (Durham,  NC), and CellzDirect.  Microarrays  and
reagents  for a portion  of this study were provided by Affymetrix
Inc. (Santa Clara, CA) as part of a Materials  Cooperative Research and
Development Agreement with  EPA. AKG  was supported by EPA/
North  Carolina State  University  Cooperative Training  Agreement
#CT826512010.

Appendix A. Supplementary data

   Supplementary data associated with this article can be found, in
the online version, at doi:10.1016/j.taap.2009.04.016.
References

Allen, J.W., Wolf, D.C., George, M.H., Hester, S.D., Sun, G., Thai, S-E, Delker, DA, Moore,
    T., Jones, C, Nelson, G., Roop, B.C., Leavitt, S., Winkfleld, E., Ward, W.O., Nesnow, S.,
    2006. Toxicity profiles  in  mice  treated with hepatotumorigenic  and  non-
    hepatotumorigenic triazole conazole fungicides: propiconazole, triadimefon, and
    myclobutanil. Toxicol. Pathol. 34, 853-862.
Ghannoum, MA, Rice, L.B., 1999. Antifungal agents: mode of action, mechanisms of
    resistance, and correlation of these mechanisms with bacterial resistance. Clin.
    Microbiol. Rev. 12, 501-517.
Goetz, A.K., Bao, W, Ren, H., Schmid, J.E., Tully, D.B., Wood, C.R., Rockett, J.C., Narotsky,
    M.G., Sun, G., Lambert, G.R., Thai, S-E, Wolf, D.C., Nesnow, S., Dix, D.J., 2006. Gene
    expression profiling in the liver of CD-I mice to characterize the hepatotoxicity of
    triazole fungicides. Toxicol. Sci. 215, 274-284.
Goetz, A.K., Ren, H., Schmid, J.E., Blystone, C.R., Thillainadarajah, I., Best, D.S., Nichols,
    H.P., Strader, LE, Wolf, D.C., Narotsky, M.G., Rockett, J.C, Dix, D.J., 2007. Disruption
    of testosterone homeostasis as a mode of action for the reproductive toxicity of
    triazole fungicides in the male rat. Toxicol. Sci. 95, 227-239.
Guo, G.L, Staudinger, J., Ogura, K., Klaassen, C.D., 2002. Induction of rat organic anion
    transporting polypeptide  2 by pregnenolone-16ct-carbonitrile is via interaction
    with pregnane x receptor. Mol. Pharmacol. 61,832-839.
Hester, S.D., Wolf, D.C, Nesnow, S., Thai, S-E, 2006. Transcriptional profiles in liver from
    rats treated with tumorigenic and non-tumorigenic triazoles conazole fungicides:
    propiconazole, triadimefon, and myclobutanil. Toxicol. Pathol. 34, 879-894.
Hester, S.D., Nesnow, S., 2008. Transcriptional responses in thyroid tissues from rats
    treated with a  tumorigenic and a non-tumorigenic triazole conazole fungicide.
    Toxicol. Appl. Pharmacol. 227, 357-369.
Honkakoski, P., Negishi, M., 2000. Review article: regulation of cytochrome P450 (CYP)
    genes by nuclear receptors.]. Biochem. 347, 321-337.
Jigorel, E., LeVee, M., Boursier-Neyret, C, Bertrand, M.,  Fardel, 0., 2005. Functional
    expression of sinusoidal drug transporters in primary human and rat hepatocytes.
    Drug Metab. Dispos. 33,1418-1422.
Jigorel, E., LeVee, M., Boursier-Neyret, C., Parmentier, Y.,  Fardel, 0., 2006. Differential
    regulation f sinusoidal and canalicular hepatic drug transporter expression by
    xenobiotics activating drug-sensing receptors in primary human hepatocytes. Drug
    Metab. Dispo. 34,1756-1763.
LeCluyse, E.L, Bullock, PL, Parkinson, A., Hochman, J.H., 1996. Cultured rat hepatocytes.
    Pharm. Biotechnol. 8,121-159.
LeCluyse, E,  Madan, A., Hamilton, G.,  Carroll,  K., Dean, R., Parkinson, A., 2000.
    Expression  and regulation of cytochrome P450 enzymes in  primary cultures of
    human hepatocytes. J Biochem. Mol. Toxicol. 14,177-188.
Martin, M.T., Brennan, R., Hu, W, Ayanoglu, E., Lau, C., Ren, H., Wood, C.R., Corton, J.C.,
    Kavlock, R.J., Dix,  D.J., 2007. Toxicogenomic study of triazole fungicides  and
    perfluoroalkyl acids in rat livers predicts toxicity and categorizes chemicals based
    on mechanisms of toxicity. Toxicol. Sci. 97, 595-613.
Meng, L.J., Wang, P., Wolkoff,  A.W., Kim, R.B., Tirana, R.G.,  Hofmann, A.F., Pang, K.S.,
    2002. Transport of the sulfated, amidated bile acid, sulfolithocholyltaurine, into rat
    hepatocytes is mediated by oatpl and oatp2. Hepatology. 35,1031-1040.
Ortiz, D.E, Moseley, J., Calderon, G., Swift, A.L., Li, S., Arias, I.M.,  2004. Identification of
    HAX-1 as a protein that binds bile salt export protein and regulated its abundance in
    the apical membrane of  Madin-Darby canine kidney cells. J. Biol. Chem. 279,
    32761-32770.
Staudinger, J.L, Madan, A.,  Carol, K.M., Parkinson, A., 2003. Regulation of drug
    transporter gene expression by nuclear receptors. Drug Metab. Dispos. 31,523-527.
Tully, D.B., Bao,  W, Goetz, A.K., Blystone, C.R., Ren, H., Schmid, J.E., Strader, LE, Wood,
    C.R.,  Best, D.S., Narotsky, M.G., Wolf,  D.C., Rockett, J.C.,  Dix, D.J., 2006.  Gene
    expression profiling in liver and testis of rats to characterize the toxicity of triazole
    fungicides. Toxicol. Appl.  Pharmacol. 215, 260-273.
U.S. Environmental Protection Agency, 1995. Myclobutanil; pesticide tolerances. Office
    of Prevention, Pesticides  and Toxic Substances, Washington DC. Fed. Regist. 60,
    40500-40503.
U.S. Environmental Protection Agency, 2005a. Myclobutanil; pesticide  tolerances for
    emergency  exemptions. Office of  Prevention, Pesticides and Toxic Substances,
    Washington DC. Fed. Regist. 70,49499-49507.
U.S. Environmental Protection Agency,  2005b. Propiconazole; pesticide  tolerances for
    emergency  exemptions. Office of  Prevention, Pesticides, and Toxic Substances,
    Washington DC. Fed. Regist. 70,43284-43292.
U.S. Environmental Protection Agency, 2006. Triadimefon. Preliminary Human Health
    Risk Assessment Revised.  Office of Prevention, Pesticides,  and Toxic Substances.
    Washington DC.
Vanden Bossche, H., Marichal, P., Gorrens, J., Coene, M-C, 1990. Biochemical basis for
    the activity and selectivity of oral antifungal drugs. Br. J. Clin. Pract. Suppl. 71,
    41-46.
Ward, W.O., Delker, DA, Hester, S.D., Thai, S-E, Wolf, D.C., Allen, J.W, Nesnow, S., 2006.
    Transcriptional  profiles in liver from mice treated with hepatotumorigenic and
    nonhepatotumorigenic triazole conazole fungicides: propiconazole, triadimefon,
    and myclobutanil. Toxicol. Pathol. 34, 863-878.
Wolf, D.C, Allen, J.W., George, M.H., Hester, S.D., Sun, G., Moore, T, Thai, S-E, Delker, D.,
    Winkfleld, E., Leavitt, S., Nelson, G., Roop, B.C., Jones, C., Thibodeaux, J., Nesnow, S.,
    2006. Toxicity  profiles in rats treated with tumorigenic and  nontumorigenic
    triazole  conazole fungicides: propiconazole, triadimefon, and  myclobutanil.
    Toxicol. Pathol. 34, 895-902.
                                              Previous

-------
This article was downloaded by: [US EPA Environmental Protection Agency]
On: 2 September2009
Access details: /Access Details: [subscription number 910755252]
Publisher Informa Healthcare
Informa Ltd Registered in England and Wales Registered Number:  1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK
                           Toxicology Mechanisms and Methods
                           Publication details, including instructions for authors and subscription information:
                           http://www. info rmaworld. co m/smpp/title~content=t713396575


                           Understanding Genetic Toxicity Through Data Mining: The Process of Building
                           Knowledge by Integrating Multiple Genetic Toxicity Databases
                           C. Yang *; C. H. Hasselgren b; S. Boyerb; K. Arvidson c; S. Aveston d; P. Dierkes d; R. Benignie; R. D. Benz f;
                           J. Contrera f; N. L. Kruhlakf; E. J. Matthews f; X. Han a; J. Jaworska h; R. A. Kemper <; J. F. Rathman '; A. M.
                           Richard k
                           a Leadscope, Inc., Columbus, OH b Computational Toxicology, Safety Assessment, AstraZeneca R&D,
                           Molndal, Sweden c U.S. Food and Drug Administration, Center for Food Safety and Applied Nutrition, Office
                           of Food Additive Safety, College Park, MD d Unilever, Safety and Environmental Assurance Centre, Bedford,
                           Bedfordshire, England e Environment and Health Department, Istituto Superiore di Sanita', Rome, Italy ' U.S.
                           Food and Drug Administration,  Center for Drug Evaluation and Research, Office of Pharmaceutical Science,
                           Informatics and Computational  Safety Analysis Staff 10903 New Hampshire Avenue, Silver Spring, MD 9
                           DuPont Haskell Global Centers for Health & Environmental Sciences, Newark, DE h Procter & Gamble,
                           Central Product Safety, Strombeek, Sever, Belgium '' Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield,
                           CT J Department of Chemical and Biomolecular Engineering, The Ohio  State University, Columbus, OH k
                           National Center for Computational Toxicology, U.S. Environmental Protection Agency,  Research Triangle
                           Park, NC

                           Online Publication Date: 01 February 2008
To cite this Article Yang, C., Hasselgren, C. H., Boyer, S., Arvidson, K., Aveston, S., Dierkes, P., Benigni, R., Benz, R. D., Contrera, J.,
Kruhlak, N. L., Matthews, E. J., Han, X., Jaworska, J., Kemper, R. A., Rathman, J. F. and Richard, A. M.(2008)'Understanding
Genetic Toxicity Through Data Mining: The Process of Building Knowledge by Integrating Multiple Genetic Toxicity
Databases',Toxicology Mechanisms and Methods,18:2,277 — 295
To link to this Article: DOI: 10.1080/15376510701857502
URL: http://dx.doi.org/10.1080/15376510701857502
                               PLEASE SCROLL DOWN FOR ARTICLE


Full terms  and conditions  of  use:  http://www.informaworld.com/terms-and-conditions-of-access.pdf

This article may be used for  research, teaching and private study purposes. Any  substantial or
systematic  reproduction, re-distribution, re-selling,  loan or  sub-licensing,  systematic supply or
distribution in any form to anyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representation that the contents
will be  complete or accurate  or up to date. The accuracy of any  instructions,  formulae and drug doses
should be independently verified with primary  sources. The publisher shall not be  liable for any loss,
actions, claims, proceedings,  demand or costs  or damages whatsoever or howsoever caused arising directly
or indirectly in connection with or arising out of  the use of  this material.
                                   Previous

-------
Toxicology Mechanisms and Methods, 18:277-295, 2008
ISSN: 1537-6516 print; 1537-6524 online
DOI: 10.1080/15376510701857502
                                                          informa
                                                          healthcare
  Understanding Genetic  Toxicity Through Data Mining:
            The Process  of Building Knowledge by Integrating
                                      Multiple  Genetic Toxicity  Databases
 C. Yang
 Leadscope, Inc., 1393 Dublin Road,
 Columbus, OH, 43215
 C. H. Hasselgren and S. Boyer
 Computational Toxicology, Safety
 Assessment, AstraZeneca R&D, 431 83
 Molndal, Sweden

 K. Arvidson
 U.S. Food and Drug Administration, Center
 for Food Safety and Applied Nutrition, Office
 of Food Additive Safety, 5100 Paint Branch
 Parkway, College Park, MD, 20740

 S. Aveston and P. Dierkes
 Unilever, Safety and Environmental
 Assurance Centre, Colworth House,
 Sharnbrook, Bedford, Bedfordshire, MK44
 1LQ, England

 R. Benigni
 Istituto Superiore di Sanita', Environment
 and Health Department, Viale Regina Elena
 299,00161, Rome, Italy

 R. D. Benz, J. Contrera, N. L. Kruhlak,
 and E. J. Matthews
 U.S. Food and Drug Administration, Center
 for Drug Evaluation and Research, Office of
 Pharmaceutical Science, Informatics and
 Computational Safety Analysis Staff 10903
 New Hampshire Avenue, Silver Spring, MD
 20993-0002

 X. Han
 DuPont Haskell Global Centers for Health &
 Environmental Sciences, Newark, DE 19714

 J. Jaworska
 Procter & Gamble, Central Product Safety,
 lOOTemselaan, 1853 Strombeek—Bever,
 Belgium

 R. A. Kemper
 Boehringer Ingelheim Pharmaceuticals Inc.,
 175 Briar Ridge Rd, Ridgefield, CT,
 06877-0368
 J. F. Rathman
 Department of Chemical and Biomolecular
 Engineering, The Ohio State University, 140
 W, 19th Ave., Columbus, OH 43210

 A. M. Richard
 National Center for Computational
 Toxicology, U.S. Environmental Protection
 Agency, Research Triangle Park, NC 27711

Received 20 November 2007;
accepted 5 December 2007.
This article is not subject to United States
Copyright laws.
Address correspondence to C. Yang, 1393
Dublin Road, Columbus, OH 43215,
Leadscope, Inc. E-mail:
cyang@leadscope.com
ABSTRACT   Genetic toxicity data from various sources were integrated into
a rigorously designed database using the ToxML schema. The public database
sources include  the  U.S.  Food and Drug  Administration (FDA) submission
data from approved new drug applications, food contact notifications, generally
recognized as safe food ingredients, and chemicals from the NTP and CCRIS
databases.  The  data from  public  sources were then  combined with data
from private industry according to ToxML criteria. The resulting "integrated"
database, enriched  in pharmaceuticals, was used  for  data  mining analysis.
Structural  features  describing  the  database were  used to  differentiate the
chemical spaces of drugs/candidates, food ingredients, and industrial chemicals.
In general,  structures for drugs/candidates and food ingredients are associated
with lower frequencies of mutagenicity and clastogenicity, whereas industrial
chemicals as a group contain a much higher proportion of positives. Structural
features were selected to  analyze endpoint outcomes of the genetic toxicity
studies. Although most of the well-known genotoxic carcinogenic alerts were
identified,  some discrepancies from the classic Ashby-Tennant alerts were
observed. Using these influential features  as  the independent variables, the
results of four types of genotoxicity studies  were  correlated.  High  Pearson
correlations were found between the results of Salmonella mutagenicity and
mouse lymphoma assay testing as  well as those from  in vitro  chromosome
aberration  studies. This paper demonstrates the  usefulness of representing a
chemical by its  structural features and the use  of these  features to profile
a battery of tests rather  than relying  on  a  single toxicity test of a given
chemical. This  paper  presents data mining/profiling methods applied in a
weight-of-evidence approach to assess  potential  for genetic toxicity,  and to
guide the development of intelligent testing strategies.

Keywords   Genetic  Toxicity; Databases; ToxML; SAR; QSAR; Structural Alerts
                           INTRODUCTION

   Computational toxicology is the application of computational methods to iden-
tify,  characterize,  and assess the  hazards and risks that chemicals pose to human
health and the environment (Environmental Protection Agency  [EPA] NCCT 2007).
One  aspect of computational toxicology involves implementing reliable predictive
methods similar to  those traditionally relied upon in structure-activity  relationships
                                                     277
                             Previous
              TOC

-------
(SARs)  employed  for  small molecules.  Current  research
trends include computational  screening of chemicals  and
extension to molecular-level understanding via bioassays  and
genomics/proteomics methodologies (Richard 2006). The  Na-
tional Toxicology Program's (NTP) high-throughput screening
project is a good example of this trend (NTP NICEATM 2007).
Profiling compounds based on both chemical and biological
domains as a first step in understanding complex toxicity can
be applied using predictive toxicology to  screen and prioritize
chemicals in both the development and safety assessment stages.
   Whether a  method involves profiling  biological endpoints
or building quantitative structure-activity  relationship (QSAR)
models, these computational approaches require  sufficiently
large  databases with qualified  data. Issues related  to data
availability,  size,  and quality are  obstacles for both  private
and public sector entities. To develop biologically meaningful
methods that accurately predict toxicity requires good coverage
of the relevant chemical space; however, this may not be feasible
due to the labor-intensive  nature of experimental  toxicology
research. Data  sources in private sector entities are often limited
in scope by their targets of interest.  In addition, most of the
information accumulated during the lengthy period of product
discovery is typically not kept in an easily accessible format nor
is well designed for data exchange/sharing and integration with
other resources. Thus, for this paper, an integrated database was
constructed from various private sector databases to augment
public and commercial sources with an emphasis on  genetic
toxicity as an example of the benefits that can be derived from
having such an approach.
   Genetic toxicity is one  area in which chemical reactivity
mechanisms and modes of action (MOA) are at least somewhat
understood. The  results of a battery of genetic toxicity tests
remain of interest to many industries and regulatory agencies
(Contrera et al. 2005; Matthews  et al. 2006a, 2006b; Benigni
et al. 2007).  Structural alerts (SAs)  developed from  genetic
toxicity studies provide  a  basis  for screening for genotoxic
carcinogens and have been discussed extensively in the literature
(Ashby  1985;  Ashby and Tennant  1991;  Kazius et al. 2005;
Benigni 2004; Kirkland et al. 2005).  A comprehensive review
based on a workshop held in support of the European Union's
REACH legislation  has also been published (Benigni et al.
2007). Recently,  many  review articles and communications
have  been reported in  response  to REACH  as well as  the
Seventh  Amendment for  Cosmetics (Kirkland et al. 2007;
Jacobson-Kram and Contrera 2007; Tweats et al. 2007; O'Brien
et al.  2006; Cimino 2006; Yang et al. 2006). In this paper, the
benefit of constructing a toxicity database with a rigorous set
of data  models for  data mining is  demonstrated. Further, the
database was profiled for structures and various biological effects
in an  attempt to understand the relationships between chemical
and biological domains for various genetic toxicity endpoints.
Genetic  toxicity  profiles  in this  paper  are  based  on  the
four common study types:  bacterial mutagenesis, mammalian
mutagenesis, in vitro chromosome aberration (in vitro CA),
and in vivo micronucleus (MN). An improved understanding
of the chemical-biological domains opens  doors for identifying
structural features relevant for each  toxicity endpoint. The
ability to  determine structural  feature-level correlations to
biology allows us to derive MOA knowledge from the data.
This process is demonstrated in this paper using the database
integrated from  sources including the U.S. Food and Drug
Administration (FDA), the NTP,  and private industries. The
  TABLE 1   Composition of the 2006 integrated database based
  on data source1
Database
source
CDER
CFSAN-
Indirect
CFSAN-PAFA
NTP + CCRIS
+ Other2
Private
CFSAN-
CDER Indirect
231 0
131





CFSAN-
PAFA
0
0

356



NTP +
CCRIS +
Other2
17
14

184
1,930


Private
0
1

3
43

919
    10nly the structurable compound records are included in the count
  statistics.
    2Tokyo-Eiken 2007.

  ultimate goal of the methodology is to implement a systematic
  weight-of-evidence (WOE) approach based on clearly defined
  and  transparent rules  to  assess toxicity and  to guide  the
  development of intelligent testing strategies.


         MATERIALS AND METHODS

                Database  Coverage

  Data sources and ToxML database
    The integrated database used in this investigation consists
  of publicly available data assembled  as  the  Leadscope SAR-
  ready database (2006 version) and a collection of information
  from  private industries. Table 1 summarizes the sources of
  the  data  in  this integrated database. The  FDA databases in
  Table 1 were  developed using a ToxML  database  standard
  through a cooperative research and  development agreement
  (CRADA) between Leadscope and the U.S. FDA.1 ToxML is
  an XML standard developed to represent toxicity experiments
  with  an  ontology to  provide an endpoint-specific  set  of
  controlled vocabulary. ToxML has been described  in other
  articles, including another paper in this issue (Richard et al.
  2008). The term "SAR-ready" refers to a database in which the
  ToxML experimental  data,  possibly compiled from multiple
  source databases into an integrated database, has undergone
  further processing and aggregation to produce summary study-
  type endpoint assessments (i.e., assigned "calls" for individual
  chemicals). The data content included approved NDAs (new
  drug  applications),  FCNs  (food contact  notifications)  for
  packaging, sanitizers, etc., and the FDA's PAFA (priority-based
  assessment of food  additives)  database.  The PAFA  database
  includes toxicity data on a variety of food ingredients, including
  direct food additives  (e.g., aspartame, sucralose), food contact
  substances/indirect food  additives (e.g., packaging materials,
  sanitizers, plastics additives),  generally recognized  as  safe
  (GRAS) ingredients (e.g., sodium benzoate, phosphoric acid),
  and flavors. The chemicals in the PAFA database, however, are
  not all FDA-regulated food ingredients (FDA CFSAN EAFUS
  1 The data compiled for the FDA/Leadscope CRADA are available through
  the U.S. Freedom of Information Act (FOIA); no proprietary data were used
  to build the FDA CRADA databases.
C Yang et al.
                                                       278
                                        Previous
TOC

-------
TABLE 2   Study type and genetic toxicity recorded in the SAR-ready database
Study types
                                Compound level calls
Bacterial mutagenesis

Mammalian mutagenesis
In vitro chromosome
   aberrations
In vivo chromosome aberrations
In vivo micronucleus

In vitro micronucleus
Bacterial mutation. Salmonella mutation, strain-level mutations forTA100, TA98,
  TA1535, andTA1537
Mammalian mutation, MLA, CHL V79, CHO HPRT mutations
In vitro chromosome aberration, CHO chromosome aberration, CHL (v79) chromosome
  aberration, human blood (leukocytes, peripheral blood, etc.) chromosome aberration
Insufficient data for analysis
In vivo MN, mouse MN (peripheral blood, bone marrow), rat MN (peripheral blood, bone
  marrow)
Insufficient data for analysis
2007). Approximately 70% of the  compounds in the PAFA
database  are flavoring agents.  Other public sources  include
the NTP (NTP 2007), the  Chemical Carcinogenesis Research
Information System (CCRIS 2007), Tokyo-Eiken (Tokyo-Eiken
2007), and several primary publications (Crebelli et al. 1999).
These public databases are enriched with chemicals termed in
this paper as "industrial" covering a wide variety of compounds,
including agricultural chemicals and consumer products. In this
paper, chemicals that are not classified as drugs/candidates or
food ingredients are, in general, grouped as "industrial." The
Carcinogenic Potency Database (CPDB), a popular source for
Salmonella mutagenicity data, was not included in the SAR-ready
or integrated databases because no study protocols or references
are available from the source. Approximately 73% of the CPDB
compounds with Salmonella data, however,  are found in the
SAR-ready database with study details, references, and calls.
   A total of four  private  industry  databases  constructed
according to the same ToxML standard were used  as examples
to demonstrate  the  benefits  of integration. These  private
sources were categorized as drugs/candidates, food  additives,
and industrial chemicals. The therapeutic indications associated
with the  drugs/candidates  included inflammation, immunol-
ogy, cardiovascular, gastrointestinal,  respiratory, antipsychotics,
analgesics, anticancer, anti-infection, and antiviral categories.
The actual chemical structures  of the proprietary  compounds
were not required and not  used for  any of the analyses in this
paper; the sharing of knowledge derived from these  structures
was accomplished  solely  through  general structural  feature
statistics. Definition and methods of obtaining structure fea-
tures and statistics are described in this  Materials and Methods
section.
                             ToxML evaluation of genetic toxicity
                             studies
                               The ToxML standard has been  used  to transform public
                             databases as well as to construct databases from FDA and private
                             sector data. In addition, the process of building an  SAR-ready
                             genetic toxicity database was developed by defining a set of
                             fields  to meet  the inclusion criteria to be used in endpoint
                             assessment. The criteria were defined by  consultation with
                             scientists from the FDA and industry. Table 2 lists the genetic
                             toxicity study types in the SAR-ready database; it  contains a
                             total of 2,428 compounds  in six genetic  toxicity study types.
                             Table 3 lists specific fields and subfields that were deemed
                             critical for study validation. These  guidelines  were also used
                             to resolve call conflicts that arose when combining studies from
                             multiple sources: calls were assigned for the studies  only if the
                             information defined in Table 3 was available. A compound
                             was not used if there were no acceptable studies meeting these
                             criteria reported in the  database. The ToxML  approach used
                             here for genetic toxicity endpoint assessment applies transparent
                             and systematic rules across the diverse databases to aggregate
                             data from individual experiments across test systems to define
                             a "call" for each chemical compound.  A single  study was
                             designated "positive" if any of the test strains, cell lines, or
                             target cells resulted in a positive response. A single compound
                             was designated positive if it was reported positive in  more than
                             50% of the studies. In contrast, a compound was considered to
                             be negative if reported negative in more than two-thirds (67%)
                             of the studies. The outcome for a compound reported positive
                             in 33% to 50% of the  studies is assigned  as  "intermediate."
                             In  the binary classification analysis presented in this paper,
TABLE 3   ToxML fields essential for determining compound-level calls in the genetic toxicity database
Top levels of ToxML
Compound level

Study level
Test level





Fields
ID, Chemical name, InChl

Background
System

Control
Dosage regimen
Conditions
Results
Fields
ID (Registry numbers. . .), chemical name (IUPAC name
active ingredient. . .)
Study type, study source, study start date, reference
Species, strain, metabolic activation, sex, cell, cell line.
of exposure
Negative control, positive control

, Orange Book


target cell, route


Concentration/dose, solvent/vehicle, frequency, treatment time
Solvent vehicle, scoring technique, stain
Test call, cytotoxicity, precipitation


279
                           Data Mining Integrated Multiple Genetic Toxicity Databases
                                Previous
                   TOC

-------
compounds  in the intermediate group were included with
the positives. If a compound was reported  only  once in the
database, but met  the requirements shown  in Table  3, then
the compound "call," or endpoint assessment, was given  as
reported in the source database. Calls were  made for  the test
endpoints that are more commonly accepted  by regulatory
agencies (Table 2). In vivo chromosome aberration and in vitro
MN study results were not used because of insufficient data
to apply systematic rules for construction of endpoint calls.
Merging this SAR-ready database with the private industry data
using the same assessment  criteria  resulted  in the integrated
genetic toxicity database of 3,220 chemicals  with  four genetic
toxicity study types, which were then used for the  data mining
and analysis.


              Structural Analysis

Structural feature-based representation
   The Leadscope  genetic toxicity  databases are  represented
structurally by chemical classes. Structural features defined  in
the Leadscope chemical hierarchy (Roberts  et al. 2000) were
used to characterize and analyze compound classes  of the
various genetic toxicity databases. Leadscope features  include
classes of benzenes, functional  groups,  heterocycles, fused
rings, and pharmacophores. Leadscope software fragments the
chemical structure  of a given compound into corresponding
structural features (i.e., specific functional groups,  substitution
patterns, etc.) for automatic grouping. The features describing
the chemical structures in the database were exported as a table
of chemical fingerprints from Leadscope v2.4 for each of the
data sources in the integrated database.  Examples are shown
in Table 4. The fingerprint  table  is  a matrix of [^(compound
structures) x /'(chemical features)], where each row corresponds
to one  particular  structure (compound) and  each  column
corresponds to a chemical feature; a value of 1 indicates that
the compound contains that particular feature, whereas a value
of 0 indicates that it does not.

Principal component analysis
   Principal component analysis  (PGA) is a multivariate data
analysis technique with numerous  diverse and important
applications. One important feature of PGA is that it is ideal
for handling data sets in which the number of observations,
>S(compound structures), is less than the number of descriptor
variables, F (chemical features), a situation for which conven-
tional modeling approaches are inadequate. In the computation
of principal components (PCs), the original data can be thought
of as S independent observations of F random variables. The
maximum possible number of principal  components equals
the rank (r) of the data matrix, and therefore  r  <  min(S,
F).  Generally,  a large proportion of the overall variability is
accounted for by a small number,/?,  of the PCs, where  p <^ r.
Thus, the original  S x F matrix, [SF], can be replaced by a new
matrix of S observations x p latent variables with very little loss
of information, where  p <$; F. PCA as such is an unsupervised
technique (i.e., its  aim  is simply to project  the original high
dimensional set into a  lower dimensional  one  spanned  by
orthogonal components).  Nevertheless, it  can  be used for
pattern recognition purposes  as well. Pattern recognition  on
the principal component scores is performed as with any other
type of descriptors; however,  pattern recognition  on PCs has

C Yang et al.
  several noteworthy  advantages.  In many instances  the  first
  component, even though it is the one explaining the highest
  proportion of variance, only reflects the scaling properties of
  the studied system, and is a pure "size" indicator, whereas the
  next  components describe the "shape," or quality properties
  of the system (Darroch and Mosimann 1985). In addition, the
  principal components are orthogonal to each other and, hence,
  are uncorrelated and do not carry any redundant information;
  this allows for the generation of well-conditioned models by
  avoiding mutual correlation of regressors that creates ambiguity
  in the model selection procedure. Since the PC scores are linear
  combinations of all of the original variables, each of the original
  variables contributes to each PC in  the presence of all other
  variables, making this a truly multivariate approach (Jolliffe
  2002; Benigni and Giuliani 1994).
     Structural domains of the databases can thus be visualized
  by plotting various PCs. Although the full /^-dimensional PC
  space cannot be visualized if/1 > 3, projection plots in 2-
  or 3-dimensions  are often very  useful. Leadscope  structural
  features were used as independent variables; structural features
  appearing  in 20  or  more  compounds were included in  this
  unsupervised analysis. As noted, it is generally possible to select
  a small number, p, of PCs that  explain  most of the variation
  in the data set. In order to discard  PCs not relevant for the
  problem under study, the eigenvector coefficients of the/1 PCs
  are analyzed. Removed from consideration were PCs associated
  with  eigenvectors in which all coefficients are  the same  sign
  (either all  positive or all negative). As stated above, the  PCs
  with  all coefficients of the same sign are "size" components, and
  describe the number of "active" Leadscope features, whereas the
  PCs with mixed signs describe the different types of chemicals in
  the databases: the latter is the use of PCs that we were interested
  in. The principal component analyses in Figures 2 and 4 were
  performed using MatLab 7.4.


  Statistics for characterizing the
  chemical-biological domain
  Z-SCOre  StdtistiC. The  compounds  in the  database were
  grouped by pre-defined  substructure searching. The  resulting
  chemical classes can be correlated with the biological responses,
  after  which the  mean  and the  standard deviation of each
  compound class are calculated. The z-score statistic provides a
  metric to estimate whether the compound class is more highly
  associated  with the biological response than the mean of the
  whole database. The z-score compares the mean activity  of a
  subset to the expected value according to:
                 z=(xi- x0)
(1)
   where  x\ and x-i are the mean responses of the subset and
  full set, respectively,  and n\  and ng are the set sizes and so2
  is the sample variance of the full set. Equation 1 represents a
  z-score adjusted for set sizes  to correct for the fact that small
  sets with extreme values  would have too large an  impact if
  conventional z-scores were used. For example, the adjusted z-
  score will balance the following two situations: (1) a particular
  class has only a few compounds but all have much higher values
  than the average of the whole  data set; (2) a particular class has a
  large number of compounds but also a broad distribution with
                                                       280
                                        Previous
TOC

-------
TABLE 4   A selection of influential differentiating features of chemicals in the integrated database acrn*;*; the nenetic tnxicitv cmdnnint<;
Chemical features
Alcohol alkenyl, cyc-
Alkyl
Aryl
Aldehyde Alkyl
Aryl
Amlne amirte(NH2). alKyt-
am>ne(NH2). aryl-
sec-ammef NH) aryl-
lart-amine, aryl-
Azo
Bases, nucleositfes
Benzene benzene. l-alkylammo-
benzene. 1-carbonylarnino-
benzeoe 1 -heleroamino-
benzerte, 1 -methyl-
benzene. t.2.3.4-fused
Epoxide
Ether ether, alkenyl. cyc-
Furan
Malide halkte. alkenyl-
haltde. alkyl-
halide. p-alkyl-
nalkte. s-alkyl-
tialide aryi-
Hydrazine hydrazine. alky), acyc-
hydrazine. aryl-
Kelone ketone. af kenyl , cyc-
ketone. aryl-
Iminomethyl. alkenyt, eye-
Nilro nil/o
nitro, aryi-
Nilroso and Nitrosamine
Phosphorous groups
Heterocyde benzimtaazole
benzopyran
imidazole
pyrazine1. Cells with z-soores >2 are in bold fonts lo indicate a strong association with positives. The full feature table of the integrated database will be
            available online: http:Wivww.lead5CQpe.com. http «epa.gov'ncct'dsstox and httpiWambit.acad.bg
                                                           Previous

-------
its own average being greater than the that of the whole data set.
In this case, the adjusted z-score will be greater for the second
case although the class mean is smaller than in the first case. In
this paper, all the z-scores were calculated using equation 1. The
features and statistics (means, z-scores, and frequencies) of each
of the database sources were exported  from Leadscope v2.4.
The database sources  include the SAR-ready genetic toxicity
database  and the private collections. The exported data were
combined and the  statistics (means,  z-scores, and frequencies)
were recalculated for  the  entire integrated database outside
of Leadscope software. Examples of the resulting integrated
database are listed in Table 4.

Multivariate  Correlation. Finding  structural features
that correlate with  toxicity can be mathematically represented
by:
                    [SFf[ST] = [FT]
(2)
where [SF] represents a fingerprint table. [ST] is the response
matrix, with each row corresponding to a structure and each
column a  toxicity  endpoint of  interest.  The  [FT]  matrix
obtained by the above operation is  thus an array in which
each row  corresponds  to  a particular  chemical  feature and
each column a toxicity endpoint. Thus,  [FT]  provides a direct
connection between  structural features  and endpoints, which
subsequently is used to discover structural alerts  through the
analysis techniques presented in this paper.
   For profiling chemicals with structural features and toxicity
endpoints, relevant structural features must be selected. These
structural features are correlated with toxicity endpoints, and
selected features become  the independent  variables  in the
subsequent multivariate analysis. The feature  selection criteria
to differentiate with toxicity endpoints are summarized below.
For a given endpoint, features that give the  highest absolute
z-scores are selected. Across the endpoints, features resulting in
the largest variance across the different endpoints are selected,
which is equivalent to finding the largest difference between
minimum and maximum  values.  It is  also desirable to find
features with  the highest  or  lowest  mean values across the
endpoints with  small variance. In  addition, the  frequency
of a feature (i.e., the number of compounds that a feature
describes in the  data set) is also considered. The correlation
is quantitatively expressed by the Pearson correlation:
                                                      (3)
where Sj and Sj are the variances of the i andj variable and Sy the
covariance. A pair-wise Pearson correlation coefficient for assays
i and j is calculated using z-scores of selected features against
the particular test endpoints using features as observations.

        RESULTS AND  DISCUSSION

     Chemical Domain  of Integrated
                     Databases

Structures in the database
   Approximately  16%  of  the  structures  in the  SAR-ready
database are  drugs,  16% are food  ingredients  (direct  and

C Yang et al.
indirect), and  68%  are industrial chemicals. When the  four
private sources are  combined with the SAR-ready database
to create the integrated  genetic toxicity database, the counts
for drugs/candidates increased to 30% and the general indus-
trial chemical  counts  decreased to 48%. Table 1 shows the
compound counts by  data sources included in the integrated
database. There  are very few overlaps  between the various
sources with the exception of the PAFA data  from CFSAN.
About 52% of the PAFA food ingredients were also found in
NTP/CCRIS, as  shown in Table 1. The chemicals present in
the intersection of PAFA and NTP/CCRIS are mainly flavoring
(78%) and coloring agents (15%).
   The chemical  structures  in the databases can be further
characterized by structural classes and substructural fragments.
The NTP  and CCRIS  databases  contain  mostly industrial
chemicals with increased presence of pyran(H), benzopyran,
and alkyl and aryl halides features. In contrast, drug compounds
from the  FDA  CDER enrich the heterocyclic classes but do not
generally contain reactive groups associated with mutagenicity
such as  unsaturated aldehydes or N-nitroso features.  Food
ingredients do  not contain carbamates or common heterocycles
such  as  1,4-dihydro  pyridines,  thiophenes, and  thiazoles.
Figure 1 depicts the  Leadscope compound class  distribution of
the integrated  database.  Structural features differentiating the
three broad substance use types, without regard  to  endpoint
classification, are selected as examples.
        Structural domain comparisons
          Two  of the  most  important reasons for building the
        integrated database from multiple sources were (1) to diversify
        and augment the chemical spaces of the public databases, which
        are heavily biased toward industrial chemicals; and (2) to address
        the  biological responses of the various chemical domains and
        determine if SAR patterns relating the chemical and biological
        domains can be discovered. In addition, the integrated database
        permits an  analysis of the biological activity associated with
        specific chemical structural features when these features are
        present in different chemical environments. It  is the context
        of where and how the features  are  positioned and  with the
        combinations of other features in the whole molecule that is
        critical. The makeup of a whole molecule by various structural
        features is manifested in the chemical substance use types, for
        example, drugs vs. surfactants. For this reason, adding biological
        and chemical structure information of drugs/candidates and
        food ingredients to the  set of general industrial chemicals
        was extremely important. It should be stressed  again that the
        chemical  structure information  was in the form  of features
        statistics as shown in Table 4.
          Approximately 50% of the SAR-ready database structures
        were randomly selected and their Leadscope structural features
        exported  as chemical fingerprints from  Leadscope  v.2.4. This
        random sampling preserved  the ratio between the  chemical
        substance use types (i.e., drugs, food ingredients) and general
        industrial chemicals. The  structures were then  clustered using
        principal component analysis (PCA) using the structural feature
        fingerprints as descriptors. In this analysis, only those features
        present in 20 or more  compounds  were included. As noted
        previously,  a  large proportion of the  variation in  the data
        is typically captured  by a relatively small number  of latent
        variables (principal components). If  one chooses p PCs, then
                                                             282
                                        Previous
      TOC

-------
                                   Drugs/candidates
Food ingredients
Industrial chemicals
                      Aldehyde
                     Carbamate
                       Epoxide
                         Halide
                     alkyl halide
                     aryl halide
                    Heterocycle
                      Imidazole
            Naphthalene, 1-subst
                          Nitro
                    Oxolane(H)
                      Pyran(H)
                      Piperidine
                       Pyridine
                      Quinoline
                       Thiazole
                                                                                                   10       100
                                0.1       1       10       100 0.1       1       10      100 0.1

                                                           % Frequency of features

FIGURE 1  Structural classes differentiating the broad substance use categories within the integrated database (3,220 compounds).
                            benzene polyhalides
                                                                                    azo dyes and
                                                                                    benzene sulfonates
                                                                                                                  PC,
FIGURE 2  Chemical domain of the databases represented in the latent variable space defined by the three most differentiating principal
components. Orange denotes drugs/candidates, green for industrial, and blue for food ingredients chemicals. A: SAR-ready databases
only. B: Integrated databases.
283
  Data Mining Integrated Multiple Genetic Toxicity Databases
                                 Previous
            Next

-------
similar compounds  will tend to cluster together in the p-
dimensional space defined by the PCs.
   The scores of the three most strongly discriminating princi-
pal components (PC2, PCS, and PC4) are plotted in Figure 2A
for the SAR-ready database. The PC plots effectively differenti-
ate the various database sources; most of the defined organics in
the group of industrial compounds from NTP/CCRIS clustered
together except for two local islands. A portion of the industrial
chemicals, namely the  benzene polyhalide  group, clustered
separately with most of the loading on PCS.  The other island
included benzene sulfonates and azo  benzenesulfonyl dyes.
Some of the  food ingredients such  as flavoring and coloring
agents were  clustered closely with  the industrial  chemicals.
Many of the marketed drug compounds with heterocycles were
mostly differentiated as outliers in the SAR-ready database.
   When compounds from the four proprietary collections
were added  to  the  SAR-ready database  and PCA repeated,
the structures from chemical and consumer product industries
overlap significantly with the chemical space of NTP/CCRIS
and food ingredients (Fig. 2B). However, the region of chemical
space populated by drug candidates expanded considerably and
is clearly separated from the structures in NTP/CCRIS. The drug
space can be further differentiated by the therapeutic indications
using the  same structural features. This  domain  analysis
validates a notion of drug-like compounds and biology-driven
chemical analogs. Structural  domains give  further  insights
beyond those obtained  from analysis of the  experimental or
calculated physical properties of various structures.


 Biological Domain of  the  Integrated
                    Database

Endpoints of toxicity database
   Six genetic toxicity study types were included in the SAR-
ready database (Table 2) and four types within the integrated
database were analyzed in depth. Bacterial mutagenesis data
                                             were grouped by four individual Salmonella strains. Mammalian
                                             mutagenesis data were evaluated from mouse lymphoma (MLA)
                                             studies. In vitro chromosome aberration data were grouped by
                                             Chinese hamster lung (CHL) or Chinese hamster ovary (CHO)
                                             cell tests. In vivo micronucleus data include both mouse and rat
                                             studies. These further aggregations of data represent "endpoint
                                             assessment" to define activity categories for use in SAR analysis.
                                                The databases  are also profiled for their distributions of
                                             genetic toxicity study calls (i.e., positive and negative  com-
                                             pounds). These  study call counts reflect the result of the
                                             aforementioned endpoint assessment rule, instead of that given
                                             in the original databases. For example, eugenol (CAS 97-53-0,
                                             2-Methoxy-4-(2-propenyl)phenol) was  considered positive in
                                             the CHO chromosome aberration test with and without S9 in
                                             the study reports from the PAFA database. However, the NTP
                                             study report considers this chemical to be a negative without S9,
                                             and weakly positive with S9 under similar test conditions. In the
                                             SAR-ready and  integrated databases, the compound was thus
                                             counted as positive for CHO  chromosome  aberrations,  both
                                             with and without metabolic activation. However, in all cases,
                                             the original  test and study calls are preserved in the  database
                                             along with the treatment-level results.
                                              Distributions of endpoint outcomes
                                                The distributions shown in  Figures 3 to 6 represent the
                                              compound statistics when the ToxML endpoint criteria were
                                              applied to determine an aggregated call for each compound
                                              for a  given  endpoint.  In general, both  FDA submissions
                                              and private collections are quite low in positive compounds.
                                              The number of positive compounds increased  significantly by
                                              adding data from the  public sources. Taking the Salmonella
                                              endpoint as an example, drugs from the FDA CDER database
                                              contain fewer than 9%  positives, whereas the food ingredients
                                              and the industrial chemicals from CCRIS/NTP give 14% and
                                              39%, respectively. Thus, in the integrated database, the overall
                                              frequency of positives is 33%.
3000


2500


2000


1500


1000


 500


   0
                                                                              I positive
                                                                              j negative

FIGURE 3  Distribution of negative/positive outcomes for Salmonella reverse mutation.

C Yang et a/.
                                                                                                  284
                                       Previous
                                            TOC

-------
1000
 800
 600
 400
 200
J positive
  negative
                         #
FIGURE 4  Distribution of negative/positive outcomes for in
vitro mammalian mutation.

   The results in  Figures 3 to  6  also demonstrate that many
more positive studies have historically been reported in  the
in vitro chromosome aberrations  and mammalian mutagenesis
than in bacterial mutagenesis or in in vivo MN. In the MLA2
test, the proportion of positives is high for the  industrial
chemicals from CCRIS and NTP (71%); the same source also
gives 52% positives for in vitro chromosome aberration studies.
Drugs/candidates that have been selected for development seem
to result in much lower positives in the MLA tests (14% positive)
as well as in the in vitro chromosome aberrations (26%) tests
than industrial chemicals. One of the likely explanations is that
drug candidates that were found to be positive in these tests were
dropped  from development and their data were not reported.
The positive frequency trend of in vivo MN follows the pattern
of Salmonella mutagenesis: 7% for CDER drugs, 33% for  the
food ingredients from CFSAN, 43% for the chemicals in CCRIS
and NTP, and 33% for the integrated database. Of the CFSAN
compounds, more than 90% of the positive compounds in
Figures 3  to 6 are from CFSAN PAFA data; 70% to 80% of
these PAFA positive compounds are flavoring agents.
   Exploring correlations  between  observations across  the
genetic toxicity experimental results using a read-across strategy
requires data for many compounds and many different toxico-
logical endpoints. However, at a compound level, there were
only 65 total compounds out of 3,220 in common that had all
possible test results across the Salmonella, MLA, CHL, or CHO
in vitro chromosome aberration, and rodent MN endpoints. A
partial explanation for the low number of compounds with
a complete  test profile  is that  the  integrated  database does
not capture all available genetic toxicity data;  our stringent
2The MLA data included in the combined database were not evaluated
using the new global equivalency factor recommended by the International
Workshop for Genotoxicity Testing. Re-evaluation of the MLA data using
this factor would undoubtedly decrease the percentage of positive MLA
findings in this data set (Moore et al. 2006).
285
 inclusion criteria forced us to exclude many data points. Based
 on this  sparse data resource,  it is  impossible  to  obtain a
 meaningful correlation across the genetic toxicity endpoints at
 the compound level. However, a strategy  for resolving issues
 related to  data scarcity in the chemical-biological domain is
 presented in the next section. By expanding the chemical space
 from compounds to structural features, biologically meaningful
 and statistically valid correlations can be made.


     Chemical and  Biological Activity
                 Domain  Analysis

    In this  section,  the  biological activity domain  of  the
 integrated database is analyzed in greater detail at the compound
 level across the various endpoints, similarities and dissimilarities
 of findings between the tests  are compared, and  then  the
 analysis is  expanded to the feature level.  In compound-level
 analysis, a  specific endpoint outcome  of a single compound
 is used as an observation. In feature-level analysis, the  average
 outcome for the particular endpoint is calculated so that the
 observation is linked to  specific  chemical structural features
 that may be present in many compounds. Finally, structure
 domain profiling across  the four genetic toxicity endpoints
 is explored  at  the  feature  level.  Correlations  of selected
 structural  features with various endpoints are used  to link
 the observations in the two  domains, chemical structures and
 biology.

 Reverse bacterial mutagenesis profile
    Since the Salmonella test  is one of the most common and
 important genetic toxicity tests, detailed discussion at the strain
 level is  presented. Common questions asked when  testing is
 performed  using different  strains  include whether  there  are
 correlations between test results of different Salmonella strains
 and whether there  is subsequent structural  differentiation.
 Table 5 shows contingency tables for  the compound counts
 for the responses across four Salmonella strains. The numbers
 in parentheses represent the expected values of the appropriate
 null distribution; that is, these counts are what one would expect
 if the two strains being compared were completely independent
 (uncorrelated). Large differences between actual and expected
 counts suggest a significant association between two  strains.
 For example, only 55  compounds were found to be  TA100
 negative and TA1535 positive; under the null hypothesis, 235
 compounds were expected in this category. This table indicates
 that the concordance  (overall  agreement on  both  positives
 and negatives) of any  two  strains ranges from 85% to 90%.
 However, when examined separately, sensitivity (agreement on
 positives) and specificity (agreement on  negatives) values reveal
 some interesting patterns.  For positive TA1535 or TA1537,
 there is a 0.85 probability that TA100 is positive. If TA1537
 is positive, then the probability that TA98 is also positive is
 ~0.85, whereas the proportion of positive TA1535 compounds
 that are also TA98 positive is only 0.55. This may be understood
 from the standpoint of mutation mechanism differences (i.e.,
 base substitution vs. frameshift  mutation). It is interesting
 to note that the reverse observation is not always  true.  For
 example, only 50% of TAlOO-positive compounds are TA1535
 positive; similarly,  only 50% of TA98-positive compounds
 are positive  in TA1537.  This means that TA100 and TA98
 strains are much more sensitive and/or not differentiating than

Data Mining Integrated Multiple Genetic Toxicity Databases
                                Previous
                                    TOC

-------
                      1000
                       800
                       600
                       400
FIGURE 5  Distribution of negative/positive outcomes for in vitro chromosome aberration.
their  counterparts,  TA1535 and TA1537, respectively. This
observation can be explained by the fact that the pKMlOl
plasmid was introduced into the former two strains specifically
to increase sensitivity (McCann et al. 1975).
   If  TA98 is positive, then the probability  of  TA100 also
being positive is ~0.85; however, a positive TA100 outcome
corresponds to a positive TA98 outcome for only 68% of the
compounds. This suggests that the TA98 test may be a more
conservative test than TA100. This analysis may be interpreted
as an  indication of more frequent base substitution mutations
than frameshift mutations for the chemicals in this database.
Examining the negatives, if either TA100 or TA98 is negative,
then  probabilities that the other three strains will  also be
negative are >0.90 in both cases. These observations may justify
the use of TA98  and  TA100  strains  to make a  conservative
decision  to avoid false negatives. Based  on  historical data,
if the compound is negative in  both TA98 and  TA100,  the
compound has a >0.9 likelihood of being negative in all other
strains. A similar recommendation based on analysis of test data
across these Salmonella  strains for more limited chemical space
coverage was previously published by NTP (Mortelmans and
Zeiger 2000).
   This investigation also examined  the correlations between
the biological activity  response  and chemical  structural fea-
ture domains. Within  the chemical  domain, a compound is
described by  specific  chemical  features,  which  in  turn  are
  used for data mining of SARs. For simple visualization, the
  Salmonella  data  were interpreted  according  to the ToxML
  criteria for endpoint assessment (i.e.,  if any one of the four
  test strains is positive, then the Salmonella assay is considered
  positive). In Figure 7A, Salmonella outcome was then projected
  onto  the  chemical domain  of the integrated  database for
  three  of the different substance use types identified by PGA
  (Fig. 2B).  Not surprisingly, most of the hot spots are found
  within the industrial chemical  clusters located  in the center.
  Several  local islands within  this dense space  are  identified
  with high  frequency of Salmonella  mutagenicity,  including
  nitroso  groups,  aromatic polyhalides,  aromatic amines,  and
  epoxides.  In addition, a cluster along the diagonal of the
  PC2 and PCS  axis with positive PC2 scores clearly separates
  a group of drugs/candidates with positive Salmonella mutagenic
  potential (Fig. 7A). It's worth noting that these drugs are mostly
  antineoplastics and antivirals, which often have exemptions for
  positive genetic toxicity studies. About 30% of the  cluster of
  azo dyes and sulfonyl benzenes was associated with Salmonella
  mutagenicity. This analysis  gives a rationale for pursuing local
  group-based QSARs for Salmonella mutagenicity. Not only can
  structural domains be differentiated, but the biological profiles
  can also be differentiated using structural features.
     SAR analyses of Salmonella mutagenicity data of chemical
  structural features have been widely published (Benigni 2004;
  Vogel et al. 1998). Well-known chemical classes identified in
TABLE  5  Contingency table for Salmonella mutagenicity in the integrated database

TA100 +
TA100-




TA1535 +
248 (68)
55 (235)




TA1535-
250 (430)
1680(1500)





TA100 +
TA100-
TA1535 +
TA1535-


TA1 537 +
147(36)
28(139)
100(21)
72(151)


TA1 537 -
219(330)
1399(1288)
122(201)
1531 (1452)



TA100 +
TA100-
TA1535 +
TA1535-
TA1537 +
TA1537-
TA98 +
488(161)
124(451)
1 53 (44)
196(305)
144(27)
135(252)
98-
230 (557)
1885(1588)
130(239)
1743(1634)
29(146)
1493(1376)
  Values in parentheses are the expected compound counts based on the null hypothesis.
C Yang et al.
                                                        286
                                        Previous
TOC

-------
                        500


                        400


                        300


                        200


                        100
positive
negative
                                  Oil
FIGURE  6  Distribution of negative/positive outcomes for in vivo micronucleus.
those studies, capable of direct alkylation or cross-linking of
DNA molecules, are also found in the present databases. Table 4
summarizes the comparisons across the four Salmonella strains
for prominent structural features  in the database. The major
structural features associated with Salmonella mutagenicity are p-
alkyl halides (including nitrogen mustard), nitroso compounds,
epoxides, aziridines,  and O-alkyl  sulfites. Structural features
differentiating individual Salmonella strains can be also found.
For example,  epoxides, nitroso compounds, hydrazines, and
alkyl halides seem to preferentially induce point mutations of
TA100 and TA1535.  Of the two, TA100 seems to be affected
by much wider types of compound classes and, therefore, it is
much less discriminating.
   TA98 and  TA1537  strains reflect frameshift  mutations
and are associated with  intercalating  agents  or large reactive
groups.  Polycyclic  aromatic hydrocarbons (PAHs) provide a
good  example;  both TA1537 and  TA98  give positive re-
sponses to the PAH chemical class. Many of these PAHs are
also  associated with TAlOO-positive  responses,  but not with
TA1535-positive responses. Features  such as azo, 2-hydroxy
naphthalene,  secondary  aryl  amine(NH), and 1-alkylamino
benzene are positively associated with TA98 and TA1537. The
presence of furan,  quinoline, and 2-oxynaphthalene features in
a compound are associated with TA98 mutagenesis. However,
most of these heterocycles are nonmutagenic to Salmonella,
with the exceptions of quinoline,  pyrazole,  and a  small
number of well-known  compounds  containing aziridine  or
benzopyrans.
   There are  a number of structural features that correlate
positively with all Salmonella  strains,  especially with TA100,
TA1537, and TA98. These include  primary aryl amine, 2-
amino  naphthalene, nitrosamine,  quinone, p-alkyl halide, 4-
oxo benzopyran, quinoline, and 1,3-benzothiazole. Aromatic
amines  and  halides  are worth mentioning in more  detail.
Aromatic amines belong to  a class that requires transformations
to metabolites (e.g., by N-hydroxylation). As shown in Table 4,
                   primary aromatic amines affect all strains, whereas secondary
                   amines are associated with only frameshift mutations. Tertiary
                   aromatic amines are much less likely to induce positive effects.
                   This observation supports the notion that more complex modes
                   of action are involved in the aromatic amines. In the case of alkyl
                   halides, both primary and secondary alkyl halides are highly
                   positively associated with the three strains. The formation of
                   small DNA adducts via carbonium ion formation may explain
                   this observation. Aryl halides, on the other hand, do not easily
                   form carbonium ions and, hence, are not found to be highly
                   associated with positive results in TA1535  and TA98. Aryl nitro
                   groups, especially 1-nitro benzenes, are much more potent than
                   alkyl nitro groups across all four strains, which can be explained
                   by Sf^Ar reaction through the resonance effects of nitro groups
                   of the aromatic rings.
                      A combined analysis of the biological activity profiles in
                   Figures  3 to  6 and chemical structural  features  in  Table  4
                   allows us  to  next  consider  whether there are  preferential
                   nonconcordant  features  across the  Salmonella strains.  For
                   example,  are there structural features that are  consistently
                   positive for TA100 and  negative for TA1535? This answer
                   can  be found by comparing the two  groups of  compounds
                   for  TA100+/TA1535-  vs. TA100-/TA1535-. Aryl nitro  and
                   l-alkyl-4-amino(NH2)-benzene features belong to this group.
                   Features contributing preferentially to TA98 over TA1537 are
                   primary aromatic amines, alkenyl and aryl halides, and aryl nitro
                   groups. On the other hand, phenols,  l-R-3-hydroxybenzenes
                   or l-carbonyl-2-hydroxybenzenes, are associated positively with
                   TA1537 mutation but negatively with TA98. The  compounds
                   resulting in preferential  mutation of TA1535 over TA100
                   contain alkyl oximes and, in general, have more heterocycles
                   such as l,3-diazine(H).
                      One systematic way of identifying features that are preferen-
                   tial in this way is to employ multivariate analysis of selected
                   features.  In this  process,  instead of finding commonalities
                   of individual compounds within  a cell in the contingency
287
                  Data Mining Integrated Multiple Genetic Toxicity Databases
                               Previous
          TOC

-------
                                                                                                                     PC,
                              (a)
                                                      PC,
                                                   PC,
                                                                                          (d)
FIGURE 7   Mapping of genetic toxicity outcomes onto structural domains of the integrated database represented in the latent variable
space defined by the three most differentiating principal components. Red and green symbols denote positive and negative outcomes,
respectively. Gray points mark regions in the chemical space in which there are no toxicity data, (a): Salmonella reverse mutation, (b): In
vitro mammalian mutation, (c): In vitro chromosome aberration, (d): In vivo micronucleus.
table, selected structural features are directly correlated with the
biological profiles. Mathematically, two matrices are prepared:
ST [structure x strain calls] and SF [structure x features]. Matrix
multiplication (eq 2) yields matrix FT [features  x strain calls].
Analysis of the resulting matrix of structural features  against
the mutagenicity across  various  strains allows  more in-depth
questions on the structure-toxicity relationships to be addressed.
   From the Leadscope structural hierarchy, 83  chemical fea-
tures were selected (as described in the Materials  and Methods
section) as descriptor variables to correlate with the four strain
outcomes. A partial but representative list of statistical results is
given in Table 4 by way of example. Figure 8 provides a multiple
scatterplot  of the outcomes of the  four strains  based on the
83 feature observations, each point representing a feature. The
C. Yang et a/.
                                                        288
                                          Previous
                 Next

-------
                                        halide, alkyl-
                                        halide, p-alkyl-
          TA100
                      TA1535
   Pearson cor relations
   TA100-TA1535:
   TA100-TA1537:
   TA98 - TA1537:
   TA93-TA1535:
0,93
0.73
080
0.84
060
                                        •
                                 V ••:
                                '•  '• '
                                  TA1537
                                             v--
TA98
FIGURE 8  Scatter plots of feature z-scores computed from the
outcomes of each Salmonella strain and compared pair-wise with
the other three strains. This analysis used all data in the integrated
database.
values plotted here are the z-scores of the mutagenic outcome
between 0 and 1. The outcomes of TA100 and TA98 test strains
are 93% correlated, whereas the correlations forTA100/TA1535
and TA98/TA1537 are 73% and 84%, respectively. However, it
is important to emphasize that the purpose of this  scatterplot
matrix is  not to  simply look for high  Pearson correlation
coefficients; rather,  it is to easily identify concordant  and
nonconcordant features.
   Between TA100 and TA1535, 4-oxo benzopyran is one of
the nonconcordant feature groups for positive TAlOO/negative
TA1535. Between  TA98  and TA1537,  l,3,5-triazine(H), furan,
and s-alkyl halide features are associated with negative TA1537
regardless of TA98. Features such as 1-amino (primary) naph-
thalene, 1-oxy naphthalene, and O-alkyl sulfonates are TA1537
positive regardless of the outcomes of other strains.  Between
                 TA100 and TA98, aziridine, nitrosamine, nitroso, and epoxide
                 features are nonconcordant. Nonconcordant features can enrich
                 the structural rules  that differentiate between strains.  This
                 knowledge can also be used to establish testing strategies. For
                 example, if a compound contains features that the combination
                 of TA100 and TA98 strains can detect for mutagenic potential,
                 then testing  the  strain  combination should be sufficient.  If
                 a compound contains  features such  as  4-oxo benzopyran,
                 l,3,5-triazine(H),  furan,  oximes, hydrazine, or  s-alkyl halide,
                 then it will  be  desirable  to test  TA1535/TA1537  as  well.
                 These observations are consistent with previous findings in the
                 literature (Prival and Zeiger 1998).
In  vitro mammalian mutagenesis
   Although the mammalian mutagenesis data in the integrated
database include V79 and CHO HPRT as well as MLA assays,
due to the limited number of available studies for the former
two, only the MLA studies are discussed in detail. Sixty percent
of  the  639  compounds  are found  to  be mutagenic when
mouse lymphoma cells are used (Fig. 4). Figure 7B is a result
of projecting the MLA outcomes  onto the chemical domain
depicted in  Figure 2B. The positive  cluster of MLA is more
densely populated in the industrial chemical space, but smaller
than the region  occupied by the Salmonella-positive cluster. In
addition, the positive frequency of MLA varies depending on
the substance use type. Of the 639 compounds with MLA data,
there are 163  drugs/candidates and 61 food ingredients; the
percent positives is  34% for the  drugs/candidates,  49%  for
food ingredients, and 72%  for industrial chemicals,  resulting
in 60% total positive. In general, the positive frequency in the
integrated database increases in the order of drugs/candidates,
food ingredients, and industrial chemicals as shown in Table 6.
This observation raises an important point: when dealing with
correlations  across genetic toxicity endpoints, it is important
to profile the relationships by both  compound  classes  and
substance use  types.  For example,  aryl alcohol  groups in
drugs/candidates seem to give higher positive rates in MLA tests
than in food ingredients and industrial chemicals. Quinolines
have a stronger influence on the industrial chemicals than on
the drugs/candidates  and food ingredients. Table 7 illustrates
TABLE 6  Reported genetic toxicity outcome by substance use types in the integrated database

                                                                   % Positives

Substance use type
Drugs/candidates1
Food ingredients2
Industrial chemicals3
Total positive (%)

Salmonella
19.1
17.0
41.6
33.1

MLA
30.4
49.2
71.6
56.2
In Vitro
CA
38.6
29.5
50.4
44.1
In Vitro CA
(CHL)
52.2
21.6
76.4
46.1
In Vitro CA
(CHO)
46.2
40.4
47.3
46.6
In Vivo
MN
25.4
37.3
50.0
33.3

Total
882
391
1,945
3,218
  1The drugs/candidates in the integrated database include drugs from the FDA CRADA CDER 2006 genetic toxicity database and drugs/candidates
from private sources. Drugs from FDA CDER do not include all historical marketed drugs information.
  2The food  ingredients data in the integrated database combined the FDA CRADA CFSAN 2006 database with a small amount of additional data
from private sources. An examination of the CFSAN data indicates that compounds that tested positive in one or more of the genetic toxicity assays are
disproportionately associated with chemicals used as flavors. Risk assessors account for genotoxic test results in conjunction with other information as
part of their overall chemical risk assessment of these substances.
  3The industrial chemicals in the integrated  database include a wide variety of substance use types (agricultural  chemicals, surface active agents,
solvents, organic catalysts, etc.), which are not further  differentiated in this paper. The public sources of the industrial chemicals are mostly from the
public databases, not from the regulatory agencies. Only a small number of chemicals were added from the private sources.
289
               Data Mining Integrated Multiple Genetic Toxicity Databases
                                 Previous
       TOC

-------
TABLE 7  Interdependency of some features and substance use types across the genetic toxicity endpoints
Chemical features
Aromatic p-amines


Alcohol, aryl


Halides, alkyl

Ketone, alkenyl, cyc-

Imidazole


Quinoline


Substance use
type
Drug
Food
Industry
Drug
Food
Industry
Drug
Food
Industry
Drug
Food
Industry
Drug
Food
Industry
Drug
Food
Industry
Total
frequency
58
3
353
55
54
237
52
6
157
30
19
39
45
8
25
32
5
23

Salmonella
2.22
-0.64
11,14
0.99
137
1.86
-0.53
f3.11
5.98
-0.11
2.87^H
-1.46
-1.32
-1.12
2.29
-0.35
0.43
4.72
Average
MLA
1.55
none
2.74
3.47
0.80
1,34
1.56
1.45
0.79
1.22
none
-0.11
1.76
1.45
-0.19
0.44
-0.98
1-42
z-score
In vitro CA
0.67
-,,3
4.39
0.33
1,65
!-0.20
0.84
-0.47
2.07
143
100
125
1.85
2.02
1 12
1.79
None
2.64


In vivo MN
1.11
-0.77
-0.96
0.80
| 0.89
-2.53
0.64
None
-1.47
2.17
1.56
1.75
-0.77
1.85
-0.58
-0.23
-0.77
-1.00















   Table cells are shaded gray if absolute z-scores > I. Cells with z-scores >2 are in bold fonts to indicate a strong association with positives.
the effects of these two factors, structural features and substance
use types, on the genetic toxicity outcomes.
   When comparing both mammalian and bacterial mutations,
the MLA gives  more positive  findings than  the Salmonella
mutagenicity assay. Endpoints  for the Salmonella assay and
MLA are  concordant (either both positive  or  both  negative)
for 30% of the compounds in the database, whereas only ~5%
(32 compounds) are Salmonella positive and MLA  negative.
A  feature that is Salmonella positive is very likely to also  be
MLA positive. There are features preferentially associated with
positive MLA  results regardless of Salmonella outcomes. They
include s-alkyl halide, aryl aldehyde, acyclic alkyl hydrazine,
alkylamino  benzene, thioxomethyl,  1,3-benzodioxole,  and
furan. The azo group was associated with positivity in Salmonella
but not in MLA, whereas epoxides are positive in both tests.

In vitro chromosome aberration
   Two study types accepted by regulatory agencies to estimate
the clastogenic potential  of a  compound are the in vitro
chromosome aberration and in vivo MN tests. In this integrated
database, historical results from the CHL and CHO cell lines,
as  well as human blood cells, were available for comparisons.
Of the 793  compounds, 228 have CHL data and 638 have
CHO; however, only 73 compounds have been tested in both
CHL and CHO cells; hence, the comparison of the two at the
compound level is not reliable.
   The in vitro chromosome aberration endpoint has many
more positive outcomes  compared to the in vivo MN assay, as
illustrated in Figures  5, 6, and 7c-d. Of the 986 compounds
with  reported values at this endpoint, 44%  were  positive.
Further breaking down to the cell lines showed that 46% of

C  Yang et al.
  the compounds  were positive in CHL  (total 228) or CHO
  (total 638) cell lines. Also, many more positives are found in
  the industrial chemicals group than  in  the drugs/candidates
  group,  and a much higher positive frequency was found for
  industrial chemicals when tested with CHL cell lines (76%)
  than with CHO (47%)  or human blood cell lines. A lower
  positive frequency was found for the drugs/candidates group
  when human peripheral blood cell lines were used (27%). In
  contrast, the positive frequency  of the  CHO cell lines  was
  consistently 40% to 50% across the substance use types. These
  results are summarized in Table 7.
     It is worth noting that there were no drugs/candidates from
  private sources added for in vitro chromosome aberrations to
  the integrated database.  In general, in this  database a much
  higher frequency of positives are found in the in vitro chromo-
  some aberration studies than in the  in vivo micronucleus.
     As shown in Table 4, a portion of the same features triggering
  positives for Salmonella and MLA  results are also associated
  with positive outcomes for in vitro chromosome aberrations.
  Important features include aromatic amines, alkyl halides, nitro
  groups, nitroso groups, epoxides,  and furans. Primary aromatic
  amines  and  alkyl  halides preferentially are associated with
  positives in  industrial chemicals (Table  7). Features such as
  alkenyl  ketones, imidazoles, purines, nucleoside  bases,   and
  quinolines are also associated with positive results in  the in
  vitro chromosome aberration test.

  In vivo micronucleus
     Most of the in vivo MN studies in the integrated database
  were performed  using mouse  models.  For this analysis, rat
  and mouse data with bone marrow and peripheral blood as
                                                       290
                                        Previous
TOC

-------
target cells were combined  under a general  classification as
"rodent" to give a more statistically sound base. For the rodent
MN, a  smaller number of positive studies (34%) was  found
compared to the in vitro chromosome aberrations. Of the total
381 compounds, 211 were drugs/candidates and 26% of these
were positive. Food ingredients and industrial chemicals make
up  18% and 24%  of the MN data, respectively; 37%  of the
food ingredients and 50% of the industrial chemicals were
positive.
   When correlating structural features to a biological outcome,
many of  the prominent features  that were  associated with
other genetic toxicity endpoints, including in vitro chromo-
some aberrations, are not highly correlated with MN results.
The features  associated positively with  MN  are nucleoside
bases, cyclic alkenyl alcohols, cyclic alkenyl  ethers,  alkenyl
ketones, nitrosamines, nitroso groups, 1,3-dioxanes, 2,4-dioxo
pyrimidines, and cyclic alkenyl  iminomethyl groups. A  set
of features that are not positively correlated with the other
Genetox endpoints, but only with both clastogenicity tests, in-
clude alkenyl ketones (Michael acceptors), etoposides, purines,
oxolanes,  and alkyl chlorides. Within the integrated database,
there  are  several compound  clusters that were found to be
positive in either the in vitro  chromosome aberration or
in vivo  MN assay. They  show similar structural features to
etoposides, ellipticines, or flavonoids, containing one or more
combinations of anthraquinones, purines, oxolanes, imidazoles,
indoles, and benzopyran  groups.  These structures are well-
known topoisomerase I and II inhibitors, which may be related
to the topoisomerase-induced clastogenicity (Snyder and Gillies
2002; Lynch et al. 2003). The detailed feature analysis in Tables 5
and 7 may give insights to improve the understanding of the
clastogenicity knowledge base.


Profile  across genetic toxicity endpoints
   The  final  process  of learning  from  these data involves
profiling the  structural domains with  genetic toxicity based
on  the  four individual  study endpoints  and their  structural
relationships. The same process of expanding the compound-
level observations  to multidimensional profiles  of structural
features  is  adopted. Table 8 provides two-way contingency tables
across the Salmonella, MLA, in vitro  chromosome aberration,
and in vivo MN assays. Concordances (overall agreement)  for
these two-way comparisons of the endpoints range from 60% to
75%. However, the sensitivity (agreement on positives) of the
contingency tables ranges widely, from 36% to 83%, whereas the
specificity (agreement on negatives) is approximately 55%  for
all pairs. Salmonella and MLA outcomes  are 83% concordant
          for positives, but only 57% concordant  for negatives  based
          on a total of 612 compounds. Although only based on  180
          compounds, findings on in vitro chromosome aberrations  and
          in vivo MN results are interesting. Only about 37% of the
          compounds were positive in both in vitro and in vivo testing,
          which is one source of the concern for the validity of using
          in vitro chromosome aberration for clastogenicity potential
          assessment. However, if a compound is negative in the in vitro
          test, then there is a 94% chance that it is also negative in in
          vivo testing. Considering that in vitro chromosome aberrations
          showed  only ~55%  specificity (agreement on  negatives) to
          Salmonella and MLA assays, the high specificity to the MN
          assay is worth noting.
            In general, Salmonella and MLA results exhibit an increas-
          ing proportion of positives  from  drugs/candidates, to food
          ingredients, to industrial chemicals as listed  in Table 6. In
          contrast, the chromosome aberration test using CHL cells  had
          higher positive rates for drugs/candidates than food ingredients.
          Industrial  chemicals  had the  highest positive  frequency in
          all tests.  Sixty percent of the MN results were  based on
          compounds in the drugs/candidates group, while only 19% of
          the in vitro chromosome aberration data were from this same
          group. Industrial chemicals make up ~60% of the chromosome
          aberration and 24% of the MN studies.  There  may be  two
          reasons for  the  higher positive frequency for the in vitro
          chromosome aberration results in the integrated database. One
          is that the biology of the in vivo MN is  more differentiating
          than that of in vitro chromosome aberration. The other is  due
          to an artifact of the integrated database, where there were more
          data from industrial chemicals for the in vitro  chromosome
          aberration assay and more  drugs/candidates for the in vivo
          MN assay.  Overall, the lower frequency of positives in the
          drug/candidates and food ingredients groups compared to the
          industrial chemical group is  a  reflection of the  greater safety
          margins applied to compounds intended for consumption by
          humans.
            As  explained  in previous discussions  under specific end-
          points, Table  7  summarizes chemical features having large
          variations across substance use types (drugs/candidates, food in-
          gredients, industrial chemicals) and genetic toxicities. Aromatic
          primary amines correlate strongly with high positive z-scores
          for industrial chemicals and drugs/candidates for Salmonella,
          in vitro chromosome aberration, and MLA tests, whereas the
          effects on the food ingredients were somewhat negative. For in
          vivo MN, drugs/candidates were still positively correlated, but
          the industrial chemicals were not. For the cyclic alkenyl ketone
          feature, z-scores are very similar and positive for all substance
          use types for both in vitro chromosome aberrations and in vivo
TABLE 8  Contingency table for genetic toxicity outcomes in the integrated database

Salmonella +
Salmonella —



MLA + MLA-
161(110) 34(85) Salmonella +
185(236) 232(319) Salmonella -
MLA +
MLA-

In Vitro
CA +
219(146)
191 (264)
156(124)
34 (66)

In Vitro
CA-
98(171)
381 (342)
92(124)
97 (75)


Salmonella +
Salmonella —
MLA +
MLA-
In Vitro CA +
In Vivo
MN +
18(10)
32 (40)
156(124)
34 (66)
30(16)
In Vivo
MN-
31 (39)
171 (48)
92(124)
97 (75)
50 (64)
  Values in parentheses are the expected compound counts based on the null hypothesis.
291
        Data Mining Integrated Multiple Genetic Toxicity Databases
                                Previous
TOC

-------
           nitre, heteraamino benzene
             Salmonella
                           MLA
Pearson correlations:
Salmonella - MLA:   0.72
Salmonella - MCA:   0.68
MLA - ivtCA:        0.67
MLA - MN:         0.42
MCA - MN:         0.33
                                     ivtCA
                          ketone, alkenyl
                          ketone. cyclic altenyl
                                                 MN
FIGURE 9  Scatter plots of feature z-scores computed from the
outcomes of each genetic toxicity test and compared pair-wise
with the other three tests.  Test abbreviations: Salmonella (all
four strains), MLA (mouse lymphoma mutation), ivtCA (in  vitro
chromosome aberration), MN (in vivo micronucleus).
MN. On the other hand, the same feature shows very different
patterns  for other endpoints  (e.g., Salmonella mutagenicity).
Such observations explain why it is difficult to draw meaningful
conclusions based solely on  compound classes  and a single
endpoint. A better approach is to describe compounds at the
feature level  for diverse substance use types  profiled across
multiple  endpoints.
   For building structural trends,  the  four study types were
compared using the multivariate  analysis method  described
in the Materials and Methods section.  A total of 203 features
were extracted from the Leadscope chemical hierarchy using the
systematic feature selection method. Table 4 lists examples  of
these features and Figure 9 displays the pair-wise correlations
between  the  genetic toxicity assays for the selected features
as observations (points on  the  plots).  As  expected  from
the contingency  table  analysis, Salmonella reverse  mutation
correlates quite well with in vitro chromosome aberration as well
as with MLA. As mentioned before, the in vitro  chromosome
aberration tests were correlated more closely with Salmonella
and MLA mutations than with clastogenicity of the in vivo MN
studies.
   The correlating features for each pair are identified in Figure 9
and Table 4. The well-known genotoxic  compound classes such
as nitroso/nitrosamines, nitro  groups, phosphorus-containing
groups, and quinones  are associated positively with all  four
study types in genetic toxicity endpoints. Aldehydes  are highly
associated only with the MLA. Aromatic amines significantly
affect all of  the  in  vitro genetic  toxicity  tests, but do not
have a large impact  on the in vivo MN test.  The  azo group
is positively associated with Salmonella mutagenicity, but not
with other endpoints.  PAHs with the 1,2,3,4-fused benzene
feature show high positive correlation  with mutagenicity but
not with  clastogenicity endpoints.  Alkyl halides and epoxides
have positive effects in  all of the genetic toxicity tests except in
the MN  test. Nucleoside bases do  not  impact Salmonella, but

C Yang et al.
  influence other genetic toxicity endpoints. On average, phenols
  and thioxomethyl groups contribute positively only to MLA.
  Compounds with imidazole groups and quinoline features are
  preferentially contributing  to positive in vitro chromosome
  aberrations.


        Validation  of Structural Rules

     The ultimate goal of data mining is to transform all of
  the observations discussed above into knowledge from which
  structural  feature rules  can then be  built.  The contingency
  table and multivariate analyses provide systematic methods to
  explore the data. The objective of this paper is to describe the
  analysis process; the next step-building structural alerts from
  the extracted knowledge-will be demonstrated in a later paper.
  However, the structural rules available in the current literature
  can be applied to  the  integrated database  to evaluate their
  validity. As an example, Table  9 shows how Ashby-Tennant
  alerts (Ashby 1985) are correlated with the data in the integrated
  database across the  four endpoints  using the z-score  statistic
  as  a quantitative metric. From Table 9, roughly two-thirds of
  the rules are  appropriate to Salmonella mutagenicity.  Within
  the integrated database, these published rules (Kazius et al.
  2005) show correct classification rates of 60% to 70% for
  Salmonella mutagenicity in  vitro chromosome aberration and
  MLA assays, but only 39% for in vivo MN results. Most of
  the published  rules  were not statistically  significant based on
  this integrated database. Again, this  is not surprising since
  60% of the compounds with MN results in this database are
  from  the  drugs/candidates group,  a  group whose  chemical
  space the Ashby-Tennant rules  or other published rules  may
  not fully represent.  In this study, due to  the addition  of a
  private collection with more  MN data, new features  such as
  cyclic alkenyl  ketones,  cyclic alkenyl ether, and nucleoside
  bases  are found. The cyclic alkenyl ketone ether group was
  preferential to in vivo MN, whereas  alkenyl ketone (general
  Michael acceptor) and nucleoside base features represent both
  in  vitro chromosome aberration and  in vivo MN outcomes.
  Not many  compounds in the drugs/candidates group in the
  integrated database contained Michael acceptor-related features.
  These features are not yet considered rules since the statistical
  base of the data in the integrated database was not large  enough
  and the chemical structures were too diverse. It is important
  to  point out again that a substructural fragment by itself is not
  necessarily sufficient to define an alert, but that combinations
  of such features in  a substance are necessary to  explain the
  significance and mode of action of each.


        Weight-of-Evidence Approach
                      by Profiling

     Genetic toxicity  studies are  often viewed as  a battery of
  surrogate screening  tests for  carcinogenicity. In some cases,
  single-endpoint approaches have been shown to work fairly
  well; for example, Salmonella mutagenesis is known to be a good
  predictor of genotoxic carcinogenicity (Benigni  et al. 2000).
  However, many previous publications have convincingly argued
  that  using  individual genetic toxicity endpoints  to estimate
  carcinogenicity will generally not give a complete understanding
  and will often give unsatisfactory predictions; these researchers
  conclude that what is needed are methods of using information
                                                        292
                                        Previous
TOC

-------
TABLE 9   Some structural rules across the genetic toxicity endpoints
                                                                               z-sco re
Chemical features
Total frequency     Salmonella
                  MLA
             In vitro CA
In vivo MN
Ashby-Tennant alerts
N-chloramine
N-methylol
alkyl aldehyde
Alkyl hydrazine (R2NNH2)
Alkyl hydrazine (R2NNR2)
Alkyl hydrazine (RNHNH2)
Alkyl hydrazine (RNHNHR)
Alkyl phosphonate
Aromatic N-oxide
Aromatic amine (NH2)
Aromatic amine, N-hydroxy
Aromatic amine, dialkyl
Aromatic azo
Aromatic nitro
Aryl methyl halide
Aziridine
Epoxide
Monohaloalkyene
Nitrogen mustard
IMitrosamine, dialkyl
Primary alkyl halide
Propriolactone
Urethane derivatives
Features for clastogenicity
  from the integrated
  database
Cyclic alkenyl ketone ether
Alkenyl ketone
Nucleoside bases
      930
        5
        3
       24
        2
        1
        4
        4
        8
        5
      414
        2
       76
       79
      199
        8
        6
       59
       88
        8
       24
       88
        4
       51
       30
       34
       28
 0.80
-0.60
-1,0
0.50
0,10
                                                      0.83
 2.7
    2.4
    2.9
    2.8
I
across multiple endpoints to give  more reliable predictions.
Models based on a WOE approach provide exactly this type of
advantage (Matthews et al. 2006b). WOE methods are diverse,
varying  from heuristic rules  to  Bayesian  optimization.  The
multivariate analysis is one way to quantify WOE. In this paper,
we are taking a step toward multiple domain correlations using
structural features and the endpoint outcomes represented by
the features.
   The notion that MLA and in vitro chromosome aberration
experiments give too many false  positives originates from the
correlations of these endpoints individually to carcinogenicity
(Matthews et al. 2006a). From the perspective of feature profiling
based on  the  battery of genetic toxicity  tests, it would be
better to view MLA and  chromosome aberration outcomes
not as  individual predictors,  but rather as components  of a
multivariate approach in which the aggregate profile over all
endpoints provides a biological fingerprint  for any compound
of interest. As long as the assays are biologically meaningful
and are not used  in isolation to  correlate a particular genetic
toxicity endpoint to carcinogenicity, the high percentage  of
negatives in the Salmonella data tests and high percentage  of
                               positives in the MLA data tests are in fact giving insights about
                               the biological activities of the compound classes and chemical
                               structural features. If a certain combination of positive features
                               shows up more frequently in the industrial chemicals, then it is
                               reasonable to expect to find more positives when the compound
                               makeup of the data source is mostly industrial chemicals. It may
                               be possible to use all of the in vitro genetic toxicity endpoints
                               as assays (in a preliminary screen) and to include the aggregate
                               profile of these results to correlate to in vivo toxicity endpoints
                               or to carcinogenicity. The  multivariate method described in
                               this paper can be applied to assess the WOE quantitatively. The
                               selected features for these multivariate correlations in fact are
                               candidates for the predictors in a QSAR model.
                                  Lastly, one of the most important potential applications of
                               this profiling method is to guide development of intelligent
                               testing strategies. These  methods can help researchers decide
                               what chemicals should be  tested and what assays  should be
                               employed in order to obtain reliable toxicity data as efficiently
                               as possible  and to  avoid  unnecessary animal testing.  Data
                               mining to establish profiles  of chemicals both in chemical and
                               biological domains will  assist such  decisions. For example, if
293
                             Data Mining Integrated Multiple Genetic Toxicity Databases
                                Previous
                     TOC

-------
a drug/candidate contains a primary aromatic amine  and if
this  feature  is a major reason  for concern, then a positive
observation in Salmonella mutagenicity may signal a concern
for other genetic toxicity endpoints because positive z-scores
are observed for all tests for this particular feature-substance
type combination (Table 7) and also because of the relatively
strong correlation of Salmonella results with the other three tests
(Table 8  and Figure 9),  and testing of in vivo MN may be
considered. A negative observation in the Salmonella assay for
this group indicates that both chromosome aberration and MN
testing may not be necessary because of the strong correlation
of Salmonella mutagenicity and in vitro chromosome aberration
for this primary aromatic amine group (Table 7) and because of
the observation that in vivo MN is most likely negative when in
vitro chromosome aberration is negative (Table 8). On the other
hand,  if an  industrial  chemical contains  a primary aromatic
amine or alkyl halide group, the chemical is likely to be positive
in three of the in vitro tests if these features are responsible for
the observed toxicity. Industrial chemicals  with these features
alone do not usually trigger the in vivo MN outcome, as  shown
in Table 7.
   Although genetic toxicity tests,  in  general, are  usually per-
formed simultaneously as a full battery for the sake of efficiency,
the principle of intelligent testing strategies demonstrated here
can be applied to other types of biological safety tests. Overall,
intelligent testing strategies can be developed by understanding
the interaction between structural features and the molecular
makeup  defined in  the substance use types and  their  effects
across the various genetic toxicity and clinical endpoints. Thus,
this  paper presents  data  mining and profiling methods for  a
WOE  approach  to assess toxicity and for guiding the general
development of intelligent testing strategies.
            ACKNOWLEDGMENTS

   The authors dedicate this paper to the memory of Dr. Gary
Hollingshaus  at DuPont whose vision of applying predictive
data  mining  within the  industrial workflow  instigated  the
formation of the Leadscope In Silico Toxicology (LIST) focus
group.  The idea of ToxML  and construction  of aggregating
databases was borne from many discussions with  Drs. Gary
Hollingshaus, Philip Lee, and Dan Kleierin Haskell Laboratory.
The  authors  also  thank Leadscope  focus group  members
Mitchell  Cheeseman,  Yan  Gu, Dale Johnson, Julie  Mayer,
Donna Morrall, Richard  Mueller  (deceased),  Chad Nelson,
Grace Patlewicz,  Gregory  Pearl, Rene  Sotomayor, Michelle
Twaroski,  Anita White,  and  Alan Wilson. Funding  for  the
ToxML database  and  LIST  focus group  was  provided  in
part by the U.S.  NIST Advanced Technology  Program  (70
NANB4H3003).
   Disclaimer.  This paper does not reflect the policies  of
the U.S. FDA CDER, U.S. FDA CFSAN, or U.S. EPA.  The
content of this paper is only a reflection of scientific research
conducted in one of many collaborations with the agencies.
Proprietary structures, toxicity data, and other information from
private sources  were not disclosed  outside of Leadscope Inc.
The information presented in this paper is derived solely from
analysis of structural feature statistics. The status of the testing of
these chemicals during lead discovery and product development
has never been disclosed.
                      REFERENCES

  Ashby, J. 1985. Fundamental structural alerts to potential carcinogenicity
       or non-carcinogenicity. Environ. Mutagen. 7:919-921.
  Ashby, J.,  and Tennant, R. W. 1991. Definitive relationships  among
       chemical  structure,  carcinogenicity  and mutagenicity for 301
       chemicals tested by the U.S. NTP. Mutat.  Res. 257:229-306.
  Benigni,  R. 2004. Computational prediction of drug toxicity: The case
       of  mutagenicity and carcinogenicity.  Drug Discov. Today Techno!.
       1(4):457-463.
  Benigni,  R., and  Giuliani, A. 1994. Quantitative modeling and biology:
       the multivariate approach. Am. J. Physiol. 266:R1697-R1704.
  Benigni,  R., Giuliani, A., Franke, R., and Gruska, A.  2000. Quantitative
       structure-activity relationships of mutagenic and carcinogenic aro-
       matic amines. Chem. Revs. 100:3697-3714.
  Benigni,  R., Netzeva, E., Bossa, C, Franke, R., Helma, C, Hulzebos,  E.,
       Merchant, C., Richard, A. M., Woo, Y. I, and Yang, C. 2007. The
       expanding  role of predictive toxicology: an update on the  (Q)SAR
       models for mutagens and carcinogens. J. Environ.  Sd.  Health C.
       Environ. Carcinog. Ecotoxicol. Rev. 25:1-43.
  Chemical  Carcinogenesis   Research   Information   System.   2007.
       Available  on  the   Internet  at  http://toxnet.nlm.nih.gov/cgi-
       bin/sis/htmlgen?CCRIS. Accessed November 28, 2007.
  Cimino,  M. C. 2006.  Comparative  overview of current international
       strategies and guidelines for genetic toxicology testing for regu-
       latory purposes. Environ. Molec. Mutagen. 47(5):362-390.
  Contrera, J. F, Matthews, E. J., Kruhlak, N. L, and Benz, R. D. 2005.
       In silico screening of chemicals for bacterial mutagenicity using
       electrotopological E-state indices and MDL QSAR software. Regul.
       Toxicol. Pharmacol. 43:313-323.
  Crebelli, R., Carere, A., Leopardi, P., Conti, L., Fassio, F., Raiteri, F., Barone,
       D., Ciliutti, P., Cinelli, S., and Vericat, J. A. 1999. Evaluation of  10
       aliphatic halogenated  hydrocarbons  in the mouse  bone marrow
       micronucleus test. Mutagenesis 14(2):207-21 5.
  Darroch, J., and Mosimann, J. E. 1985. Canonical and principal compo-
       nents of shape. Biometrika 72:241-252.
  Environmental Protection Agency National Centerfor Computational Tox-
       icology. 2007. http://www.epa.gov/comptox/index.html. Accessed
       November 2, 2007.
  Food and Drug Administration CFSAN EAFUS. 2007. U.S. FDA Center
       for Food Safety and Applied  Nutrition, Everything Added to Food
       in the United States,  http://www.cfsan.fda.gov/~dms/eafus.html.
       Accessed November 28, 2007.
  Jacobson-Kram, D.,and Contrera, J. F. 2007. Genetic toxicity assessment:
       employing the best science for human safety evaluation part I: early
       screening for potential human mutagens. Toxicol. Sci. 96(1):16-20.
  Jolliffe, I. T. 2002. Principal Component Analysis, 2nd ed. Springer Series
       in Statistics, Springer-Verlag, New York.
  Kazius, J., McGuire, R., and Bursi, R.  2005. Derivation and validation of
       toxicophores for mutagenicity prediction. J. Med. Chem. 48:312-
       320.
  Kirkland, D. J., Aardema,  M.,  Banduhn, N., Carmichael,  P.,  Fautz,  R.,
       Meunier, J. R., and Pfuhler, S. 2007. In vitro approaches to develop
       weight  of evidence (WoE) and  mode of  action (MoA) discussions
       with  positive in vitro genotoxicity results. Mutagenesis 22(3): 161-
       175.
  Kirkland, D., Aardema, M., Henderson, L, and Miiller, L. 2005. Evaluation
       of  the ability of  a battery of three in vitro genotoxicity tests to
       discriminate rodent carcinogens and non-carcinogens. Mutat. Res.
       584(1-2): 1-2 56.
  Lynch, A., Harvey, J., Aylott, M., Nicholas, E., Burman, M., Siddiqui, A.,
       Walker, S., and Rees, R. 2003. Investigations into the concept of
       inhibitor-induced clastogenidty. Mutagenesis 18(4):345-353.
  Matthews, E. J., Kruhlak, N. L., Cimino, M. C., Benz, R. D., and Contrera,
       J. F.  2006a. An analysis of genetic toxicity,  reproductive and
       developmental toxicity, and carcinogenicity data: I. Identification of
       carcinogens using surrogate endpoints. Regul. Toxicol. Pharmacol.
       44:83-96.
C. Yang et a/.
                                                           294
                                           Previous
TOC

-------
Matthews, E. J., Kruhlak, N. L, Cimino, M. C., Benz, R. D., and Contrera,
     J.  F. 2006b.  An  analysis  of  genetic toxicity,  reproductive  and
     developmental toxicity,  and carcinogenicity data: II.  Identification
     of genotoxicants, reprotoxicants, and carcinogens  using in silico
     methods. Regul. Toxicol. Pharmacol. 44:97-110.
McCann, J., Spingarn, N. E., Kobori, J., and Ames, B. N. 1975. Detection
     of carcinogens as mutagens: bacterial tester strains with R factor
     plasmids. Proc. Nat. Acad. Sd. USA 72(3):979-983.
Moore, M. M.,  Honma,  M.,  Clements, J., Bolcsfoldi,  G., Burlinson, B.,
     Cifone, M., Clarke, J., Delongchamp, R., Durward, R., Fellows, M.,
     Gollapudi, B., Hou, S., Jenkinson, P., Lloyd, M., Majeska, J., Myhr,
     B.,  O'Donovan, M., Omori, T.,  Riach, C., San, R., Stankowski,  L.
     F. Jr., Thakur, A., Van Goethem, F., Wakuri, S.,  and Yoshimura, I.
     2006. Mouse lymphoma thymidine kinase  gene mutation assay:
     follow-up meeting  of the International Workshop on Genotoxicity
     Tests-Aberdeen, Scotland, 2003-assay acceptance criteria, positive
     controls, and data evaluation. Environ. Mol. Mutagen. 47:1-5.
Mortelmans, K., and Zeiger, E. 2000. The Ames Salmonella  microsome
     mutagenicity assay. Mutat.  Res. 455:29-60.
National  Toxicology  Program  (NTP)  On-line Database. 2007. National
     Institute of Health & Environmental Sciences NTP On-line Database.
     http://ntp-apps.niehs.nih.gov/ntp_tox/index.cfm.  Accessed Novem-
     ber 2, 2007.
National  Toxicology Program NICEATM. 2007. The  NTP Interagency
     Center for the Evaluation of  Alternative Toxicological Methods
     (NICEATM);  Interagency Coordinating  Committee  on the  Val-
     idation of Alternative  Methods (ICCVAM).  http://iccvam.niehs.
     nih.gov/methods/methods.htm. Accessed November 2,  2007.
O'Brien, J., Renwick, A. G., Constable, A., Dybing, E., Muller, D. J. G.,
     Schlatter, J., Slob, W., Tueting, W., van  Benthem, J., Williams, G.
     M., and Wolfreys, A. 2006. Approaches to the risk assessment of
     genotoxic carcinogens  in food:  A critical appraisal. Food Chem.
     Toxicol. 44(10):!613-1635.
           Prival, M. J., and Zeiger, E.  1998. Chemicals mutagenic in Salmonella
                typhimurium strain TA1535 but not in TA100. Mutat. Res. Genetic
                Toxicol. Environ. Mutagen. 412(3):251-260.
           Richard, A. M. 2006. Future of toxicology—predictive toxicology: an ex-
                panded view of chemical toxicity. Chem. Res. Toxicol. 19(10):1257-
                1262.
           Richard, A., Yang, C., and Judson, R. 2007.  Toxicity  data  informatics:
                supporting a new paradigm for toxicity prediction. Toxicol. Mech.
                A/7ef/i.Present volume
           Roberts,  G.,  Myatt,  G.  J., Johnson,  W.  P.,  Cross,  K.  P.,  and
                Blower,  P.  E.  2000.  LeadScope:  software for exploring  large
                sets  of  screening data. J.  Chem.  Inf.  Comput. Sci. 40:1302-
                1314.
           Snyder,  R.  D.  Gillies,  P.  J.  2002.  Evaluation of  the clastogenic,
                DNA intercalative, and  topoisomerase  ll-interactive properties of
                bioflavonoids in  Chinese  hamster  V79  cells.   Environ.  Molec.
                Mutagen. 40:266-276
           Tokyo-Eiken.    2007.    Tokyo   Metropolitan   Institute   of   Public
                Health,  Mutagenicity  of  food  additives.  http://www.tokyo-
                eiken.go.jp/henigen/index.htm. Accessed November 2, 2007.
           Tweats,  D. J.,  Scott,  A.  D.,  Westmoreland,   C.,  and Carmichael,
                P.  2007. Determination of  genetic toxicity  and  potential car-
                cinogenicity  in vitro—challenges post  the Seventh Amendment
                to the  European  Cosmetics  Directive.  Mutagenesis 22(1 ):5-
                13.
           Vogel, E. W.,  Barbin,  A., Nivard, A. J. M., Stack, H. F, Waters, M. D.,
                Paul, H. M., and  Lohman, P. H. M. 1998. Heritable and cancer
                risks  of exposures to anticancer drugs: inter-species comparisons
                of covalent  deoxyribonucleic acid-binding agents. Mutat.  Res.
                400:509-540.
           Yang, C., Benz, D. R., and Cheeseman, M. A. 2006. Landscape of current
                toxicity databases and database standards. Curr. Opin. Drug Discov.
                Develop. 9(1): 124-133.
295
         Data Mining Integrated Multiple Genetic Toxicity Databases
                                    Previous
TOC

-------
                                       Available online atvvww.sciencedirect.com
ELSEVIER                       Toxicology and Applied Pharmacology 277 (2008) 163-178
                                                                                                               Toxicology
                                                    ScienceDirect                                 ^ Applied
                                                                                                             Pharmacology
                                                                                                       www.elsevier.com/locate/ytaap

                                           Contemporary Issues in Toxicology

                   Understanding mechanisms of toxicity:  Insights from
                                            drug  discovery  research

                                        Keith A.  Houck *, Robert J.  Kavlock
            National Center for Computational Toxicology, Office Research and Development,  United Stated Environmental Protection Agency,
                                               Research Triangle Park, NC 27711, USA
                                Received 4 April 2007;  revised 28 September 2007; accepted 11 October 2007
                                                  Available online 4 November 2007
Abstract

   Toxicology continues to rely heavily on use of animal testing for prediction of potential for toxicity in humans. Where mechanisms of toxicity have
been elucidated, for example endocrine disruption by xenoestrogens binding to the estrogen receptor, in vitro assays have been developed as surrogate
assays for toxicity prediction. This mechanistic information can be combined with other data such as exposure levels to inform a risk assessment for the
chemical. However, there remains a paucity of such mechanistic assays due at least in part to lack of methods to determine specific mechanisms of toxicity
for many toxicants. A means to address this deficiency lies in utilization of a vast repertoire of tools developed by the drug discovery industry for
interrogating the bioactivity of chemicals. This review describes the application of high-throughput screening assays as experimental tools for profiling
chemicals for potential for toxicity and understanding underlying mechanisms. The accessibility of broad panels of assays covering an array of protein
families permits evaluation of chemicals for their ability to directly modulate many potential targets of toxicity. In addition, advances in cell-based
screening have yielded tools capable of reporting the effects of chemicals on numerous critical cell signaling pathways and cell health parameters. Novel,
more complex cellular systems  are  being used to model mammalian tissues and the consequences of compound treatment. Finally, high-throughput
technology is being applied to model organism screens to understand mechanisms of toxicity. However, a number of formidable challenges to these
methods remain to be overcome before they are widely applicable.  Integration of successful approaches will contribute towards building a systems
approach to toxicology that will provide mechanistic understanding of the effects of chemicals on biological systems and aid in rationale risk assessments.
Published by  Elsevier Inc.

Keywords: High-throughput screening; High-content screening; In vitro toxicology; Mechanism of toxicity; Systems  toxicology; Predictive toxicology
Contents

   Introduction	   164
   High-throughput screening	   164
      HTS for modulators of drug-metabolizing enzymes	   165
      High-throughput genotoxicity assays	   166
      Ion channel targets for toxicity screening	   167
      Receptor targets for toxicity screening	   167
      Broad pharmacological profiling	   167
      Complex cellular toxicity assays	   168
      Model organism toxicity assays	   170
   Data analysis	   171
   Experimental considerations	   172
   Challenges of HTS application to toxicology	   175
 * Corresponding author. NCCT/ORD (D343-03), US EPA, 109 TW Alexander Dr, Research Triangle Park, NC 27711, USA. Fax: +1 919 685 3371.
   E-mail address: houck.keith@epa.gov (K.A. Houck).

0041-008X/$ - see front matter. Published by Elsevier Inc.
doi:10.1016/j.taap.2007.10.022
                                        Previous

-------
164
                           K.A. Houck, R.J. Kavlock / Toxicology and Applied Pharmacology 277 (2008) 163-178
  Conflict of interest statement	176
  References	176
Introduction

   Toxicology  has traditionally  focused on  the effects  of
exogenous chemicals on living organisms through intensive
studies done one chemical at a time.  Such approaches have
served to illuminate the modes of action of many classes  of
chemicals and provided detailed mechanistic understanding  of
the molecular targets of toxicity for some. However the costs of
this approach have been high and mechanistic understanding
remains  limited.  Toxicology studies rely heavily on use  of
vertebrate animals, an expensive undertaking in both time and
money with debatable predictive power for human safety. For
example, carcinogenicity studies are conducted using a 40-year-
old model requiring 400 or more study animals and two years of
exposure at a cost of millions of dollars (Bucher, 2002). Despite
the investment of resources, debate continues on the utility  of
these  data in predicting  carcinogenicity potential in humans
(Ennever and Lave, 2003). The problem lies chiefly with  an
inability  to discern mechanisms of toxicity for the majority  of
toxicants tested using the "black box" whole  animal assays;
hence, cross-species extrapolation and low-dose, real-life expo-
sure effects become very difficult to appropriately assess. With
increasing public concern over the minimal toxicity information
available for thousands of large volume-production chemicals
produced and used in  commerce, the  inadequacy of existing
methods  presents a sizeable quandary  (De Rosa et al., 2003;
National  Research Council, 2006). REACH legislation in the
European Union  covers approximately 30,000 chemicals and
would require millions of animals  and  billions  of Euros  to
conduct  safety assessment on all of  these using traditional
methods  (Van  der  Jagt  et al., 2004).  Such  an enormously
expensive undertaking would likely provide  useful data  in
defining  the toxicity (or lack thereof) for  many chemicals;
however, without mechanistic understanding, debate over risk of
a subset of these chemicals would most assuredly ensue. What is
needed are cost-effective  screening assays that  would not only
identify chemicals of safety concern, but provide quantitative
and mechanistic information to inform rationale risk assessment.
   In  the pharmaceutical and biotechnology industries over the
past 15 years, enormous resources have been invested in devel-
oping efficient means to screen compounds against large numbers
of potential therapeutic targets. The dramatic  advances made in
high-throughput screening (HTS) technologies now permit ready
profiling  of biological activity of large chemical libraries using
multiwell plates  and automated  liquid handling equipment.
Although developed to support  drug discovery, toxicologists
have begun applying these batch-testing methodologies to large
numbers  of chemicals using in vitro bioassays and model organ-
ism screens to characterize potential for toxicity and understand
mechanisms of action. This review will look at applications  of
HTS  techniques to toxicology and  their potential impact on
shifting testing paradigms for evaluating chemicals for risk.
  High-throughput screening

     HTS techniques  are used primarily in the pharmaceutical
  industry in support of lead generation projects whose goal is to
  efficiently sort through enormous numbers of compounds  for
  leads, the starting chemical structures for the drug development
  process. Large libraries of organic small molecules or natural
  products are batch-tested against biological targets in industry-
  standard  96- and 384-well plates, occasionally using even
  higher density 1536- or 3456-well plates.  Relying heavily on
  automation and robotics, throughput ranges from thousands to
  1,000,000 samples tested  per day depending on the specific
  assay format (Table  1). Rate-limiting steps often lie in the type
  of signal detection required for the assay; simple fluorescence
  signals on the high  end of the  throughput range and cellular
  imaging assays at the lower end. Costs range from well under
  $ 1 per well for reagents and consumables to $ 10-75 per well for
  work performed at contract research laboratories.
     Although the  numbers of  chemicals requiring toxicity
  screening  is not  in the range of the numbers  of chemicals
  typically screened for lead generation for drug  discovery,  the
  opportunity  to broadly profile compounds for toxicity with a
  variety of assays efficiently and  at low cost makes HTS an
  attractive approach.  This  strategy, i.e. few chemicals  tested
  against a large number of assays, is the converse of the drug
  discovery paradigm  where many compounds are tested against
  one biological target, but makes  use of the same efficiency
  infrastructure used for HTS assays. Initial work on adopting
  HTS techniques for toxicity testing occurred in the pharmaceu-
  tical industry and centered largely on the cytochrome  P450
  monooxygenase (CYPs)   drug  metabolizing enzymes  since
  interference  with these enzymes was a commonly encountered
  problem during development of new drug candidates (Crespi
  and Stresser, 2000).  This area continues to receive much focus
  as new technologies are brought  to bear on inadequacies of
  existing approaches. Beyond drug metabolizing enzymes, in
  vitro assays for specific targets associated with toxicity, e.g.  ion
  channel assays, have been developed and are now routinely run.
  In addition,  pharmacology profiling panels targeting represen-
  tative members of many protein families are used to understand
  Table 1
  Definitions of screening modes

  Screening mode Abbreviation Samples tested   Example
                          per day
Low-throughput
Medium-
throughput
High-
throughput
Ultra-high-
throughput
LTS
MTS

HTS

uHTS

1-500
500-10,000

10,000-100,000

> 100,000

Animal models
Fluorescent cellular
microscopic imaging assay
Fluorescent enzymatic
inhibition assay
Beta-lactamase cell
reporter assay
                                       Previous
TOC

-------
                           K.A. Houck, R.J. Kavlock / Toxicology and Applied Pharmacology 277 (2008) 163-178
                                                                                                                    165
unexpected "off-target" activities of compounds  in develop-
ment. Use of in vitro cellular assays for toxicity testing has been
rapidly growing using a number of different methods including
automated, fluorescence  imaging  technology to  profile cell
health parameters; genotoxicity screening in mammalian cells;
and  cell systems biology approaches. Finally, adaptation  of
several model organism assays to HTS-compatible formats has
also begun to impact toxicity testing paradigms. The application
of these approaches  to  toxicity  prediction and  mechanistic
understanding will be discussed in this review.

HTS for modulators of drug-metabolizing enzymes

   CYPs are the most important class of enzymes metabolizing
exogenous chemicals  as well as many endogenous substrates.
Metabolism of exogenous chemicals or interference with the
activities of these enzymes can result in toxicities manifested in
a number of ways. First, the chemical itself can be directly acted
on by the enzymes, primarily in the liver  and intestines,
resulting in production of a toxic metabolite or, conversely, the
detoxification of a  toxic chemical  (Shimada,  2006). For
Pharmaceuticals, this is of obvious importance in  determining
whether the active agent  survives first-pass metabolism in the
liver to  reach its  target tissue in sufficient dose. For environ-
mental  chemicals, understanding  whether CYP  metabolism
activates chemicals to toxic metabolites that would be missed in
in  vitro systems  lacking  CYP activity or,  equally important,
whether CYP metabolism results  in  efficient inactivation  of
toxic chemicals are critical requirements needed for rationale
risk assessment.
   Secondly,  inhibition or induction of CYPs by drugs  or
xenobiotics can disrupt the normal kinetics  of other drugs and
chemicals resulting  in  increased  or decreased  levels and
potential toxicities. Such interactions, called drug-drug inter-
actions  in the pharmaceutical field,  have  many  clinically
important examples. In one common case,  exogenous ligands
activate nuclear  receptors, in particular pregnane X receptor
(PXR), leading to induction of mRNA for a variety of CYPs,
subsequent  increased metabolic activity  of the expressed
enzymes and, hence,  reduced bioavailability  of therapeutic
drugs such as oral contraceptives (Handschin and Meyer, 2003).
Such activities can also seriously effect the clinical development
of novel therapeutic drugs by confounding efficacy trials. As a
result, the  pharmaceutical  industry has  developed HTS
approaches to permit efficient screening of chemicals for their
activities in modulating the drug metabolizing enzymes. These
assays, described below, allow the quantitative measurement of
effect of the chemicals directly on their molecular targets, e.g.
specific CYPs or nuclear receptors. This can then be  used  in
conjunction with human exposure predictions or measurements,
to  estimate the potential  for these interactions to be of a sig-
nificant toxicity concern.
   HTS methods to determine the effects of CYP enzymatic
activity  on exogenous chemicals has been approached in a
number of ways. Standard,  low-throughput,  methods  allow
exposure of chemicals to a source of CYP  activity in  96-well
plates followed by extraction of the chemical and determination
    of amount of material remaining using liquid chromatography/
    mass spectrometric methods  (LC/MS/MS) (Lau et al., 2002;
    Shibata et al., 2002). Since the vast majority of cell lines have
    insignificant levels of most of the CYP  enzymes, sources of
    metabolizing capacity are recombinant CYP enzymes, hepato-
    cyte subcellular fractions such as S9 or microsomes, or primary
    hepatocytes themselves, freshly isolated or cyropreserved. Use
    of hepatocytes also provides  the inclusion of Phase II conju-
    gation systems to more closely represent the in vivo environ-
    ment. These "loss-of-parent" assays are useful in understanding
    pharmacokinetic parameters  in  the  drug  development  pro-
    cess but are of less utility in toxicology.  Reasons for this are
    predominately the lack of a priori  knowledge of what the
    metabolite(s) will be and whether they are  inherently more or
    less toxic  than  the parent.  Determination of these  effects
    generally has been beyond the screening level of information
    provided by such assays. Interest in coupling such biotransfor-
    mation systems to toxicity testing has led  to several systems of
    potential value. In one approach, cytotoxicity in cell lines stably
    expressing individual, recombinant CYP enzymes can be com-
    pared to the parental cell line lacking the CYP (Yoshitomi et al.,
    2001;  Bernauer  et al., 2003). While  such approaches can  be
    high-throughput and useful  for mechanistic understanding
    for specific chemicals, concern over lack of a  complete spec-
    trum of metabolic  responses,  including enzyme induction, has
    prevented the widespread  adoption of these  lines for high-
    throughput toxicity screening.
       Alternatively, co-culture systems consisting of metabolically
    competent cells, i.e. primary hepatocytes, coupled with one or
    more additional target cell types provided a means to incorporate
    biotransformation into toxicity screening. Although notoriously
    variable from batch-to-batch,  improvements in methods for use
    of cryopreserved human hepatocytes have  demonstrated greater
    consistency and preservation of metabolic  function including
    enzyme induction (Roymans et al., 2005;  Kafert-Kasting et al.,
    2006). Innovative platforms to carry out toxicity screening with
    primary hepatocytes have  been described although these new
    microfluidic-based technologies have yet to be widely accepted
    (Viravaidya et al., 2004; Lee et al., 2006a). In another variation
    on this theme, the  MetaChip  system couples CYP catalysis of
    chemicals using immobilized CYPs with cytotoxicity endpoints
    determined by cell viability staining (Lee  et al., 2005). Finally,
    the IdMOC system permits co-culture of five cell types along
    with primary hepatocytes in an interconnected system allowing
    exposure of metabolites of test compound to a variety of cells
    representing different tissues  (Li et al., 2004). Although these
    methods have yet  to make a  significant impact in toxicology,
    they represent  progress in dealing with a critical problem
    otherwise requiring  whole animal testing. These or similar
    methods will likely receive growing attention in the future.
       HTS methods to assess the potential for compounds to elicit
    pharmacokinetic drug interactions via inhibition of cytochrome
    CYP activities are well established in the drug development
    industry. Most of the major species of CYP enzymes are readily
    available from  commercial sources as recombinant proteins
    produced in Escherichia coli  or baculovirus-insect cell expres-
    sion systems. Semi-automated  assays validated under Good
                                     Previous
TOC

-------
166
                           K.A. Houck, R.J. Kavlock / Toxicology and Applied Pharmacology 277 (2008) 163-178
Laboratory Practices (GLP) standards that measure inhibition of
specific substrate conversion using LC/MS/MS  for detection
have been described (Walsky and Obach,  2004). Assays with
much higher throughput also exist and are  reliant on quenched
fluorescent substrates that fluoresce following CYP-mediated
metabolism (Crespi et al., 1997). Because this is a homogenous
assay with  a  fluorescent readout, the  assay  can  be readily
miniaturized using very low reaction  volumes suitable  for
screening in 1536-well plates (Trubetskoy et al., 2005). Phase II
conjugative enzymes have been less well  developed for HTS
approaches. Methods possibly useful in HTS format for UDP-
glycosyl transferase  enzymes and phenol  sulfotransferase
(SULT1A1) have been published but do not appear to be in
routine use (Trubetskoy and Shaw, 1999; Frame et al., 2000).
   Induction of drug metabolizing enzyme activity,  another
potential mode of drug interaction toxicity, has traditionally
been determined through measurement of enzymatic activity in
primary hepatocyte cultures following drug exposure, a rela-
tively low-throughput method (Luo et al., 2004; Lin, 2006).
CYP3A4 monooxygenase activity has been the focus of much
attention for drug interactions as it is responsible for metabolism
of more than 50% of pharmaceuticals (Guengerich,  1999).
Recent work that identified the  nuclear receptor PXR as a key
regulator of CYP3A monooxygenase expression provided the
insight to develop HTS assays for inducers of this important
drug interaction pathway (Lehmann et al., 1998; Moore and
Kliewer, 2000). HTS luciferase reporter  gene assays, typically
performed  in stably or transiently transfected liver cell lines
such as 4HIIE or HuH7, are now routinely  run  to identify
activators of PXR. Strong correlations between inducers of the
PXR  reporter gene activity  and  ability  to  induce CYP3A
activity have been demonstrated. Lower  throughput enzymatic
activity assays in  hepatocytes  are reserved for confirmation
assays (Luo et al., 2002; Cui et al., 2005). Reporter gene assays
compatible with HTS for other nuclear receptors  controlling
drug metabolism enzyme induction; e.g. constitutive androstane
receptor, liver X receptor, vitamin D receptor, and farnesoid X
receptor; have also been described (Faucette et al., 2006; Houck
et al., 2004; Bettoun et al., 2003; Lee et  al., 2006b). The more
distantly related aryl hydrocarbon receptor  (AhR), activated by
ligands  including  dioxin, regulates  expression  of the drug-
metabolizing enzymes of the CYP1A  family. Reporter gene
assays have been established in liver cell lines, e.g. HepG2, and
can be used in HTS format (Yueh et al., 2005). Good correlation
between reporter gene activity and  induction of mRNA  for
CYP1A1 as well as additional genes regulated by AhR was
shown.

High-throughput genotoxicity assays

   A battery of in vitro assays including  the  Ames bacterial
reverse  mutation test, the mouse lymphoma tk gene mutation
assay, and the micronucleus clastogenicity assay is traditionally
used to measure the potential for genotoxic activity of chemicals.
These are not high-throughput  methods, are relatively costly,
and are not ideal with regard to  either  sensitivity (Ames) or
specificity  (lymphoma tk and micronucleus) (Kirkland et  al.,
  2005). In addition,  for drug development,  a positive finding
  often results in termination of work on a chemical series, perhaps
  prematurely, since additional medicinal chemistry modifications
  to the chemical series may eliminate a positive test finding. Tests
  are generally reserved for relatively late-stage drug development
  candidates. However, recent advances provide high-throughput
  adaptations  and  alternatives  to these  genotoxicity assays.
  Although not yet validated to the extent they can be used for
  regulatory purposes, they expand the possibilities for providing
  useful prioritization information. For example,  in  drug dis-
  covery, the HTS methods permit an earlier screening of a much
  larger pool of potential drug candidates, thus drawing attention
  to possible liabilities earlier in the development process. For
  environmental health concerns, a wider net can be cast to more
  efficiently prioritize chemicals for more detailed evaluation of
  carcinogenicity concerns, in particular  if sensitivity and
  specificity of the test or combinations of tests are superior.
     The Ames reverse mutagenesis test requires relatively large
  quantities of test chemical, approximately 250 mg, and is not
  easily automated (Miller et al., 2005). An adaptation, known at the
  Ames  II assay, uses bacteria  grown in liquid suspension in
  microtiter plates rather than on agar plates (Gee et al., 1994). This
  method was shown to have a very  good concordance with the
  standard Ames testing procedure (Fluckiger-Isler et al., 2004). It
  reduces the  amount of  chemical required  for testing and  is
  compatible with limited automation providing a means of signifi-
  cantly increasing the throughput of this important genotoxicity
  test.
     A mammalian reporter cell line  for DNA damage  has been
  developed from the human, p53-proficient, lymphoblastoid cell
  line TK6 (Hastwell et al.,  2006). The resulting cell line, GenM-
  T01, contains a stably integrated, enhanced green fluorescent
  protein (EGFP) reporter gene under the control of the upstream
  promoter region of the human GADD45a (growth arrest and
  DNA damage) gene. GADD45a is induced in a p53-dependent
  manner in response to a wide variety of DNA damaging agents.
  Initial validation efforts have shown very high sensitivity and
  specificity for distinguishing genotoxic carcinogens from non-
  genotoxic carcinogens  and non-genotoxic non-carcinogens
  (Hastwell et al., 2006). However, only chemicals not requiring
  metabolic activation were  used  for the validation. A single
  chemical, cyclophosphamide,  was used to demonstrate that
  addition of an S9 mix could be used to metabolically activate a
  pro-genotoxin; however, the method used was not conducive to
  HTS  format. Further  validation of this reporter cell line  is
  warranted given the potential to broadly and accurately screen
  for genotoxic chemicals. Unfortunately, in lieu of a method to
  provide biotransformation capacity, this assay cannot provide a
  complete picture of genotoxicity potential, an issue of relevance
  in general for in vitro toxicity assays.
     Finally,  automated cellular imaging technology  has been
  recently been applied to the micronucleus clastogenicity assay
  resulting in much higher  throughput  and  lower compound
  requirement than the assay relying on traditional microscopy. In
  this assay, cells are treated with test  chemicals  in  microtiter
  plates, stained,  read on automated, fluorescence microscope
  imaging systems, and scored using an image analysis algorithm.
                                       Previous
TOC

-------
                           K.A. Houck, R.J. Kavlock / Toxicology and Applied Pharmacology 277 (2008) 163-178
                                                                                                                     167
Compound requirement is reduced from tens of milligrams to
1-3 mg and throughput is greatly increased (Lang et al., 2006).

Ion channel targets for toxicity screening

   The hERG potassium ion channel is essential  for normal
electrical activity in the heart by mediating ventricle repolariza-
tion (Sanguinetti and Tristani-Firouzi, 2006). Linkage of hERG to
pathology was demonstrated through mutations in the receptor
that caused  cardiac arrhythmias. This promiscuous receptor is
blocked  by a relatively  wide  variety  of  pharmaceutical
compounds  resulting  in  prolongation of  the  QT  interval of
cardiac rhythm. Examples include Seldane,  Hismanal, Propulsid
and Raxar, all withdrawn from the market at least in part as a result
of these findings (Fermini and  Fossa, 2003). Therefore,  major
effort was put  into developing HTS methods to screen drug
candidates early in the discovery process to identify poten-
tial problems.  Several types of assays  in HTS format  were
develope —a receptor-binding assay, an ion efflux assay, and a
fluorescence assay based on membrane potential  dyes.  The
receptor-binding assay,  while  sensitive and reliable, detects
chemicals that bind to the receptor but may or may not induce
QT prolongation (Finlayson et al., 2001). Thus compounds active
in the receptor-binding assay must be followed up with assays
such as voltage-clamping electrophysiology measurements. The
rubidium efflux assay can provide  modest throughput  and a
functional endpoint, but is not as readily automated as the binding
assay due to requirement for a flame atomic absorption spectro-
meter as a reader (Cheng et al.,  2002). Alternatively, use  of the
radioactive rubidium isotope 86Rb and scintillation counting can
be used but  generation of 86Rb  waste in an HTS format is not
desirable (Weir and Weston, 1986). This method may also result
in right-shifted concentration-response curves indicating reduced
potencies relative to patch-clamp-derived values (Murphy et al.,
2005). Use of fluorescent, voltage-sensitive  dyes in combination
with a fluorometric imaging reader is an additional HTS method
used  for  hERG  although  again  sensitivity  is reported as
significantly right-shifted relative to patch-clamp values (Murphy
et al., 2005). Development of patch-clamp instruments capable of
parallel  measurements providing moderate-throughput offered
alternatives to the binding and rubidium  efflux assay. However
published results suggest the instrument performs no better than
the rubidium assay with regard to sensitivity (Sorota et al., 2005).
Pharmaceutical compounds in development are now routinely
screened against hERG, but it does not appear that environmental
chemicals have been evaluated for activity against this channel.
Interpretation of the  significance of activity of environmental
chemicals against  hERG would need to consider the relevant
exposure scenarios for active chemicals together with the potency
of the chemicals for channel inhibition.

Receptor targets for toxicity screening

   Beyond the specific molecular targets described above, the
drug discovery process has generated HTS assays against a large
number of candidate  therapeutic targets. Some of  these also
represent likely potential toxicity screening targets. For exam-
    ple, there has been much interest in the potential for environ-
    mental chemicals to act as endocrine disrupters in humans and
    wildlife  (Kavlock  et  al.,  1996; Waring and Harris,  2005;
    Hutchinson et al., 2000). One mechanism of action for endocrine
    disrupters is to act directly as ligands to steroid hormone nuclear
    receptors, in particular estrogen, androgen and thyroid hormone
    receptors. High-throughput, gene reporter assays using lucifer-
    ase or beta-lactamase have been developed that can easily be
    used to screen for both agonist and antagonist activity for these
    as well as other nuclear receptors (Giddings et al., 1997; Chin
    etal., 2003). In addition, functional biochemical assays exist that
    are capable of measuring receptor  activation through ligand-
    induced recruitment of coactivator proteins. A number of assay
    formats  based  on protein:protein interactions  have been
    developed using the interacting domains of the nuclear receptor
    and  the  coactivator protein.  Ligand-induced interactions are
    measured by fluorescence polarization with  a fluorescently
    labeled  small peptide (Ozers et al., 2005), fluorescence reso-
    nance energy transfer using fluorescently labeled  receptor and
    coactivator  (Liu et al.,  2003), or Amplified  Luminescence
    Proximity Homogenous  Assay (AlphaScreen), a bead-based
    approach with very high sensitivity (Rouleau et al., 2003).These
    assays, too, can be run in agonist or antagonist mode. Each of
    these assay formats is subj ect to a variety of types of interference,
    e.g. fluorescence quenching, resulting in potential false positive
    and false negative results.  However, combining two formats, e.g.
    cell-based  gene reporter assay  with biochemical coactivator
    recruitment,  would greatly reduce  the  number of inaccurate
    findings. Traditional receptor binding assays using radiolabeled
    ligands are also  available as means to confirm results from the
    other assays. Indeed, these binding assays can themselves be
    configured to be conducted in high-throughput format using
    scintillation  proximity assay  beads (Nichols  et al.,  1998).
    Application of traditional binding study analysis methods, e.g.
    Lineweaver-Burk  plots, gives insight into mechanism of
    binding that permits appropriate assessment of the results of in
    vitro  screens (Laws   et al.,  2006).  Potency, efficacy  and
    mechanism of binding provide data that can be used to determine
    the effects  of the chemicals under given  human exposure
    scenarios, thus aiding in understanding the likelihood of hazard.
       Recent work has linked cardiac valvulopathy with therapeutic
    drugs having activity at the 5HT-2b G-protein-coupled receptor
    (GPCR)  including norfenfluramine  and  ergot-based Parkinson
    drugs (Fitzgerald et al., 2000; Schade et al., 2007). HTS binding
    assays for 5HT-2b are readily available and the receptor has been
    reported  to be one of the more promiscuous ones screened in
    pharmacology safety profiling panels (Whitebread  et al., 2005).
    Given the growing evidence of the serious side effects associated
    with 5HT-2b ligands, screening against this receptor is likely to
    become a routine part of safety assessment for new drug develop-
    ment. Screening of environmental chemicals for activity  at this
    receptor does not appear to have been reported.

    Broad pharmacological profiling

       Other molecular targets  are less well characterized with
    respect  to demonstrated direct associations  with  clinical or
                                     Previous
TOC

-------
168
                           K.A. Houck, R.J. Kavlock / Toxicology and Applied Pharmacology 277 (2008) 163-178
environmental toxicities.  However,  understanding  of the
importance of many of these targets in normal physiology as
well as a variety of pathologies suggest that chemicals with
adequate potency against a target may represent a potential for
toxicity given sufficient  exposure.  As a  result, a standard
method of assessing the possible liabilities of chemicals during
their  development  as new drugs  is  to broadly screen the
candidates against a panel of molecular targets in HTS assays
(Whitebread  et al., 2005). Such an approach has also  been
suggested for evaluating environmental chemicals for potential
for hazard (Dix et al., 2007). These panels vary in size as well as
the degree of known associations with  clinical side effects and
toxicities. A  number  of research services companies provide
access to hundreds of these assays. GPCRs, kinases, proteases,
nuclear receptors, phosphatases, phosphodiesterases and other
protein families are included in these panels. Unfortunately,
details on use of the  data  derived from screening with these
panels  have  not been made  readily  available since most
screening is done in conjunction with proprietary drug discov-
ery research in the private sector. However, several publications
demonstrated use of a panel of 92 ligand-binding  assays for
identifying relationships between molecular structure and broad
biological activity profiles (Fliri et al., 2005a) and relationship
to drug adverse side  effects  (Fliri et al.,  2005b). Further
investigation  of this type  of approach  is warranted given the
modest success  despite using  relatively crude  datasets,  e.g.
screening at  a single concentration, the limited number and
breadth of assays employed, and toxicity endpoints derived only
from  clinical side-effect  data  on commercial  drug  labels.
Several commercial databases have been developed containing
HTS data to  support development of computational models to
predict toxicity  including  Bioprint  (Mao  et  al., 2006)  and
RSMDB (http://www.novascreen.com/rsmdb.asp). Such mod-
els are  empirically  derived based on profiles  of known toxic
chemicals. Unfortunately, their general utility is difficult to
judge as they are not in the public domain. A currently growing
public database of informative HTS data is PubChem (http://
pubchem.ncbi.nlm.nih.gov/), the repository for  screening
results  from  the National Institute of Health's  Molecular
Libraries Roadmap  (Austin et al., 2004). Screening conducted
at 10 Molecular Libraries Screening Centers includes  a wide
range of targets and assay types that  eventually will include
many of interest to the toxicology community.  Currently,  a
collection of  ~ 100,000  small molecules constitutes the screen-
ing library and includes both compounds with known biological
activities as well as novel structures.

Complex cellular toxicity assays

   The ability to link activity against specific molecular targets
with toxicity  endpoints has significant limitations as evidenced
by the relatively few examples of clearly defined biochemical
assays for safety assessment. Future work likely will lead to
increased mechanistic understanding of various  toxicities and
may identify  additional targets  for which surrogate assays can
be used. However these biochemical systems inherently lack the
complexity that may be required for mechanisms of toxicity to
  be manifested.  Such complexity  does exist in whole animal
  assays although  important species  differences often pose
  problems in extrapolation to human safety assessment. With
  the great interest in reduction of the number of animals used for
  toxicity testing due to costs and ethical concerns, much attention
  has been focused on use of cell-based assays for prediction of
  toxicity. Use of cellular models provides a much higher level of
  complexity than  simple biochemical  assays.  The emergent
  properties of cell system models may be more suitable for
  detecting toxicity not necessarily linked to single  molecular
  targets. In addition, these properties may serve to put activity
  against molecular targets in the context of normal in  vivo
  biology where robust homeostatic  networks provide a buffering
  capacity against chemical stresses to specific molecular entities.
  Finally, the relatively easy accessibility of cells and cell lines
  from a number of different species including humans facilitates
  comparison of chemical activities across species. Research of
  this nature may greatly help in understanding the reliability of
  extrapolation of results from test animal to human.
     Early work on use of cell-based assays for toxicity prediction
  used primarily  cytotoxicity endpoints such as ATP content,
  membrane leakage, or cell number to  screen for activity against
  a variety of cell lines. Such data is  useful for identifying acutely
  toxic compounds as demonstrated by the Multicenter Evalua-
  tion  of In  Vitro Cytotoxicity program carried out by  the
  Scandinavian Society for Cell Toxicology (Walum et al., 2005).
  Cell lines, in particular human ones, showed good predictive
  ability for 50 acute human toxicants. The coefficient of deter-
  minations of linear regressions of ICsoS for the human cell line,
  with LD50 values in man obtained from clinical and forensic
  medicine handbooks, generated r2=0.69 vs 0.65 and 0.61 in
  mouse and rat in  vivo studies, respectively. This was increased
  to  r2=0.85  when toxicokinetic  information  on blood-brain
  barrier penetration was added for 32 of the chemicals. Such an
  approach, however, has significant drawbacks including lack of
  mechanistic  information and inability to  predict non-acutely
  toxic compounds. However recent efforts in several areas have
  provided new opportunities to utilize  cellular models to predict
  toxicity and provide more insight into mechanisms of action.
  Through evaluation  of a wide variety of cell signaling and
  stress-response  pathways, activities  of chronic  toxicants are
  more likely to be detected since these complex networks include
  large  numbers  of potential toxicity  targets.  The impact  of
  automated, epifluorescence  microscopy  on development  of
  toxicity assays in support of this will be discussed below. In
  addition,  technological advances  permitting  novel high-
  throughput approaches will also be reviewed.
     Microscopy  in cell biology has  long been  a descriptive
  technology limiting its utility in quantitative  studies, particu-
  larly in large-scale ones. However, development of automated,
  epifluorescence imaging platforms and robust image analysis
  algorithms, together  termed high-content imaging (HCI),  has
  provided the means to do high-throughput, quantitative analysis
  of cellular phenotypic assays (Rausch, 2006). These assays
  gather data at the single cell level and report  information on
  many parameters, an approach often referred to as high-content
  screening (Giuliano et al., 1997). Among the many parameters
                                        Previous
TOC

-------
                           K.A. Houck, R.J. Kavlock / Toxicology and Applied Pharmacology 277 (2008) 163-178
                                                            169
that can be quantified are cell signaling pathways, protein ex-
pression levels,  cell cycle  status,  receptor internalization,
cytoskeletal integrity, energy metabolism status, nuclear mor-
phology, post-translational protein modifications, cell move-
ment, and cell differentiation. These parameters are measured
using specific fluorescent stains,  fluorescently labeled anti-
bodies, and/or expression of GFP-tagged proteins. The imaging
platforms allow multiplexing of fluorescent endpoints with up to
four colors capable of being detected in a single well (Bertelsen,
2006). Although typically done as fixed endpoint assays, kinetic
analysis is also possible using instruments containing integrated
environmental chambers.  Application of this technology to
toxicology has paralleled  its development for drug discovery
uses in recent years.
   A commonly employed application of HCI is to determine
the effect of chemicals on a set of parameters measuring cell
homeostasis. A  variety of endpoints can be measured  and
multiplexing permits efficient  screening of chemicals  and
increases  sensitivity towards detection of toxic  compounds
(O'brien et al.,  2006).  By combining  measurement  of cell
number and nuclear morphology (Hoechst 33342 dye), intra-
cellular calcium content (fluo-4 dye), mitochondrial membrane
potential (TMRM dye) and membrane permeability (TOTO-3
dye), sensitivity in detecting 243 hepatotoxic chemicals out of a
collection of 611  compounds was 93%,  a significant improve-
ment  over the <25% obtained using conventional cytoxicity
endpoints. In this study, human HepG2 hepatoblastoma cells
were used with exposure to compounds optimized to three days.
Of the endpoints measured, cell proliferation was  the most
sensitive endpoint,  perhaps reflecting the complexity of the
physiological processes regulating it, followed  by mitochon-
drial  membrane  potential and  nuclear  morphology,  with
membrane permeability and calcium levels being least sensitive
(O'brien et al., 2006). The complimentary nature of these mul-
tiple endpoints reflects different mechanisms of toxicity for the
chemicals tested and resulted in the high sensitivity  obtained.
This study did not directly address mechanistic interpretation of
the results obtained; rather it presented an in  vitro  surrogate
system for detecting compounds likely to cause hepatotoxicity
in vivo.
   Additional HCI endpoints would likely provide insight into
specific mechanisms of toxicity. For example, reactive oxygen
species generation can be determined with the use of the dye
dichlorofluorescein (Phillips et al., 2005). Nuclear morphology
is easily measured with Hoechst dye. Interference with the cell
cycle  can be measured by monitoring the expression of cell
cycle-specific proteins,  e.g.  cyclins  (Gasparri  et al.,  2006).
Mitotic index assays can be performed by labeling phosphory-
lated histone proteins (Gasparri et al.,  2004). Apoptosis can be
detected  with multiplexed endpoints including nuclear con-
densation and caspase activation (Fennell et al., 2006). Out-
growth  of neurites  from  neuronal precursor cells (Richards
et al., 2006) or cell lines can be monitored and used to determine
potential  neurotoxicity demonstrated  by selective  activity
against neurites relative to other cytotoxic effects. As previously
mentioned in the genotoxicity assay section,  micronucleus
detection as a measure of clastogenicity is feasible in a relatively
    high-throughput format (Lang et al., 2006). With all of these
    phenotypic assays, multiple endpoints can be combined in a
    single  assay provided they are within the spectral limitations
    of the  fluorescent  dyes used  and  instrument capabilities.
    Since chemicals with similar mechanisms of toxicity would
    be expected to have similar cellular phenotypic effects, mea-
    surement of HCI  endpoints  can be used as a classification
    system. Quantifying and statistical clustering of a large number
    of endpoints resulted in successful classification of 45 of 51
    chemicals as  to mechanism of action for a diverse set of
    chemicals representing at least 12 different mechanism classes,
    e.g. actin poisons and topoisomerase II inhibition (Adams et al.,
    2006).
       Combining  the ability to  quantify  phenotypic  changes
    relating to toxicity with the emerging opportunity to interrogate
    primary human cells may yield assays of high value as in vitro
    surrogate toxicity screens. A number of features of HCI make
    this an ideal technology for use  with primary cells. Relatively
    few  cells are required  as typically only  100-500  cells are
    counted  per well, accommodating the  often  very limited
    availability of primary cells as well as batch-to-batch variability
    issues  (Borchert et al., 2006;  Mayer et al.,  2006).  Many
    endpoints measured by HCI rely on antibody  staining or
    fluorescent dyes of endogenous proteins, thus eliminating the
    need  for protein  over-expression or  other  cell  engineering
    tactics. In recent years,  a large variety  of primary human (and
    other species) cells have been made available commercially.
    Advances in use of pluripotent stem cells differentiated  into
    desirable target cell types  such as  cardiomyocytes present
    opportunities for providing ready supplies of appropriate cell
    types of interest for testing (Sartipy et al., 2006).  In addition,
    more complex cell culture systems may be possible containing
    two  or more  different  cell types to better represent in vivo
    physiology. Morphological or specific antigenic staining can be
    used to distinguish between the populations. Such a system, for
    example, might represent the interaction of liver Kupffer cells
    and hepatocytes, perhaps critical for understanding hepatocar-
    cinogenesis  and providing  a means  to  measure effects of
    chemicals on cell-cell interactions not  seen  in single  cell
    cultures  (Roberts et al.,  2007).
       Another  strategy that has generated  mechanistic under-
    standing, using a HTS  panel of cell proliferation assays, has
    been employed for more than a decade at the National Cancer
    Institute  (NCI). Profiling of tens of thousands of chemicals for
    growth inhibitory activity across the National Cancer Institute's
    ~ 80 tumor cell line screen created an information-rich database
    useful for understanding mechanism  of action (Shoemaker
    et al.,  2002). Chemicals were clustered based on  structural
    information and biological data using  a self organizing maps
    (SOM's) statistical approach resulting in groups of compounds
    with similar cytotoxicity profiles. As an example of the use of
    this technique for understanding mechanism, a cluster contain-
    ing a number of known inhibitors of mitochondrial complex I of
    the electron transport chain was studied (Glover et al., 2007).
    Ten chemicals with unknown mechanism of action were select-
    ed from the cluster and subsequently five of these shown to be
    potent inhibitors of complex I activity.
                                     Previous
TOC

-------
170
                           K.A. Houck, R.J. Kavlock / Toxicology and Applied Pharmacology 277 (2008) 163-178
   Berg et al. (2006) followed a similar strategy through use of
an HTS panel of four human cell-based assays consisting of
primary endothelial  and peripheral blood mononuclear cells,
alone or in co-culture,  and measured multiple inflammation-
related endpoints, primarily by ELISA. In testing of a set of 44
compounds consisting of a wide variety  of pharmacological
tools  and  drugs, they  successfully grouped compounds by
mechanism of action, e.g. modulators of NFkB signaling or the
phosphatidylinositol 3-kinase/Akt signal transduction pathway,
through hierarchical clustering and function similarity mapping
of the activity profiles in the assays. This included successful
classification of compounds with  mechanisms unrelated to
inflammatory system modulation. They attributed their success
to use of primary cells that maintain appropriate regulatory
processes and signaling pathway interactions.  This  approach
seems well-suited  towards mechanistic  understanding  of
toxicity. Development of additional complex cellular models
and profiling of larger libraries of pharmacological tools would
provide a resource for understanding and predicting potential
toxicities for new chemicals by highlighting  specific mechan-
isms affected by the  tested compounds.
   Similarly, MacDonald et al.  (2006) profiled the activity of
107 drugs and pharmacological tools in 49 different human cell-
based assays at several time-points generating a matrix of 127
different measurements  for  each  compound. The  assays
consisted of cellular  reporter assays using protein complemen-
tation technology for diverse protein interactions within  cells
including those involved in cell-cycle, apoptosis, mitogenesis,
proteolysis, GPCR  signaling,  cytoskeletal  function, DNA
damage, nuclear receptor signaling, stress and inflammation.
Data  were analyzed by  clustering  and  revealed known
mechanisms of action as well as novel activities for some of
the drugs. A subset of the assays was shown to be predictive for
antiproliferative activity.
   An additional approach for cell-based toxicity screening for
characterizing mechanism of toxicity used dynamic monitoring of
cytotoxicity by microelectronic sensors in 96- or 384-well arrays
(Kirstein et al., 2006). The assay determines changes in cell status
including viability, morphology  and adherence by measuring
electrical impedance in real-time.  This approach has  several
attractive features including a label-free detection system and the
ability to use any attached  cell  type. Testing of 11  reference
toxicants on the mouse fibroblast cell line BALB/c 3T3 yielded a
correlation  coefficient of r2=0.98 comparing in vitro ICso with
mouse in vivo LD50 values (Xing et al., 2006). Prediction of acute,
in vivo toxicity by this method  is likely limited to classes of
chemicals with direct effects  on target cells.  In  addition to
accurately predicting acutely toxic effects of chemicals, specific
kinetic patterns of response to different toxicants over a 72 hr
period provides insight into possible mechanisms of action by
comparison to patterns induced by compounds  of  known
mechanism of action (Xing et al., 2005; Solly et al., 2004).

Model organism toxicity assays

   While complex cell culture systems of primary  cells  can
provide unique insights into in vivo  pathophysiology, they will
  never completely model the higher level interactions present in
  an intact organism. For that reason,  efforts have been made to
  adapt HTS technologies to assays using whole animals in the
  form of model organisms that may yield important mechanistic
  information. Model organisms used  include both invertebrates,
  e.g. the yeast, Saccharomyces cerevisiae (Fairn et al., 2007),
  and nematode, Caenorhabditis elegans (Anderson et al., 2004);
  and vertebrates,  e.g.  the zebrafish, Danio rerio (Zhang et al.,
  2003).
     Ease  of growth,  a  completely  sequenced genome and
  availability of a wide range of genetic mutants has resulted in
  extensive use of S. cerevisiae for understanding physiological
  and pathophysiological  processes in higher level organisms.
  Several recent approaches demonstrated the utility of yeast in
  examining mechanisms of toxicity. Chemogenomic profiling in
  S.  cerevisiae,  a  method that determines chemical  sensitivity
  using a whole genome screen, was used to identify the ability of
  an apoptosis-inducing chemical, the  isoprenoid farnesol, to kill
  cells  through  generation of  reactive  oxygen  species by the
  electron transport chain. The Pkc 1 signaling pathway was shown
  to mediate farnesol-induced cell death through modulation of the
  generation  of reactive oxygen species (Fairn  et al.,  2007).
  Parsons et al. (2006) used a complete library of yeast haploid
  deletion mutants (~ 5000 strains) to  test for hypersensitivity to
  82 chemicals and natural product extracts which included many
  drugs and  other pharmacological agents. Screening of 5000
  strains was  done efficiently through parallel treatment using
  large  pools  of deletion  strains with identification of specific
  strains determined by  molecular barcodes, unique  20-mer
  oligonucleotide tags associated with the specific gene disruption
  and quantitated by hybridization to oligonucleotide arrays com-
  plimentary to  the  barcode sequences (Giaever et al.,  2002).
  Using both hierarchical clustering and a factorgram method that
  allows chemicals to be included in multiple sets, chemicals with
  similar modes-of-action  were grouped  and correlated  with
  effects on specific cellular pathways and protein targets. Crude
  natural product extracts were also demonstrated to display the
  activity profile of their constituent active component suggesting
  utility in screening mixtures and poorly characterized samples.
  The same method was applied to examining the activity of 12
  chemicals with DNA-damaging activity (Lee et al., 2005). Both
  known and novel functional interactions were discovered by this
  screening method and  distinctions between types of DNA
  damage and response shown.
     C.  elegans is a well-studied organism easily grown in the lab
  on agar or in liquid medium and fed with bacteria. All 959
  somatic cells have been characterized with respect to lineage
  and, importantly for screening, are  visible  by  microscopy.
  Extensive genetic manipulation of  the worm provided avail-
  ability of a large number of mutant  strains (C. elegans WWW
  Server: http://elegans.swmed.edu/) and C. elegans is conducive
  to  RNAi studies simply by soaking the worm in dsDNA or
  feeding with bacteria containing specific plasmids (Kamath and
  Ahringer, 2003). Thus hypersensitivity and hyposensitivity to
  specific toxicants can be linked to over- or under-expression of
  genes providing  a link to toxic mechanism (Lindblom et al.,
  2001;  Liao and Yu, 2005).  For  example,  arsenic-induced
                                        Previous
TOC

-------
                           K.A. Houck, R.J. Kavlock / Toxicology and Applied Pharmacology 277 (2008) 163-178
                                                                                                                     171
oxidative stress is more toxic to "/-glutamylcysteine synthetase
(GCS-1) mutant C. elegans than to wild-type demonstrating the
role of glutathione  in protecting  from  arsenate-induced
oxidative stress. Quantitative handling of the worms as well
as  detection of  assay endpoints  are  key factors  limiting
screening to modest throughput. Development of the COPAS
Biosorter, a flow-cytometer modified for larger objects, allowed
sorting of up to 100 worms per second as well as fluorescent
analysis of signals along the worm's axis providing a means to
automate the assays (Pulak, 2006).  Adaptation of HCI instru-
mentation to measuring endpoints  of C. elegans  assays is a
likely future development in this field.
   D. rerio, the zebrafish, as well as other fish species such as
the fathead minnow, have been used for environmental toxicant
testing for many years in the role of a sentinel species (Mizell
and Romig,  1997). Although informative as to presence of
toxicants, little mechanistic information resulted. More recently,
the zebrafish has been used as a model organism for a wide
variety of research including drug discovery  and toxicology. In
particular, the  transparent  embryo  of the zebrafish  allow
changes to organ morphology to be detected with a dissecting
microscope  with  more detailed  information  obtained by
visualization of serial  sections stained with specific antibodies
or by in situ hybridization. Using this  approach, Haendel et al.
(2004)  demonstrated  malformations,  in  particular a twisted
notochord, induced by the  dithiocarbamate pesticide Metam
Sodium and correlated this with altered expression of collagen
2al in the notochord sheath.  Further work  that  allowed
identification of a target in a pathway regulating collagen 2al
synthesis could then provide an opportunity  to compare across
species the target's existence and function as well as the potency
of the chemical for the target relative to real-world exposure
levels.  Parng  et al.   (2007) applied immunostaining and
morphometric analysis to  studying  neurotoxicity in  zebrafish
and distinguished effects of a variety of human neurotoxicants
on different neuron populations.  Relative to  larger vertebrates,
these studies are inexpensive and capable of testing  moderate
numbers of chemicals, although not in the traditional HTS
range, but are still laborious and are  generally not applied to
large chemical libraries. In order to  improve throughput of
scoring phenotypic endpoints in toxicity  assays, transgenic
embryos expressing GFP in specific locations utilizing tissue-
specific promoters  have been generated. Transgenic fish are
relatively easily made by injection of DNA vectors at the one- or
two-cell stage. This  permits screening  for effects on the labeled
tissue using automated fluorescent microscopy of fish embryos
in microtiter plates  (Burns et al., 2005). Effects  on the devel-
oping heart can be determined through use of a transgenic
zebrafish expressing GFP in the myocardium and measuring
heart rate by an HCI system. In addition to genetic manipulation
of the zebrafish, a large collection of mutant strains of the fish
have been collected (http://zfin.org/zf_info/dbase/db.html) and,
through  comparison  to  effects induced  by toxicants,  may
provide clues to mechanisms of toxicity. Zebrafish can also be
injected with morpholino oligonucleotides permitting a reverse
genetics approach potentially useful in elucidating mechanisms
of action (Nasevicius  and Ekker, 2001).  As with C. elegans,
    high-throughput approaches have just begun to be adapted for
    screening  of zebrafish.  The  ability to  interrogate intact,
    vertebrate organisms, particularly during developmental stages,
    gives access to a wide variety of toxicities perhaps detectable
    only in whole animals. Elucidating the mechanisms involved,
    evaluating that mechanism across species, and testing across
    species against  any molecular targets identified will greatly
    enhance risk assessment.

    Data analysis

       As discussed above, many methods exist for efficiently and
    relatively inexpensively  generating  data  by high-throughput
    means on large collections of chemicals. Such data in isolation
    have limited  utility.  Only through integration  and analysis of
    these data is meaningful understanding of mechanisms of
    toxicity likely to result. Towards this goal, two general analysis
    methods have been applied. The first, classification based on
    similarity of bioactivity  profiles to those of chemicals of
    known toxic mechanism, relies on a relatively large number of
    screening assays to ensure comprehensive  coverage of possible
    mechanisms.  The idea  is based  on  the assumption  that
    mechanistically  related toxic  chemicals  will  display similar
    patterns of biological activity recognized through characteristic
    signatures in in vitro tests. On the other hand the converse of this
    idea, i.e. compounds with similar in vitro  signatures will have
    similar in vivo toxicities, does not necessarily  follow  due to a
    lack of higher level biological interactions in these  in  vitro
    assays.  Exhaustive  screening  of all  protein  targets is not
    practical. However, because polypharmacology, i.e. the ability
    of chemicals to bind to more than one protein target, is common
    and occurs most often between members of the same protein
    family,  sufficient representation of protein families in the
    screening panel may provide an adequate  sampling of specific
    targets or their related family members to develop signatures
    (Paolini et al., 2006).  Chemicals with known mechanism of
    toxicity that are used as reference standards will likely play a key
    role in interpreting these data. Profiles of biological activity for
    these reference chemicals anchored to in vivo toxicity endpoints
    can serve as classifiers to suggest potential hazard for chemicals
    of unknown toxicity. Discovery  of novel patterns  of activity
    could suggest new mechanisms  of action outside  the list of
    reference compounds and trigger new research hypotheses.
       Alternatively,  a quantitative determination of the effect of
    individual chemicals on molecular targets and cellular signaling
    pathways, coupled with information on phenotypic endpoints,
    can be applied in a cell systems toxicology approach. These
    hypothesis-driven datasets would  be  integrated  based on
    knowledge of cellular  and higher  level systems responses
    using computational and informatics tools from systems biology
    research (Fig.  1).  Such  work  would help  define  toxicity
    pathways and could lead to development  of cellular response,
    pathway-based  assays  that  would serve as  key targeted-
    information screening assays of the type described by the
    National Research Council (NRC, 2007). These pathway assays
    would report perturbation of components of the pathways by
    tested chemicals suggesting potential for hazard and increasing
                                     Previous
TOC

-------
172
£yl flbudt, «../ Kavlock / Toxicology and Applied Pharmacology 277 (2008) 163-178
        Cell  Responses
        Proliferation
        Apoptosis
        Mitochondrial Fx
        ERK Pathway
        Wnt Pathway
        NFkB Translocation
        Cytokine 1 Release
        Cytokine 2 Release
        Ca2+ Release
        SMAD translocation
        FKHLR nuclear translocation
        Etc.
                                                                      Biochemical
                                                                      Targets	
                                                                      Kinase 1
                                                                      Kinase 2
                                                                      Kinase 3
                                                                      GPCR1
                                                                      GPCR2
                                                                      GPCR3
                                                                      Ion channel 1
                                                                      Ion channel 2
                                                                      PDE1
                                                                      PDE2
                                                                      Nuclear Receptor 1
                                                                      Nuclear Receptor 2
                                                                      Etc.
                                                        Mechanism
Fig. 1. Cell systems toxicology. Chemicals are screened through panels of biochemical targets and cellular phenotypic screens. Data are in
biology tools to identify likely mechanisms of action. (Illustration used with permission from BioCarta.)
                                                                                    using systems
priority for  selective in vivo  toxicity testing. Quantitative
modeling of affected systems, assuming availability of sufficient
information, would then provide key, mechanistically relevant
input into  rationale  risk assessment  for  the toxicant. Such
modeling would take into account the pathway being perturbed
not in isolation, but as  part of a dynamic regulatory system
composed of multiple interacting pathways. This systems ap-
proach may reveal unexpected environmental effects on human
disease as well as the presence of compensatory responses that
maintain cellular homeostasis despite environmental stress.

Experimental  considerations

   As with all  high-throughput assays, there  are a number of
important  caveats  that  must  be  considered in  designing
screening strategies  and  interpreting results. Data quality is of
utmost importance. The "garbage in, garbage out" lesson holds
very true  for HTS data and has serious implications for  at-
tempting to screen for potential for hazard or illuminate mech-
anistic understanding of results. For this reason, all HTS testing
needs to be conducted under stringent, well-validated condi-
tions that include determination of Z'-scores, signal-to-back-
ground, reproducibility, plate positional effects and appropriate,
biologically relevant controls  (Zhang et  al.,  1999). The  Z'
statistic is  widely used  in  HTS assay validation  and relates
variability around the minimum and maximum signals of the
assay to the assay window size (Mean Maximum Signal - Mean
Minimum Signal) (Fig. 2). Values of > 0.5 indicate suitability of
the assay for HTS applications. Assays with lower values can
                                    still be used but with likely significantly higher false positive
                                    and negative results (Zhang et al., 1999). Signal-to-background
                                    should generally be as large as possible using biologically
                                    relevant  controls  to  determine  the window. Day-to-day
                                    reproducibility is best determined by measuring ICso or ECso
                                    values from independent runs and calculating variability. Plate
                                    positional effects are best detected with analysis of variance of
                                    rows and columns combined with good data visualization tools.
                                    In addition, data quality of HTS assays is subject to a variety of
                                    artifactual interferences potentially resulting in false positives


5 3000
0) 2500 '•
J =
Relative
_i _i r


•* • • • • .•*•••
• % % „ • •••••# #•*•• ••••./•••• ,
r. "..*•" .*.•••..*••*•* :'•••>

n Min • Max
Z' = 1 - [(3*STDmax + 3*STDmin)/(AVGmax-AVGmm)] = 0.76


^^jPWItefl} OIQ4I%lj3*Il^^ DC ttoflftyftyjJDi CtcfTHTtD O3
                                            0    10   20   30    40    50    60    70   80   90
                                                              Well Number

                                    Fig. 2. Example of calculation of Z'-statistic for validation of an HTS screening
                                    assay. Data are derived from two 96-well plates; one with solvent control (Min)
                                    and one with positive control (Max). The Z' of 0.76 indicates an assay highly
                                    suitable for HTS. No plate positional effects are observed.
                                        Previous

-------
                              K.A. Houck, R.J. Kavlock / Toxicology and Applied Pharmacology 277 (2008) 163-178
                                                                                                                                  173
Table 2
Common modes of interference with HTS assays
Assay type
                         Interference
                                                Cause
                                                                                   Accommodation
Fluorescence, luminescence

Fluorescence

Fluorescence, luminescence,
  colorimetric
AlphaScreen

Enzymatic inhibition

Luminescent reporter

Fluorescent imaging
Cell-based assay
Innerfilter effect          Compound or substrate absorption of
(quench of signal)        excitation or emission light from tracer
Autofluorescence         Compound or other assay component
                       emission of light in range of assay signal
Light scattering          Compound or other assay
(quench of signal)        component insolubility
Signal quench           Singlet oxygen quench

Non-specific loss of signal Aggregation of enzyme by compound

Non-specific loss of signal
Loss of signal
General cytotoxicity
Direct inhibition of luciferase
enzymatic activity
Photobleaching
Varied
Read of compound alone; use of red-shifted dyes; secondary
assay in alternative format; reduce screening concentration
Read of compound alone; use of red-shifted dyes; secondary
assay in alternative format
Reduce screening concentration; increase organic
solvent concentration in assay buffer
Test in control AlphaScreen assay not requiring target;
secondary assay in alternative format
Inclusion of detergent in assay buffer; reduce
screening concentration
Run control assay for inhibition with added recombinant
luciferase; secondary assay in alternative format
Use more stable dye
Run control assay for cell viability, e.g. ATP
content assay; reduce screening concentration
and false negatives. Table 2 lists some of the most common of
these problems. Strategies to deal with specific issues must be
incorporated into screening programs. Beyond assay validation
and interference concerns, there are several other issues leading
to false positives and false negatives which must be  kept in
perspective (Table  3). As previously mentioned, pro-toxicants,
i.e. those requiring biotransformation to a toxic metabolite, will
require  a  metabolic  activation  system  to  detect  activity.
Technological advances are urgently needed in this  area. Poor
aqueous solubility, particularly for many environmental chemi-
cals which are often  very lipophilic, is also of great concern
since screening assays are  usually conducted in an aqueous
buffer.
   The potential problems described above can be of very high
concern  in the  field  of toxicology  where  false negative
screening results may cause significant toxicities and mechan-
istic understanding to be overlooked. Development of assays
with both high sensitivity and high specificity is the obvious
goal for such  research, although  this is usually difficult to
achieve. In drug discovery, false negative rates are treated as
less critical since the goal is to identify a viable starting point for
drug development, not find every compound with activity at the
target. False positive rates in drug discovery present a problem
                                            only  to  the extent that  actives  become  too numerous  to
                                            thoroughly  follow-up  on  and result in missed  opportunities
                                            for finding the rare, true positives.  For toxicity screening, false
                                            negative rates need to be minimized, often at the cost of lowered
                                            specificity. This can be accomplished with highly robust assays,
                                            testing samples in replicate, appropriate chemical handling, and
                                            testing at multiple concentrations. Within reasonable limits, the
                                            lowered specificity usually accompanying increased sensitivity
                                            can be handled by two methods:  1)  confirmation assays that
                                            eliminate  false positives resulting  from statistical variation or
                                            procedural errors, and 2) testing against the same target in an
                                            alternative format to eliminate artifacts due to interference with
                                            the  assay technology. As further  insurance against false
                                            negatives, redundancy can be  built in to an overall screening
                                            strategy.  For example,  screening against  a  wide  panel  of
                                            molecular targets in combination with cellular pathway screens
                                            that encompass many  of these same targets in a cellular context
                                            gives  multiple,  independent opportunities  to detect  activity.
                                            Current  limitations to HTS  approaches  requiring  technical
                                            solutions include lack of biotransformation capacity capable of
                                            metabolism of a chemical  to a toxic form  or, conversely,  its
                                            inactivation, and difficulty dealing with volatile  or aqueously
                                            insoluble chemicals.
Table 3
Critical issues with HTS assays for toxicity screening
Issue
                  Cause
                                                                   Accommodation
False negative      Lack of biotransformation to active metabolite
False negative      Poor aqueous solubility of chemical

False negative      Poor solubility of chemical stock in standard solvent
False negative      Testing concentration too low
False negative      Lack of appropriate assay
False negative      Statistical issue

False positive       Testing concentration too high

False positive       Lack of appropriate biological complexity for
                  adaptation to toxicant stress
False positive       Lack of biotransformation to inactive metabolite
                                          Include biotransformation system in assay
                                          Determine solubility with nephelometer; increase organic solvent
                                          in assay (if tolerated); reduce screening concentration
                                          Change solvent and ensure compatibility with assay
                                          Increase testing concentration to limits of solubility
                                          Expand potential target coverage with broader panels or complex cellular assays
                                          Use appropriately validated assays; replicate testing; use
                                          concentration-response testing format
                                          Use concentration relevant to exposure scenarios; interpret
                                          quantitative result in the context of relevant exposure scenarios
                                          Develop/use more complex cellular assays

                                          Include biotransformation system in assay
                                         Previous
                                       TOC

-------
174
                          K.A. Houck, R.J. Kavlock / Toxicology and Applied Pharmacology 277 (2008) 163-178
   With a myriad of assay possibilities, how can toxicity
screening be efficiently utilized to recognize toxicants and shed
light on mechanism of action? One possibility would be as
illustrated in Fig.  3. Initial screening would occur  with a
comprehensive panel of cellular assays covering cell response/
signaling pathways and cell health parameters. Testing concen-
trations  and interpretation  of concentration-response results


                                            Chemicals
                                                           would ideally be informed by realistic exposure scenarios. Use
                                                           of negative control compounds such as Generally Regarded As
                                                           Safe  (GRAS) substances would help provide confidence  in
                                                           positive findings. Model organism assays that capture complex
                                                           biology not capable of being detected in simpler formats may be
                                                           included here as well. These pathway assays would likely have
                                                           varying levels of linkage to known toxicity outcomes based on
                                    aqueous solubility | aqueous insolubility

                                volatile  ^ non-volatile
                      Special handling required
                                                              Special handling required
                                                     Testing at exposure-
                                                     relevant concentrations
                                          Cellular Response
                                               Pathways

                                          Cell Health Panels
                                                Model
                                         v^^ organism  ^>
                                            active
                                                    inactive
                                                                   Predict
                                                             biotransformation
                                                        no metabolites  I metabolites
                                                     r
                                                 Low priority
                                                                            Consider testing
                                                                              metabolites
Protein superfamily panel biochemical screens
kinases
phosphatases
GPCRs
transcription
factors
NRs
proteases
CYPs
DMA modifying
enzymes
other
enzymes
                                                    Integrate results;
                                                 mechanism hypothesis
                                       Exposure
                                     determination

                                        PB/PK
                                       Modeling
                                                            	 Selective
                                                             Animal testing
                                             Risk Assessment

                                         Fig. 3. Chemical toxicity testing flow scheme.
                                     Previous

-------
                           K.A. Houck, R.J. Kavlock / Toxicology and Applied Pharmacology 277 (2008) 163-178
                                                                                                                     175
testing  of reference  toxicants.  Experience  and  knowledge
gained over time would demonstrate which  of the pathways
can be confidently called toxicity pathways. Chemicals with
activity in one or more of these assays would then pass to a large
screening panel of in vitro, biochemical targets. Activity pat-
terns in these assays would lead to hypotheses about mechanism
of action. The  strength of the evidence linking the  in  vitro
results with in vivo toxicity would drive decisions about the
necessity  of additional  selective biochemical, cellular and/or
animal testing. Animal testing may also be required to under-
stand pharmacokinetic parameters that may influence toxicity.
Together,  these data would provide important qualitative and
quantitative information to be  used for a  risk  assessment
analysis of the chemical. The capability to carry our most of this
effort currently exists although means to deal  with biotransfor-
mation, highly volatile chemicals, and poorly aqueously soluble
chemicals are still needed.

Challenges of HTS application to toxicology

   As recently noted by the National Research Council (2007),
toxicology is reaching a pivotal movement. The emergence of a
plethora of advances in molecular biology,  cell culture tech-
niques, computer science and bioinformatics is providing  an
opportunity to acquire information on the cellular and molecular
pathways  affected by chemicals in ways  unimagined only a
decade ago. Large numbers of biological targets as identified in
this review can  be probed for interactions with  hundreds to
thousands of chemicals in short periods  of time. The NRC
envisions  a future  of toxicology that includes multi-faceted
exploration  of  chemical characterization  (e.g. physical and
chemical properties, environmental fate and transport, metabo-
lism,  and interaction  with  cellular components),  toxicity
pathways  (defined as a  'cellular response pathway that, when
sufficiently perturbed in an intact animal, are expected to result
in adverse health effects') and targeted in vivo testing (to further
explore and quantify information obtained by exploring toxicity
pathways).  Combined, the  NRC  foresees  these elements
providing the basis for a more informed assessment of human
health risks based on a deeper understanding of the mode of
action by which toxic effects are induced, including  the key
molecular and other biological targets in the pathways.
   Many of the new tools now available to toxicology origi-
nated in the pharmaceutical industry as they strove to streamline
the process of drug discovery and development. While overall
assessments of the  success of that industry are  cloaked  in
confidential business information, this review has identified a
number of efforts  that have at least provided preliminary
glimpses of their potential to detect toxicity pathways.  Never-
theless, there are numerous  differences  and challenges  in
translating the  experience  of drug development to  environ-
mental toxicity. Notable of these are: (1) drugs are developed as
biologically  active  compounds, and hence may be more
amenable to such an approach than environmental chemicals,
many of which have no intended biological activity;  (2) the
chemical space covered by drug development is considerably
narrower than that of environmental chemicals, which do not
    have to display particular ADME characteristics that make drugs
    bioavailable and efficacious; (3) the  metabolism of drugs is
    generally well characterized, as is the activity of any metabolites;
    (4) selection of chemical  libraries used for  drug  discovery
    include consideration of solubility, something not necessarily of
    importance for environmental chemicals; (5) while there may
    be only  400-500 drugable targets, the number of potential
    biological off-targets for drugs  and environment chemicals is
    likely to be quite large, necessitating a very broad search for
    toxicity pathways; (6) for assessment of environmental chemi-
    cals, confidence in the value of a negative results in any assay
    will  have to be  higher than for drugs,  which will  certainly
    undergo  additional scrutiny as preclinical studies continue on
    lead compounds; (7)  the probability that chemicals are going to
    interact with more than a single toxicity pathway, with exposure
    intensity and  duration playing important determinants as to
    which one(s) are key to inducing adverse health effects; (8) the
    likelihood that the outcomes of perturbing toxicity pathways will
    be cell-type dependent, reflecting the variability in expression of
    coactivators, corepressors or other mitigating factors; and (9) the
    aspect that at least some forms of toxicity are dependent on
    higher order interactions of cells in tissues or organs what may
    not be apparent by a reductionist evaluation of isolated cells and/
    or pathways. Even this limited list of obstacles  exposes the
    daunting challenge in creating a new paradigm for toxicological
    evaluation, and it also points to the need for a strategic approach
    to design a research program that can begin to chip away at the
    obstacles.
       The application of HTS in toxicology could be twofold, one
    essentially bottom up, the second top down. In the bottom up
    approach, a single or small number of chemicals could be
    analyzed against a vast array of targets to isolate the key toxicity
    pathways. In the top  down  approach, a relatively large number
    of chemicals could be assayed against a small number of key
    targets (this is essentially the approach being used for detection
    of endocrine disrupting  chemicals that act via interaction with
    estrogen, androgen or thyroid  hormone function). A  middle
    ground would try to  maximize both the numbers of chemicals
    assayed  and the breadth of the assays so as to achieve  a
    biologically based prioritization process. This process could
    then be applied to chemicals that have large potential for human
    exposure, but for which have avoided extensive traditional
    toxicological evaluations due to uses not covered by the data-
    intensive regulatory statutes  such as  Federal  Insecticide,
    Fungicide and Rodenticide  Act. It is this latter application that
    should serve as  fertile test  of the utility of HTS technologies,
    and one that is being explored by the EPA (Dix et al., 2007).
    Under this ToxCast  program, several  hundred chemicals with
    well characterized toxicological  effects are  being evaluated
    against more than 400 biological targets. The biological targets
    include  activities against a variety  of enzymes,  kinases,
    phosphatases, ion channels, nuclear receptor  binding and
    response modulation, exocrine  function of stimulated cells,
    transcript profiles of exposed primary cell types, and responses
    of model organisms. The chemicals currently being looked at
    are primarily pesticidal  agents that have subchronic, chronic,
    developmental and reproductive assays available. Since they are
                                     Previous
TOC

-------
176
                                K.A. Houck, R.J. Kavlock / Toxicology and Applied Pharmacology 277 (2008) 163-178
designed to have  some biological activity, the  probability of
detecting  interactions  with  some  cellular  pathways  should
be relatively high, and the resulting  bioactivity profiles across
all targets should  provide a rich dataset to evaluate whether
the particular profiles can be associated with certain  specific
health outcomes. The current status of this program is available
at http://epa.gov/ncct/toxcast.

Conflict of interest statement

   The authors report no conflicts of interest with respect to this
manuscript.
   This work was reviewed by EPA and approved  for publi-
cation but does not necessarily reflect  official Agency policy.
Mention  of trade  names  or  commercial products  does  not
constitute  endorsement or recommendation by EPA for use.


References

Adams, C.L., Kutsyy, V., Coleman, D.A., Cong, G., Crompton, A.M., Elias,
   K.A., Oestreicher,  D.R., Trautman, J.K., Vaisberg, E.,  2006. Compound
   classification using image-based cellular  phenotypes. Methods  Enzymol.
   414, 440-468.
Anderson, G.L., Cole, R.D., Williams, P.L., 2004. Assessing behavioral toxicity
   with Caenorhabditis elegans. Environ. Toxicol. Chem. 23, 1235-1240.
Austin, C.P., Brady,  L.S., Inse, 1T.R.,  Collins,  F.S., 2004. NIH molecular
   libraries initiative. Science 306, 1138-1139.
Berg,  E.L., Kunkel, E.J., Hytopoulos, E., Plavec, I., 2006. Characterization of
   compound  mechanisms  and secondary activities by BioMAP analysis.
   J.  Pharmacol. Toxicol. Methods 53, 67-74.
Bernauer, U., Glatt, H., Heinrich-Hirsh, B.,  Liu, Y, Muckel, E., Vieth, B.,
   Gundert-Remy, U., 2003. Heterologous expression of mouse cytochrome
   P450 2el in V79 cells: construction and characterization of the cell line and
   comparison with V79 cell lines stably expressing rat P450 2E1 and human
   P450 2E1. ATLA 31, 21-30.
Bertelsen,  M.,  2006. Multiplex analysis of inflammatory signaling pathways
   using a high-content imaging system. Methods Enzymol. 414, 348-363.
Bettoun, DJ., Burris, T.P., Houck, K.A., Buck II, D.W., Stayrook, K.R., Khalifa,
   B., Lu, J., Chin, W.W., Nagpal, S., 2003. Retinoid X receptor is a nonsilent
   major contributor to vitamin D receptor-mediated transcriptional activation.
   Mol. Endocrinol. 17, 2320-2328.
Borchert, K.M., Sells Galvin, R.J., Hale, L.V., Trask, O.J., Nickischer, D.R.,
   Houck, K.A., 2006. Screening for activators  of the wingless type/Frizzled
   pathway  by automated fluorescent microscopy. Methods Enzymol.  414,
   140-150.
Bucher, J.R., 2002. The National Toxicology Program rodent bioassay: designs,
   interpretations, and scientific contributions.  Ann. N.Y. Acad.  Sci.  982,
   198-207.
Burns, C.G., Milan, D.J., Grande, E.J., Rottbauer, W., MacRae, C.A., Fishman,
   M.C.,  2005.  High-throughput  assay for  small molecules  that modulate
   zebrafish embryonic heart rate. Nat. Chem. Biol. 1, 263-264.
Cheng, C.S., Alderman, D., Kwash, J., Dessaint, J., Patel, R., Lescoe, M.K., Kinrade,
   M.B., Yu, W., 2002. A high-throughput HERG potassium channel function
   assay: an old assay with a new look. Drug Dev.  Ind. Pharm. 28, 177-191.
Chin,  J.,  Adams, A.D.,  Bouffard, A., Green, A.,  Lacson, R.G., Smith, T.,
   Fischer, PA., Menke, J.G., Sparrow, C.P.,  Mitnaul, L.J., 2003. Miniaturiza-
   tion of cell-based  beta-lactamase-dependent FRET assays  to ultra-high
   throughput formats to identify agonists of human liver X receptors. Assay
   Drug Dev. Technol. 1, 777-787.
Crespi, C.L., Stresser, D.M., 2000. Fluorometric screening for metabolism-
   based drug-drug interactions. J. Pharmacol. Toxicol. Methods 44,325-331.
Crespi, C.L., Miller, V.P., Penman, B.W., 1997. Microtiter plate assays for
   inhibition of human, drug-metabolizing cytochromes P450. Anal. Biochem.
   248, 188-190.
  Cui, X., Thomas, A., Han, Y, Palamanda, J., Montgomery, D., White, R.E.,
      Morrison, R.A., Cheng, K.C.,  2005.  Quantitative  PCR assay  for
      cytochromes  P450 2B and 3A induction in rat precision-cut liver slices:
      correlation study with induction in vivo. J. Pharmacol. Toxicol. Methods 53,
      215-218.
  De Rosa, C.T., Nickle,  R., Faroon, O., Jones, D.E., 2003. The impact of
      toxicology on public health policy and service:  an update. Toxicol. Ind.
      Health 19, 115-124.
  Dix, D J., Houck, K.A., Martin, M.T., Richard, A.M., Setzer, R.W, Kavlock, R J.,
      2007. The ToxCast Program for prioritizing toxicity testing of environmental
      chemicals. Toxicol. Sci. 95, 5-12.
  Ennever, F.K., Lave, L.B., 2003.  Implications of the lack of accuracy of the
      lifetime  rodent  bioassay for  predicting human  carcinogenicity.  Regul.
      Toxicol.  Pharmacol. 38, 52-57 (Aug).
  Fairn, G.D.,  Macdonald, K, McMaster, C.R., 2007. A chemogenomic screen in
      Saccharomyces cerevisiae uncovers a primary role for the mitochondria in
      farnesol toxicity and its regulation by the Pkcl pathway. J. Biol. Chem. 282,
      4868-4874.
  Faucette, S.R., Sueyoshi, T, Smith, C.M., Negishi, M., Lecluyse, E.L., Wang,
      H., 2006. Differential regulation of hepatic CYP2B6 and CYP3A4 genes by
      constitutive androstane receptor but not pregnane X receptor. J. Pharmacol.
      Exp. Ther. 317, 1200-1209.
  Fennell, M.,  Chan, H., Wood, A., 2006. Multiparameter measurement of caspase
      3 activation and apoptotic cell death in NT2 neuronal precursor cells using
      high-content analysis. J. Biomol. Screen. 11, 296-302.
  Fermini,  B., Fossa, A.A., 2003.  The  impact of drug-induced  QT interval
      prolongation on drug discovery and development. Nat. Rev. Drug Discov. 2,
      439-447.
  Finlayson, K, Turnbull, L., January, C.T., Sharkey, J., Kelly, J.S., 2001. [3H]
      dofetilide binding to HERG  transfected membranes:  a  potential high
      throughput preclinical screen. Eur. J. Pharmacol. 430,  147-148.
  Fitzgerald,  L.W, Burn, T.C., Brown,  B.S.,  Patterson,  J.P.,  Corjay, M.H.,
      Valentine, P.A., Sun, J.H., Link, J.R., Abbaszade, I., Hollis, J.M., Largent,
      B.L., Hartig, P.R., Hollis, G.F., Meunier, P.C., Robichaud, A.J., Robertson,
      D.W., 2000. Possible role of valvular serotonin 5-HT(2B) receptors in the
      cardiopathy associated with fenfluramine. Mol. Pharmacol. 57, 75-81.
  Fliri, A.F.,  Loging, W.T., Thadeio, P.P., Volkmann,  R.A., 2005a. Biological
      spectra analysis: linking biological activity profiles to molecular structure.
      Proc. Natl. Acad. Sci. U. S. A. 102, 261-266.
  Fliri, A.F., Loging, W.T., Thadeio, P.P., Volkmann, R.A.,  2005b. Analysis of
      drug-induced effect patterns to link structure and side effects of medicines.
      Nat. Chem. Biol. 1, 389-397.
  Fluckiger-Isler, S., Baumeister, M., Braun, K., Gervais, V, Hasler-Nguyen, N.,
      Reimann, R., Van Gompel, J.,  Wunderlich, H.G., Engelhardt, G., 2004.
      Assessment of the performance of the Ames II assay: a collaborative study
      with 19 coded compounds. Mutat. Res. 558, 181-197.
  Frame, L.T., Ozawa, S., Nowell,  S.A., Chou, H.C., DeLongchamp, R.R.,
      Doerge, D.R., Lang, N.P., Kadlubar, F.F., 2000. A simple colorimetric assay
      for phenotyping the major human thermostable  phenol sulfotransferase
      (SULT1A1) using platelet cytosols. Drug Metab. Dispos.  28, 1063-1068.
  Gasparri, F., Mariani, M., Sola, F., Galvani, A., 2004. Quantification of the
      proliferation index of human dermal fibroblast cultures with the Array Scan
      high-content screening reader.  J. Biomol. Screen. 9, 232-2343.
  Gasparri, F., Cappella, P., Galvani, A., 2006. Multiparametric cell cycle analysis
      by automated microscopy. J. Biomol. Screen. 11,  586-598.
  Gee, P.,  Maron, D.M., Ames, B.N., 1994. Detection and classification of
      mutagens: a set of base-specific Salmonella tester strains.  Proc. Natl. Acad.
      Sci. U. S. A.  91, 11606-11610.
  Giaever, G.,  Chu, A.M., Ni, L., Connelly, C., Riles, L., Veronneau, S., Dow, S.,
      Lucau-Danila, A., Anderson, K., Andre, B., Arkin, A.P., Astromoff, A., El-
      Bakkoury,  M., Bangham, R., Benito, R., Brachat,  S.,  Campanaro,  S.,
      Curtiss, M., Davis, K., Deutschbauer, A., Entian, K.D., Flaherty, P., Foury,
      F., Garfmkel, D.J., Gerstein, M., Gotte, D., Guldener, U., Hegemann, J.H.,
      Hempel, S., Herman, Z., Jaramillo, D.F., Kelly, D.E., Kelly, S.L., Kotter, P.,
      LaBonte, D., Lamb, D.C., Lan, N., Liang, H., Liao, H.,  Liu, L., Luo, C.,
      Lussier,  M., Mao, R., Menard, P., Ooi, S.L., Revuelta, J.L., Roberts, C.J.,
      Rose, M., Ross-Macdonald, P., Scherens, B., Schimmack, G., Shafer, B.,
      Shoemaker, D.D.,  Sookhai-Mahadeo, S.,  Storms, R.K., Strathern, J.N.,
                                               Previous
TOC

-------
                                 K.A. Houck, R.J. Kavlock / Toxicology and Applied Pharmacology 277 (2008) 163-178
                                                                                                                                                111
   Valle, G., Voet, M., Volckaert, G., Wang, C.Y., Ward, T.R., WiLhelmy, J.,
   Winzeler, E.A., Yang, Y., Yen, G., Youngman, E., Yu, K., Bussey, H., Boeke,
   J.D.,  Snyder, M.,  Philippsen,  P., Davis, R.W.,  Johnston,  M.,  2002.
   Functional profiling of the Sacchammyces cerevisiae genome. Nature 418,
   387-391.
Giddings, S.J., Clarke,  S.E., Gibson, G.G.,  1997. CYP4A1 gene transfection
   studies and the peroxisome proliferator-activated receptor: development of a
   high-throughput assay to detect  peroxisome proliferators.  Eur.  J. Drug
   Metab. Pharmacokinet. 22, 315-319.
Giuliano, K.A., DeBiasio, R.L., Dunlay, R.T., Gough, A., Volosky,  J.M., Zock,
   J.,  Pavlakis,  G.N.,  Taylor,  D.L.,  1997. High-content screening: a new
   approach to easing key bottlenecks in the drug discovery process. J. Biomol.
   Screen. 2, 249-259.
Glover, C.J., Rabow, A.A., Isgor, Y.G., Shoemaker, R.H., Covell, D.G., 2007.
   Data mining of NCI's anticancer screening database reveals mitochondrial
   complex I inhibitors cytotoxic to leukemia cell lines. Biochem.  Pharmacol.
   73,331-340.
Guengerich,  P.P.,  1999. Cytochrome P-450  3A4: regulation and role in drug
   metabolism. Annu. Rev. Pharmacol. Toxicol. 39, 1-17.
Handschin, C., Meyer, U.A., 2003. Induction of drug metabolism: the role of
   nuclear receptors. Pharmacol. Rev. 55, 649-673.
Haendel, M.A., Tilton,  R, Bailey, G.S., Tanguay, R.L., 2004. Developmental
   toxicity  of the  dithiocarbamate pesticide  sodium metam  in zebrafish.
   Toxicol.  Sci. 81, 390-400.
Hastwell, P.W, Chai, L.L., Roberts, K.J., Webster, T.W, Harvey, J.S.,  Rees, R.W,
   Walmsley, R.M.,  2006.  High-specificity  and high-sensitivity genotoxicity
   assessment in a human cell line: validation of the GreenScreen HC GADD45a-
   GFP genotoxicity assay. Mutat. Res. 607,  160-175.
Houck, K.A., Borchert, K.M.,  Hepler, C.D.,  Thomas, J.S.,  Bramlett, K.S.,
   Michael, L.F., Burris, T.P., 2004. T0901317 is a dual LXR/FXR agonist. Mol.
   Genet. Metab. 83, 184-187.
Hutchinson, T.H., Brown, R., Brugger, K.E., Campbell, P.M., Holt, M., Lange, R.,
   McCahon, P., Tattersfield, L J., van Egmond, R., 2000. Ecological risk assess-
   ment of endocrine disrupters. Environ. Health Perspect. 108, 1007-1014.
Kafert-Kasting,  S., Alexandrova, K., Barthold, M., Laube, B., Friedrich, G.,
   Arseniev, L., Hengstler,  J.G., 2006. Enzyme induction in cryopreserved
   human hepatocyte cultures. Toxicology 220, 117-125.
Kamath, R.S., Ahringer, J., 2003. Genome-wide RNAi screening  in  Caenor-
   habditis  elegans. Methods 30, 313-321.
Kavlock, R.J., Daston, G.P., DeRosa, C., Fenner-Crisp, P., Gray, L.E., Kaattari,
   S.,  Lucier, G., Luster, M., Mac, M.J., Maczka, C., Miller, R., Moore, J.,
   Rolland,  R.,  Scott, G.,  Sheehan,  D.M., Sinks,  T, Tilson, H.A.,  1996.
   Research needs for the risk assessment of health and environmental effects
   of  endocrine disrupters:  a report of the U.S. EPA-sponsored workshop.
   Environ. Health Perspect. 104 (Suppl4), 715-740.
Kirkland, D., Aardema, M., Henderson, L., Muller, L., 2005. Evaluation of the
   ability of a battery of three in vitro genotoxicity tests to discriminate rodent
   carcinogens and  non-carcinogens  I. Sensitivity, specificity and relative
   predictivity. Mutat. Res. 584, 1-256.
Kirstein, S.L., Atienza, J.M., Xi, B., Zhu, J., Yu, N., Wang, X., Xu, X., Abassi,
   Y.A., 2006. Live cell quality control and utility of real-time cell electronic
   sensing for assay development. Assay Drug Dev.  Technol. 4, 545-553.
Lang, P., Yeow, K., Nichols, A., Scheer, A., 2006. Cellular imaging in drug
   discovery. Nat. Rev. Drug Discov. 5, 343-356.
Lau, YY, Sapidou, E., Cui, X., White, R.E., Cheng, K.C., 2002. Development of
   a novel in vitro model to predict hepatic clearance using fresh, cryopreserved,
   and sandwich-cultured hepatocytes. Drug Metab. Dispos. 30, 1446-1454.
Laws,  S.C.,  Yavanhxay, S., Cooper, R.L., Eldridge, J.C., 2006. Nature of the
   binding interaction for 50 structurally diverse chemicals with rat estrogen
   receptors. Toxicol. Sci. 94, 46-56.
Lee, M.Y, Park, C.B., Dordick, IS., Clark, D.S., 2005. Metabolizing enzyme
   toxicology assay chip (MetaChip) for high-throughput microscale toxicity
   analyses. Proc. Natl. Acad. Sci. U.  S. A.  102, 983-987.
Lee, P.J.,  Hung, P.J.,  Rao,  V.M.,  Lee,  L.P.,  2006a. Nanoliter scale
   microbioreactor  array for quantitative cell biology. Biotechnol.  Bioeng.
   94, 5-14.
Lee, F.Y, Lee, H., Hubbert,  M.L., Edwards, P.A., Zhang, Y, 2006b. FXR, a
   multipurpose nuclear receptor. Trends Biochem. Sci. 31, 572-580.
     Lehmann, J.M., McKee, D.D., Watson,  M.A., Willson,  T.M.,  Moore, J.T.,
         Kliewer, S.A., 1998. The human orphan nuclear receptor PXR is activated
         by compounds that regulate CYP3A4  gene expression and cause drug
         interactions. J. Clin. Invest. 102, 1016-1023.
     Li, A.P., Bode, C.,  Sakai, Y,  2004. A novel in vitro system, the integrated
         discrete multiple organ cell culture (IdMOC) system, for the evaluation of
         human drug toxicity: comparative cytotoxicity of tamoxifen towards normal
         human cells from five major organs  and MCF-7 adenocarcinoma breast
         cancer cells. Chem. Biol. Interact. 150, 129-136.
     Liao, V.H., Yu, C.W, 2005. Caenorhabditis elegans gcs-1 confers resistance to
         arsenic-induced oxidative stress. BioMetals 18, 519-528.
     Lin, J.H., 2006. CYP induction-mediated drug interactions: in vitro assessment
         and clinical implications. Pharm. Res.  23, 1089-1116.
     Lindblom, T, Pierce, G., Sluder, A., 200 I.AC, elegans orphan nuclear receptor
         contributes to xenobiotic resistance. Curr. Biol. 11, 864-868.
     Liu, J., Knappenberger, K.S., Kack,  H., Andersson, G., Nilsson, E., Dartsch, C.,
         Scott, C.W., 2003.  A homogeneous in vitro functional assay for estrogen
         receptors: coactivator recruitment. Mol. Endocrinol. 17, 346-355.
     Luo, G., Cunningham,  M., Kim, S., Burn, T, Lin, J., Sinz, M., Hamilton, G.,
         Rizzo, C., Jolley,  S., Gilbert,  D.,  Downey, A.,  Mudra, D.,  Graham, R.,
         Carroll, K., Xie, J., Madan, A., Parkinson,  A.,  Christ, D., Selling, B.,
         LeCluyse, E., Gan, L.S., 2002.  CYP3A4 induction by drugs: correlation
         between a pregnane X receptor reporter gene assay and CYP3A4 expression
         in human hepatocytes. Drug Metab. Dispos. 30, 795-804.
     Luo, G., Guenthner, T, Gan, L.S., Humphreys, W.G., 2004. CYP3A4 induction
         by xenobiotics: biochemistry, experimental methods and impact  on drug
         discovery and development. Curr. Drug Metab. 5, 483-505.
     MacDonald, M.L., Lamerdin, I, Owens, S., Keon, B.H., Bilter, O.K., Shang, Z.,
         Huang, Z., Yu, H.,  Dias, J., Minami, T., Michnick, S.W, Westwick, J.K.,
         2006. Identifying  off-target effects and hidden phenotypes of drugs in
         human cells. Nat. Chem. Biol. 2, 329-337.
     Mao, B., Gozalbes, R., Barbosa, F., Migeon, J., Merrick, S., Kamm, K., Wong, E.,
         Costales, C., Shi, W, Wu, C., Froloff, N., 2006. QSAR modeling of in vitro
         inhibition of cytochrome P450 3A4. J. Chem. Inf. Model 46, 2125-2134.
     Mayer, T., Jagla, B., Wyler, M.R., Kelly, P.O., Aulner, N., Beard, M., Barger, G.,
         Tobben,  U., Smith, D.H., Branden, L., Rothman, J.E., 2006. Cell-based
         assays  using primary endothelial  cells to study  multiple  steps in
         inflammation. Methods Enzymol. 414, 266-283.
     Miller, J.E., Vlasakova, K., Glaab,  WE.,  Skopek, T.R., 2005. A low  volume,
         high-throughput forward mutation assay in Salmonella typhimurium based
         on fiuorouracil resistance. Mutat. Res. 578, 210-224.
     Mizell, M., Romig, E.S., 1997. The aquatic vertebrate embryo as a sentinel for
         toxins: zebrafish embryo dechorionation and perivitelline space microinjec-
         tion. Int. J. Dev. Biol. 41, 411-423.
     Moore,  J.T., Kliewer, S.A., 2000. Use of the nuclear receptor PXR to predict
         drug interactions. Toxicology 153,  1-10.
     Murphy, S.M., Palmer, M., Poole, M.F., Padegimas, L., Hunady, K., Danzig, J.,
         Gill, S., Gill, R., Ting, A.,  Sherf, B., Brunden, K., Stricker-Krongrad, A.,
         2005. Evaluation of functional and binding assays in cells expressing either
         recombinant or endogenous hERG channel. J. Pharmacol. Toxicol. Methods
         54, 42-55.
     Nasevicius, A., Ekker, S.C., 2001. The zebrafish as a novel system for functional
         genomics and therapeutic development applications. Curr. Opin. Mol. Ther.
         3, 224-228.
     National Research Council, Committee on Toxicity Testing and Assessment of
         Environmental Agents, 2006. Toxicity Testing for Assessment of Environ-
         mental Agents: Interim Report.  The National Academies Press.
     National Research Council, 2007. Toxicity Testing in the Twenty-first Century:
         A Vision  and Strategy. Prepublication edition available at: http://dels.nas.
         edu/dels/reportDetail.php?Flink_id=4286&session_id=.
     Nichols, J.S., Parks, D.J., Consler, T.G., Blanchard, S.G., 1998. Development of
         a scintillation proximity assay for peroxisome proliferator-activated receptor
         gamma ligand binding domain.  Anal. Biochem. 257, 112-119.
     O'brien, P.J.,  Irwin, W, Diaz, D., Howard-Cofield, E., Krejsa, C.M., Slaughter,
         M.R., Gao, B., Kaludercic, N., Angeline, A., Bernardi,  P., Brain,  P.,
         Hougham, C., 2006. High concordance of drug-induced human hepatotoxi-
         city with  in vitro cytotoxicity measured  in a novel cell-based model using
         high content screening. Arch. Toxicol. 80 (9), 580-604 (Sep).
                                              Previous
TOC

-------
178
                                 K.A. Houck, R.J. Kavlock / Toxicology and Applied Pharmacology 277 (2008) 163-178
Ozers, M.S., Ervin,  K.M., Steffen, C.L., Fronczak, J.A.,  Lebakken, C.S.,
   Carnahan,  K.A.,  Lowery, R.G., Burke, T.J., 2005. Analysis of ligand-
   dependent  recruitment of coactivator peptides to estrogen receptor using
   fluorescence polarization. Mol. Endocrinol. 19, 25-34.
Paolini, G.V., Shapland, R.H., van Hoorn, W.P., Mason, J.S., Hopkins, A.L.,
   2006.  Global mapping of pharmacological  space. Nat.  Biotechnol.  24,
   805-815.
Parng, C., Roy,  N.M., Ton, C., Lin,  Y., McGrath, P., 2007. Neurotoxicity
   assessment using  zebrafish. J. Pharmacol. Toxicol. Methods 55, 103-112.
Parsons, A.B., Lopez, A., Givoni, I.E., Williams, D.E., Gray, C.A., Porter, J.,
   Chua, G., Sopko, R., Brost, R.L., Ho, C.H., Wang, J., Ketela, T., Brenner, C.,
   Brill, J.A.,  Fernandez, G.E., Lorenz, T.C., Payne, G.S., Ishihara, S., Ohya,
   Y., Andrews, B.,  Hughes,  T.R., Frey, B J., Graham, T.R., Andersen, R.J.,
   Boone, C., 2006.  Exploring the mode-of-action of bioactive compounds by
   chemical-genetic profiling in yeast. Cell 126, 611-625.
Pulak, R.,  2006. Techniques for analysis, sorting, and dispensing of C. elegans
   on the COPAS flow-sorting system. Methods Mol. Biol. 351, 275-286.
Phillips, G.W., Irwin,  W.A., Howard-Cofield, E.J., Randle, L.E., Abraham, V.C.,
   Haskins, J.R., O'Brien, P.J., 2005. Incorporation of an  oxidative  stress
   biomarker into high content screening  (HCS) for human toxicity potential.
   llth Annual Meeting of the Society for Biomol.  Screen. (SBS), P05020.
Rausch, O., 2006. High content cellular screening. Curr. Opin. Chem. Biol. 10,
   316-320.
Richards,  G.R., Smith, A.J., Parry, F.,  Platts, A., Chan, O.K., Leveridge,  M.,
   Kerby,  J.E.,  Simpson,  P.B., 2006. A morphology- and kinetics-based
   cascade for human neural cell  high content screening.  Assay Drug Dev.
   Technol. 4, 143-152.
Roberts, R.A.,  Ganey, P.E., Ju, C., Kamendulis, L.M., Rusyn, I, Klaunig, J.E.,
   2007. Role of the kupffer cell in mediating hepatic toxicity and carcinogenesis.
   Toxicol. Sci. 96,2-15.
Rouleau,  N.,  Turcotte, S., Mondou,  M.H.,  Roby, P.,  Bosse, R.,  2003.
   Development of a versatile platform for nuclear receptor screening using
   AlphaScreen. J. Biomol. Screen. 8, 191-197.
Roymans,  D., Annaert, P., Van Houdt, J., Weygers, A., Noukens, J., Sensenhauser,
   C., Silva, J., Van Looveren, C., Hendrickx, J., Mannens, G., Meuldermans, W,
   2005.  Expression and induction potential  of cytochromes P450 in human
   cryopreserved hepatocytes. Drug Metab. Dispos. 33, 1004-1016.
Sanguinetti, M.C., Tristani-Firouzi, M., 2006. hERG potassium channels  and
   cardiac arrhythmia. Nature 440, 463-469.
Sartipy, P., Bjorquist, P., Strehl, R., Hyllner, J., 2006. Pluripotent human stem
   cells as novel tools in drug discovery and toxicity testing. IDrugs 9,702-705.
Schade, R., Andersohn, F., Suissa, S., Haverkamp, W, Garbe, E., 2007. Dopamine
   agonists and the risk of cardiac-valve regurgitation. N. Engl. J. Med. 356,
   29-38.
Shibata, Y, Takahashi, H., Chiba, M.,  Ishii, Y, 2002. Prediction of hepatic
   clearance  and availability  by cryopreserved  human  hepatocytes:  an
   application of serum incubation method. Drug Metab. Dispos. 30, 892-896.
Shimada, T., 2006. Xenobiotic-metabolizing enzymes involved in activation and
   detoxification of carcinogenic  polycyclic aromatic hydrocarbons.  Drug
   Metab. Pharmacokinet. 21, 257-276.
Shoemaker, R.H., Scudiero, D.A., Melillo, G.,  Currens, M.J., Monks, A.P.,
   Rabow, A.A., Covell,  D.G.,  Sausville, E.A., 2002. Application of high-
   throughput, molecular-targeted screening to anticancer drug discovery. Curr.
   Top. Med. Chem. 2, 229-246.
  Solly, K., Wang, X., Xu, X., Strulovici, B., Zheng, W, 2004. Application of real-
      time cell electronic sensing (RT-CES) technology to  cell-based assays.
      Assay Drug Dev. Technol. 2, 363-372.
  Sorota, S.,  Zhang,  X.S., Margulis,  M., Tucker, K.,  Priestley, T.,  2005.
      Characterization of a hERG screen using the Ion Works HT: comparison to
      a hERG rubidium efflux screen. Assay Drug. Dev. Technol. 3, 47-57.
  Trubetskoy, O.V, Shaw, P.M., 1999. A fluorescent assay amenable to measuring
      production of D-glucuronides produced from recombinant UDP-glycosyl
      transferase enzymes. Drug Metab. Dispos. 27, 555-557.
  Trubetskoy, O.V, Gibson, J.R., Marks, B.D., 2005. Highly miniaturized formats
      for in vitro drug metabolism assays using vivid fluorescent substrates and
      recombinant human cytochrome  P450 enzymes.  J. Biomol. Screen.  10,
      56-66.
  Van der Jagt, K., Munn, S., Torslov,  J., de Bruijn,  J.,  2004. Alternative
      approaches can reduce the use of test animals  under REACH. Report to
      the  Directorate  General  JRC,  Institute  for  Health  and  Consumer
      Protection,  European Commission,  (http://ihcp.jrc.ec.europa.eu/docs/ecb/
      Reducing_the_use_of_test_animals_under_REACH_IHCP_report.pdf).
  Viravaidya,  K., Sin, A., Shuler, M.L., 2004. Development of a microscale cell
      culture analog to probe naphthalene toxicity. Biotechnol. Prog. 20,316-323.
  Walsky, R.L., Obach, R.S., 2004. Validated assays for human cytochrome P450
      activities. Drug Metab. Dispos. 32, 647-660.
  Walum, E., Hedander, J., Garberg, P., 2005.  Research perspectives for pre-
      screening alternatives to animal  experimentation On the  relevance of
      cytotoxicity  measurements,  barrier  passage  determinations and  high
      throughput screening in  vitro to  select potentially hazardous compounds
      in  large sets of chemicals.  Toxicol.  Appl. Pharmacol.  207 (2  Suppl),
      393-397.
  Waring, R.H., Harris, R.M.,  2005. Endocrine  disrupters: a human risk? Mol.
      Cell. Endocrinol. 244, 2-9.
  Weir, S.W,  Weston, A.H., 1986. The effects of BRL 34915 and nicorandil on
      electrical and mechanical activity  and on 86Rb efflux in rat blood vessels.
      Br. J. Pharmacol. 88,  121-128.
  Whitebread, S., Hamon, J., Bojanic, D., Urban,  L., 2005. Keynote review: in
      vitro safety pharmacology profiling: an essential tool for successful drug
      development. Drug Discov. Today 10, 1421-1433.
  Xing, J.Z., Zhu, L., Jackson, J.A., Gabos, S., Sun, X.J., Wang, X.B., Xu,  X.,
      2005. Dynamic monitoring  of cytotoxicity on microelectronic  sensors.
      Chem. Res. Toxicol. 18, 154-161.
  Xing, J.Z., Zhu, L., Gabos, S., Xie, L., 2006. Microelectronic cell sensor assay
      for  detection of cytotoxicity and prediction of acute toxicity.  Toxicol.
      In Vitro  20, 995-1004.
  Yoshitomi, S., Ikemoto, K., Takahashi, J., Miki, H., Namba, M., Asahi, S., 2001.
      Establishment of the transformants expressing human cytochrome  P450
      subtypes in  HepG2, and their  applications  on  drug metabolism and
      toxicology. Toxicol. In Vitro 15, 245-256.
  Yueh,  M.F., Kawahara,  M., Raucy, J.,  2005. Cell-based  high-throughput
      bioassays to  assess induction and inhibition of CYP1A enzymes. Toxicol.
      In Vitro  19, 275-287.
  Zhang, J.H., Chung, T.D., Oldenburg, K.R., 1999. A simple statistical parameter
      for use in evaluation  and validation of high throughput screening assays.
      J. Biomol. Screen. 4, 67-73.
  Zhang,  C.,  Willett, C., Fremgen, T.,  2003. Zebrafish:  an  animal model  for
      toxicological studies. Curr. Protoc. Toxicol., Suppl. 17, 1-18.
                                                Previous
TOC

-------
Research

Use of Cell Viability Assay Data Improves the Prediction  Accuracy
of Conventional Quantitative Structure-Activity  Relationship Models
of Animal Carcinogenicity
Hao Zhu,1-2 Ivan Rusyn,1-3 Ann Richard,4 and Alexander Tropsha1-2
1Carolina Environmental Bioinformatics Research Center, laboratory for Molecular Modeling, Division  of Medicinal Chemistry and
Natural Products, School of Pharmacy, and 3Department of Environmental Sciences and Engineering, School of Public Health,
University of North Carolina at Chapel Hill, Chapel Hill, North Carolina USA; 4National Center for Computational Toxicology, Office of
Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA
 BACKGROUND: To develop efficient approaches for rapid evaluation of chemical toxicity and human
 health risk of environmental compounds, the National Toxicology Program (NTP) in collaboration
 with the National Center for Chemical Genomics has initiated a project on high-throughput screen-
 ing (HTS) of environmental chemicals. The first HTS results for a set of 1,408 compounds tested for
 their effects on cell viability in six different cell lines have recently become available via PubChem.
 OBJECTIVES: We have explored these data in terms of their utility for predicting adverse health
 effects of the environmental agents.
 METHODS AND RESULTS: Initially, the classification k nearest neighbor (£NN) quantitative struc-
 ture—activity relationship  (QSAR) modeling method was  applied to the HTS data only, for a
 curated data set of 384 compounds. The resulting models had prediction accuracies for training,
 test (containing 275 compounds together), and external validation (109 compounds) sets as high as
 89%, 71%,  and 74%, respectively. We then asked if HTS results could be of value in predicting
 rodent Carcinogenicity. We identified 383 compounds for which data were available from both the
 Berkeley Carcinogenic Potency Database and NTP—HTS studies. \C^e found that compounds clas-
 sified by HTS as "actives" in at least one cell line were likely to be rodent carcinogens (sensitivity
 77%); however, HTS "inactives" were far less informative  (specificity 46%). Using chemical
 descriptors only, £NN QSAR modeling resulted in 62.3% prediction accuracy for rodent Carcino-
 genicity applied to this data set. Importantly, the prediction accuracy of the model was significantly
 improved (72.7%) when chemical descriptors were augmented by HTS data, which were regarded
 as biological descriptors.
 CONCLUSIONS: Our studies suggest that combining NTP—HTS profiles with conventional chemical
 descriptors  could considerably improve the predictive power of computational approaches in
 toxicology.
 KEYWORDS: carcinogenesis, computational toxicology, high-throughput screening, QSAR. Environ
 Health Persfect 116:506-513 (2008).  doi:10.1289/ehp.!0573 available via http://dx.doi.org/
 [Online 4 January 2008]
The traditional approaches for in vivo animal
chemical safety testing are costly, time consum-
ing, and have a low throughput (Bucher and
Portier 2004). To improve the efficiency of
assessing potential human health hazards of
environmental  chemicals,  the National
Toxicology Program (NTP) at the National
Institute of Environmental Health Sciences
(NIEHS) recently initiated the High Through-
put Screening (HTS)  project (NTP 2007;
Inglese et al. 2006; Xia et al. 2007). The NTP-
HTS effort aims to develop high-throughput
biological assays  that aid in predicting a
chemical's potential for in vivo toxicity in a
manner that is both informative of mecha-
nisms and pathways and relevant to human
health risk assessment. These  assays  are
expected to help in prioritizing compounds
for targeted animal testing. Recently, a set of
1,408 chemical agents, many with known
in vivo toxicity profiles, was screened in six
human cell lines for cytotoxicity and other phe-
notypic end points. The  HTS results, including
complete dose-response  data for all tested com-
pounds, were made publicly available through
PubChem [National Center for Biotechnology
Information (NCBI) 2007]. These data can be
explored in terms of assessing the relevance of
HTS screening to predictive toxicology.
   Accurate prediction of the adverse effects of
chemical substances on living systems, identifi-
cation of possible toxic alerts, and compound
prioritization for animal testing are the primary
goals of computational toxicology. Rapid
expansion of experimental data sets that com-
bine data on  chemical structure and various
toxicity end points for numerous environmen-
tal agents {e.g., NTP [NTP 2007]; Berkeley
Carcinogenic Potency Database [CPDB 2007];
and Distributed Structure-Searchable Toxicity
database  [DSSTox; U.S. Environmental
Protection Agency (U.S. EPA) 2007]} provides
novel opportunities to explore the relationships
between chemical structure and toxicity using
cheminformatics approaches. Application of
advanced cheminformatics tools, such as quan-
titative structure—activity relationship (QSAR)
methods, to the analysis of these data may pro-
vide means for accurate prediction of chemical
toxicity of untested compounds, allowing for
prioritization of compounds for subsequent
animal testing.
    QSAR modeling aims to establish rigorous
correlations between the chemical descriptors
of a set of compounds and their experimentally
studied biological activities. Many different
QSAR approaches have been developed over
nearly 50 years of research (Beresford et al.
2004; Dearden 2003; Johnson et al. 2004;
Schultz et al. 2003a). Recent trends in the field
have focused on model validation as  the key
part of model development to ensure signifi-
cant external predictive power of QSAR mod-
els. Traditional QSAR models are developed
based on chemical descriptors alone (Klopman
et al. 2004; Richard 2006). In some cases, addi-
tional physicochemical properties, such as water
partition coefficient (logP)  (Klopman et al.
2003), water solubility (Stoner et al. 2004), and
melting point (Mayer and Reichenberg 2006)
were used successfully to augment computed
chemical descriptors and improve the predic-
tive power of QSAR models. These studies sug-
gest that using hybrid descriptor sets in QSAR
modeling could prove beneficial.
    The availability of HTS data on large sets
of chemical agents offers an attractive avenue
for exploring its utility in hybrid descriptor-
based QSAR modeling. In this respect, the
NTP-HTS data represent attractive and poten-
tially mechanistically relevant in vitro "biologi-
cal" descriptors for modeling the adverse health
effects in vivo. Our study tested a hypothesis
that improved QSAR predictions can be

Address correspondence to A. Tropsha, Campus Box
7360, 327 Beard Hall, University of North
Carolina, Chapel Hill,  NC 27599-7360 USA.
Telephone: (919) 966-2955. Fax: (919) 966-0204.
E-mail: alex_tropsha@unc.edu
  Supplemental Material is available online at http://
www.ehponline.org/members/2008/10573/suppl.pdf
  We thank R. Tice (National Institute of Environ-
mental Health Sciences) for valuable comments.
  This work was supported, in part, by grants from
the National Institutes of Health (GM076059 and
ES005948) and the U.S. EPA (RD832720).
  This manuscript was approved for publication by the
U.S. EPA National Center  for Computational
Toxicology. However, the content does not necessarily
reflect the views and policies of the U.S. EPA and men-
tion of trade names or commercial products does not
constitute endorsement or recommendation for use.
  The authors declare they have no competing
financial interests.
  Received 19 June 2007;  accepted 3 January 2008.
506
                          VOLUME 1161 NUMBER 41 April 2008 • Environmental Health Perspectives
                                       Previous
                 TOC

-------
                                                                       Biological descriptors in QSAR modeling of carcinogenicity
developed using a combination of chemical and
biological descriptors of environmental chemi-
cals. To this end, we have developed QSAR
models based on NTP-HTS data using the k
nearest neighbor (knn) approach. Initially, we
modeled the NTP—HTS results separately to
explore the inherent relationship between
chemical structure and its effect on cell viability.
Next, we evaluated if a correlation exists
between the NTP-HTS  assay results and their
in vivo rodent carcinogenic potency, as reported
in the CPDB. Subsequently, the HTS results
were used as biological descriptors that were
combined with chemical  descriptors  to develop
£NN QSAR models for predicting rodent car-
cinogenicity summary calls of the compounds.
Finally, we attempted to examine the relative
significance of the HTS "descriptors" in the
resulting models and their interplay with chem-
ical descriptors. Our studies demonstrate that
adding NTP—HTS data to chemical descriptors
employed in  conventional  QSAR modeling
affords improved models that may advance the
use of computational approaches in toxicology.
   Our current studies were limited to explor-
ing the value of cell viability assays in  predict-
ing rodent  carcinogenicity as one example of
in vivo toxicity end point. This limitation was
because in  vivo rodent carcinogenicity is the
only end point reported in the CPDB for a sig-
nificant fraction of compounds also tested for
their effect on cell viability. Certainly,  as addi-
tional chemicals with known in vivo responses
are tested in cell-based assays, we will continue
to explore similar approaches in correlating the
in vitro and in vivo data.

Methods
Data sources. NTP-HTS  data set. The
NTP-HTS assay results were obtained from
PubChem (NCBI 2007), and chemical  struc-
tures associated with these  results  were pro-
vided by the DSSTox (U.S. EPA 2007)
database. The complete data set included
1,408 compounds that were tested in six cell
lines at the National Institutes of Health
(NIH) Chemical Center Genomics (NCGC)
(Inglese et al. 2006; Xia et al. 2007). The cell
lines used for screening of the effect of chemi-
cal agents on cell viability included BJ  [human
foreskin fibroblast;  PubChem bioassay identi-
fier (AID)  no. 421], HEK293 (transformed
human embryonic kidney cell line; AID no.
427), HepG2 (human  hepatoma; AID no.
433), Jurkat (clone E6-1, human acute T-cell
leukemia; AID no. 426), MRC-5 (human
lung fibroblast; AID no. 434), and  SK-N-SH
(human neuroblastoma;  AID  no. 435).
Details on the assays and the testing protocols
can be found in PubChem. For the purposes
of this work,  the data set was curated as fol-
lows. First,  we removed  duplicate data entries
for 55 chemical records with identical chemi-
cal structures (i.e., keeping one of the two
identical records) and 14 records for which
molecular structure could not be obtained.
Second, inorganic and organometallic com-
pounds as well as compound  mixtures were
excluded since these do not have conventional
chemical descriptors used in QSAR studies.
The curated subset of the original NTP-HTS
data set used in this work included 1,289
unique organic compounds [Supplemental
Material, Table  1  (online at http://www.
ehponline.org/members/2008/10573/suppl.
pelf)].  The "activity" classification for each
compound, for each HTS assay, was assigned
by NCGC as reported in PubChem. HTS
studies included the 55 duplicate compounds.
The analysis of assay results for  these duplicate
compounds demonstrated that the HTS data
were highly reproducible (Xia et al. 2007).
    The CPDB database. We obtained the
rodent carcinogenicity data from the CPDB
(CPDB 2007; Gold et al. 1991). The CPDB
provides a systematic and unifying source of
the outcomes from in vivo  animal chemical
carcinogenicity studies. The most recent release
of the  CPDB includes experimental  data on
testing of 1,481 diverse chemicals in one or
both sexes of rats and mice, reporting out-
comes  on 35  possible target organ/tissue sites.
A chemical structure—annotated version of the
CPDB summary tables consolidating all
species was  published on the U.S. EPA
DSSTox website (U.S. EPA  2007) with addi-
tional  summary activity categorizations  and
was used for the present study. For modeling
purposes, chemical agents in the CPDB were
categorized as follows: active  (multisite, multi-
sex, or multispecies carcinogens), marginally
active (single-site carcinogens), inactive (non-
carcinogenic in more than two test cells and no
active results), or no conclusion (insufficient
results). Of the 1,466 compounds classified as
"active" or "inactive" in the CPDB, 314 were
represented in the NTP-HTS data set (178
active and 136 inactive) and used in this study.
A complete list of these agents is provided in
the Supplemental Material, Table 2 (online at
http://www.ehponline.org/members/2008/
10573/suppl.pdf).
    MolConnZ chemical descriptors.  The
MolConnZ  software (eduSoft LC, Ashland,
VA, USA) affords computation  of a wide range
of topologic  indices of molecular structure.
These  indices include but are  not limited to
the following descriptors: simple and valence
path, cluster, path/cluster and chain molecular
connectivity  indices, kappa molecular shape
indices, topologic and electrotopologic state
indices, differential connectivity indices, graph
radius and diameter, Wiener  and Platt indices,
Shannon and Bonchev—Trinajstic information
indices, counts of different vertices, counts of
paths,  and edges between different kinds of
vertices (Hall et al. 1991; Kier 1986, 1987;
Kier and Hall 1991). Overall, MolConnZ
produces over 400 different descriptors. Those
with zero value or zero variance were removed.
The remaining descriptors were range scaled,
as the absolute scales for MolConnZ  descrip-
tors can differ by  orders of magnitude.
Accordingly, our use of range scaling  avoided
giving descriptors with significantly higher
ranges a disproportional weight on distance
calculations in multidimensional MolConnZ
descriptor space.
    QSAR  modeling. Selection of test and
training sets. The curated NTP-HTS data set
(consisting of 1,289 unique organic com-
pounds) was subdivided into multiple training/
test set pairs using the sphere exclusion pro-
gram developed in our laboratory (Golbraikh
et al. 2003).  The number of compounds
included in the test set was gradually increased
to obtain the largest possible test set for which
accurate predictions  could be obtained from
models developed for the corresponding small-
est possible training set.
    The procedure implemented in the present
study begins with the calculation of the distance
matrix D between points that represent com-
pounds in the descriptor space. Let -Dm;n and
-Dmax be the minimum and maximum elements
of D, respectively. TV probe sphere radii, R, are
defined by the following formulas: R^m = R\ =
Anim  -#max  = RN  =  Anax/4,  Rf  =  RI +
(*-l)*CfyrJ?i)/CW-l), where i= 2, ..., N-l.
Each probe sphere radius corresponds to one
division in the training and the test set. A
sphere-exclusion algorithm used in the present
study consisted of the following steps: (i) ran-
domly select a compound; (ii)  include  it in the
training set;  (iii) construct a probe sphere
around this compound; (iv) select compounds
from this sphere  and include them alternately
into the test and the training sets; (v) exclude all
compounds from within this sphere from fur-
ther consideration; and (vi) if no more com-
pounds are left, stop. Otherwise let m be the
number of probe spheres constructed and n be
the number of remaining compounds. Let d^
(i=l,...,m;j=l,...,n) be the distances between the
remaining compounds and the probe sphere
centers. Select a compound corresponding to
the lowest  Rvalue and go to step (ii). This
algorithm guarantees  that at least in the entire
descriptor space (i) representative points of the
test set are close to representative points of the
training set (test set compounds are within the
applicability domain defined by the  training
set); (ii)  most of the representative points of the
training set  are close to representative points of
the test set; and (iii) the training set represents
the entire modeling set (i.e., there is no subset
in the modeling set that is not represented by a
similar compound  in  the  training set)
(Golbraikh et al. 2003). Consequently, the
sphere exclusion algorithm could maximize the
diversity of the training/test sets in the descrip-
tor space used for modeling. Because of the
Environmental Health Perspectives  •  VOLUME 116 I NUMBER 4 I April 2008
                                                                                  507
                                       Previous
                  TOC

-------
Zhu et al.
stochastic nature of the algorithm, the composi-
tion of training and test sets is different for dif-
ferent original data set divisions. For example,
we tested the results of more than 40 data set
divisions generated by the sphere exclusion and
found that any two training sets had no more
than 85%  identical compounds.
   The statistical significance  of models was
characterized with the standard leave-one-out
cross-validated R2 (if) for the training sets and
the conventional R2  for the test sets. Models
were considered acceptable if both q and R
were larger  than the arbitrary cutoff values
(0.65 was used as a cutoff in this study).
Models that did not  meet these cutoff criteria
were discarded. Additional details  of this
approach  are described elsewhere  (Golbraikh
et al. 2003; Golbraikh and Tropsha 2002b).
   £NN  QSAR method. The £NN QSAR
method employs the  £NN pattern recognition
principle and a variable selection procedure.
Initially, a subset of nvar (number of selected
variables) descriptors is selected  randomly. The
model developed with this set of descriptors  is
validated  by leave-one-out cross-validation,
where each compound is eliminated from the
training set, and its biological  activity is pre-
dicted as the average  activity of k most similar
molecules  (k = 1 to 5). The weighted molecu-
lar similarity was characterized by the modi-
fied Euclidean distance between compounds
in the nvar subspace  of the multidimensional
descriptor space. Generally, the Euclidean dis-
tances in the descriptor space between a com-
pound and each of its k nearest neighbors are
not the same. Thus, the activity of each of the
k neighbors of a compound was given a weight
that was higher for close neighbors and lower
for distant neighbors as follows (Equations  1
and 2):
                                       [1]
                      exp(-<)'
             y = 2 w,y,
where dj is the Euclidean distance between
the compound and its k nearest neighbors; Wj
is the weight for every individual nearest
neighbor; j/; is the actual activity value for
nearest neighbor i; and y is the predicted
activity value. A method of simulated anneal-
ing with the Metropolis-like acceptance crite-
ria is used to optimize the variable selection.
    In summary, the £NN QSAR algorithm
generates both an optimum k value and an
optimal nvar subset of descriptors, that afford a
QSAR model with the highest training set
model accuracy as estimated by the q2 value.
Further details of the £NN method implemen-
tation, including the description of the simu-
lated  annealing procedure used for stochastic
sampling of the descriptor space, are given  in
our previous publications  (Ng et al. 2004;
Shen et al. 2003; Zheng and Tropsha 2000).
    Applicability domain of &NN QSAR
models. Formally, a QSAR model can predict
the target property for any compound for
which chemical descriptors can be calculated.
However, because all the models are developed
in £NN QSAR modeling by interpolating
activities of the nearest neighbor compounds
only in the relevant training sets, a special
applicability domain (i.e., similarity threshold)
should be introduced to avoid making predic-
tions  for compounds that differ substantially
from  the training set  molecules. This proce-
dure resembles that for identifying chemical
outliers prior to the onset of modeling.
    To measure similarity, each compound is
represented by a point in the Af-dimensional
descriptor space (where M is the total number
of descriptors  in the descriptor pharma-
cophore) with the coordinates X^, Xi2, ..., XiM,
where Xj s are the values of individual descrip-
tors. The molecular similarity between any two
molecules is characterized by the Euclidean dis-
tance  between their representative  points. The
Euclidean distance  d^ between two  points i
and j (which correspond to compounds i and
j) in Af-dimensional space can be calculated  as
follows (Equation 3):
                                        [3]
    Compounds with the smallest distance
between one another are considered to have
Table 1. Summary of the biological activity of chemical agents screened in NTP-HTS assays.
Classification
Actives
Inconclusives
Inactives
BJ
42
44
1,203
HEK293
63
79
1,147
HepG2
41
47
1,201
Jurkat
121
89
1,079
MRC-5
37
44
1,208
SK-N-SH
74
54
1,161
All tests
140
90
1,059
Table 2. Rodent carcinogenicity classification (CPDB database) for 314 NTP-HTS compounds.
Rats
Classification
Active
Inactive
Total
Male
121
150
271
Female
111
154
265
Mice
Male
123
153
276
Female
134
140
274
the highest similarity. The similarities of com-
pounds in our training set are compiled to
produce an applicability domain threshold,
Dj^ calculated as follows (Equation 4):
DT = V -
                                       [4]
Here, y is the mean Euclidean distance to the
nearest neighbor of each compound within
the modeling set, O is the standard deviation
of these Euclidean distances, and Zls an arbi-
trary parameter to control the significance
level. On the basis of previous studies (Shen
et al. 2002), we set the default value of this
parameter to 0.5, which formally places the
boundary for which compounds will be pre-
dicted at one-half of the SD (assuming a
Boltzmann distance distribution between
£NN compounds in the training set). Thus, if
the distance of the external compound from
at least one of its k nearest neighbors  in the
training set exceeds this threshold, the predic-
tion is considered unreliable.
    Robustness of QSAR models. y-Randomi-
zation (randomization of response)  is a widely
used approach to establish the model robust-
ness. It consists of rebuilding the models using
randomized activities of the  modeling set and
subsequent assessment of the model statistics. It
is expected that models obtained for the model-
ing set with randomized activities should have
significantly lower predictivity for the external
validation set than the models built using the
modeling set with real activities, or the total
number of acceptable models based on the ran-
domized modeling set satisfying the same cutoff
criterion (q2 and R2> 0.65) is much less than
that based on the real modeling set. If this con-
dition is not satisfied, real models built for this
modeling set are not reliable and should be dis-
carded. This test was applied to all data divi-
sions considered in this study.

Results
Table 1 provides  a summary of the classifica-
tion of the chemical agents used for these stud-
ies with respect to their  "biological activity"
(i.e., the effect on cell viability)  in each of the
six cell lines used for screening. In the entire
NTP-HTS data set of unique 1,289 com-
pounds, 140 were defined as "active" and 90 as
"inconclusive" based on one or more active or
inconclusive calls recorded in PubChem across
the six cell lines, respectively. The majority of
compounds—1,059—were  recorded  in
PubChem as "inactive" in all experiments.
Overall, the NTP-HTS data set contains
314 compounds  that can be mapped  to the
CPDB database and classified as carcinogenic
according to  DSSTox "multisite, multisex, or
multispecies" summary designations (Table 2).
    QSAR modeling of NTP-HTS data using
chemical descriptors, QSAR modeling of
the NTP-HTS data was desired to establish
508
                           VOLUME 116 I NUMBER 41 April 2008  •  Environmental Health Perspectives
                                        Previous
                  TOC

-------
                                                                        Biological  descriptors in QSAR modeling of carcinogenicity
predictive models of HTS assays that can be
used to impute such data for future compound
libraries that may be tested. In addition, our use
of the y-randomization test as part of modeling
procedures could be viewed as an independent
statistical test of the "nonrandomness" of the
HTS data. The curated NTP-HTS data set has
a biased distribution of active and inactive com-
pounds (16% actives and inconclusives vs. 84%
inactives). This  is characteristic of most of the
available biological data sets (such as those
deposited in PubChem), which are dominated
by inactive compounds. To address this bias,
we used a (dis)similarity search to exclude a
considerable fraction of inactive compounds
from the data set to balance the  active/inactive
ratio for modeling purposes. To this end, we
calculated the Molecular ACCess System
(MACCS)  structural keys  (Renner  and
Schneider 2006) for all 1,289 compounds in
the  data set,  using  the MOE  software
(Chemical  Computing Group, Montreal,
Canada). All the active compounds were used
as a  probe subset,  and the Tanimoto coeffi-
cients (Schultz et al. 2003b; Willett  and
Winterman  1986) between each inactive com-
pound and the probe subset were calculated
based on their MACCS keys.  The inactive
compound was  selected into the modeling set
only if it had a  relatively high Tanimoto simi-
larity (> 0.7) with one or more active/inconclu-
sive compounds. Using this approach, 244 of
the original 1,079 inactive compounds were
selected because of their relatively high similar-
ity to the active compounds. Thus, the final
data set for the  classification QSAR modeling
included a total of 384 compounds (140 actives
and  244 inactives). The rationale for  this
approach to selecting (a subset of) inactive com-
pounds for the classification modeling is that it
is more challenging to establish robust models
when the two classes of active and inactive
compounds include relatively similar molecules.
It is quite obvious that if the two classes of
compounds  (i.e., active or inactive) are chemi-
cally dissimilar as judged by a simple similarity
metric such  as Tanimoto coefficients, then no
additional statistical modeling using sophisti-
cated data mining techniques is  necessary. We
did not include any compounds with incon-
clusive results in our modeling studies.
   Because it is critical to demonstrate  that
QSAR models have high prediction accuracy
for external validation data sets (Golbraikh and
Tropsha 2002a; Zhang et al. 2006),  109 com-
pounds (37 actives  and 72 inactives) were  ran-
domly selected  for external model validation.
The remaining 275 compounds (103 actives
and 172 inactives) were  used for modeling, and
multiple training and test sets were generated.
The variable selection £NN QSAR models were
developed for each training set, and the predic-
tive power of each  model was assessed against
the corresponding test set. The acceptability
cutoff values of the leave-one-out cross-valida-
tion accuracy and the prediction accuracy for
the test set were set to 0.65 (Kovatcheva et al.
2004). Because the data set was unbalanced,
we used the average of sensitivity and speci-
ficity to represent the overall predictive power
of a model in this study. Therefore, the overall
predictive  accuracy of each model was defined
as the average of the  correctly predicted active
ratio (sensitivity) and the correctly predicted
inactive ratio  (specificity) (de Lima et al.
2006). The total number of models that satis-
fied the accuracy threshold criteria was 599,
and the statistical characteristics of 15 most sig-
nificant £NN models are shown in Table 3.
    Our previous studies have demonstrated
that the highest external prediction accuracy of
QSAR models is achieved using the consensus
approach,  that is, by averaging the predictions
from individual models (Tropsha et al. 2003).
The consensus prediction results for 109 com-
pounds in the external validation set are
provided in Table 4.  The sensitivity and speci-
ficity of the consensus prediction were 56.8%
and 90.2%, respectively. Thus, the overall pre-
dictive power was 73.5%,  that is, similar to
that for the training/test sets (Table 3).
    To ensure high external validation accuracy
of the training set models, we  also  considered
their applicability domains. This  restriction
decreases the number of compounds consid-
ered for the prediction but  increases the relia-
bility so that higher  accuracy is typically
expected. Indeed, after removing compounds
outside the applicability domain of our train-
ing set models, the coverage of the external set
was reduced to 88%.  However, the accuracy of
prediction for actives and inactives improved
to 65.4% and 92.9%,  respectively (i.e., total
accuracy increased to  - 80%).
    It is interesting to  see whether the  £NN
HTS models could make reliable predictions of
the remaining 835 inactive compounds, which
were excluded because they were relatively dis-
similar to the compounds used in the model-
ing procedure. The consensus prediction gave
64.1% predictive  accuracy for all 835 com-
pounds. After excluding 138 compounds out
of applicability domain, the coverage was
reduced to 83.5%, but the predictive accuracy
increased to 80.1%
    The y-randomization test was performed as
well. For the modeling set with real HTS
results, there were 599 models that satisfied the
criterion ofif/K2 > 0.65 (Table 3), whereas for
the data set with randomized HTS  results, only
5 models that had if/IP > 0.65 were generated.
These results indicate that our  models are sta-
tistically robust.
    The utility of the NTP-HTS data for
QSAR modeling of rodent carcinogenicity, A
total of 314 NTP-HTS compounds are repre-
sented in the CPDB. A summary of HTS
activity and rodent  carcinogenicity of these
agents is shown in  Table 5. Seventy-seven
percent of the compounds classified  by
NTP—HTS as "active" are also categorized as
rodent carcinogens. On the contrary, only
46% of NTP—HTS "inactive" agents are classi-
fied by the CPDB  as noncarcinogenic in
rodents. At the same  time, the large fraction of
compounds  found  inactive in HTS assays
effectively renders the current assays insuffi-
cient in terms of predicting the in vivo toxicity.
    To further examine whether in vitro
NTP-HTS  data could  improve the pre-
diction  accuracy for in vivo rodent carcino-
genicity  testing, we applied the hybrid
descriptor-based QSAR modeling that uti-
lized both biological  (NTP-HTS output)  and
chemical [MolConnZ (eduSoft LC)] descrip-
tors. First, all 314 compounds were  randomly
divided  into two sets. The modeling set com-
prised 264 compounds, whereas 50  randomly
selected compounds were designated as the
external validation set. After calculating
Table 3. Statistical information of the 15 most statistically significant /d\IN QSAR models based on the
275-compound modeling set.
Model ID
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Average
N-training
141
140
141
140
141
190
228
140
140
149
140
190
140
149
141
154
Pred.-training
0.90
0.91
0.92
0.88
0.88
0.90
0.86
0.92
0.89
0.88
0.87
0.85
0.88
0.87
0.92
0.89
N-test
119
121
120
123
120
85
47
121
116
122
124
85
125
118
123
111
Pred.-test
0.73
0.71
0.69
0.73
0.73
0.71
0.74
0.67
0.70
0.70
0.72
0.73
0.70
0.71
0.66
0.71
NNN
1
1
1
1
1
1
1
1
4
1
1
1
1
1
1
1
Abbreviations: N-training, number of compounds in the training set; Pred.-training, the overall predictivity of the training
set; N-test, number of compounds in the test set; Pred.-test, the overall predictivity of the test set; NNN, number of the
nearest neighbors used for prediction.
Environmental Health Perspectives • VOLUME 116 I NUMBER 4 I April 2008
                                                                                   509
                                        Previous
                  TOC

-------
Zhu et al.
chemical descriptors using the MolConnZ
software, we combined the NTP-HTS data
(a total of seven binary biological  descriptors
including the active/inactive call for each cell
line separately and one for the entire experi-
ment, i.e., a compound was considered  active
if it was active in at least one cell line) with
the MolConnZ chemical descriptors to  create
a hybrid chemico-biological descriptor set.
Although we appreciate that the six cell lines
originate from different organs, it  is notewor-
thy that great similarity was observed in cyto-
toxicity profiles across the entire panel of cell
lines  (R. Tice,  personal communication).
Furthermore,  the number of active  com-
pounds for each individual cell line is rela-
tively small, thus we combined the data. After
using the sphere exclusion method  to generate
training/test set pairs from the same modeling
set of compounds, two types of £NN QSAR
models were developed. One was  built  using
only the MolConnZ chemical  descriptor set
(340 variables), and the other was  built using
the combined chemico-biological descriptor
set (347 variables).
   kNN QSAR models were selected  based
on the cf/R2 cutoff of 0.65/0.65 in  this model-
ing development process. One hundred three
£NN models  developed using chemical
descriptors alone that passed these criteria,
whereas this number nearly doubled to 198
when a combined chemico-biological descrip-
tor set was used. Although data from each of
the six  cell lines or their combination were
given equal weight in defining  the global
NTP-HTS activity of each compound, the
prognostic value of each cell line varied with
regard  to its usefulness for predicting the
rodent carcinogenicity of a chemical. Figure 1
shows the frequency of use of each biological
descriptor in the 198 successful £NN QSAR
models. The predictive power of the QSAR
models  was verified using the external valida-
tion set of 50 compounds not used in training
set modeling (Table  6). QSAR modeling
using MolConnZ descriptors only  [referred to
                 as £NN-MolConnZ (£NN-MZ)  models]
                 achieved 69.2% sensitivity and 55.5% speci-
                 ficity (Table 7). In contrast, 78.6% sensitivity
                 and  66.7% specificity were achieved when the
                 combined chemico-biological descriptor set
                 (referred to as £NN-MZHTS models) was
                 used for modeling. The overall prediction
                 accuracy rate  increased significantly from
                 62.3% to 72.7% and the coverage of the
                 external set increased from 88% to 92%, that
                 is, more external compounds were found
                 within (numerically) the same applicability
                 domain when using the hybrid descriptor set.
                    The y-randomization test was also per-
                 formed for the carcinogenicity modeling using
                 MZ descriptors only and using the MZ and
                 HTS descriptors. Using  randomized carcino-
                 genicity results, no models could be found to
                 satisfy the criterion ofif/R2 > 0.65,  indicating
                 that  our models were statistically robust.

                 Discussion
                 This study evaluated the potential of HTS cell
                 assays  as novel biological predictors  of adverse
                 health effects caused by chemicals in vivo in ani-
                 mal studies. To this end, we have evaluated the
                 HTS data for  hundreds of chemical agents
                 tested in six cell lines and focused on  com-
                 pounds that were also studied for their carcino-
                 genicity in chronic two-year cancer bioassays by
                 the NTP. Although HTS results provided com-
                 plete dose—response data, we used only binary
                 activity summary data (i.e., actives or inactives)
                 because of the binary nature of the CPDB data
                 (i.e., carcinogenic or not carcinogenic). Our ini-
                 tial analysis has established a strong correlation
                 between the chemical structures of the com-
                 pounds and their effects in cell-based assays.
                 However, we have demonstrated, not surpris-
                 ingly, that the results of testing compounds in
                 cell viability assays do not serve as unequivocal
                 predictors of their carcinogenicity  in vivo.
                 Specifically, the data indicated a fairly strong
                 predictivity of cell growth inhibition toward
                 animal carcinogenicity (i.e., a positive cell
                 viability assay response has a strong probability
Table 4. Consensus prediction for 109 compounds in the external validation set.
                                Consensus prediction
                                  After applicability domain applied
Model characteristics
Pred. actives (n)
Pred. inactives (n)
Sensitivity (%)
Specificity(%)
Overall predictive power (%)a
Exp. actives
21
16






56.8
90.2
73.5
Exp. inactives
7
65



Exp. actives
17
9






65.4
92.9
79.2
Exp. inactives
5
65



Abbreviations: Exp., experimental; Pred., predicted.
"The overall predictive power is the average value of sensitivity (predictive rate of actives) and specificity (predictive rate
of inactives).

Table 5. The relationship between HTS activity and rodent carcinogenicity of 314 compounds.
Content of CPDB
                          HTS actives
                                                   HTS inconclusives
                                                                              HTS inactives
          of predicting carcinogenicity in vivo) but low, if
          any, predictivity of the in vivo carcinogenicity
          on the compound effects in cell viability assays
          (i.e., there are many carcinogens that do not
          elicit responses in the cell viability assays). Thus,
          to maximize the utility of in vitro assays results
          for predicting the in vivo data, we considered
          building QSAR models of the in vivo chemical
          carcinogenicity using HTS results as additional
          biological descriptors of underlying chemical
          structures.
             There are several major potential applica-
          tions of biological descriptors in QSAR model-
          ing that may advance  the science and practice
          of computational toxicology. In our computa-
          tional experiments, the binary contributions of
          all six HTS cell line test results were treated
          equally a priori. The  variable selection £NN
          QSAR approach yielded 198 externally predic-
          tive models. Because of the nature of the
          method,  these models differ in the choice of
          descriptors resulting from the variable selection
          procedure for the final model. Thus, the mod-
          els could be analyzed for the frequency of
          occurrence  of different descriptors that could
          reveal chemical determinants of a compound's
          carcinogenicity, as well as possible utility of the
          individual HTS assays. Figure 1 shows the fre-
          quency of occurrence of seven HTS descriptors
          in the 198  £NN QSAR models described
          above. The analysis of this  distribution, espe-
          cially in the context of chemical structure of
          tested compounds, may provide clues concern-
          ing the usefulness of different cell lines  for
          screening purposes.
             For example, the HTS-Jurkat and HTS-
          HepG2 biological descriptors were found in  the
          majority  of the successful models. Jurkat and
          HepG2 are human tumor cell lines derived
          from a T-cell leukemia and hepatocellular carci-
          noma, respectively. Jurkat cells grow in suspen-
          sion with a relatively fast doubling time of about
          22 hr. In contrast, HepG2 cells grow as attached
          cultures with a doubling time of about 37  hr.
          Both cell lines retain some metabolic capacity
          toward xenobiotics and are used frequently for
          in vitro testing (Mersch-Sundermann et al.
          2004; Nagai et al.  2002). Compared with
          HTS-Jurkat and HTS-HepG2  cells, the
          HTS-HEK293 descriptor (a human embryonic
                                                                                                120
                                                                                            8 o 100
                                                                                            if  8°
                                                                                            = -S  60
CPDB actives (n)
CPDB inactives (n}
Correlation (%)
30
77
                            12
                            13
136
114
 46
Figure 1. Seven HTS descriptors with their fre-
quency of use in the 198 kNN QSAR model.
510
                                            VOLUME 116 I NUMBER 41 April 2008 • Environmental Health Perspectives
                                        Previous
                                   TOC

-------
                                                                        Biological descriptors in QSAR modeling of carcinogenicity
kidney cell line) was found in much smaller
numbers of successful models, and all but
two compounds active in this cell line were
also found  to be active in other cell  lines.
Therefore, assay results for the tested com-
pounds in HEK293 cells may be redundant
with respect to rodent carcinogenicity model-
ing conducted here.
   Interestingly, the predictions for 8 of the 50
compounds  in the external test set were  differ-
ent using the £NN-MZ versus £NN-MZHTS
models. The apparent reason for this disparity
(in the context of £NN  QSAR approach used
in this study) is due to the change of nearest
neighbors in the training set of these 8 com-
pounds using the MolConnZ (eduSoft LC)
descriptors only versus using the hybrid chemi-
cal—HTS descriptors. For example, the com-
pound 2,4-dichlorophenol (CAS  no. 120-83-2)
has 1,2-benzenediol (CAS no. 120-80-9),
1,4-benzenediol (CAS no. 123-31-9),  and
4-chlorobenzene-l,2-diamine  (CAS  no.
95-83-0) as its nearest neighbors in the training
set as  defined  by  £NN-MZ modeling
(Table 6). After including HTS descriptors, its
nearest neighbors in the training set change to
2-chloro-/>-phenylenediamine (CAS no.
61702-44-1), l-amino-4-methoxybenzene
(CAS no. 20265-97-8), and/-nitroaniline
(CAS no.  100-01-6) instead. Thus, the addi-
tion of HTS descriptors affects  the similarity
relationships between compounds based purely
on their chemical descriptors. As shown in this
study, the addition of HTS descriptors, on
average, improves the prediction  accuracy of
in vivo carcinogenicity.
   We further analyzed the interplay between
the significance of the bioassay and that of spe-
cific chemical descriptors  in the context of
in vivo carcinogenicity by comparing the
occurrence of top  chemical descriptors in
QSAR models with and without HTS descrip-
tors. Table 8 shows chemical descriptors that
occur most frequently in successful (i.e., exter-
nally predictive) QSAR models using chemical
descriptors  only. This table also reports the
change in occurrence of these descriptors after
HTS  descriptors are included. Because the
number of successful £NN QSAR models
increased significantly from 103 to 198 after
HTS descriptors were used,  we also include in
Table 8 the ratio of occurrence to the total
number of models, which may better indicate
the significance of the descriptors.
   The descriptors for each final kNN QSAR
model are chosen as a result of the stochastic
variable selection procedure that maximizes the
correlation between descriptors and carcino-
genicity. We reasoned that the analysis of
occurrence  of various chemical descriptors
before and after inclusion of HTS descriptors
in modeling may be interpreted  in terms of
their relative information content with respect
to the in vivo toxicity. Thus, those chemical
descriptors that have a similar ratio of occur-
rence in models with or without HTS descrip-
tors  (exemplified by descriptors 1, 2, and 7)
contribute to successful models independently
of the biological descriptors. For compounds
whose predicted activity is primarily determined
by the presence of these particular chemical
descriptors and unaffected by the addition of
Table 6. Consensus prediction of 50 compounds in the external validation set using the /d\IN QSAR models
based on two different descriptor sets.
CAS no.
79005
106934
90120
86577
634935
120832
99558
67630
96695
619170
298817
75058
50782
50760
86500
92875
57578
80057
75274
115286
91645
4342034
103231
333415
62737
828002
98011
87683
67721
122667
58935
121755
298000
150685
1212299
759739
98953
67209
59870
55185
636215
106478
122601
103855
1918021
57681
79196
108054
1330207
17924924
Name CPDB actives
1,1,2-Trichloroethane +
1,2-Dibromoethane +
1-Methylnaphthalene
1-Nitronaphthalene
2,4,6-Trichloroaniline +
2,4-Dichlorophenol
5-Nitro-o-toluidine +
Isopropanol
4,4-Thiobis(6-ferf-butyl-m-cresol)
4-Nitroanthranilicacicl
8-Methoxypsoralen +
Acetonitrile
Acetylsalicylicacid
Actinomycin D +
Azinphosmethyl
Benzidine +
Propiolactone +
BisphenolA
Bromodichloromethane +
Chlorendicacid +
Coumarin +
Dacarbazine +
Di(2-ethylhexyl)adipate +
Diazinon
Dichlorvos +
Dimethoxane +
Furfural +
Hexachloro-1,3-butadiene +
Hexachloroethane +
Hydrazobenzene +
Hydrochlorothiazide
Malathion
Methyl parathion
Monuron +
/V./V'-Dicyclohexylthiourea
/V-Ethyl-n-nitrosourea +
Nitrobenzene +
Nitrofurantoin +
Nitrofurazone +
/V-Nitrosodiethylamine +
o-Toluidine hydrochloride +
p-Chloroaniline
Phenyl glycidyl ether +
Phenylthiourea
Picloram
Sulfamethazine +
Thiosemicarbazide
Vinyl acetate +
Xylenes (mixed)
Zearalenone +
MZ MZHTS
+ +
+ +
-
+ +
+ +
+
+ +
+
+ +
- -
+ +
+ +
-
I +
-
+ +
+ +
+
+ +
I I
+ +
_ _
+ +
- -
+ +
-
+ +
+
+ +
-
I I
- -
-
_ _
I |
+ +
+ +
I +
I I
+ +
-
-
+ +
+
- -
-
+ +
+ +
+ +
+
Abbreviations: +, carcinogenic; -, noncarcinogenic; I, inconclusive because out of the applicability domain; MZ, models
based on MolConnZ descriptors only; MZHTS, models based on the combination of MolConnZ and HTS descriptors.

Table?. Summary of the statistical parameters of the prediction results of 50 external compounds.
                            Chemical descriptors only
                     Combined descriptors
Model characteristics
Pred. actives
Pred. inactives
Sensitivity (%)
Specificity(%)
Overall predictivity(%)
Coverage (%)
Exp. actives
18
8
69.2
55.5
62.3
88
Exp. inactives
8
10
Exp. actives
22
6
Exp. inactives
6
12
78.6
66.7
72.7
92
Abbreviations: Exp., experimental; Pred., predicted.
Environmental Health Perspectives  •  VOLUME 116 I NUMBER 4 I April 2008
                                                                                  511
                                        Previous
                  TOC

-------
Zhu et al.
Table 8. Summary of the top 10 atom and bond type MozConnZ chemical descriptors used in successful
kNN QSAR models before and after using HTS descriptors.
 No.  Descr  Name    Illustration
                                                Freq_MZ  Ratio_MZ  Freq_MZHTS  Ratio_MZHTS
       Snitroso
       nHBintS
        naasN
       SHBintS
       nHssNH
        SdsN
       SdsssP
9       SsBr

10     SHssNH
                   Sum of E-states of nitroso group
       nnitroso      Number of nitroso group
                   Number of hydrogen bond acceptor
                   and donor pairs separated by
                   3 skeletal bonds
                                I
                                                  38
                                                  34
                                                  27
                                        36.9%
                                        33.0%
                                        26.2%
                                                                       73
                                                               36.9%
                                                               34.8%
                                                                15.7%
                    H bond
                    donor
                                 H bond
                                 acceptor
                   Number of aromatic nitrogen
                   with substitute
Sum of E-state of strength for
potential hydrogen bonds if
separated by 3 skeletal bonds
              l
                     H bond     |        | 3
                     donor       H       A
              H bond
              acceptor
Number of amine groups
              ,H

                   Sum of E-states for nitrogens
                   with one single bond and one
                   double bond
                   Sum of E-states for phosphors
                   with three single bonds and
                   one double bond

Sum of E-states for bromines
         R	Br
Sum of H E-states for hydrogens
in amine groups.
                                                  25
                                        24.3%
                                                                       42
                                                               21.2%
                                                  24       23.3%        41
                                                               20.7%
                                                  24       23.3%        23
                                24      23.3%
                                                  19
                                        18.4%
                                                               24.2%
                                                                10.6%
19

18
                                        18.4%

                                        17.5%
45         22.7%

25         12.6%
                          	N
                               \
Abbreviations: Descr_Name, name of descriptor; Freq_MZ, frequency of occurrence in successful kNN models only
using only MolConnZ descriptors; Ratio_MZ, ratio of occurrence in successful QSAR models using only MolConnZ
descriptors; Freq_MZHTS, frequency of occurrence in successful kNN models using MolConnZ and HTS descriptors;
Ratio_MZHTS, ratio of occurrence in successful QSAR models using MolConnZ and HTS descriptors.
HTS descriptors, this implies that the HTS
adds no new information to the prediction of
in vivo carcinogenicity. Conversely, if the fre-
quency of a chemical descriptor decreases sig-
nificantly after the  HTS descriptors are
included, it is less important than, and likely
redundant with, the biological descriptors.  In
these cases, the biological descriptor is clearly
adding new, biologically significant information
that is not as effectively captured by the chemi-
cal descriptor.
    Interestingly,  descriptors   1  and  2
(7Y-nitroso compounds) were  selected as the
most important chemical descriptors in our
models, and their importance is relatively unaf-
fected by inclusion of HTS descriptors. The
large  majority of 7V-nitroso compounds have
been found to produce genotoxic effects and to
cause tumor development in laboratory ani-
mals, as they are metabolized to reactive elec-
trophilic species causing damage to  various
cellular constituents such as DNA,  constituting
a key event  in the carcinogenic  mechanism
(Brambilla and Martelli 2007). Because such
metabolic transformations do not generally
occur in cellular systems, the significance of all
but one of the HTS assays  (i.e.,  HepG2)  in
predicting events  relevant to the carcinogenic-
ity for these  compounds is likely  to be  mini-
mal. To the contrary, the NTP-HTS data
show that cells are highly sensitive to the effects
of amine-type compounds (descriptors 6 and
10) and biological descriptors are better predic-
tors of the  carcinogenic potential of these
agents than structure alone. Among all 30 car-
cinogens that are also  active in HTS tests,  15
are  amines. A similar observation can be made
for organic compounds containing phosphorus
(descriptor 8). Most of the remaining chemical
descriptors,  which approximately delineate
neighborhoods of chemical space,  have similar
distribution among the models with or with-
out the HTS descriptors. Hence, HTS  descrip-
tors offer no additional value as predictors of
carcinogenicity for these chemical subsets. As
more HTS data are generated, the above analy-
sis suggests a strategy that can be used to eluci-
date possible mechanistic relevance of HTS
assays to  carcinogenicity prediction within
areas of chemical space approximately defined
by chemical descriptors.

Conclusions
We have examined the utility  of in vitro
NTP—HTS data for predicting in  vivo adverse
health effects (i.e., carcinogenicity) of environ-
mental agents. Our analysis suggests that
NTP—HTS  results have limited predictive
power by themselves for rodent carcinogenicity.
This result is not surprising,  given the relatively
low frequency of positives across the HTS
assays (16%) and that cell  viability (i.e., cell
death) may  not be directly related to the car-
cinogenic potential of a compound. However,
512
                                                        VOLUME 116 I NUMBER 41 April 2008  •  Environmental Health Perspectives
                                         Previous
                                               TOC

-------
                                                                                     Biological descriptors in QSAR modeling of carcinogenicity
our data suggest that using the NTP-HTS
results as biological fingerprint descriptors of
generalized xenobiotic-induced pathophysiol-
ogical processes helps  improve the overall
QSAR-based prediction accuracy of rodent car-
cinogenicity compared with those  based on
chemical descriptors alone. While the mecha-
nistic relevance of the  HTS assays in predicting
rodent carcinogenicity is  unclear at present, the
empirical evidence of the significance of the
biological descriptors for the computational
modeling purposes is compelling  and should
motivate continued investigation. Furthermore,
as additional sets of compounds with known
in vivo toxicity responses are investigated in
cell-based viability assays, we shall continue to
develop models similar to those reported in this
article for additional  toxicity end points.  The
present analysis suggests that as more mechanis-
tically relevant HTS data are generated and a
greater number  of compounds  are  screened,
computational toxicology tools  could be used
to select most relevant HTS assays  (cell lines
and/or measurements) and prioritize  chemical
agents for screening. With sufficient improve-
ments in resulting model predictive perfor-
mance, in vitro HTS  bioassays,  coupled with
traditional chemical structure-based descriptors,
may be ultimately helpful in prioritizing or even
partially replacing in vivo toxicity testing.
 	CORRECTION	

 The following corrections have been made
 from the original manuscript published
 online. In the Abstract under "Methods and
 Results," the  phrase "curated data set of
 557 compounds"  has  been  changed to
 "curated data set of 384 compounds." The
 sentence "The resulting models had  predic-
 tion accuracies for training, test (containing
 400 compounds together), and  external vali-
 dation  (157 compounds) sets as high as
 79%, 79%, and  84%, respectively" has been
 changed to "The resulting models had pre-
 diction accuracies for training, test (contain-
 ing 275 compounds together),  and external
 validation (109 compounds) sets as high as
 89%, 71%, and  74%, respectively."
                   REFERENCES

Beresford AP, Segall M, Tarbit MH. 2004. In silico prediction of
    ADME properties: are we making progress? Curr Opin
    Drug Discov Develop 7:36-42.
Brambilla G, Martelli A. 2007. Genotoxic and carcinogenic risk
    to humans of drug-nitrite interaction products. Mutat Res
    635:17-52.
Bucher JR, Portier C. 2004. Human carcinogenic risk evalua-
    tion. Part V: The National Toxicology Program vision for
    assessing the human carcinogenic hazard of chemicals.
    ToxicolSci 82:363-366.
CPDB (The Carcinogenic Potency Database).  2007. The
    Carcinogenic Potency Project. Available: http://potency.
    berkeley.edu/cpdb.html [accessed 14 December 2007].
Dearden JC. 2003. In silico prediction of drug toxicity. J Comput
    Aided MolDes 17:119-127.
de Lima P, Golbraikh A, Oloff S, Xiao Y, Tropsha A. 2006.
    Combinatorial QSAR modeling  of P-glycoprotein sub-
    strates. J Chem Inf Mod 46:1245-1254.
Golbraikh A, Shen M, Xiao Z, Xiao YD, Lee KH, Tropsha A. 2003.
    Rational selection of training and test sets for the develop-
    ment of validated QSAR models. J Comput Aided Mol Des
    17:241-253.
Golbraikh A, Tropsha A. 2002a. Beware of q2! J Mol Graph
    Model 20:269-276.
Golbraikh A, Tropsha A.  2002b. Predictive QSAR modeling
    based on diversity sampling of experimental datasets for
    the training and test  set selection. J Comput Aided Mol
    Des 16:357-369.
Gold LS, Slone TH, Manley NB, Garfinkel GB, Hudes ES,
    Rohrbach L, et al. 1991. The Carcinogenic Potency
    Database: analyses of 4000 chronic animal cancer experi-
    ments published in the general literature and by the U.S.
    National Cancer Institute/National Toxicology Program.
    Environ Health Perspect96:11-15.
Hall LH, Mohney B,  Kier LB. 1991. The electrotopological
    state—an atom index for QSAR. Quan Struct Act Relat
    10:43-51.
Inglese J, Auld DS, Jadhav A, Johnson RL, Simeonov A, Yasgar
    A, et al. 2006. Quantitative high-throughput screening: a
    titration-based approach that efficiently identifies biologi-
    cal activities in large chemical libraries. Proc Natl Acad
    Sci USA 103:11473-11478.
Johnson DE, Smith DA, Park BK. 2004. Linking toxicity and
    chemistry: think globally, but act locally? Curr Opin Drug
    Discov Devel 7:33-35.
Kier LB. 1986. Indexes of molecular shape from chemical
    graphs. Acta Pharmaceutica Jugoslavia 36:171-188.
Kier LB. 1987. Inclusion of symmetry as a shape attribute in
    kappa-index analysis. Quant Struct Act Relat 6:8-12.
Kier LB, Hall LH. 1991. A differential molecular connectivity
    Index. Quant Struct Act Relat 10:134-140.
Klopman G, Zhu H, Ecker G, Chiba P. 2003. MCASE study of the
    multidrug resistance reversal activity of propafenone
    analogs. J Comput Aided Mol Des 17:291-297.
Klopman G, Zhu H, Fuller MA, Saiakhov RD. 2004. Searching for
    an enhanced predictive tool for mutagenicity. SAR QSAR
    Environ Res 15:251-263.
Kovatcheva A, Golbraikh A, Oloff S, Xiao YD, Zheng W,
    Wolschann P, et al. 2004. Combinatorial QSAR of ambergris
    fragrance compounds. J Chem Inf Comput Sci 44:582-595.
Mayer P, Reichenberg F. 2006. Can highly hydrophobic organic
    substances cause aquatic  baseline toxicity and can they
    contribute to mixture toxicity? Environ Toxicol Chem
    25:2639-2644.
Mersch-Sundermann V, Knasmuller S, Wu XJ, Darroudi F,
    Kassie F. 2004. Use of a human-derived liver cell line for
    the detection of cytoprotective, antigenotoxic and
    cogenotoxic agents. Toxicology 198:329-340.
Nagai F, Hiyoshi Y, Sugimachi K, Tamura HO. 2002. Cytochrome
    P450 (CYP) expression in human myeloblastic and lym-
    phoid cell lines. Biol Pharm Bull 25:383-385.
NCBI (National Center for Biotechnology Information). 2007.
    The PubChem Project. Available: http://pubchem.
    ncbi.nlm.nih.gov [accessed 3 September 2007].
Ng C, Xiao Y, Putnam W, Lum B, Tropsha A. 2004. Quantitative
    structure-pharmacokinetic parameters relationships
    (QSPKR) analysis of antimicrobial agents in humans using
    simulated annealing k-nearest-neighbor and partial least-
    square analysis methods. J Pharm Sci 93:2535-2544.
NTP (National Toxicology Program). 2007. NTP High Throughput
    Screening Initiative. Available: http://ntp.niehs.nih.gov/go/
    28213 [accessed 14 December 2007].
Renner S, Schneider G. 2006. Scaffold-hopping potential of lig-
    and-based similarity concepts. ChemMedChem  1:181-185.
Richard AM. 2006. Future of toxicology—predictive toxicology:
    an expanded  view  of "chemical toxicity." Chem Res Toxicol
    19:1257-1262.
Schultz TW, Cronin MTD, Netzeva  Tl. 2003a. The present status
    of QSAR in toxicology. J Mol Struct Theochem 622:23-38.
Schultz TW, Netzeva Tl, Cronin MT. 2003b. Selection of data
    sets for QSARs: analyses of Tetrahymena toxicity from
    aromatic compounds. SAR QSAR Environ Res 14:59-81.
Shen M, LeTiran A, Xiao Y, Golbraikh A, Kohn H, Tropsha A. 2002.
    Quantitative structure-activity  relationship analysis of func-
    tionalized amino acid anticonvulsant agents using k nearest
    neighbor and simulated annealing PLS methods. J Med
    Chem 45:2811-2823.
Shen M, Xiao Y,  Golbraikh A, Gombar VK, Tropsha A. 2003.
    Development and  validation of k-nearest-neighbor QSPR
    models of metabolic stability of drug candidates. J Med
    Chem 46:3013-3020.
Stoner CL, Gifford E, Stankovic C, Lepsy CS, Brodfuehrer J,
    Prasad JVNV, et  al. 2004. Implementation of an  ADME
    enabling selection and visualization tool for drug discov-
    ery. J Pharm  Sci 93:1131-1141.
Tropsha A, Gramatica  P, Gombar  VK. 2003. The importance of
    being earnest: validation is the absolute essential for suc-
    cessful  application and interpretation of QSPR models.
    QSAR Combi  Sci 22:69-77.
U.S. EPA (U.S. Environmental Protection Agency). 2007.
    Distributed Structure-Searchable Toxicity (DSSTox) Public
    Database Network. Available: http://www.epa.gov/ncct/
    dsstox/index.html [accessed 3 September 2007].
Willett P, Winterman V. 1986. A comparison of some measures
    for the determination of intermolecular structural similarity
    measures of intermolecular structural similarity. Quan
    Struct Activ Relat 5:18-25.
Xia M, Huang R, Witt KL, Southall N, Fostel J, Cho MH, et al. .
    2007. Compound cytotoxicity profiling using quantitative
    high-throughput  screening. Environ Health Perspect
    doi:10.1289/ehp.10727 [Online 22 November 2007].
Zhang S, Golbraikh A, Oloff S, Kohn H, Tropsha A. 2006. A novel
    automated lazy learning QSAR (ALL-QSAR) approach:
    method  development, applications, and virtual screening
    of chemical databases using validated ALL-QSAR models.
    J Chem Inf Model 46:1984-1995.
Zheng W, Tropsha A. 2000. Novel variable selection quantita-
    tive structure—property relationship approach based on
    the k-nearest-neighbor principle. J Chem Inf Comput Sci
    40:185-194.
Environmental Health Perspectives  • VOLUME 116 I NUMBER 4 I April 2008
                                                                                                  513
                                               Previous
                     TOC

-------
                                                                     Genetic Epidemiology 32: 767-778 (2008)
A Comparison of Analytical Methods for Genetic Association Studies
         Alison A. Motsinger-Reif,1 David M. Reif,2 Theresa J. Fanelli3 and Marylyn D. Ritchie3*

            1Bioinformatics Research Center, Department of Statistics, North Carolina State University, Raleigh, North Carolina
        2National Center for Computational Toxicology, US Environmental Protection Agency, Research Triangle Park, North Carolina
     3Center for Human Genetics Research and Department of Molecular Physiology and Biophysics, Vanderbilt University Medical School,
                                                  Nashville, Tennessee


The explosion of genetic information over the last decade presents an analytical challenge for genetic association studies. As
the number of genetic variables examined per individual increases, both variable selection and statistical modeling tasks
must be performed during analysis. While these tasks could be performed separately, coupling them is necessary to select
meaningful variables that effectively model  the data.  This challenge is heightened due to the complex nature of the
phenotypes under study and the complex underlying genetic etiologies. To address  this problem, a number of novel
methods have been developed. In the current study, we compare the performance of six analytical approaches to detect both
main effects and gene-gene interactions in a range of genetic models. Multifactor dimensionality reduction, grammatical
evolution neural networks, random forests, focused interaction testing framework, step-wise logistic regression, and explicit
logistic regression  were compared. As one might  expect, the relative success of each method is context dependent.  This
study demonstrates the strengths and weaknesses of each method and illustrates the importance of continued methods
development. Genet. Epidemiol. 32:767-778, 2008.    © 2008 Wiley-Liss, Inc.

Key words:  genetic association study; epistasis; multifactor dimensionality reduction; grammatical evolution  neural
networks; focused interaction testing framework


Contract grant sponsor: National Institutes of Health; Contract grant numbers: HL65962; GM62758; AG20135.
*Correspondence  to: Marylyn D Ritchie, Center for Human Genetics Research, Department of Molecular Physiology and Biophysics, 519
Light Hall, Vanderbilt University Medical School, Nashville, TN 37232-0700.  E-mail: ritchie@chgr.mc.vanderbilt.edu
Received 4 lanuary 2008; Accepted  3 April 2008
Published online 16 lune 2008 in Wiley InterScience (www.interscience.wiley.com).
DOI: 10.1002/gepi.20345
              INTRODUCTION

  The identification and characterization of genetic
variants that predict common, complex disease is an
important priority in the field of genetic epidemiol-
ogy. It is hypothesized that such diseases are the
result of the complex interplay between a myriad of
genetic and  environmental factors  [Moore,  2003;
Moore and Williams, 2005; Ritchie, 2005; Templeton,
2000; Thornton-Wells et al., 2004]. Additionally, as
genotyping technology has advanced, the volume of
genetic information  available  for  analysis  has ex-
ploded.  This explosion  of information, combined
with  the complex genetic architecture, has resulted
in  a  difficult  analytical  challenge  [Moore  and
Ritchie, 2004; Ritchie et al., 2005]. The dimensionality
involved in the evaluation of combinations of many
such  variables quickly diminishes the usefulness of
traditional,  parametric statistical methods.  Referred
to as  the curse of dimensionality [Bellman,  1961], as
the number  of genetic  or environmental factors
increases and the number of  possible interactions
increases exponentially,  many  contingency  table
cells will be left with very few, if any,  data points.
This results in a crucial need for analytical methods
that can simultaneously perform  variable selection
tasks along with statistical modeling in such high-
dimensional data.
  To address this challenge, a number of analytical
and computational methods have been developed to
detect and model genetic associations [Hahn et al.,
2003; Kooperberg et al., 2001; Moore, 2003; Nelson
et al., 2001; Ritchie et al., 2001,2003a; Tahri-Daizadeh
et al., 2003; Zhu and Hastie, 2004]. Since it is unlikely
that any one analytical method will be ideal in all
situations, it is important to empirically evaluate the
strengths and weaknesses of a variety of computa-
tional approaches. In the current study, we evaluate
the performance  of  multifactor dimensionality re-
duction  (MDR) [Ritchie et al.,  2001], grammatical
evolution neural networks (GENN) [Motsinger et al.,
2006a; Motsinger-Reif et al., 2008a], focused interac-
tion testing framework (FITF) [Millstein et al., 2006],
random  forests (RF) [Breiman, 2001], and logistic
regression (LR) [Hosmer and Lemeshow,  2000] on
main  effect,  two-locus, and   three-locus  genetic
models.  We  approach  this comparison  from an
end-user's perspective,  using commonly  available
© 2008 Wiley-Liss, Inc.




-------
768
                     Motsinger-Reif et al.
software packages and author recommended con-
figuration parameters.
  As  expected  the relative  performance of  each
method is context dependent, as each  method has
its individual strengths and weaknesses. This study
highlights  the  utility  of  each  method  in  their
respective  context and  stresses the importance of
continued methods development. No single method
will be optimal for all data  scenarios, and this study
illustrates this fact. The goal of this study is to aid a
common  user  in deciding  which computational
method is most appropriate for their particular data.

                  METHODS

MULTIFACTOR DIMENSIONALITY
REDUCTION (MDR)
  MDR was designed  to detect gene-gene interac-
tions  in the presence or absence  of main effects in
case-control studies in human genetics [Hahn et al.,
2003;  Ritchie et  al., 2001]. MDR has been shown to
have  high power  to detect interactions in a wide
range of  simulated  data  [Motsinger  and Ritchie,
2006b; Ritchie et al., 2003a; Velez et al., 2007] and has
identified interactions in common complex diseases
such  as multiple sclerosis  [Brassat et  al., 2006;
Motsinger et al.,  2007],  coronary artery disease
[Agirbasli  et  al.,  2006], and diabetic nephropathy
[Hsieh et al., 2006].
  MDR has previously  been  described  in  detail
[Hahn et al., 2003; Motsinger  and Ritchie,  2006a;
Ritchie  et  al., 2001]. Figure 1 illustrates the MDR
algorithm. The  data set is divided into  multiple
                                 partitions for cross-validation before analysis begins.
                                 Cross-validation [Hastie et al., 2001] is an important
                                 part of the algorithm in order to find a model that
                                 not only fits  the given data but can also predict on
                                 unseen data.  In this study, five-fold cross-validation
                                 is used, so that 4/5 of the data comprise the training
                                 set and the remaining 1 /5 of the data comprises the
                                 testing set [Motsinger and Ritchie,  2006b].
                                   In step one, an exhaustive list of n combinations of
                                 genetic loci to evaluate from the list of all variables is
                                 created. Next, each of the n combinations is arranged
                                 in contingency tables in fc-dimensional space with all
                                 possible  combinations  as individual  cells in the
                                 table.  Then,  the number of  cases  and  controls for
                                 each locus combination is counted.  In step three, the
                                 ratio  of cases  to controls  within  each  cell  is
                                 calculated.  Each  genotype  combination  is  then
                                 labeled as "high risk" or "low risk" of the phenotype
                                 of interest  based  on  comparison of the ratio  to  a
                                 threshold. The threshold used is dependent on the
                                 ratio of cases and controls within the data set. If the
                                 ratio within a multifactor combination is above that
                                 seen in the data, it is labeled as "high risk" and if it
                                 is  below, it  is  labeled as  "low  risk." This  step
                                 compresses  multidimensional genotype data  into
                                 one dimension with two classes.
                                    The  high-risk/low-risk profile  for  each of the
                                 multifactorial combinations represents  the MDR
                                 model for a  particular combination of multi-locus
                                 genotypes.  Balanced accuracy, the  arithmetic mean
                                 of sensitivity and  specificity, is calculated for each
                                 model.  Balanced  accuracy is  used as the fitness
                                 metric in the  MDR algorithm to solve the challenges
    Stec 1
List of multi-locus
 combinations to
   evaluate
     1,2
     1,3
     1,4
                                                Factor 1
                            Factor 1
s
)
Step 2°
1 .
d
M
I
J
L
•
• *
i. i
L
                                                             Step3
                                     Step 6
                     Training Accuracy
                       1.2     73.77
                       1.3     69.22
   Testing Set   step 7  r^	^
     Factor 1    *      I—
                       HR   | HR
                   S.
                       LR
            LR


            LR
                             LR
     i

Testing Accuracy
 1.4     72.30
CV Consistency
 1.4        9
                                                                  3.
2.80
0.33
0.13
0.23
3,33
0.58
1.33
0.40
2.71
                          Step4|T=1

                           Training Set
                             Factor 1
                                                             Step 5
                                                                      72,30   p=0.002
Fig. 1. Multifactor dimensionality reduction. The steps correspond to those described in the Methods section.

Genet, Epidemiol.
                           Previous
                        TOC

-------
                                Analytical Methods for Genetic Epidemiology
                                                                            769
presented by an imbalanced  number of cases  and
controls in a  data set. In combination with adjusting
the threshold used  in assigning high- or low-risk
status, it  has been  shown  that balanced accuracy
makes MDR robust  to class imbalance [Velez et al.,
2007].  In the  case  of  balanced  data,  balanced
accuracy is mathematically equivalent to classifica-
tion accuracy. The best fc-locus model is selected and
the model is evaluated against the testing group and
testing accuracy is calculated.  Prediction accuracy is
then calculated  for the testing set. This  is repeated
for each cross-validation interval  (i.e. training set
and testing  set) and the average training accuracy
and testing accuracy are calculated. Among all of the
fc-locus models  created,  the single model with the
highest cross-validation consistency is chosen as the
best fc-locus  model.  This process is completed for
each k = 1 to N loci combinations that are computa-
tionally feasible. An optimal fc-locus model is chosen
for each level of k considered, so a one-locus model,
two-locus model, three-locus model, etc.  collectively
comprise a set.
   Once this set of models is computed, a final model
is  chosen. The  final model  is selected based on
maximization of both testing accuracy  and cross-
validation consistency. Testing accuracy is how well
the model predicts risk/disease status in indepen-
dent testing sets generated through cross-validation
and is calculated as described above. Cross-valida-
tion consistency is the number of times a model  is
identified across the cross-validation sets. For five-
fold cross-validation, the consistency can range from
one to  five.  A higher  value  of  cross-validation
consistency   represents  stronger  support  for  the
model. When testing accuracy and cross-validation
                              indicate different models, the rule of parsimony is
                              used to choose the simpler model.

                              GRAMMATICAL EVOLUTION NEURAL
                              NETWORKS  (GENN)
                                GENN is a  machine-learning approach  designed
                              to detect gene-gene  interactions in the presence or
                              absence of marginal main effects  [Motsinger et al.,
                              2006a; Motsinger-Reif et  al.,  2008a].  GENN was
                              designed to perform variable selection without the
                              computational burden of exhaustively searching all
                              possible variable combinations. GENN has success-
                              fully identified interactions in a wide range of data
                              simulations [Motsinger et al., 2006a,b,c; Motsinger-
                              Reif et al., 2008a], and a real-data application in HIV
                              immunogenomics [Motsinger-Reif et al., 2008a].
                                Methodology and  software have been previously
                              described for  GENN [Motsinger et al., 2006a; Mot-
                              singer-Reif et al., 2008a]. GENN utilizes grammatical
                              evolution (GE) [O'Neill and Ryan, 2001] to optimize
                              the inputs, architecture,  and  weights of an NN.
                              Details of  GE can be  found in O'Neill and Ryan
                              [2003]. Briefly, a binary string genome is mapped
                              into a functional NN according to the rules specified
                              in pre-defined Backus-Naur Form grammar. The
                              grammar used in GENN is available in Motsinger-
                              Reif et al.  [2008a] or from the authors on request.
                              Evolutionary  operators operate at the level of the
                              binary string  genome, automatically evolving the
                              optimal NN for a given data set.
                                The  GENN  algorithm is depicted in Figure 2. In
                              step one the user initializes a set of parameters in the
                              configuration  file. These parameters  specify details
                              of the  evolutionary process implemented in GENN.
                              These  items include: crossover rate,  mutation rate,
STEP I
pupUUimw
IHI eninncoi
pvm.itrfcuief.gtiifrMitii.
,u,Jnm_,,,i
r, '.!•
™tan.«»
^T^^omd
»ii.*™._B»
^,.rt.0,_-.

iu
»
ii
7
1 I
DOI
:
.:
»
ICCC
                                                STEP 2
                                                10    1
                                                                        • o
                      L
                                                                   • •
                                              STEP 5
                         GKNN Model
                           LITUI  Pmliclliiii Emir
19.25
                                   21.55
                                                  19.26%
                                                        22.12%
                                                                   GKiNN Models
                                                                 Classification Error
                                                                       19.25
                                                                       22,12
                                                                       24,3?
                                                                       28,14
                                               Tournament                ;

Fig. 2. Grammatical evolution neural networks. The steps correspond to those described in the Methods section.
                           Previous
                     TOC
                                                                                           Genet. Epidemiol.

-------
770
Motsinger-Reif et al.
population size, maximum number of generations,
type of selection, and type of crossover. Second, the
data are divided into 10 equal parts for 10-fold cross-
validation. Cross-validation  is implemented to de-
velop a model that not only fits the data at hand but
that can also generalize to future, unseen data. In 10-
fold cross-validation, 9/10 of the  data are used for
training, to develop an NN model, and the other 1 /
10 of the data are used to evaluate the predictive
ability of the  model.  This is repeated  for  each
possible 9/10:1/10 split of the data. Third, an initial
population of random solutions is generated using
sensible  initialization  [O'Neill and  Ryan,  2003].
Sensible initialization guarantees functional NNs in
the initial population. Details of this process can be
found in [Ritchie et  al., 2003b]. Fourth, each newly
generated NN  is evaluated on  the data  in the
training  set and its  fitness recorded. The  fitness
function  used  is balanced  error (or 1—balanced
accuracy), where balanced accuracy is (sensitivity+
specificity)/2. Higher  accuracy  represents  higher
reproductive fitness  in  the GENN population. This
fitness function  was implemented to ensure GENN
is robust to class imbalance in a data set [Hardison
et al., 2008]. Fifth, a  user-specified selection techni-
que selects  the  best solutions for crossover and
reproduction. A proportion of the best solutions will
also be directly copied (reproduced)  into the new
generation. Another  proportion of solutions will be
used  for crossover with  other best solutions. The
cycle begins with this newly created  generation,
which is equal in size to the original population. This
cycle continues  until either a classification error of
zero is found or a user-specified limit on the number
of generations is reached. After each generation, an
optimal solution is identified. At the end of GENN
evolution, the overall best solution is selected as the
optimal NN. Sixth, this best GENN  model is tested
on  the testing portion  of the data to estimate the
predictive balanced  error of the model.  Steps two
through six are  performed  10 times with the same
parameter settings, each time using a different 9/10
of the data for  training and 1/10 of the data for
testing. At the end of a GENN analysis,  10 models
are generated—one  best model  from each  cross-
validation interval. A final model is chosen based on
maximization of the cross-validation consistency of
variables/loci across the 10 models. A higher value
of cross-validation consistency represents stronger
support for the model.

RANDOM FORESTS (RF)
  RF  is a machine-learning technique that builds a
forest of classification trees wherein each component
tree is grown from a bootstrap sample of the data,
and the variable at each tree node is selected from a
random subset of all variables in the data [Breiman,
2001].  The final classification of an  individual is
           determined by voting over all trees in the forest.
           There are several advantages of the RF method that
           make it a  promising technique for genetic associa-
           tion studies. First,  it can handle a large number of
           input variables. Second,  it estimates  the  relative
           importance of variables in determining classifica-
           tion, thus  providing a metric for  feature selection.
           Third, RF produces a highly accurate classifier with
           an internal  unbiased  estimate  of generalizability
           during the forest  building process. Fourth, RF is
           fairly robust in  the presence  of etiological hetero-
           geneity and relatively high amounts of missing data
           [Lunetta et  al.,  2004].  Finally,  and  of increasing
           importance  as  the number of  input  variables
           increases, learning is  fast and computation time is
           modest  even  for  very  large  data sets [Robnik-
           Sikonja, 2004]. RF has successfully identified disease
           susceptibility variants  in a  variety  of real-data
           applications  in  genetic epidemiology such as drug
           response [Sabbagh and Darlu, 2006] and dermato-
           myositis [Mamyrova et al., 2006].
              Each tree in the forest  is constructed as  follows
           from data  having N individuals and M  explanatory
           variables:
           (1) Choose a training sample by selecting N indivi-
               duals, with replacement, from the entire data set.
           (2) At  each  node  in  the tree, randomly select m
               variables from the entire set of M variables in the
               data. The absolute magnitude of m is a function
               of the  number of  variables in  the data  set and
               remains constant throughout the forest building
               process.
           (3) Choose the best split  at the current node from
               among the subset of m variables selected above.
           (4) Iterate  the second and third steps until the tree is
               fully grown (no pruning).

              Repetition of this algorithm yields a forest of trees,
           each of which has been trained on bootstrap samples
           of individuals (see Fig. 3). Thus,  for a given tree,
           certain individuals  will have  been left out during
           training. Prediction error and variable importance is
           estimated from these "out-of-bag"  individuals.
              The out-of-bag (unseen) individuals  are used to
           estimate the importance of particular variables by
           randomly permuting the values of that variable and
           testing whether these permutations adversely affect
           the predictive  ability  of trees  in classifying out-of-
           bag samples. If randomly permuting  values of a
           particular  variable  does not  affect the predictive
           ability of trees on out-of-bag samples, that variable is
           assigned  a  low importance score. If randomly
           permuting  the  values  of a  particular  variable
           drastically impairs the ability of trees  to correctly
           predict the class of out-of-bag samples, then  the
           importance score of that variable  will be high. By
           running out-of-bag samples  down  entire  trees
           during the permutation procedure, interactions are
           taken into account  when  calculating  importance
Genet. Epidemiol.




-------
                                Analytical Methods for Genetic Epidemiology
                                                        771
   M variables
                 M variables
                                m variables
Fig. 3. Random forests. The steps correspond to those described
in the Methods section.
scores, since class is assigned in the context of other
variable nodes in the tree.
  The  recursive  partitioning trees comprising an
RF provide an explicit representation of variable
interaction  [Breiman et al.,  1984;  Province  et al.,
2001]. Thus, these models may uncover interactions
among  factors that  do not exhibit  strong marginal
effects,  without  demanding a pre-specified  model
[McKinney  et al., 2006]. Additionally, tree methods
are suited to  dealing with certain  types of genetic
heterogeneity, since  splits near the root node define
separate model subsets in the data. RFs capitalize on
the  solid  benefits  of  decision  trees  and  have
demonstrated  excellent  predictive   performance
when the forest  is diverse (i.e. trees are not  highly
correlated   with  each   other)  and composed  of
individually strong  classifier trees [Breiman, 2001;
Bureau  et al.,  2005].

FOCUSED  INTERACTION TESTING FRAME-
WORK (FITF)
  The   FITF  was  recently  developed  to  detect
epistatic interactions that predict disease risk. De-
tails of the FITF algorithm  and  software can be
found in Millstein et al. [2006]. FITF is a modification
of the interaction testing framework (ITF) method,
which pre-screens all possible gene sets to focus on
those that potentially are the most  informative and
reduce the multiple testing problem by reducing the
number of statistical tests performed. FITF has been
shown  to  outperform  MDR   when   interactions
involved  additive,  recessive, or  dominant   genes
[Millstein  et   al.,  2006].  Additionally,  FITF has
successfully identified  a multi-locus model that
predicts childhood asthma [Millstein et al., 2006].
           The ITF strategy performs a series of LR analyses
         in incremental  stages,  where  the  highest-order
         interaction parameter  considered increases at each
         subsequent stage.  In stage one,  the main effect  of
         each genetic variant is considered, in  stage two, all
         pair-wise combinations are tested, in stage three all
         three-way interactions are tested, etc. In order  to
         avoid re-testing the same effects, if a  variant  or
         multi-locus combination is declared significant in an
         earlier stage,  those variants are not re-tested  in
         subsequent  stages.  The   overall type  I error  is
         controlled by dividing the overall desired  a level
         by the number of stages and allocating this adjusted
         a  level  to  each  stage.   Within each   stage, the
         significance threshold  is adjusted by controlling the
         false discovery rate [Benjamini and Hochberg, 1995].
         This approach tests for interactions in the presence
         of no main effects.
           The FITF algorithm modifies the ITF approach  to
         reduce the overall number of variants tested with an
         initial filter process. A j^ goodness-of-fit statistic
         that  compares  the observed  with   the  expected
         Bayesian distribution of multi-locus genotype com-
         binations in a combined case-control  population is
         used  in a prescreening initial stage.  This statistic,
         referred to as the chi-square subset (CSS), has the
         form:
                       CSS =
Ej
                                    E(n{)
         where M,  is  the  observed  number of  subjects
         (regardless of case/control status) in the z'th geno-
         type group and r is the total number of genotype
         groups. The expected M,-, noted as £(«,-), is estimated
         based on the sample marginal genotype frequencies
         of each gene.

         LOGISTIC REGRESSION (LR)
           LR  is a derivative of linear regression  that fits a
         function to continuous or discrete independent vari-
         ables  based on  a dichotomous dependent variable
         [Hosmer and Lemeshow, 2000]. LR uses a transforma-
         tion of the logistic distribution to develop a function
         based  on  the  independent variables. The  logistic
         distribution provides  a  flexible  platform with a
         clinically  meaningful interpretation. The distribution
         is transformed by the logit function,  allowing tradi-
         tional regression techniques to be applied. Using this
         formulation, the  predicted values from regression are
         dichotomous with binomially distributed errors. Once
         a regression function has been derived using iterative
         fitting techniques, least  squares is used to minimize
         errors, based on the binomial distribution and is given
         as a  logarithmic  transformation  of  the  maximum
         likelihood, called a log-likelihood.  Like linear regres-
         sion, partial derivatives of the likelihood function are
         evaluated  to minimize error.
                           Previous
TOC
                                                                                            Genet. Epidemiol.

-------
772
Motsinger-Reif et al.
  One of the most common procedures for variable
selection  in  a  LR  analysis  is  step-wise logistic
regression (step  LR)  [Hosmer  and  Lemeshow,
2000]. In the step-wise procedure, each variable is
tested for independent effects, and those  variables
with significant effects are included in the model. In
a second  step, interaction terms of those  variables
with  significant  main effects  are included, and
significant effects are included in the model.
  LR is a de facto standard for traditional association
studies. Using independent variables to predict a
dichotomous dependent variable, LR by definition
lacks the  ability to  characterize purely interactive
effects. Only variables that  contain an independent
main effect will be included in the final model.  To
properly  evaluate  non-linear  purely  interactive
effects, combinations of variables  must be encoded
as a single variable for inclusion in the analysis. Such
an encoding scheme can be computationally expen-
sive, depending  on the number of variables used.
  In the current study, we use stepLR to assess  its
performance  to perform both variable selection and
statistical  modeling tasks.  Additionally,  since  LR
modeling is the standard in genetic epidemiology,
we  use explicit LR (eLR) as a  "positive control" to
assess the strength of  the genetic  signal in  the
simulated data. In eLR, the  known simulated  effect
is explicitly modeled. So for example, for a two-locus
interactive model, an  LR equation would be built
with parameters for the independent effects of both
loci and the interaction term.


DATA SIMULATION
  Simulated data sets that exhibit both main effects
and gene-gene interactions were generated in order
to compare the performance of the above methods in
a variety of situations. Multiple disease models were
generating with varying allele frequencies,  herit-
ability,  odds  ratios,  and  number  of  functional
polymorphisms.
  We simulated  case-control data  with both main
effects and purely epistatic models. We generated
data sets  demonstrating main effects to test  the
performance  of all the methods studied to perform
simpler variable selection tasks. We also simulated
models that exhibit epistatic effects in the absence of
main effects. Epistasis occurs  when the combined
effect of two or more genes on a phenotype  could
not be predicted from their individual genotypes.
Epistasis  is  increasingly accepted  as  a  common
feature  of the  genetic architecture  of  common,
complex  disease  [Moore,  2003]  and  presents  a
difficult analytical challenge. For this study, epistatic
disease models with no main effects were simulated
to challenge  the  analytical methods.  While   an
effectively infinite range of genetic  models  could
be simulated, simple main effect and purely epistatic
           models were chosen to represent extremes from an
           analytical point of view.
              Two different minor allele frequency scenarios
           were chosen: 0.2 and 0.4. Two different values of
           heritability were used in our simulations: 1 and 5%.
           Roughly, heritability in the broad sense describes the
           proportion of  the  total phenotype/disease that  is
           due to genetic effects. More specifically, the exact
           heritability calculations  used  can  be  found in
           Culverhouse et  al.  [2002]. A range of odds  ratio
           values was selected for each heritability value. For
           the lower heritability models, odds ratios of 1.2, 1.4,
           1.6,  1.8, and  2.0 were simulated.  For the higher
           heritability models, odds ratios of 2.0, 2.5, 3.0, 3.5,
           and 4.0 were simulated.  This range of heritability
           and odd  ratio values were chosen to represent a
           "worst-case scenario" of common diseases to test the
           lower limits  of the  methods.  By  testing  these
           extremely  low-signal  models, the study is able to
           distinguish differences in  the  lower performance
           limits of the  analytical methods. It is  assumed  that
           any method that  can find  such minimal effects
           should have greater posterior probabilities to find
           more  substantial effects.  Additionally,  null  data
           (with no disease model) were simulated to estimate
           the type I error rates of each method.
             A range of functional interacting loci  was  simu-
           lated using penetrance functions. Penetrance func-
           tions  define  the probability of disease  given  a
           particular  genotype  combination  to  model  the
           relationship between genetic  variations and disease
           risk. The range of  functional loci selected included
           one-, two-, and three -locus interaction models. All
           penetrance functions used in the current study are
           available from the authors on request.
             Data sets were simulated using software described
           by Moore et al. [2004]. All possible combinations of
           allele frequency, heritability/odds ratio,  and inter-
           acting loci were modeled, resulting in 30 models.
           One  hundred  data sets  were  generated for  each
           model, resulting in 3,000  data sets. Each data set
           included 1,000 total individuals—500 cases and 500
           controls. While the number of functional loci present
           varied between models, the total number of single
           nucleotide polymorphisms (SNPs) were constant in
           each data set (100 SNPs per individual).

           DATA ANALYSIS
             All replicates of all disease  models were analyzed
           by each of the following methods: MDR, GENN, RF,
           FITF, stepLR, and  eLR. The number of times each
           method  identified  the correct  simulated model
           (including all correct  loci  and only correct loci) as
           the final or best model across the 100 replicates  was
           used to estimate the posterior probability of identi-
           fying the simulated model. We have chosen to use
           posterior probability, rather than empirical power to
           evaluate performance in these simulations, because
Genet. Epidemiol.




-------
                               Analytical Methods for Genetic Epidemiology
                                              773
each of the methods  used for this comparison has
different prior probabilities for detecting interaction
effects.  For example,  the prior probability for step
LR to detect a purely epistatic two-locus model is
extremely small (close to zero) because there are no
main effects to condition on.  However, the prior
probability  for MDR  is quite  high because  MDR
performs  an exhaustive  search.  For  all methods
except  RF,  the  final model  selected  must  be
statistically significant to contribute to the posterior
probability estimate. For MDR and GENN, permuta-
tion testing was  used  to empirically  estimate
statistical   significance,  with a   P-value  of  0.05
considered significant. For the eLR,  step  LR, and
FITF analyses, a P-value of 0.05  was used  in the
analysis to determine significance. Because  RF  is
most commonly used as  a filter, and not for strict
association testing,  no estimate of statistical signifi-
cance was  used in  the  RF  posterior  probability
estimations.
  While we understand  this performance assess-
ment is strict, we feel it is the fairest  comparison
across many methods. Also, since the  end-goal of
many analyses is to find the best model that predicts
disease  status, we feel this is a practical assessment.
Null data  were used to estimate the type I error rate
of each method. Alpha was set  to  0.05 for each
method, and the type I error rate  was estimated as
the number of times a significant  result was found
across  100 replicates. Again, because  RF is most
commonly used as a filter, no false-positive rate was
estimated.
  All analyses were   performed  with  a common
user's  perspective  in mind.  Configuration  para-
meters for each software program were set according
to the author default settings or recommendations.
  MDR  analysis was  performed  using a  Linux
version  of the MDR software (compiled and bench-
marked on  a PC with  a  600 MHz  Pentium-Ill
running Red Hat 2.2.5-15, written in C and compiled
with  the   GNU  C  compiler).  MDR  software  is
currently  distributed  in  a JAVA version  with a
graphical  user interface or in a C  library. The most
current  open-source  versions   are  available at
www.epistasis.org/mdr.html. MDR has also been
added to Weka-CG, which is available from the same
Web site.  MDR evaluated all possible  single-locus
through four-locus models, and a single final model
was chosen using the heuristic described above. No
level of  interaction was pre-selected for final model
selection.  Final models containing too few  or too
many loci based on  the simulated model did not
contribute to the probability estimate.  This strict
method of final model selection should be consid-
ered in  interpreting the results presented below.
  GENN  analysis was performed using  a  Linux
software package that is available from the authors
on request. The GA used to evolve the binary string
that is  transcribed  into  an NN has  the following
parameters in the current implementation: crossover
rate = 0.9,  mutation = 0.01,  population = 200  per
deme,  demes = 10, max  generations = 200,  codon
size = 8, GE wrapping count = 2, min  chromosome
size  (in terms  of codons) = 50, max  chromosome
size = 1,000,  selection = tournament,  and sensible
initialization depth = 10. The island model of paral-
lelization  is  used, where the  best  individual  is
passed to  each  of the other processes after every 25
generations [Cantu-Paz, 2000], to prevent stalling in
local minima. The genome was derived from GAlib
(version 2.4.5), and a typical GA one-point crossover
of linear  chromosomes is used  [Motsinger  et al.,
2006b]. For the  results presented below, final  model
selection was performed based on cross-validation
consistency  as  described  above  and  only models
with the correct number of loci, and only correct loci
included  in the model counted  toward the prob-
ability estimate. Incomplete models (models that did
not include all the simulated disease loci) or models
containing the  simulated disease model plus addi-
tional noise loci were not included in the probability
estimated. Again, this strict performance assessment
should  be kept  in  mind when interpreting the
results.
  RF  analysis  was   performed  using  the  freely
available  R  package  randomForest   [Ihaka  and
Gentleman,   1996;  R Development  Core  Team,
2006]. This package is based on the original Fortran
code available  in Breiman and Cutler  [2004]. For
each of the data sets, forests comprising 10,000 trees
were  grown. Variable importance was calculated
using the  out-of-bag permutation test. The relative
importance (rank) of variables was determined from
the mean decrease in Gini index using the out-of-bag
permutation testing procedure [Breiman et al., 1984].
Again, a very strict performance assessment is used
in the results presented below. The simulated locus/
loci must  be the top ranking locus/loci in the RF
analysis to contribute to the probability estimate. RF
is often used as a filter  approach, where a pre-
specified  number of the top loci will  be used in a
second stage of association analysis. In this study, RF
is being used for association testing and not as a
filter. This use should be kept in mind in interpreting
the results.
  FITF  software  is  freely  available  at http://
hydra.usc.edu/fitf and was  originally written for
Windows. Because of the  computational burden of
running large-scale simulation studies  in Windows,
source code was requested and  kindly shared by Dr.
W. James Gauderman. The configuration parameters
used in the current study are as recommended in the
software instructions, and include CSS cutoff 2 = 3,
CSS   Cutoff  3 = 6,   0^ = 0.016667,  a2 = 0.016667,
a3 = 0.016667.
  Both stepLR  and eLR analyses were performed
using SAS v9.1 commercial software.  For the  eLR
analysis, because the known simulated model  was



                                                                                          Genet. Epidemiol.

-------
774
Motsinger-Reif et al.
explicitly modeled, the probability results presented
are estimated as the number of times the eLR model
was statistically significant at P<0.05 across the 100
replicates.

                  RESULTS

  The  type  I  error results  demonstrate that each
method had nominal type I error rates. The type I
error for each method was  as  follows: MDR—3%,
FITF—3%, GENN—3%, eLR-^%, and sLR—2%.
  Table I  summarizes  the  posterior  probability
results for all six methods for the lower heritability
models (1%). As these results show, these simula-
tions  really do test  the lower  limits of  all six
methods,  as the  genetic effects in this  data are
extremely small. The low posterior probability of
even eLR for  the two- and three-locus interaction
models demonstrates the difficulty of statistically
modeling such small  effects. The other methods
must  also  perform  variable selection along  with
TABLE I. Posterior probability results from 1%
heritability models
Posterior probability to detect
simulated model (%)
Number of
disease loci
1









2









3









OR
1.2

1.4

1.6

1.8

2

1.2

1.4

1.6

1.8

2

1.2

1.4

1.6

1.8

2

MAP
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
GPNN
6
6
36
40
69
84
80
90
99
99
0
0
0
4
10
4
5
24
30
14
0
0
0
0
0
0
0
0
0
0
MDR
1
4
30
29
67
74
85
86
94
93
0
0
2
3
8
8
33
20
46
22
0
0
0
0
0
0
4
2
0
0
FITF
0
0
0
0
21
4
56
34
91
49
0
0
0
0
12
1
35
1
37
0
0
0
0
0
0
0
2
0
0
0
RF
4
7
37
34
73
86
88
96
98
99
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
stepLR
42
52
68
69
95
93
88
97
85
88
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
eLR
48
52
84
76
100
96
100
100
100
100
24
44
10
33
96
98
100
81
98
96
2
6
8
6
0
3
15
17
0
3
           statistical  modeling,  so  lower  performance  as
           compared to eLR is not surprising. For the lowest
           effect  single-locus models  (OR  1.2-1.6),  stepLR
           outperforms  the  other  four  computational  ap-
           proaches and has comparable performance to eLR.
           For  the higher  effect  single-locus models (OR
           1.8-2.0), all  the  methods perform very well.  In
           particular, GENN and RF have the highest perfor-
           mance. The performance of  those  two methods is
           comparable to that of eLR.
             For the lowest effect size two-locus models, all five
           computational  approaches  have  extremely low
           posterior probabilities. For the higher  effect size
           models, GENN, MDR, and FITF show an improve-
           ment as compared  to stepLR and RF. The low
           performance  of all methods is not surprising given
           the very small effect sizes and  the added challenge
           of identifying interacting  variables as  opposed to
           main effects.  The posterior probabilities to detect the
           three-locus models with such small effect sizes are
           near or at zero for all of the methods. Even the
           performance  of eLR is extremely low.
             Table II gives the results of the analyses  of the
           higher  heritability genetic models. For  the single-
           locus models,  all methods have  reasonably high
           performance. Notably, stepLR has the lowest poster-
           ior probability  of the five computational methods.
           RF has the  highest  probability  (100%)  of  the
           computational  methods  (tied with  eLR) to  detect
           all single-locus models. GENN has the next highest
           performance, and the performance of MDR is pretty
           comparable. FITF has a much lower probability than
           GENN, MDR, or  RF for  the very weakest models
           (OR 1.2).
             For the two-locus models, the posterior probability
           of correctly detecting  the interaction  both  stepLR
           and RF is minimal (approaching or at zero). This is
           not surprising, since both are dependent on at least a
           small marginal  main effect  to detect interactions,
           and these models are purely epistatic. GENN, MDR,
           and FITF all  have a reasonably high probability of
           detecting the two-locus interactions. MDR has the
           highest performance across most models, and FITF
           has the lowest of these three methods. For the three-
           locus models, the only method with a  reasonable
           posterior  probability to  detect the interaction  is
           MDR.  Notably, the performance of MDR is much
           higher than even that of eLR.
             In summary, as expected, the relative performance
           of each method is context dependent. Figures 4-6
           graphically summarize these results. These figures
           present box  plots of the  distributions of posterior
           probability results (presented in Tables I and II) over
           a  range of  MAP and  OR. Overall,  MDR had
           consistently high performance in main effect, two-
           locus,  and three-locus models. GENN  had high
           performance  in main effect and two-locus models,
           but had poor posterior probabilities to detect three-
           locus interactions. FITF had relatively high perfor-
Genet. Epidemiol.




-------
                                Analytical Methods for Genetic Epidemiology
                                                        775
TABLE II. Posterior probability results from 5%
                Posterior Probabilities to Detect Two-Locus Models
heritability
models
Posterior
probability
simulated

Number of
disease loci

1




2









3










OR

2
2.5
3
3.5

4

2

2.5

3

3.5

4

2

2.5
3

3.5

4



MAP

0.2
0.4
0.2
0.4
0.2
n A.
U.'i
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4
0.2
0.4


GPNN

95
99
99
99
99
QQ
"o
99
100
100
100
10
62
71
82
92
95
95
97
95
73
0
0
2
0
6
1
0
o
0
0


MDR

94
93
98
98
96
01
"1
93
99
95
98
60
82
93
93
95
100
93
96
98
83
14
15
71
45
51
80
62
95
14
93


modei


FITF RF

84
48
98
94
98
QQ
""
97
98
97
99
0
10
68
72
82
96
90
85
96
48
7
1
38
22
1
0
1
o
0
3

100
100
100
100
100
1 nn
1UU
100
100
100
100
52
2
26
1
6
4
9
1
5
1
0
0
0
0
0
0
0
o
0
0
to detect
(%)


stepLR

88
88
93
82
95
QQ
OO
92
91
92
87
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
o
0
0




eLR

100
100
100
100
100
1 nn
1UU
100
100
100
100
99
99
100
100
100
100
100
100
100
100
18
21
19
21
16
3
30
13
17
2
O
° '

0
1 3-
m

if-
* 8-
o -
A A ^ i
n P t

m


li.J
T



•

•
• _
GENN MDR FITF RF stepLR eLR GENN MDR FITF RF stepLR eLR
1 % Heritability 5% Heritability
Fig. 5. Posterior probabilities to detect two-locus epistatic
models.





Posterior Probabilities to Detect Three-Locus Models

1

_S-
$
Is-
£
°
r
i
*a-
o -







'

'
r*^ t
_ A J!_ _ _ 9 ±. I








1
T
; t
1 	 -L
GENN MDR FITF RF stepLR eLR GENN MOR FITF RF stepLR eLR


l« Heritability 5% Heritability
Fig. 6. Posterior probabilities to detect three-locus epistatic
                                                       models.
      Posterior Probabilities to Detect Single-Locus Models
          T   T
      GENN MDR FITF  RF stepLR eLR   GENN MDR FITF  RF stepLR eLR
           1 % Heritability             5% Heritability

Fig. 4. Posterior probabilities to detect single-locus models.
mance  in  detecting main  effects and  two-locus
interactions, but consistently had lower performance
than  MDR and GENN.  RF  had  extremely  high
posterior probabilities  to  detect  main  effects, but
         was limited in the case of purely epistatic models.
         StepLR had  high performance  in  detecting  main
         effects, but was also unable to detect interactions in
         the absence of main effects.

                         DISCUSSION

           The results of this study demonstrate the utility of
         the methods investigated and emphasize the im-
         portance of continued methods  development. For
         the smallest effect sizes simulated, none of the
         methods had decent posterior probabilities to detect
         interactive effects.  Also,  these simulations did not
         try to  model more complex scenarios, like genetic
         heterogeneity or phenocopy.  The performance  of
         each  of  these  methods should be  evaluated  in
         increasingly complex models.
           The success of MDR in all the models simulated is
         not surprising, given its many previous successes in
         both real and simulated data [Agirbasli et al., 2006;
         Cho et al., 2004; Hsieh et al., 2006; Motsinger and
                           Previous
TOC
                                                                                             Genet. Epidemiol.

-------
776
Motsinger-Reif et al.
Ritchie,  2006b;  Ritchie  et  al., 2001, 2003a]. The
success of GENN  for  main  effect and  two-locus
models  was  also  expected,  given  its  previous
successes in real and  simulated  data  [Motsinger
et al., 2006a,b,c; Motsinger-Reif et al., 2008a], but its
poor performance with the  three-locus models was
surprising. This  highlights a  disadvantage  of  an
evolutionary computation  approach  in  exploring
purely epistatic models—it  is much less likely that
three loci will be  stochastically assembled into a
model to evaluate than two loci. The  exhaustive
search approach  of MDR avoids  this risk, but is
highly computationally intensive. In relatively small
data sets like those simulated  here, the computa-
tional burden is not a limitation, but as genotyping
technology pushes the field toward genome-wide
association as the norm, this approach will become
infeasible without  decreasing the  number of vari-
ables using a filtering technique. The evolutionary
computation approach will  be much more realistic
for   such  data   sets;  however, detecting  purely
epistatic models  will  be a daunting task.  It was
encouraging that  the GENN results in the single and
two-locus models were comparable to MDR, because
of its computational advantage. Additionally, there
is evidence that the non-exhaustive search strategy
of GENN may have  other  advantages over  MDR,
especially in the case of genetic heterogeneity [Mot-
singer-Reif et al.,  2008a].
  Both  GENN and MDR outperformed  the FITF
approach  in  epistatic  models. The increase  in
probabilities for these two methods may be due to
the many corrections for multiple testing used in
FITF. Because GENN and MDR both  utilize permu-
tation distributions for significance testing, correc-
tion for multiple testing is  unnecessary. While the
filter stage of FITF  does reduce the number of tests
performed with the ITF strategy, there are still a very
large number of tests that  are corrected  for. This
correction  may limit the performance of FITF in
comparison to GENN and MDR.
  Both  RF and  stepLR had very high  posterior
probabilities  to detect main effects, but were both
unable to detect purely epistatic models. There is a
solid theoretical explanation for these results, since
both  require  marginal  main effects  to perform
variable selection tasks. Future extensions/modifi-
cations  of these  approaches  should consider this
limitation and modify the variable  selection process
to capture pure interactions. Some groups have in
fact begun to make  modifications  in  this  way
[Bureau et al., 2005].
  The results of the current study were presented as
posterior probabilities to detect the simulated model,
as an empirical estimate of the performance of each
method. While these results allow  a comparison of
the posterior probabilities between methods, they do
not reflect the differences in  the prior probabilities of
each method. The prior probabilities of each method
           differ based on the model building process of each.
           For example, in the case of stepwise LR, the prior
           probability of selecting any multi-locus interaction
           model under the null is very close to zero. This is
           due  to  the hierarchical  model building  process,
           where only variables  with significant main effects
           are even considered in the evaluation of potential
           interactions.  MDR,  on  the other   hand,  has  a
           theoretically higher prior  probability of  detecting
           multi-locus interactions because of the exhaustive
           search process used ensures higher-order models
           will be evaluated.
             Bayes  factors comparisons [Goodman, 1999] are a
           commonly used tool to combine prior and posterior
           information in a ratio that provides evidence in favor
           of one  model  specification versus  another.  Such
           comparisons  would be useful  in interpreting the
           results  of the current study given  the  important
           differences in the prior probabilities of each method,
           but unfortunately a formal Bayes Factor comparison
           is not possible in  the current  study. Calculations
           result in effectively undefined  Bayes Factors  (data
           not  shown)  due  to  the  extremely small  prior
           probabilities under the null of each of the methods.
           The very general low false-positive rates (discussed
           previously) of each  of the methods reflect the very
           low prior probability of  detecting specific genetic
           models in null data.
             While formalization of these differences is not
           possible, it is important to remember these differ-
           ences when choosing an  analysis strategy.  If it is
           hypothesized that the underlying genetic  etiology
           includes purely epistatic models, step LR would be a
           theoretically inappropriate choice. MDR or GENN
           may be better selections. If it is hypothesized that the
           underlying genetic  etiology includes interactions
           with  strong main effects,  a number of analytical
           methods may be appropriate. If it is hypothesized
           that main effects result in the phenotype of interest,
           the list of appropriate methods  extends further.
             While the list of computational methods compared
           in the current study is  not exhaustive, this study
           compares  and  contrasts  several  important  novel
           analytical approaches in  genetic  epidemiology.  By
           thoroughly understanding the strengths and  weak-
           nesses  of each method,  researchers can  choose
           appropriate analytical tools for  their particular data
           and questions.
                    ACKNOWLEDGMENTS

             We to thank Dr. W. James Gauderman for provid-
           ing code for the FITF algorithm. These results do not
           reflect the official policy of the US Environmental
           Protection Agency.
Genet. Epidemiol.




-------
                                       Analytical Methods for Genetic Epidemiology
                                                        777
                   REFERENCES
Agirbasli  D,  Agirbasli M, Williams SM,  Phillips  III JA. 2006.
   Interaction among 5,10 methylenetetrahydrofolate reductase,
   plasminogen activator inhibitor and endothelial nitric oxide
   synthase gene polymorphisms  predicts the severity of cor-
   onary  artery disease  in Turkish patients. Coron Artery Dis
   17:413-117.
Bellman R. 1961. Adaptive Control Processes. Princeton: Princeton
   University Press.
Benjamini Y,  Hochberg Y. 1995.  Controlling the false discovery
   rate: a practical and powerful approach to multiple testing. J R
   Stat Soc Ser B (Methodological) 57:289-300.
Brassat D, Motsinger AA, Caillier SJ, Erlich HA, Walker K, Steiner
   LL, Cree  BA, Barcellos LF, Pericak-Vance MA, Schmidt  S,
   Gregory S, Hauser SL, Haines JL, Oksenberg JR, Ritchie MD.
   2006. Multifactor dimensionality reduction reveals gene-gene
   interactions associated with multiple sclerosis susceptibility in
   African Americans. Genes Immun 7:310-315.
Breiman L. 2001. Random forests. Mach Learn 45:5-32.
Breiman L, Cutler A. 2004. Random forests. http://www.Stat.-
   Berkeley.Edu/~Breiman/RandomForests/Cc_Home Htm
Breiman L, Friedman JH, Olshen RA, Stone CJ. 1984. Classification
   and Regression Trees.  New York: Chapman & Hall.
Bureau A, Dupuis J, Falls K,  Lunetta KL, Hayward B, Keith TP,
   Van EP. 2005. Identifying SNPs predictive of phenotype using
   random forests. Genet Epidemiol 28:171-182.
Cantu-Paz  E.  2000.  Efficient and  Accurate Parallel  Genetic
   Algorithms. Boston: Kluwer Academic Publishers.
Cho YM, Ritchie MD, Moore  JH, Park JY, Lee KU, Shin HD, Lee
   HK,  Park  KS.  2004. Multifactor-dimensionality  reduction
   shows  a two-locus interaction associated with Type 2 diabetes
   mellitus. Diabetologia 47:549-554.
Culverhouse R, Suarez BK, Lin J, Reich T. 2002. A perspective on
   epistasis: limits of models displaying no main effect.  Am J
   Hum Genet 70:461-171.
Goodman S.  1999. Toward evidence-based medical statistics. 2:
   The Bayes factor. Ann Intern Med 130:1005-1013.
Hahn LW, Ritchie MD, Moore JH.  2003. Multifactor dimension-
   ality reduction software for detecting  gene-gene and  gene-
   environment interactions. Bioinformatics 19:376-382.
Hardison  NE, Fanelli TJ, Reif DM, Ritchie MD, Motsinger-Reif
   AA. 2008.  Balanced accuracy makes grammatical evolution
   neural  networks robust to  class imbalance. Proceedings of the
   Genetic and Evolutionary Algorithm  Conference.  2008,  in
   press.
Hastie T,  Tibshirani R,  Friedman  JH. 2001. The  Elements Of
   Statistical Learning. Springer Series in Statistics. Basel: Springer.
Hosmer DW, Lemeshow S. 2000. Applied Logistic Regression.
   New York: John Wiley & Sons Inc.
Hsieh CH, Liang KH, Hung YJ, Huang LC, Pei D, Liao YT, Kuo
   SW, Bey MS, Chen JL, Chen EY  2006. Analysis of epistasis for
   diabetic nephropathy  among type 2 diabetic patients. Hum
   Mol Genet 15:2701-2708.
Ihaka R, Gentleman R. 1996. R: a language for data analysis and
   graphics. J Comput Graphical Stat 5:299-314.
Kooperberg C,  Ruczinski I, LeBlanc ML, Hsu L. 2001. Sequence
   analysis using logic regression. Genet Epidemiol 21 :S626-S631.
Lunetta KL, Hayward LB, Segal J, Van EP.  2004. Screening large-
   scale association  study data: exploiting  interactions  using
   random forests. BMC  Genet 5:32.
Mamyrova G, O'hanlon TP, Monroe JB, Carrick DM, Malley JD,
   Adams S, Reed AM, Shamim EA, James-Newton L, Miller FW,
   Rider LG.  2006. Immunogenetic  risk and protective factors for
   juvenile  dermatomyositis in  Caucasians.  Arthritis  Rheum
   54:3979-3987.
McKinney BA, Reif DM, Ritchie MD, Moore JH. 2006. Machine
   learning  for detecting gene-gene interactions: a review. Appl
   Bioinformatics 5:77-88.
Millstein J, Conti DV, Gilliland FD, Gauderman WJ. 2006. A testing
   framework for identifying susceptibility genes in the presence
   of epistasis. Am J Hum Genet 78:15-27.
Moore JH. 2003. The ubiquitous nature of epistasis in determining
   susceptibility to common human diseases. Hum Hered 56:73-82.
Moore JH, Ritchie MD. 2004. STUDENTJAMA.  The challenges of
   whole-genome approaches to  common diseases. J Am Med
   Assoc 291:1642-1643.
Moore JH, Williams SM. 2005. Traversing the conceptual divide
   between  biological  and statistical epistasis:  systems  biology
   and a more modern synthesis. Bioessays 27:637-646.
Moore J, Hahn L, Ritchie M, Thornton T, White BC. 2004. Routine
   discovery of high-order epistasis models for computational
   studies in human genetics. Appl Soft Comput 4:79-86.
Motsinger AA, Ritchie MD.  2006a.  Multifactor dimensionality
   reduction: an  analysis strategy for modeling and detecting
   gene-gene interactions in  human genetics and pharmacoge-
   nomics studies. Hum Genomics 2:318-328.
Motsinger AA, Ritchie MD. 2006b. The effect of reduction in cross-
   validation intervals on the performance of multifactor dimen-
   sionality  reduction. Genet Epidemiol 30:546-555.
Motsinger AA,  Dudek  SM, Hahn LW, Ritchie  MD.  2006a.
   Comparison of Neural Network Optimization  Approaches
   for Studies of  Human Genetics. Lecture Notes in Computer
   Science, vol. 3907. Berlin: Springer, p 103-114.
Motsinger AA, Hahn LW, Dudek SM, Ryckman KK,  Ritchie MD.
   2006b. Alternative cross-over strategies and  selection techni-
   ques for  grammatical evolution optimized  neural  networks.
   Proceedings of  the Genetic  and  Evolutionary  Algorithm
   Conference, vol. 1, p. 947-949.
Motsinger AA, Reif DM, Dudek SM, Ritchie MD. 2006c. Under-
   standing  the evolutionary process of  grammatical evolution
   neural networks for feature selection in genetic epidemiology.
   IEEE Trans Comput Intell Bioinformatics Comput Biol 1-8.
Motsinger AA, Brassat D, Caillier SJ, Erlich HA, Walker K, Steiner
   LL, Barcellos LF, Pericak-Vance MA, Schmidt S, Gregory S,
   Hauser  SL, Haines JL,  Oksenberg JR, Ritchie MD.  2007.
   Complex gene-gene interactions in multiple  sclerosis: a multi-
   factorial  approach  reveals associations with inflammatory
   genes. Neurogenetics 8:11-20.
Motsinger-Reif AA, Dudek SM, Hahn LW, Ritchie  MD. 2008a.
   Comparison of approaches for machine-learning optimization
   of neural networks  for detecting gene-gene interactions in
   genetic epidemiology. Genet  Epidemiol Feb  8 [Epub ahead of
   print].
Motsinger-Reif AA, Fanelli  TJ, Davis  AC, Ritchie  MD.  2008b.
   Grammatical evolution neural  networks is robust to common
   types of noise common to genetic epidemiology studies. BMC
   Res Notes, in press.
Nelson MR,  Kardia SL, Ferrell RE, Sing CF. 2001. A combinatorial
   partitioning method to identify multilocus genotypic partitions
   that   predict  quantitative  trait  variation. Genome   Res
   11:458-470.
O'Neill M, Ryan C. 2001. Grammatical evolution. IEEE Trans Evol
   Comput 5:349-357.
O'Neill M, Ryan C. 2003. Grammatical Evolution: Evolutionary
   Automatic Programming in  an Arbitrary Language.  Boston:
   Kluwer Academic Publishers.



                                                                                                              Genet. Epidemiol.

-------
778
Motsinger-Reif et al.
Province MA, Shannon WD, Rao DC. 2001. Classification methods
   for confronting heterogeneity. Adv Genet 42:273-286.
R Development Core Team. 2006. R: a language and environment
   for statistical computing. http://www  R-Project Org Vienna,
   Austria.
Ritchie MD. 2005.  Bioinformatics approaches for detecting gene-
   gene and gene-environment interactions in studies of human
   disease. Neurosurg Focus 19:E2.
Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF,
   Moore JH. 2001. Multifactor-dimensionality reduction reveals
   high-order interactions among estrogen-metabolism genes in
   sporadic breast cancer. Am J Hum Genet 69:138-147.
Ritchie  MD,  Hahn  LW,  Moore JH.  2003a. Power  of multi-
   factor dimensionality reduction for detecting gene-gene inter-
   actions  in the  presence of genotyping error, missing data,
   phenocopy,  and  genetic  heterogeneity.  Genet  Epidemiol
   24:150-157.
Ritchie MD, White BC,  Parker JS, Hahn LW, Moore JH. 2003b.
   Optimization of neural network architecture using genetic
   programming improves detection and modeling of gene-gene
   interactions in studies of human diseases. BMC Bioinformatics
   4:28.
              Ritchie  MD,  Carillo  MW,  Wilke  RA.  2005.  Computational
                 approaches  for  pharmacogenomics. Pac Symp  Biocomput
                 245-247.
              Robnik-Sikonja M.  2004.  Improving  random forests.  Machine
                 learning: Ecml 2004. Proceedings 3201:359-370.
              Sabbagh A, Darlu P. 2006. Data-mining methods as useful tools for
                 predicting individual drug response: application to CYP2D6
                 data. Hum Hered 62:119-134.
              Tahri-Daizadeh N, Tregouet DA, Nicaud V, Manuel N, Cambien F,
                 Tiret L. 2003. Automated  detection of informative combined
                 effects in genetic association studies of complex traits. Genome
                 Res 13:1952-1960.
              Templeton A. 2000. Epistasis and Complex Traits. Oxford: Oxford
                 University Press, p 41-57.
              Thornton-Wells TA, Moore JH, Haines JL. 2004. Genetics, statistics
                 and human disease: analytical retooling for complexity. Trends
                 Genet 20:640-647.
              Velez D, White BC,  Motsinger AA, Bush WS, Ritchie MD, Moore
                 JH. 2007. A balanced accuracy function for epistasis modeling
                 in imbalanced  datasets  using  multifactor  dimensionality
                 reduction. Genetic Epidemiology 31:306-315.
              Zhu J,  Hastie  T. 2004. Classification of  gene  microarrays by
                 penalized logistic regression. Biostatistics 5:427-443.
Genet. Epidemiol.




-------
© 2008 Wiley-Liss, Inc.
         Birth Defects Research (Part B) 83:522-529 (2008)
                                            Review Article


A Lifestage Approach to Assessing Children's  Exposure


                  Elaine A. Cohen Hubal,1* Jacqueline Moya,2 and Sherry G. Selevan3
    1U.S. Environmental Protection Agency, Office of Research and Development, National Center for Computational Toxicology,
                                           Research Triangle Park, NC
    U.S. Environmental Protection Agency, Office of Research and Development, National Center for Environmental Assessment,
                                                Washington, DC
    3Consultant, Silver Spring MD, formerly at the U.S. Environmental Protection Agency, Office of Research and Development,
                            National Center for Environmental Assessment, Washington, DC
Understanding and characterizing risks to children has been the focus of considerable research efforts at the U.S.
Environmental Protection  Agency  (EPA). Potential health  risks  resulting from environmental  exposures  before
conception and during pre- and postnatal development  are often difficult to  recognize and assess because of  a
potential time lag between the relevant periods of exposure during development and associated outcomes that may be
expressed at later lifestages. Recognizing this challenge, a lifestage approach for assessing exposure and risk is presented
in the recent EPA report titled A Framework for Assessing Health Risks of Environmental Exposures to Children (U.S. EPA,
2006). This EPA report emphasizes the need to account for the potential exposures to environmental agents during all
stages of development, and consideration  of the relevant adverse health outcomes that may occur as a result of such
exposures. It identifies lifestage-specific issues associated with exposure characterization for regulatory risk assessment,
summarizes the lifestage-specific approach to exposure characterization presented in  the Framework, and discusses
emerging research needs for exposure characterization  in the larger public-health context. This lifestage approach for
characterizing children's exposures to environmental contaminants ensures a more complete evaluation of the potential
for vulnerability and exposure of sensitive populations throughout the life cycle. Birth  Defects Res (Part B) 83:522-529,
2008.    © 2008 Wiley-Liss, Inc.
                INTRODUCTION

  Potential  health risks  resulting from  environmental
exposures  before  conception  and  during  pre- and
postnatal development are often difficult to recognize
and assess  due to a potential time lag between the
relevant timing of exposure and of outcomes that may be
expressed at any lifestage including those far removed
from that of exposure.  Lifestages are  defined here as
temporal stages (or intervals) of life that have distinct
anatomical,  physiological, behavioral and/or functional
characteristics that contribute to potential differences in
environmental exposures (U.S.  EPA, 2006). The consid-
eration  of lifestage-specific periods of unique suscept-
ibility in relation to childhood activities, behaviors, and
intakes was  recognized in the International Life Sciences
Institute (ILSI)  2001  workshop  (Olin  and  Sonawane,
2002) and has been the focus of significant efforts in the
EPA Office of Research and Development.
  The need for a lifestage approach to exposure and risk
assessment  is  highlighted in  the recent  EPA report
entitled  A  Framework  for  Assessing  Health  Risks  of
Environmental Exposures to Children (hereinafter referred
to as the "Framework")(U.S. EPA,  2006). This report
followed a series of policy statements  (U.S. EPA, 1995;
Executive Order, 1997) and regulatory statutes such as
the Food Quality Protection Act and the Safe Drinking
Water Act  (U.S.  104th  Congress, 1996aand  1996b)
explicitly requiring consideration of risks to children in
risk assessments conducted by the Agency. The Frame-
work emphasizes the need  to  account  for  potential
exposures to environmental agents during all  stages of
development, and consideration of the relevant adverse
health outcomes  that  may occur as a result of  such
exposures (Brown et al., this issue). A report developed
in parallel by the World Health  Organization,  Principles
for  Evaluating Health Risks in  Children Associated  with
Exposure to Chemicals (WHO,  2006), also focuses  on the
potential vulnerability of children to chemical exposures
and the potential  for increased  risk of adverse  affects
from early-life exposure.
  Further, as understanding improves for how to assess
and mitigate health risks resulting from  exposures to
individual   environmental pollutants, environmental
health scientists  are  turning  their  attention toward
characterizing relationships between multiple environ-
mental factors and complex diseases such as asthma and
obesity  (Schwartz,  2006; U.S.  EPA  2003a).  Because
*Correspondence to: Elaine Cohen Hubal, U.S. Environmental Protection
Agency, National Center for Computational Toxicology, Mail Drop B205-
01, Research Triangle Park, NC 27711.
E-mail: hubal.elaine@epa.gov
Received 8 July 2008; Accepted 15 September 2008
Published online in Wiley InterScience (www.interscience.wiley.com)
DOI: 10.1002/bdrb.20173




-------
                      LIFESTAGE APPROACH TO ASSESSING CHILDRENS EXPOSURE
                                                     523
preconception, prenatal, infant, and childhood exposures
may impact both early lifestage health outcomes as well
as development of  disease  later  in  life,  a  lifestage
approach is  required  to  identify  and  assess these
exposures. Carefully designed exposure studies and
multifactorial  exposure analyses of one form or another
are required in  order to conduct national-scale regula-
tory-based risk assessments, conduct community-based
risk  screening and remediation,  support epidemiology
studies investigating gene-environment interactions, and
characterize exposure and risk for public health tracking.
  In this paper, lifestage-specific issues associated with
exposure characterization for regulatory risk assessment
are identified; the lifestage-specific approach to exposure
characterization that  is  presented in the Framework is
summarized; and emerging research needs for exposure
characterization in the larger public-health context are
discussed.


  ROLE OF  LIFESTAGE-SPECIFIC EXPOSURE
                   ASSESSMENT

  Exposure characterization is the risk analysis step in
which human  interaction with an environmental agent of
concern is evaluated (WHO, 2004). Exposure (sometimes
referred to as  potential dose) is defined as the contact of
an individual or population with  an agent of concern
(WHO,  2004). Exposure assessment is  defined as the
process of estimating the  magnitude,  frequency, and
duration of an exposure, along with characteristics of the
exposed  individual or population. Ideally, this process
will  provide  descriptions  of  the  sources,  pathways,
routes, and uncertainties associated with exposures of
interest  (WHO,  2004).  Sometimes  exposure  can  be
measured directly, but more often exposure must  be
estimated. Armstrong et  al., (2000) note that the EPA
Guidelines for  Exposure  Assessment  (U.S.  EPA, 1992)
provide  the  approaches  needed  to assess  exposures,
including those of  children. But that due to potentially
unique exposure patterns that may result during devel-
opment,  EPA  guidance needs to be supplemented  by
further commentary on application of exposure assess-
ment approaches to children. In addition, the Framework
(U.S. EPA, 2006) notes that children may be more or less
vulnerable than  adults, but without data on exposure
and response and without systematic evaluation of these
data, determining which lifestage may be most vulner-
able is challenging. The Framework provides information
on application of traditional exposure characterization
approaches  over the  course of development as  well as
references to current lifestage-specific data for exposure
factors and exposure  scenarios.
  An important objective of the Framework is to provide
a holistic approach for assessing lifestage-specific risks.
As such, significant emphasis is placed on the need  for
iteration  between exposure, dose-response, and hazard
analysis steps  of the risk assessment process, as depicted
in Figure  1.  The amount of agent that  enters  an
individual after crossing  an  exposure surface  (e.g.,
surface of respiratory tract, gastrointestinal  tract, skin,
placental barrier) is referred to as the dose (WHO, 2004).
Clearly, not all exposures will result in a significant dose
(e.g., contaminated hands may be washed before dermal
absorption or oral transfer can occur). Yet, it is the dose of
the toxic moiety  at the target tissue that will ultimately
            Lifestage-Specific Problem Formulation
                Lifestage-Specific Analysis
                      Lifestage-Specific
                   Exposure Characterization
                Evaluation of Available Exposure Data
                           j
                Lifestage-Specific Exposure Analysis
               Variability, Sensitivity, and Uncertainty
               Iteration with Hazard and Dose-!
                         Characterization
                           x
              Lifestage-Specific Exposure Characterization
        Lifestage-                         Lifestage-
         Specific                           Specific
            Lifestage-Specific Risk Characterization
             Risk Communication/Management

   Fig. 1.  Flow diagram for lifestage-specific exposure character-
   ization. [CAP] Using the lifestage-specific exposure information
   identified in problem formulation, exposure is estimated using a
   tiered approach. The lifestage-specific exposure is characterized
   by discussing the variability and uncertainty in the results. Key
   sources  of variability and uncertainty can be assessed using
   sensitivity analysis. Iteration with hazard characterization and
   dose-response characterization  (illustrated by dashed arrows)
   occurs throughout this process to ensure that critical windows of
   exposure are considered.
   cause  health effects.  The  approach  outlined in  the
   Framework encourages evaluation of  the  potential for
   adverse health outcomes during all developmental life-
   stages, based on  knowledge of exposure, critical win-
   dows  of  development  for  organ  systems,  MOAs,
   anatomy,  physiology,  and  behavior  that can affect
   external exposure and internal dose metrics. The primary
   purpose of a lifestage-specific exposure characterization
   is a detailed  description of the potential for exposure
   during preconception or developmental lifestages that is
   relevant for assessing potential health risks at these and
   subsequent lifestages.
         EXPOSURE CHARACTERIZATION IN
                      FRAMEWORK
                   Overview of Process
      Exposure  characterization  (Fig.  1)  begins  in  the
   problem formulation phase with identification of poten-
   tial  sources, pathways, and scenarios.  Pathways and
   media that may be relevant for the various lifestages are
   depicted in Figure 2. For example, depending on the life
   stage, exposure media can include amniotic fluid, human
   milk, air, water, soil, and food. Problem formulation sets
   the stage to guide collection of available exposure data
   and   other   required  information   for   exposure
Birth Defects Research (Part B) 83:522-529, 2008
                                   Previous
TOC

-------
524
COHEN HUBAL ET AL.
                              Prenatal:
                      All exposures to the fetus occur
                      transplacentally or via physical
                     factors. The mother's exposure to
                       environmental media can be a
                      significant source of exposure for
                     environmental media for the fetus.
                        Infant/Young Child:

                        Exposures for the infant and
                     young adult can occur through all
                        environmental media. When
                      breastfed, the mother's exposure
                     to environmental media can be an
                      additional source of exposure to
                               the infant.
                      Older Child/Adolescent:
                                                          i  Air  ! |Water!
                        Exposures to the child and   j  ^j^1 i--K  Mother  ^
                      adolescent can occur through all            " if	ijl
                        environmental media. The            ; other! | Diet
                      mother's exposure is no longer a
                           factor for the child.
Fig. 2. Exposure routes during developmental lifestages. [CAP] These three illustrations show the different routes of exposure by
lifestage for children. The solid lines in the figure represent relevant exposure, whereas dotted lines represent exposures that are not
relevant to the specific lifestage. During gestation, the majority of exposures (except for physical factors) occur transplacentally through
exposure to the mother. After birth, exposures may either be directly to the child, with an additional route from the mother for those
agents that may be present in human milk.
characterization. Potentially significant exposure scenar-
ios are identified  and evaluated to conduct a lifestage-
specific exposure analysis. Variability, sensitivity,  and
uncertainty analyses are conducted to determine impact
of the available exposure data on the resulting analysis.
The results of the exposure characterization are iterated
with the  hazard  and  dose-response analyses  (Makris
et al., 2008). The purpose of this interaction is to identify
any critical  developmental windows of susceptibility
(i.e., development periods with potential for increased
likelihood of an adverse effect) that were not considered
in the  initial exposure  characterization, or  to  identify
important  exposure periods (i.e., periods of potentially
higher exposure) that were not considered in the hazard/
dose-response assessment. Finally, the exposure  charac-
terization is summarized in a narrative that includes
discussion of the confidence in the analysis results based
on available data.  This  information  feeds  into  the
comprehensive  lifestage-specific  risk  characterization
(see Brown et al.,  2008).
                                Conceptual Model

                  Problem formulation  results  in development of  a
               conceptual model.  Here, the assessor identifies  poten-
               tially  relevant lifestage-specific  exposure scenarios by
               considering the  "windows"  of early lifestage exposure
               that could lead to toxicological outcomes in children or at
               later  lifestages.  Several exposure considerations  for
               development of  the conceptual model are discussed in
               the Framework (U.S. EPA, 2006).  These include perform-
               ing a  preliminary examination of the  data to determine
               the  lifestages likely to be exposed given the chemical
               properties, fate and transport, uses of the environmental
               agent(s),  and the defined scope  of the assessment.  For
               example,  lipophilic  chemicals  such  as dioxins  and
               polychlorinated  biphenyls  (PCBs)  can  accumulate  in
               human milk (Solomon and Weiss, 2002); lead, which is
               strongly  absorbed to soil, can be  ingested by children
               during  hand-to-mouth   behavior  (Landrigan  et  al.,
               2002).  The  conceptual  model  involves  a  qualitative
                                                                           Birth Defects Research (Part B) 83:522-529, 2008
                                   Previous
            TOC

-------
                      LIFESTAGE APPROACH TO ASSESSING CHILDRENS EXPOSURE
                                                 525
characterization of the potential sources, pathways  of
exposures  (including   exposure  media  and  routes),
exposure  scenarios (lifestages, time  frames, locations,
and activities), and pattern of exposures (magnitude and
duration) to parents or children, as appropriate.
  An important issue to consider during development of
the conceptual model is the potential for lifestage-specific
vulnerabilities resulting from intrinsic susceptibility  as
well as difference in exposure patterns. This includes a
qualitative  understanding  of lifestage-specific activity
patterns to identify potentially highly exposed lifestages
(Tulve et al., 2002; Cohen Hubal et al, 2000). For example,
a small pilot of children in a daycare  setting  found
difference in potential  dermal exposure between infants
(6-12 months)  and preschool age children  (2-3  years)
(Cohen Hubal, 2006). Pesticide loadings on bodysuits and
levels measured using handwipes were higher for infants
than for the preschool age children. In addition, exposure
assessment case  studies presented by Firestone et  al.
(2007) demonstrate the need  to consider lifestage-specific
differences when  estimating exposures and identifying
data gaps. For example,  in  the  case study evaluating
dietary intake of pesticides,  significant statistical  differ-
ences were found, based on  50th percentiles  for children
aged 1 to < 3 years compared to each of children 1 to < 2
years and children 2 to <3 years. Although the magnitude
of these differences for these age bins is small, (<0.02 |j,g/
kg/day),  these may  be  significant  when aggregate
exposure over all pathways is considered and when these
exposures  are  considered  in the context  of  intrinsic
susceptibility at a particular developmental  window.  In
addition, Firestone et al. (2007) discuss how limited data
for particular age groups may impact exposure estimates.
As such, consideration  of potential differences prior  to
beginning the assessment will facilitate interpretation and
transparent communication of results  as well  as design of
studies required to develop additional exposure data.
  The Framework calls for application of EPA's Guidance
on Selecting  Age  Groups for Monitoring  and  Assessing
Childhood   Exposures   to  Environmental  Contaminants
(U.S. EPA, 2005)  as a starting point for identifying and
selecting   potentially   important    age  bins   for
exposure analysis (Firestone et al., 2007). This guidance
includes   expert   analysis   of    existing   generic
exposure  data  and provides a detailed  discussion  of
how  these age groups were developed and  how  to
implement them in an assessment. In brief, the recom-
mended age groups are based  on the current under-
standing of differences in behavior and physiology that
may impact exposures to children. (Windows of intrinsic
susceptibility were also considered in a general  sense.
For more detailed discussion of lifestage-specific sus-
ceptibility,  see  Makris et  al., 2008.)  The standard age
groups presented in the EPA Guidance include: birth to
<1 month; 1 to <3 months; 3 to <6 months; 6 to  <12
months; 1 to  < 2 years; 2 to < 3 years; 3 to < 6 years; 6 to
<11 years; 11  to  <16 years;  and  16  to  <21  years
(Firestone  et al.,  2007).  Importantly, the  Framework
extends the  scope of required exposure age bins  to
facilitate characterization of exposure during preconcep-
tion  and  prenatal lifestages. The EPA's Child-Specific
Exposure Factors Handbook is another tool referenced  in
the Framework that can be used to identify  age-specific
behaviors that may result  in higher exposure  levels
(U.S. EPA, 2002).

Birth Defects Research (Part B) 83:522-529, 2008
  Traditionally, the  conceptual  model  has considered
human exposure  in  the context of the source-to-effects
paradigm (U.S. EPA, 2003a); however,  the Framework
advocates  for  a  person-oriented approach.  In  this
approach, the individual or population is  the center of
the exposure model  and the assessor searches from the
point of contact to identify a source(s) (Price et al., 2003).
In  this way, potentially  important time periods  of
exposure, exposure  pathways,  and vulnerable indivi-
duals  or  populations  can be  identified. A person-
oriented exposure approach is necessary to address the
potential  time lag  between  the  relevant periods  of
exposure during development and associated  outcomes
that may be expressed at later lifestages.


Exposure Measurement  and Estimation Approach
  Four major types of information  are used to character-
ize exposure:  questionnaire-based  metrics,  surrogate
exposure  metrics, personal exposure  measures,  and
biomonitoring  data.  Questionnaire-based   metrics  are
often the basis for exposure classification in epidemiolo-
gical studies. These data include information on activity
patterns, diet, and product use,  and they are useful for
indicating specific behaviors or consumption patterns
that may  put  the  individual in  contact  with  the
environmental chemical. Questionnaires used  to collect
children's exposure data are usually administered by a
trained  technician   using   a  proxy  (e.g., parent  or
caregiver) or filled out by the caregiver  (e.g., in diaries).
Observational studies  using techniques  such  as video-
taping have also proven useful for collecting activity data
(Zartarian et al.,  1998). Surrogate  exposure metrics  are
typically ambient measurements of air, water, or food
(i.e., market basket)  collected at community or regional
locations.  Locations where  these data are collected may
vary depending on the life stage.  For example, infants
and toddlers are more likely to be in contact with rugs,
floors, lawns, and chemicals that tend  to  sink to low
levels (WHO, 2006).  The environment of adolescents is
more varied  and  may  include the home, school, social,
and occupational settings (WHO,  2006). These data are
often useful for  characterizing background  exposure
levels. Personal  exposure  data  are collected at  the
individual level.  Personal  air monitors and  duplicate
diet samples are examples of this type of direct or point-
of-contact information. In  the past, personal  exposure
monitoring has been difficult to implement with young
children due to the size of monitors and the level of input
required  from  participants. Emerging  technologies  in
small-scale sensors,  and wireless  transmission provide
the promise for collection of needed personal exposure
data across all developmental lifestages. Finally, biomo-
nitoring data include measurement of chemical, metabo-
lite, or molecular markers in biological fluids/tissues
(e.g., blood,  urine, human  milk,  cord  blood, amniotic
fluid). Exposure biomonitoring data are used extensively
in epidemiology and in public health surveillance and
have the advantage of measuring total exposure from all
routes and  sources. In general,  observational studies
designed specifically to characterize important sources
and pathways of exposure will collect several, or even all
four, types of exposure information.
  In the Framework,  three approaches for calculating
exposures  are  discussed: point-of-contact,   scenario




-------
526
COHEN HUBAL ET AL.
evaluation, and dose reconstruction (U.S. EPA, 1992). To
conduct the lifestage-specific exposure characterization,
a  calculation  approach  is  selected  on  the  basis of
available data and the risk assessment  questions  that
were defined during the problem formulation phase.
  The  point-of-contact, or  direct, approach requires
measurements of chemical concentrations at the point
where  exposure occurs (at the interface between the
person and the environment) and records on length of
contact with each chemical. This approach, often used in
occupational settings,  does not require additional in-
formation on an individual's characteristics or behaviors
because exposure is measured directly.
  Using the dose-reconstruction approach, estimates of
exposure are developed from population-level biomoni-
toring data. Urine,  blood, nail, saliva, hair, and feces are
commonly used to  measure biomarkers of exposure
(WHO,  2006).  Maternal biomarkers  include  amniotic
fluid and  human milk (WHO, 2006). Other media used
to assess prenatal  and postnatal exposure include first
teeth, meconium, and cord blood  (WHO, 2006). At the
present time,  most biomarker data  are  difficult to
interpret,  either because the presence of a biomarker
may not be unique (e.g., many stressors may result in a
change  in  the  same biomarker)  or there may  not be
adequate  exposure pathway information  to  link the
biomarker to the exposure. As such, careful study design
is required to collect biomonitoring information for use
in characterizing exposure for risk assessment (Albertini
et al., 2006). Currently, this assessment approach is most
successful  for  characterizing  exposure to  persistent
compounds. Using biomonitoring data to characterize
intermittent exposures to  rapidly  metabolized  and
cleared compounds requires a more complicated study
design incorporating repeated measures and information
on potential sources. Additional lifestage-specific meth-
odological considerations arise and need to be addressed
in design of biomonitoring studies (Barr et al., 2005).
  The scenario evaluation approach, sometimes referred
to as the  indirect  approach, utilizes  data on  chemical
concentration,  frequency and duration of exposure, as
well as information on the  exposed lifestage. In  this
approach, models are required to link the different data
to estimate individual or population-level exposures or
distributions of exposures. Currently, modeled estimates
(i.e., using the scenario evaluation approach) are often
used to conduct risk  assessments necessary  to make
regulatory  decisions. As such,  the scenario evaluation
approach is the focus of the Framework and is discussed
in more detail below.
         Exposure Data and Information
  The  objectives and scope  of the risk assessment,
defined in  the problem formulation  phase, provide the
focus for identifying all of the relevant human exposure
data and other necessary information. To focus on risk
from exposure to children, data are required on sources
and  exposure  media concentrations that  have been
identified in the locations where children spend time.
These may  change by developmental  stage. For example,
sources may be identified in: (1) residence and workplace
for pregnant and lactating women; (2) residence, daycare
and  outdoor play  areas for infants and toddlers; (3)
residence, school, and locations of after-school activities
               for school-age children; and (4) residence, school, and
               locations of after-school activities and workplace for
               adolescents.
                 For a given source, exposure  media (e.g., water, soil/
               dust/sediments, food, and objects/surfaces) and expo-
               sure routes (i.e., inhalation, ingestion, dermal absorption,
               and indirect ingestion)  define the pathway of exposure
               (Cohen Hubal et al., 2000b).
                 Exposure media and routes may change with lifestage.
               Figure 2 highlights the  stages of development and their
               relevant exposure routes. For example, the fetus will be
               exposed to cord blood and amniotic fluid, the infant to
               breast milk, the teething child  to surfaces of toys and
               other  objects (both intended and unintended) through
               mouthing, and the school-age child to contaminants on
               classroom floors.
                 For any given pathway, a set of associated exposure
               scenarios describes how an exposure takes place, and is
               used to estimate distribution of exposure. Available data
               resources are outlined in the Framework. Though many
               of the large exposure studies have focused on the adult
               lifestage, these data are significant and are supplemented
               by smaller and  more  recent studies  specific to  early
               lifestages  (Fenske  et al.,  2005;  Gilliland  et  al.,  2005;
               Morgan et al., 2006).


                     Analysis Level or Tiered Assessment
                 The  Framework  proposes  a  tiered approach  for
               exposure characterization with  extensive  feedback at
               each tier  to the hazard identification/dose-response
               analyses. The tiered approach is useful for the efficient
               allocation of resources. Typically, an exposure character-
               ization will begin with a screening-level assessment and
               then, if there appears to be significant exposure or an
               unacceptable level of uncertainty, a second, more refined
               level of analysis will be conducted. The major difference
               among the levels of assessment reflects  the different
               assumptions that are used (i.e., conservative assumptions
               versus more refined assumptions). An important point
               raised in the Framework is that probabilistic techniques
               may be required at either level of analysis depending on
               the types of scenarios that are being evaluated. This is a
               departure   from tiered  approaches  that  have  been
               proposed previously (e.g., Reiss et al., 2003). Because a
               lifestage assessment requires  screening and potential
               evaluation of the full range of age bins as well as multiple
               sources and pathways,  a probabilistic approach  may be
               the most efficient  and most conservative, even at the
               screening tier.
                 Screening Assessment.  The purpose of a screen-
               ing  tier as described in the Framework is to identify
               potentially  important pathways, scenarios, and  vulner-
               able lifestages as well as to rule out insignificant ones.
               Bounding  values for exposure factors  and conservative
               simplifying assumptions are used at this level of analysis.
               As a result, the output  may  have  a high  level  of
               uncertainty. Typically, screening assessments are used
               when a  quick  exposure  estimate  is  needed  and  to
               prioritize additional work. Thus, screening assessments
               make use of readily available measurement data, models,
               and  conservative  assumptions  to  fill in data  gaps.
               Historically, deterministic calculations were used in most
               screening-level  exposure analyses. However, exposure
               assessments have  become increasingly complex,  and
                                                                        Birth Defects Research (Part B) 83:522-529, 2008




-------
                     LIFESTAGE APPROACH TO ASSESSING CHILDRENS EXPOSURE
                                                 527
probabilistic techniques may be useful when, for exam-
ple, exposure parameters have large variability or when
multiple sources exist (U.S. EPA, 2001).
  Should  a potential risk be identified based  on the
bounding assumptions used in this level of analysis, a
more refined  tier of analysis would be necessary to
address uncertainties in  this  screening-level exposure
assessment.
  Refined Assessment.  This  tier  provides more
detail for  potentially relevant scenarios and potentially
sensitive age groups that may have been identified in the
screening  assessment. The goal is often to  estimate the
distribution of exposure for the relevant lifestages. Based
on results of the sensitivity analysis conducted  for the
screening-level assessment,  significant exposure factors
and  important assumptions are revisited to develop
more realistic estimates of exposure.
  This more advanced analysis may include the applica-
tion of sophisticated modeling tools to develop exposure
estimates  for use in  regulatory decisions. Few of these
models are designed currently  to specifically  address
lifestage exposures. As a result, data on the age bins used
in the models and outputs produced by the models may
not address the  specific  age  groups  of interest for a
complete lifestage-specific assessment (U.S. EPA, 2005;
Firestone et al., 2007). The Framework provides sugges-
tions  for  approaches to address data limitations  and
associated uncertainties remaining following this  refined
tier of analysis.
  Supplemental Data Collection.   Based on results
of the refined assessment and the associated sensitivity
and  uncertainty analyses, specific data  needs  may be
identified. The Framework suggests that if the objectives
of the risk assessment indicate that  any  specific un-
certainties  in  the  exposure  characterization should be
addressed, then collection of new data to address these
may be needed.


Variability and Uncertainty in LifeStage Exposure
                    Assessment
  Variability  is defined as the heterogeneity of values
over time, space, or different members of a population.
Variability implies real  differences among  members of
that population (WHO, 2004). The Framework recognizes
the  significant inter-individual variability in early  life-
stages due to rapid physiologic, anatomic, and behavior-
al changes. Even within a relatively narrow age group,
variability may be significant. This variability affects the
determination of upper percentiles of exposure  and its
associated  risk.  That is, given a high-quality/high-
quantity set of data for each age group,  there may still
be significant variability for  a particular exposure factor,
set of factors, or exposure pathway. The  better the data
and the characterization of this variability, the better the
basis  for  final selection of age groups for  a  specific
assessment.
  Uncertainty can be defined as imperfect knowledge
concerning the present  or future state of an organism,
system, or (sub)population under consideration  (WHO,
2004). Uncertainties include  considerations related to (1)
missing, incomplete and/or incorrect knowledge; and (2)
ignorance  and/or lack of awareness  (WHO, 2004).
Currently, many assessments of early life exposures will
have  significant  uncertainty due  to  limited exposure
factor data  for the relevant lifestages. Significant un-
certainty may also result from a lack of information and
understanding  of relationships between early life ex-
posures and potential effects later in life.
  Probabilistic approaches can be  used to identify and
quantitatively characterize  the important  factors  and
associated uncertainty and variability to better assess
early lifestage exposures.  As mentioned previously, the
Framework  recognizes that although probabilistic  tech-
niques  have traditionally been applied  in advanced
levels of exposure assessment, these approaches may be
required at screening levels of analysis for complex
exposures and for consideration of multiple lifestages.


                  CONCLUSIONS
         Contributions  of the Framework
  The lifestage approach presented in the Framework for
characterizing children's  exposures to  environmental
contaminants ensures a more  complete evaluation of
the potential for vulnerability and  exposure  of sensitive
populations  during  development. Several  important
issues  highlighted in  the  Framework  will  serve to
improve exposure assessments  for all lifestages.  The
need to coordinate and iterate between exposure  char-
acterization,  hazard identification, and dose-response
assessment is emphasized throughout the  Framework.
Increased coordination between the three analysis  steps
of a risk assessment will ensure that critical windows of
exposure and hazard  are considered.  In addition, the
Framework  promotes incorporating lifestage considera-
tions  into  Agency risk assessments  despite  database
limitations and scientific uncertainties.  Recognizing that
there will always be information and knowledge gaps,
the Framework advocates for transparency in identifying
these  gaps while  providing suggested approaches for
moving forward to characterize exposures with available
data. Finally, the Framework emphasizes the importance
of characterizing variability and identifying significant
uncertainties as an integral part of conducting lifestage-
specific exposure assessment. Availability of the Frame-
work  is sure  to  stimulate  research that will facilitate
better understanding of variability in exposures based on
lifestage, and  that  will  address  some of the  most
important uncertainties.

                  Research Needs
  As risk assessments are conducted to address complex
questions related to potential outcomes across  lifestages
from early life exposures, tools and  approaches  from
other fields will have to be applied. New methodologies
may have  to be  developed to incorporate  into  risk
assessment  the  knowledge about physiological   and
behavioral  differences across  life stages.  This  may
require  identifying analogies in other  disciplines  and
then conducting  research  to  adapt   and   incorporate
relevant expertise,  tools,  and advances in these other
fields.  Relevant fields may include statistics, systems
analysis, ecological risk  assessment,   social  sciences,
genomics, and systems biology. Moving exposure analy-
sis  forward for lifestage-specific risk  assessment high-
lights the importance of developing predictive exposure
metrics,  models, and lifestage-specific  exposure  factors
data. An increased emphasis on characterizing the  links
Birth Defects Research (Part B) 83:522-529, 2008




-------
528
COHEN HUBAL ET AL.
between exposure, dose, and health outcome is essential
to  ensure  that  critical  windows of  susceptibility  are
addressed. The Framework also highlights the need for
identifying  and  characterizing  vulnerable individuals
and populations. Finally, a broad definition of environ-
ment, environmental  stressors,  and  exposure  may  be
needed  to address these  emerging issues in  lifestage-
specific risk assessment.
  The Framework document and this manuscript focus
on  lifestage-specific  issues  associated with  exposure
characterization for regulatory  risk  assessment. How-
ever, as regulators are called on to address the implica-
tions of environmental policy decisions on public health,
data developed in large-scale epidemiology studies such
as the National Children's study (Needham et al., 2005;
Ozkaynak et al., 2005) will be important for  informing
these analyses. Improved approaches and guidance for
designing and implementing  a  high-quality  exposure
component   will be  critical  for  success of  studies
investigating  environmental contributions  to complex
disease  and gene-environment interactions.
  A major challenge in assessing early life exposures is
the  limited  availability  of  efficient and  affordable
methods  for  comprehensively  monitoring  exposures
and internal dose. Improved exposure monitoring tools
are critical for identifying vulnerable populations, study-
ing gene-environment interactions, and testing impacts
of  intervention  and   regulation.  Emerging  tools  in
molecular  biology  provide  the potential  to  develop
cellular  and  molecular indicators of exposure that can
be  used to assess  the vulnerability  of humans across
lifestage to  environmental stressors.  Development  of
molecular indicators of exposure combined with devel-
opment of nanotechnology-based sensors will  present
the  opportunity  for  the  simultaneous,  near  real-time
measurement of biologically relevant exposures to multi-
ple real-world  stressors. Limited  research is underway in
this area  (Weis  et  al.,  2005).  However,  a  significant
research effort is required to apply these technologies to
provide data required to characterize lifestage-specific
exposures for improved regulatory risk assessment and
public health policy.


              ACKNOWLEDGMENTS

The authors wish to acknowledge the contributions of
the following current and former EPA scientists  authors
of the EPA's Framework document: Rebecca C.  Brown,
Stanley  Barone Jr.,  Hisham El-Masri, Susan Y.  Euling,
Carole Kimmel, Susan L. Makris, Babasaheb Sonawane,
Tracey Thomas, and Chad M. Thompson.
  Disclaimer: This work was conducted by the U.S. EPA.
It has  been  subjected  to the  Agency's  administrative
review and has been approved for publication.


                     REFERENCES
Albertini R, Bird M, Doerrer N, Needham L, Robison S, Sheldon L, Zenick
    H. 2006. The  use of biomonitoring data in exposure and human
    health risk assessments. Environ Health Perspect 114(11):1755-1762.
Armstrong TW, Hushka LJ, Tell JG, Zaleski RT.  2000. A tiered approach
    for assessing  children's  exposure.  Environ Health Perspect 108:
    469^174.
Barr DB, Wang RY, Needham LL. 2005. Biologic monitoring of exposure
    to environmental chemicals throughout the life stages: requirements
    and issues  for consideration  for the national children's study.
    Environ Health Perspect 113:1083-1091.
                 Brown RC,  Barone Jr. S,  Kimmel CA.  2008. Children's  health risk
                    assessment: incorporating a lifestage  approach into the risk assess-
                    ment process. Birth Defects Res (Part  B) 83:511-521.
                 Cohen Hubal EA, Egeghy P, Leovic K, Akland G. 2006. Measuring
                    potential dermal transfer of a pesticide to children in a daycare
                    center. Environ Health Perspect 114(2):264-269.
                 Cohen Hubal EA, Sheldon LS, Burke JM, McCurdy TR, Berry MR, Rigas
                    ML,  Zartaria VG, Freeman NCG. 2000a. Exposure assessment for
                    children: a review of the factors influencing exposure  of children,
                    and  the data available to characterize and assess that exposure.
                    Environ Health Perspect 108(6):475^86.
                 Cohen Hubal EA, Sheldon LS, Zufall MJ, Burke JM, Thomas KW. 2000b.
                    The  challenge of  assessing children's  residential exposure to
                    pesticides. Journal of Exposure Analysis and Environmental Epide-
                    miology 10:638-649.
                 Executive Order 13045: Protection of children from environmental health
                    risks and safety risks.  2000. Federal  Register  April 21,  1997.
                    62(78):19883-19888. Accessed online September 25, 2007, at http://
                    www.epa.gov/fedrgstr/eo/eol3045.htm
                 Fenske RA, Bradman A, Whyatt RM, Wolff MS, Barr DB.  2005.  Mini-
                    monograph:  lessons learned for the  assessment  of children's
                    pesticide exposure: critical sampling and analytical issues for future
                    studies. Environ Health Perspect 113:1455-1462.
                 Firestone M, Moya J, Cohen-Hubal E, Zartarian V, Xue J. 2007. Identifying
                    age groups for exposure assessments and monitoring. Risk Analysis
                    27(3):701-714.
                 Gilliland F,  Avol E,  Kinney P,  Jerrett  M, Dvonch T,  Lurmann F,
                    Buckley T, Breysse P, Keeler G, de Villiers T, McConnell R. 2005.
                    Mini-monograph: air pollution exposure assessment  for epidemio-
                    logic studies of pregnant women and  children: lessons learned from
                    the Centers  for Children's Environmental  Health and  Disease
                    Prevention Research. Environ Health  Perspect: 113:1447-1454.
                 Landrigan PJ, Schechter CB, Lipton JM, Fahs MC, Schwartz J.  2002.
                    Environmental pollutants and disease in American  children: esti-
                    mates of morbidity, mortality, and costs for lead poisoning, asthma,
                    cancer, and  developmental  disabilities. Environ  Health Perspect
                    110(7):721-728.
                 Makris SL, Thompson CM, Euling SY, Selevan SG, Barone Jr. S, Sonaware
                    BR. 2008. A lifestage-specific approach to hazard and  dose-response
                    characterization for children's health  risk assessment. Birth Defects
                    Res (Part B) 83:530-546.
                 Morgan MK, Sheldon LS, Croghan CW, Chuang JC, Lordo  RA, Wilson
                    NK, Lyu C, Brinkman M, Morse N, Chou YL, Hamilton C, Finegold
                    JK, Hand K, and Gordon  SM . 2006. A pilot study of children's total
                    exposure to persistent  pesticides and other  persistent  organic
                    pollutants (CTEPP).  Volume I: Final  Report. Contract Number 68-
                    D-99-011, U.S. Environmental Protection Agency, Office of Research
                    and  Development, Research Triangle Park, NC.  Accessed online
                    November   8,  2007,   at  http://www.epa.gov/heasd/ctepp/
                    ctepp_report.pdf
                 Needham LL, Ozkaynak H, Whyatt RM,  Barr DB, Wang  RY, Naeher L,
                    Akland G, Bahadori T, Bradman A, Fortmann R, Liu L-J, Morandi M,
                    O'Rourke MK, Thomas K, Quackenboss J, Ryan  PB, Zartarian V.
                    2005.  Exposure  assessment  in  the national children's  study:
                    introduction. Environ Health Perspect 113:1076-1082.
                 Olin SS,  Sonawane BR.  2003. Workshop  to develop a framework for
                    assessing risks to children from exposure to environmental agents.
                    Environ Health Perspect 111(12):1524-1526. Accessed online Novem-
                    ber  11,  2007, at http://ehp.niehs.nih.gov/members/2003/6183/
                    6183.html
                 Ozkaynak H, Whyatt RM, Needham LL, Akland G, Quackenboss J. 2005.
                    Exposure assessment implications for the design and implementa-
                    tion  of  the National Children's  Study. Environ  Health Perspect
                    113:1108-1115.
                 Price PS, Chaisson CF,  Koontz M, Wilkes C, Ryan B,  Macintosh D,
                    Georgopoulos P. 2003. Construction  of a comprehensive chemical
                    exposure framework using person oriented modeling.  Developed for
                    The Exposure Technical Implementation Panel, American Chemistry
                    Council, Contract 1388.  Accessed online  September 24, 2007, at
                    http://www.theUfeHnegroup.org/lifeline/documents/comprehensive_
                    chemical_exposure_framework. pdf
                 Reiss R, Anderson EL, Lape  J. 2003, A framework and case study for
                    exposure assessment in the voluntary  children's chemical evaluation
                    program. Risk Analysis 23(5):1069-1084.
                 Schwartz DA. 2006. The importance of gene-environment  interactions
                    and exposure assessment in understanding human diseases. J Expo
                    Anal Environ Epidem 16(6):474-476.
                 Solomon, Weiss. 2002. Mini-monograph: chemical contaminants in breast
                    milk chemical contaminants in breast  milk: time trends and regional
                    variability. Environ Health Perspect 110(6):339-347.
                 Tulve NS, Suggs JC, McCurdy T, Cohen Hubal EA. 2002. Frequency of
                    mouthing behavior in young children. J Expo Anal Environ Epidem
                    12:259-264.
                                                                                  Birth Defects Research (Part B) 83:522-529, 2008




-------
                           LIFESTAGE APPROACH TO ASSESSING CHILDRENS EXPOSURE
                                                            529
U.S. 104th Congress. 1996a. Food Quality Protection Act (FQPA). P.L. 104-
    170. Accessed online September 25, 2007, at http://~www.epa.gov/
    pesticides /regulating/laws/fqpa/gpogate.pdf
U.S. 104th Congress. 1996b. Safe  Drinking Water Act  (SDWA)  amend-
    ments. P.L. 104-182. Accessed online September 25, 2007, at http://
    www.epa.gov/safewater/sd~wa/text.html
U.S. EPA. 1992. Guidelines for exposure assessment. U.S. Environmental
    Protection Agency, Risk Assessment Forum, Washington, DC, 600Z-
    92/001,  1992.  Accessed online September 24,  2007,  at  http://
    cfpub.epa.gov/ncea/raf/recordisplay.cfm?deid — 15263
U.S. EPA. 1995. Policy on Evaluating Health Risks to Children. Office of
    the Administrator, Washington, DC. Accessed online September 25,
    2007, at http://~www.epa.gov/OSA/spc/2poleval.htm
U.S.  EPA. 2001.  Risk Assessment Guidance  for Superfund  (RAGS),
    volume Ill-part A: process for conducting probabilistic  risk assess-
    ment. Office of Solid Waste and Emergency Response. EPA 540-R-02-
    002. Accessed online September 24, 2007, at http://www.epa.gov/
    oswer/riskassessment/ragsSa/index.htm
U.S.  EPA. 2002.  Child-specific exposure factors handbook. Office  of
    Research and Development, Washington, DC. EPA/600/P-00/002B.
    Accessed online September 24, 2007, at http://cfpub.epa.gov/ncea/
    cfm/recordisplay.cfm?deid = 55145
U.S. EPA. 2003a. Human health research strategy. Office of Research and
    Development, Washington, DC. EPA/600/R-02/050. Accessed on-
    line September 24,  2007,  at  http://www.epa.gov/nheerl/human
    health/HHRSJinal_web.pdf
U.S. EPA. 2003b. Framework for Cumulative Risk Assessment. (Fig. 1-3).
    U.S.  Environmental  Protection Agency, Office  of  Research  and
    Development,  National Center  for Environmental Assessment,
    Washington  Office,  Washington, DC.  EPA/600/P-02/001F, 2003.
    Accessed online September 24, 2007, at http://cfpub.epa.gov/ncea/
    raf/recordisplay.cfm?deid — 54944
U.S.  EPA. 2004. Risk assessment principles  and practices.  Office  of
    Research and Development, Washington, DC. EPA/100/B-04/001.
    Accessed online October 25,  2007,  at http://www.epa.gov/osa/
    pdfs /ratf-f inal.pdf
U.S.  EPA. 2005. Guidance on selecting age groups for monitoring and
    assessing childhood exposures to environmental contaminants. Risk
    Assessment Forum, Washington, DC. EPA/630/P-03/003F. Accessed
    online on September 24, 2007, at http://cfpub2.epa.gov/ncea/cfm/
    recordisplay.cfm?deid = 146583
U.S. EPA. 2006. A framework for assessing health risks of environmental
    exposures to children. National Center for Environmental Assess-
    ment, Office of Research and Development, Washington, DC. EPA/
    600/R-05/093F.  Accessed September 24, 2007, at http://cfpub.epa.
    gov/ncea/cfm/recordisplay.cfm?deid = 158363
Weis BK, Balshaw D, Barr JR, Brown D, Ellisman M, Lioy P, et al. 2005.
    Personalized exposure assessment: promising approaches for human
    environmental health  research. Environ  Health  Perspect 113(7):
    840-848.
WHO.  2004. IPCS risk assessment terminology. Harmonization Project
    document no.  1, World Health Organization,  Geneva. Accessed
    online October  25, 2007, at  http://www.who.int/ipcs/methods/
    harmonization/areas /terminology /en/index, html
WHO.   2006.  Principles  for  evaluating  health  risks  in  children
    associated with exposure to chemicals. International Programme
    on  Chemical   Safety  (IPCS)  Environmental  Health   Criteria
    document (EHC-237). World Health Organization. Accessed online
    September  25,  2007,  at  http://whqlibdoc.who.int/publications/
    2006/92415723 7X_eng.pdf
Birth Defects Research (Part B) 83:522-529, 2008




-------
© 2008 Wiley-Liss, Inc.
               Birth Defects Research (Part A) 82:177-186 (2008)
Fetal  Alcohol  Syndrome  (FAS) in  C57BL/6 Mice  Detected

     through Proteomics  Screening  of the Amniotic  Fluid


                     Susmita Datta,1* Delano Turner,2 Reetu Singh,3* L. Bruno Ruest,3
                              William M. Pierce Jr.,2 and Thomas B. Knudsen3
                Department of Bioinformatics and Biostatistics, School of Public Health and Information Science,
                                  University of Louisville, Louisville, Kentucky 40292
                   Biomolecular Mass Spectrometry Laboratory, Department of Pharmacology and Toxicology,
                          School of Medicine, University of Louisville, Louisville, Kentucky 40292
             3 Department of Molecular, Cellular and Craniofacial Biology, Birth Defects Center, School of Dentistry,
                                  University of Louisville, Louisville, Kentucky 40292
                      Received 22 August 2007; Revised 11 December 2007; Accepted 12 December 2007


BACKGROUND: Fetal  Alcohol Syndrome (FAS), a severe consequence of the Fetal Alcohol Spectrum Disor-
ders, is  associated with craniofacial defects, mental  retardation, and stunted  growth. Previous studies in
C57BL/6J and C57BL/6N mice provide evidence that alcohol-induced pathogenesis follows early changes in
gene expression within specific molecular  pathways  in the embryonic headfold. Whereas the former (B6J)
pregnancies carry a high-risk for dysmorphogenesis following maternal exposure to 2.9  g/kg alcohol (two
injections spaced 4.0 h apart on gestation day 8), the latter  (B6N) pregnancies carry a  low-risk for malforma-
tions. The present study used this  murine  model to screen amniotic fluid for  biomarkers that  could poten-
tially discriminate  between FAS-positive and  FAS-negative pregnancies. METHODS: B6J and B6N litters
were treated with alcohol (exposed) or saline  (control) on day 8 of gestation. Amniotic  fluid  aspirated on
day 17 (n = 6 replicate litters per group) was  subjected to trypsin  digestion for analysis by matrix-assisted
laser desorption-time of flight mass spectrometry with the  aid  of denoising  algorithms, statistical testing,
and classification methods.  RESULTS:  We identified several peaks in  the  proteomics  screen that were
reduced consistently and  specifically in exposed B6J  litters. Preliminary characterization  by liquid chroma-
tography tandem mass spectrometry and multidimensional protein identification mapped the reduced peaks
to alpha fetoprotein (AFP). The predictive strength of AFP deficiency as a biomarker  for FAS-positive litters
was confirmed by area under the receiver  operating characteristic curve. CONCLUSIONS: These findings in
genetically susceptible mice support clinical observations  in maternal serum that implicate a decrease in
AFP  levels following prenatal alcohol  damage.  Birth Defects Research  (Part A) 82:177-186, 2008.   © 2008
Wiley-Liss, Inc.

Key words: alcohol; pregnancy;  FAS;  FASD; mouse; C57BL/6J; C57BL/6NCrl; amniotic  fluid; proteomics;
Random Forest; alpha fetoprotein
Presented at the 47th annual meeting of the Teratology Society, June 23-28,
2007, Pittsburgh PA.
Grant sponsor: NIH P20-RR/DE17702 (S. D.).
Grant sponsor: NIH/NIEHS P30ES014443-01A1 (S. D.)
Grant sponsor: NIH RO1-AA13205 (T. B. K.).
Grant sponsor: Biomolecular Mass Spectrometry Laboratory; Grant number:
S10-RR11368.
Grant sponsor: State of Kentucky Physical Facilities Trust Fund (W. M. R).
Grant sponsor: University of Louisville School of Medicine (W. M. R).
Grant sponsor: University of Louisville Research Foundation (W. M. R).
Grant sponsor: Heart and Stroke Foundation of Canada (L. B. R.).
L. Bruno Ruest's current address: Dept. of Biomedical Sciences, Baylor Col-
lege of Dentistry/TAMHSC, 3302 Gaston Ave., Dallas TX 75246.
Thomas B. Knudsen's current address: National Center for Computational
Toxicology, US Environmental Protection Agency,  109 T.W. Alexander Dr.
(B205-01), Research Triangle Park NC 27711.
     The authors declare they have no competing financial interests. Although Dr.
     Knudsen's current address is the U.S. Environmental Protection Agency this
     work was conducted and analyzed while he was on the faculty at University
     of Louisville. The content does not reflect the views of the Agency, nor does
     the mention of trade names or commercial products constitute endorsement
     or recommendations for use.
     Correspondence to: Susmita Datta, Dept. of Bioinformatics and Biostatistics,
     School of Public Health and Information Science, University of Louisville,
     Louisville, KY 40292. E-mail: susmita.datta@louisville.edu
     *Reetu Singh's current address: The Hamner Institutes for Health Sciences, 6
     Davis Drive, Research Triangle Park, NC 27709-2137.
     Published online 31 January 2008 in Wiley InterScience (www.interscience.
     wiley.com).
     DOI: 10.1002/bdra.20440
                                     Birth Defects Research (Part A): Clinical and Molecular Teratology 82:177-186 (2008)
                                Previous
TOC

-------
178
DATTA ET AL.
                 INTRODUCTION

  The Fetal Alcohol Syndrome (FAS) comprises a subset
of Fetal Alcohol Spectrum Disorders (FASD) and includes
a consistent pattern of physical abnormalities in the face
and eye, with intrauterine growth retardation and neuro-
developmental deficits (Jones  et al.,  1973). Although the
consequences  of prenatal alcohol damage  have  been
well-characterized in children, the  full range of effects in
FAS/FASD remains difficult to diagnose (Bertrand et al.,
2004). Developmental stage at the time of prenatal alco-
hol exposure, and  maternal  genotype  or  maternal-fetal
interactive  effects, coupled with differing patterns  and
amounts of alcohol consumption by the mother account
for  some of the known  variability and uncertainty in
alcohol-induced  end  points  (Streissguth  et  al.,  1996;
Viljoen et al., 2001; Sulik, 2005).
  Early detection of FAS/FASD is highly desired for the
purposes of early intervention, both prenatally in terms
of reducing alcohol consumption through the  remainder
of pregnancy, and postnatally,  in  terms of  initiating
measures that may  improve  the child's performance
(Streissguth et al., 1996). As such, research leading to dis-
covery  and  validation of biomarkers  that  can better
inform healthcare decisions and interventions in alcoholic
pregnancies is needed (Bearer et al., 2005; Goodlett et al.,
2005). Studies have shown the utility of alcohol-derived
fatty acid ethyl esters in the meconium as indicative of
maternal alcohol  consumption during pregnancy (Bearer
et al.,  2005).  Although  this  screening  may  ultimately
prove useful  to  quantify a risky  drinking pattern, the
extent to which fatty acid ethyl esters  address sensitive
stages  of exposure  or the risk  for alcohol-related  birth
defects is not yet clear. Identification of biomarker(s) of
effect could strengthen a prenatal  alcohol screening pro-
gram by linking exposure with fetal changes  (Goodlett
et al., 2005; Chen et al., 2005; Green et al., 2007). The dis-
covery and validation of general and specific biomarkers
for  FAS/FASD  in  well-defined  animal models  can
advance  this effort.
  In the  research described here, we tested the hypothe-
sis that protein complexity of the amniotic fluid (AF) can
change  in  association with the risk for alcohol-related
birth defects. AF is a colorless composite of water, pro-
tein, minerals electrolytes, hormones, environmental pol-
lutants,  and exfoliated cells. During the normal course of
pregnancy the AF is conditioned by amniocytes that pro-
duce cytokines, lipids, prostaglandins, and growth factors
in response to local and endocrine signals, and by fetal
urination,  lung  secretion,  swallowing,  and  intestinal
absorption  (Cheung and  Brace, 2005).  In  amniocentesis
some AF is aspirated  for diagnostic purposes during the
second trimester, usually at  16-18 weeks  when  the AF
peaks in volume. Biochemical analysis  of AF can reveal
specific developmental disorders; for example, alpha-feto-
protein (AFP) is elevated in the AF of fetuses with  open
NTDs  including  anencephaly  and  spina bifida (Jones
et al., 2001). In contrast, AFP levels are sometimes abnor-
mally low  in AF  of fetuses with  Down's  syndrome
(trisomy 21) (Yamamoto  et al., 2001).  Some fetal  testis
proteins  in the AF were decreased in males born to alco-
hol users (Westney et al., 1991). Because the human AF
proteome database  comprises  more  than 400 different
gene products (Tsangaris et al., 2005), a  large scale analy-
sis of proteins (proteomics) may reflect  multiple changes
       that  can be systematically linked with alcohol-induced
       birth defects.
         Proteomics-based methods have  been used to study
       brains  from healthy  and chronic alcoholic  individuals
       (Lewohl et al., 2004; Alexander-Kaufman et al., 2006). To
       our knowledge, no such studies have been performed on
       AF of  alcoholic  pregnancies. The present study  used a
       well-defined murine model to screen AF for biomarkers
       that could potentially discriminate between FAS-positive
       and  FAS-negative  pregnancies.   This  model  comprises
       closely related C57BL/6 mouse strains that respond dif-
       ferently to maternal alcohol  exposure on gestation day
       (GD)8. Whereas C57BL/6J litters (B6J) carry a high-risk
       for dysmorphogenesis following maternal  exposure to
       two 2.9 g/kg injections of ethanol alcohol spaced  4.0 h
       apart on GD8,  C57BL/6NCrl litters  (B6N) carry a low-
       risk for malformations (Green et al., 2007). AF was har-
       vested on day 17 of gestation to capture the optimal yield
       of AF  (Cheung and Brace, 2005) at  a  stage in gestation
       that  can identify FAS-positive and  FAS-negative litters
       (Green et al.,  2007). We  applied statistical methods to
       proteomics to identify unique signatures in  the MALDI-
       TOF mass spectrum of AF that could be anchored to the
       risk for FAS and to differentiate between alcohol-exposed
       litters in the sensitive strain relative to other groups.


                 MATERIALS AND METHODS
                      Animals and Exposure
         C57BL/6J and C57BL/6NCrl mice 18-19  g in weight
       were  purchased from  The  Jackson Laboratory   (Bar
       Harbor,  ME)   and  from  Charles  River  Laboratories
       (Wilmington, MA), respectively. These lines derived from
       C57BL/6 by strict  inbreeding at The Jackson Laboratory
       (B6J) and the NIH followed by Charles River  (B6N) and
       differ by 1.6%  polymorphism in microsatellite markers
       (Hovland  et al., 2000). Mice were housed in  static mi-
       croisolater cages with bedding that was absorbent, non-
       nutritive, and nontoxic. The colonies cohabited the  same
       animal room and were maintained on a 12 h photoperiod
       (06.00-18.00 h light). Diet was Purina mouse  chow and
       tap water ad libitum. Mice were  acclimated to the  room
       for at least 10 days before breeding. The animal protocol
       for this study was reviewed and  approved by the Institu-
       tional Animal Care  and Use Committee at the  University
       of Louisville. Carbon dioxide asphyxiation  followed by
       cervical dislocation was the method of euthanasia.
         For timed pregnancies we bred experienced males to
       nulliparous females (20-22 g body weight on average) for
       4—6 h starting at 07.30-8.00 h. Detection of a vaginal plug
       at 13.30-14.00 h was regarded as evidence of  coitus and
       this day designated GDO. Dams showing 2-3 g weight
       gain at 09.30 h on GD8 were assumed pregnant (typically
       four to eight somite pair  stage). The experimental design
       had four treatment groups, with six to seven litters per
       group as follows: (I) control B6J; (II) alcohol B6J; (III) con-
       trol B6N; and (IV)  alcohol B6N. Treatment used a stand-
       ard model of i.p. injection of 22% absolute ethanol  (v/v)
       in isotonic saline given  on GD8 by two i.p.  injections
       spaced 4 h apart (Webster et al., 1980; Sulik and John-
       ston, 1983; Sulik et al., 1986; Kotch and Sulik, 1992; Green
       et al., 2007). Each injection doses the dam with 2.9  g/kg
       ethanol. Control litters received  vehicle (saline) alone in
       the same manner. All injections were at 0.5 mL per 30 g
       maternal body weight.
Birth Defects Research (Part A) 82:177- 186 (2008)
                                 Previous
 TOC

-------
                                         BIOMARKERS OF FAS/FASD
                                                        179
                   AF Collection
  Pregnant dams were euthanized on GD17.  After hys-
terectomy, the uterus was examined for resorptions. Indi-
vidual intact  amniotic sacs  containing  the  fetus  were
carefully dissected  from the myometrium and decidua
using fine tweezers  and iridectomy scissors. AF was aspi-
rated from  each individual  AF cavity  using a  sterile
microsyringe and placed  in a sterile microcentrifuge tube
on wet ice. Fetuses  were inspected for evidence of gross
malformations, weighed,  and  fixed (after hypothermia) in
neutral-buffered  formalin. Phenotype  data  from  this
study were combined  with similar data from  our previ-
ous study (Green et al., 2007). Because the "mother" was
the unit of exposure in this model, the "litter"  comprised
the unit  of sampling and also was the basis for group
comparison.  Two-way ANOVA  (substrain,  treatment,
interaction) was fitted to explain the variability in the ter-
atological outcomes using GraphPad Prism version 4.02
for  Windows  (GraphPad Software,  San  Diego, CA;
www.graphpad.com.) When ANOVA revealed a signifi-
cant group-wise effect (p <  .05),  postanalysis was per-
formed  using Bonferroni-corrected multiple comparison
tests, as indicated.

              AF  Sample Preparation
  The AF was inspected for clarity.  Samples that were
cloudy or pink were rejected, as were samples  from dead
fetuses. AF samples were then  pooled for fetuses within
each litter, except any fetuses with NTDs were processed
individually so as not to contaminate the pooled AF with
uninformative samples.  The AF  yield  for  proteomics
analysis was at least 100-150 |iL pooled AF for each litter
replicate  (n  = 6  litter replicates). In total we analyzed
24 pooled AF samples from the four groups that met  the
criteria for analysis, with six samples coming  from each
group  plus a  few smaller  samples  from the  grossly
abnormal fetuses. AF  samples were centrifuged at 800 Xg
for 5 min at 4°C to remove amniocytes. Samples were
stored at —20°C until processing. A 50 \iL aliquot of  AF
was placed  in a  clean microtube  with 10 \iL 6M urea,
vortexed, and left at room temperature for 20 min. The
denatured samples  were reduced with 10  \iL  of 20 mM
dithiothreitol at 56°C for 45  min and alkylated with
10 \iL of 55 mM  iodoacetamide at room temperature in
the dark. Ultrapure water (10  |iL)  was added to dilute
the urea  followed by addition of 10 \iL methylated tryp-
sin (Promega, Madison, WI; catalog #V5113, 100 ng/|iL)
in 50 mM ammonium bicarbonate buffer. Samples were
incubated overnight at 37°C. Trypsinized digests were
desalted  via C-18 ZipTip (Millipore, Billerica,  MA; cata-
log  #ZTC18SO96)  by aspirating  three  to five  times.
Digests were washed with 0.1% formic acid and eluted
from the ZipTip with 10  \iL 60% acetonitrile in 0.1% for-
mic acid.

          MALDI-TOF and  Tandem MS
  Aliquots of AF trypsin hydrolysate were spotted to a
MALDI  target plate  using 10  mg/mL  alpha-cyanohy-
droxyl cinnamic acid solution as the matrix. Tryptic pep-
tide fragments were resolved on a Micromass  ToFSpec2E
(Micromass/Waters, Milford,  MA)  mass  spectrometer.
The instrument was  set to reflectron  mode  using a
337 nm nitrogen laser and the  instrument was operated
      in positive ion mode for the m/z range of 500 to 4,000 Da.
      Twelve  spectra were collected automatically using set
      locations on each sample well.  Each spectrum consisted
      of  40  laser firings  (480 total laser firings) averaged to
      improve signal-to-noise ratios. Internal tryptic hydrolysis
      peaks  were used to calibrate the instrument to a mass ac-
      curacy of 75 ppm  or less. Spectral  data  patterns were
      compared to  a battery of databases that were in-house
      and on world wide web-based  data searching resources
      for the pattern matching discussed later.

                Preprocessing of Mass Spectra
       We  applied three  preprocessing steps: standardization,
      denoising, and alignment. The first preserved  higher mo-
      lecular mass protein  fragments at  low abundance by
      accounting for the nonuniform baseline and variability in
      maximum  intensity across  the mass  spectra  by  the
      method of Satten et  al. (2004). In this method, each spec-
      trum was standardized using only  information from that
      spectrum. Let x denote mass-to-charge ratio (m/z) and let
      y(x) be the corresponding spectral  intensity. The spectra
      are  standardized by  replacing the  intensity  y(x) with
      y*(x) = (y(x) -  Q0.5(x)}/{Qo.75(x) -  Q0.25(x)}, where Qa(x)
      is a local estimate of  the ath quantile of spectral inten-
      sities at m/z ratio x given by
      with
      Q+(x0)=mm{Wh(x0,y)>a},
                               E
                                                        (2)
      where
                                                        (3)
      and, for a set C , I{C} = 1 if C is true and = 0 otherwise.
      Note that h is a user selectable width defining a neigh-
      borhood  of x0. In other words, we centered the spectra
      using  a (local) estimate of  the median spectral intensity
      and divided by a local estimate of the interquartile range.
      We used interquartile  range  as a  measure  of scale
      because it is  insensitive to outlier peak intensities. The
      function Wi,(x0, y) is the proportion of weights {1 — (x —
      Xo)2/h2 that  correspond to intensities y(x)  that are less
      than or equal to y. This choice for Qa  is a variant of that
      proposed by Ducharme et al. (1995).
        The standardized spectra (step 1) were next subjected
      to a denoising algorithm to separate noise from potential
      peaks   and for  additional  smoothing  and alignment.
      Although standardized  spectra have a  common scale and
      are fairly homoscedastic, they still contain  a mixture of
      noise  and signal.  Denoising ensures  that  the features
      used for  classification correspond to real m/z peaks and
      increases confidence in  the scientific validity of the classi-
      fication procedure (Sorace  and Zhan,  2003). Again,  we
      selected a method that uses only the information in a sin-
      gle spectrum. Note that the standardized spectral inten-
                                                                    Birth Defects Research (Part A) 82:177- 186 (2008)
                                 Previous
TOC

-------
180
DATTA ET AL.
sity y*(x) can be negative; in fact the median of the y*(x)
values is typically zero. While it may be difficult to sepa-
rate  noise  from signal using those standardized inten-
sities that are positive, the  negative standardized inten-
sities presumably represent pure noise; therefore, we esti-
mated standard error based on  the negative intensities
for a spectrum (Satten et al., 2004). We used those m/z's
for which the standardized intensities were at least three
standard deviations apart from the average standardized
intensity.
  All  spectra  following  standardization  (step  1)  and
denoising (step 2)  were then aligned by a simple binning
technique.  In  step 3  the "features" across spectra were
binned into common intervals of bandwidth = 0.1 Da.
Maximum  intensity within  an interval was assigned  to
the midpoint m/z of the interval. Following alignment the
features are hereto forward defined as "peaks" that were
subjected to statistical analysis.

         Statistical Analysis for Proteomics
  A  univariate analysis was  used to find any peaks that
would differentiate between control and exposed samples
in the  sensitive  (B6J)  substrain, between control  and
exposed samples in the insensitive (B6N)  substrain, and
between control groups in both substrains (B6N, B6J). For
each comparison  we  used  ANOVA,  taking the mean
intensities in the comparison groups and forming t-statis-
tics for each  and every peak. Multiple hypothesis cor-
rection (Benjamin! and Hochberg,  1995) was applied  to
control the  false discovery rate at  a 10% level. We also
pursued groupwise classification (e.g., FAS-positive vs.
FAS-negative)  using   a  Random  Forest  (RF) method
(Breiman, 2001). Considered  one of the best off-the-shelf
classifiers currently  available  (Satten  et  al., 2004), RF
returns a list of  variables (m/z)  that are deemed to be
most useful for classifying  the  group of  samples. For
each variable  (peak) an importance measure is provided
that represents the variable's ability to distinguish control
versus alcohol-treated samples, and therefore could  be a
candidate biomarker.  Due to the  randomness inherent to
the algorithm we ran  RF multiple times to obtain the best
classification rate.

    Provisional Peptide Fragment Identification
  Uses of  MALDI-TOF  in  proteomics studies prior  to
trypsin digestion for  each protein yields a  limited num-
ber of peaks  spread  across a large range of m/z ratios.
Provisional  identification of some  proteins is possible
based  on  searchable properties of the peptides, such as
molecular mass. This peptide  mass fingerprinting used
the Aldente search engine (Tuloup et al.,  2002; http://
www.expasy.org/tools/aldente/)  with   the   following
search parameters: molecular mass range taken to be 0 to
150 kDa; fixed modification  of cysteine residues by car-
boxyamidomethylation; variable  oxidation modification
of methionine; no restriction placed on isoelectric point;
and  species selected as Mus  musculus and genus Roden-
tia. Because each peptide may produce a  range of frag-
ments  after trypsinization the complexity of the  mass
spectrum  increases dramatically, resulting in more pep-
tide fragments that are displayed over a tighter range  of
m/z ratios  (e.g., mostly below 3,000 Da).  Thus, peptide
mass fingerprinting was coupled with  liquid chromatog-
raphy  with  tandem  MS (LC-MS/MS)  or multidimen-
       sional  protein identification  technology  (MuDPIT)  to
       derive sequence information on some of the peptides. For
       LC-MS/MS, AF samples were denatured and digested as
       above, desalted on a Cig spin column, and fractionated
       with   strong   cation  exchange   resin.   Strong  cation
       exchange  fractions were concentrated to ~1  \iL with a
       SpeedVac and diluted with 5% acetonitrile/0.1%  formic
       acid to ~7.5 \iL. Some fractions were subjected to analy-
       sis on  a Waters CapLC coupled to a Waters Q-TOF API-
       US mass spectrometer. The LC eluate was coupled to a
       nano-LC sprayer and MS/MS  spectra were acquired with
       data-dependent scanning. Only ions with 2+, 3+, or 4+
       charges were selected for MS/MS analysis. MuDPIT anal-
       ysis (Washburn et al., 2001; Welters et al., 2001) included
       salt pulses given to free peptides from  cation-exchange
       resin,  reversed phase  resin separation,  and MS/MS  of
       eluted peptide fragments. For  both LC-MS/MS  and
       MuDPIT  the  MS/MS spectra were  searched  against
       Swiss-Prot database with Protein-Lynx  4.0.  The  mass
       error allowed  was 25 ppm and a  minimum three consec-
       utive residues were required for a positive match.
         Following  preliminary   characterization an in  silico
       trypsin digestion  was performed  to generate the peptide
       fragment  patterns  of  candidate  proteins. This method
       used   ProteinProspector v4.0.8  (http://prospector.ucsf.
       edu/), with "trypsin"  selected as the enzyme. Other pa-
       rameters selected included: peptide fragment mass range
       was 800^1,000 Da,  minimum fragment  length of  five,
       maximum of two missed cleavages allowed, and cysteine
       modification using carbamidomethylation.
                             RESULTS
                       Fetal Characteristics
         Results from teratological evaluation (this study) were
       combined with similar  data from  the  previous study
       (Green et al., 2007)  to gauge the net response between
       B6J and  B6N pregnancies. Results on GD17  are shown
       for mean incidence rate of resorptions, mean fetal weight,
       and mean percentage of viable fetuses with overt malfor-
       mations (Table 1). Mean resorption rates were 9.2% for
       the control and exposed B6N litters. Although resorption
       rates trended higher in exposed B6J litters (29.2% resorp-
       tions) versus control B6J litters  (13.1% resorptions), due
       to highly variable litter effects the differences in resorp-
       tion  rates were not  statistically significant. Mean  fetal
       weight was  reduced in both substrains following alcohol
       exposure  (9-11%  reduction  vs.  controls).   Two-way
       ANOVA (treatment, substrain, interaction)  identified sig-
       nificant treatment-related effects.  Malformations mostly
       involved the eye (coloboma, microphthalmia)  and were
       significant for treatment (p < .001) and treatment X  sub-
       strain interaction (p  = .006). Postanalysis localized  the
       significant effects to  the exposed B6J group.  Malforma-
       tion rates were 26.0% in exposed B6J pregnancies versus
       6.0% in exposed B6N pregnancies.  Consistent  with  the
       previous study (Green et al., 2007), these findings suggest
       that we may define  sensitivity to prenatal  alcohol based
       on increased  risk for malformations  (and resorptions),
       because fetal weight reduction was evident in  either sub-
       strain under the treatment conditions employed here. B6J
       pregnancies carry a  high-risk for dysmorphogenesis  fol-
       lowing maternal exposure to 2.9  g/kg alcohol whereas
       B6N pregnancies carry a low-risk.
Birth Defects Research (Part A) 82:177- 186 (2008)
                                 Previous
 TOC

-------
                                         BIOMARKERS OF FAS/FASD
                                                                                           181
                                                     Table 1
              Fetal Effects of GD8 Alcohol Exposure in B6J and B6N Pregnancies Evaluated on GDI 7
Substrain
B6J
B6J
B6N
B6N
Two-way ANOVA


Group
Control
Alcohol
Control
Alcohol
(p-value)


Litters
(«)
22
16
20
19
Treatment
Substrain
Interaction
Resorptions
(% per litter)
13.1 ± 3.6
29.2 ± 6.4
9.2 ± 2.9
9.2 ± 5.5
0.057
0.064
0.239
Fetal weight
(mean g per fetus)
0.725 ± 0.025
0.647 ± 0.022*
0.735 ± 0.021
0.667 ± 0.027**
<0.001
0.509
0.643
Malformations
(% per litter)
0.0 ± 0.0
26.0 ± 7.8***
1.9 ± 1.4
6.0 ± 2.8
<0.001
0.052
0.006
  Mean ± standard error, data compiled from previous (Green et al., 2007) and present studies.
  Bonferroni-corrected t test for control (2 X saline, GD8) versus alcohol (2 X 2.9 g/kg ethanol, GD8) groups within each substrain;
  *P < .05, **p < .01, ***p < .001.
  Comparative Analysis of Aligned Mass Spectra
  AF was aspirated on GDI 7 and pooled for proteomics
screen. Samples were pooled within a litter avoiding any
dead fetuses, bloody or cloudy AF aspirates, or fetuses
with open  NTDs. AF passing acceptance criteria (n =  6
per  group)  was  trypsinized  for  direct  analysis  by
MALDI-TOF mass spectrometry. Typical mass spectrum
profiles are shown for control and alcohol-exposed B6J
samples for the 500^,000 m/z region (Fig. 1) and to dem-
onstrate the signal amplification on raw profiles prepro-
cessed through  the standardization and denoising algo-
rithms (Fig. 2). Although peak sizes in the MALDI-TOF
profiles are not considered to reflect an accurate quantita-
tive  measure of peptide fragments,  the alignment of six
replicated litters per treatment group  returned a  robust
reduction in several  peaks across the different biological
                                         conditions of the experiment. Three biologically relevant
                                         groupwise  comparisons were considered: control  versus
                                         alcohol-exposed samples in  B6J (high-risk) pregnancies;
                                         control versus alcohol-exposed samples in B6N (low-risk)
                                         pregnancies; and control groups in B6J versus B6N preg-
                                         nancies.  The most important classifying m/z peaks  are
                                         shown in Table 2.
                                           Analysis of  the control  versus  alcohol-exposed  B6J
                                         samples  returned three  affected m/z  peaks with inten-
                                         sities that were reduced in a highly-significant manner,
                                         with p values well below 0.00001: m/z = 3,163.5, 1,495.8,
                                         and  1,369.6  (Table 2). These changes  were evident irre-
                                         spective  of the number of malformed fetuses per litter in
                                         the overall AF  sample prior to, or after, exclusions of
                                         NTDs or bloody/cloudy samples.  As such,  there was no
                                         bias  in those AF samples that met criteria in terms of the
                                         number  of malformed fetuses in  the  sample by  group,
100-j .



.
%-

100-


~


0-




1337.90 1479.95



913.56
897.58
v»
, J
B




705
,1
J



91361



50





11




1009.66
1010.66
M,ft.





,
1337.91



1250.74
%>
86.65 1





•4
f




\
147






1609.96



ILL
1663.96
/

i
IBB;
I,. ,




3.17
II,
9.99

ll










236
2365.37,,
i



l.l,r.
L39 3026.75
2367.37 3025.74ho28.73
.. . j M 	 - r3029'74


1883.17
1481,98
*"

,1,
1882.18
AlJ
1963.20 2366.43


„
/







MM
2367.41

2555.50 302f'80
Jlj r2682J1 I
    500
1000
1500
2000
2500
3000
3500
                                                                                                            nix
Figure 1. Sample MALDI-TOF spectrum of murine amniotic fluid samples collected on GD17. Samples from control (A) and alcoholic
(B) B6I mouse fetuses: 50 |iL aliquots were digested with methylated trypsin and 1 |iL aliquots were prepared in an alpha-cyano matrix
for linear and reflected MALDI-TOF.
                                                                     Birth Defects Research (Part A) 82:177- 186 (2008)
                                 Previous
                                   TOC
                                      Next

-------
182
                                Raw
DATTA ET AL.

             Standardized and Denoised
         CD
o
CD -
00
0
+
CD -
0
o








j[




J 	 I, . ,l,

1 1 1 1 1 1
o
1000 1200 1400
                                                           to
                                                           CD
                                                                o
                                                                o -
                                                                CNJ
                                                                O
                                                                LO -
            O -
                                                                O _
                                                                LO
                                                                                       L...
                                                                   1000
                           1200
1400
                                 m/z                                              m/z

   Figure 2. Preprocessing effects. Example of raw (left panel) and preprocessed (right panel) mass spectra (1,000-1,500 m/z region).
relative to the number of malformed fetuses per group in
the overall sample prior to exclusions. In contrast, a com-
parison of B6N samples revealed no peaks that were dif-
ferentially affected by alcohol in comparison to the con-
trols.  Thus, preliminary  screening  of  AF peptides  re-
vealed a consistent display of peptide  fragments across
samples and a significant differential display of several
peaks in the high-risk (B6J) pregnancies, but not in  the
low-risk (B6N) pregnancies. One of these  peaks (3163.5)
was significantly different in comparing control pregnan-
cies between the two substrains.
  We next analyzed the AF peptide fragment profiles to
determine which specific  peaks would discriminate  the
FAS-positive group.  Two  of the top three altered peaks
in alcohol-exposed B6J  pregnancies, namely m/z  values
3,163.5 and 1,495.8, were  also amongst the most impor-
                       Table 2
    Significantly Altered (Reduced) Mass Spectrum
 Peaks Identified in the Preliminary Murine Amniotic
               Fluid Proteomics Screen
Group comparison Significantly
(n = 6) altered peaks (m/z)
B6J control vs.

B6N control vs
B6J control vs.


exposed

. exposed
B6N control


3,163.5
1,495.8
1,369.6
None
3,163.5


Top-significant
classifiers (m/z)
3,163.5
1,495.8

None
3,163.5
2,802.3
2,228.2
  Spectra of mass/charge (m/z) ratios from trypsinized sam-
ples by MALDI-TOF in reflectron mode were preprocessed by
the three-step schema (standardized, denoised, aligned) and
subjected to statistical analysis (n  = 6 independent litters per
analysis).
       tant variables to classify the samples (Table 2). Repeated
       use of the RF algorithm yielded classification accuracy as
       high as 75 versus 75% predicted for a random classifier
       that ignores the data. Therefore, in a limited sample size
       of n = 6 we achieved reasonable  success in distinguish-
       ing the FAS-positive group using straight MALDI-TOF
       analysis  of the  trypsinized AF  sample.  Peak  1,495.8,
       which was the second most significant peak in terms of
       the univariate t test, emerged as a strong diagnostic bio-
       marker based on  importance measures for classifying the
       alcohol-exposed B6J proteomic  profiles  across multiple
       RF runs (Table 2). In contrast, repeated use of the RF
       algorithm failed to classify alcohol-exposed and control
       B6N  proteomic profiles.  This  method  yielded  21-28%
       classification accuracy, which is even worse than purely
       random assignment (50%) and  well-below  the 75%  accu-
       racy with the B6J comparison. Again, this is  consistent
       with the FAS phenotype anchor.
         Running the RF procedure  to compare B6J and  B6N
       control  groups yielded  classification  accuracies up  to
       65%,  with three  peaks  of  m/z = 3,163.5, 2,802.3, and
       2,228.2 emerging  as the top-significant  classifiers (Table
       2). Although marginal in accuracy by recursive RF, one
       of these peaks (3,163.5) was significantly different at a p-
       value well-below .00001. Another  peak (2,228.2) trended
       toward the effect  but was not statistically significant, per-
       haps as a limitation of the small sample size (n  = 6).
                 Specificity-Sensitivity Analysis
         Area under the receiver operating characteristic (ROC)
       curve (AUROC) was computed to determine the predic-
       tive classification power (sensitivity/specificity) of diag-
       nostic peaks identified in the MALDI-TOF screen. To this
       end, we selected the top five peaks (peaks 1-5) with the
       largest absolute t statistic with regards to the capacity to
       differentiate AF proteomic profiles between control and
Birth Defects Research (Part A) 82:177- 186 (2008)
                                 Previous
 TOC

-------
                                          BIOMARKERS OF FAS/FASD
                                                         183
        0,0
                 0.2
                          0,4       0.6

                           1-Specificity
                                            0,8
                                                     1.0
                                                                  0.0
                                                                           0.2
                              0,4      0.6

                               1 -Speeifcily
                                                                                                     0,8
                                                                                                              1.0
Figure 3. Specificity-sensitivity analysis of the five top significant classifier peaks in the FAS-positive diagnostic profile. ROC curves
were drawn for ROC characteristics of the top five peaks from MALDI-TOF based on linear discriminant classifier of alcohol-exposed
and control B6J samples (named peaks 1-5, based on p values). Bootstrapped samples were randomly divided into training and test sets
for linear discriminant analysis with varying classification cut-offs. ROC curves plotted 1-specificity versus sensitivity for peaks 1-4 (left
panel) and peaks 4-5 (right panel). Maximum area under the ROC curve (1.0) was achieved using peaks 1-4, and near-unity (0.92) with
peaks 4-5.
alcohol-exposed B6J pregnancies. The first three of these
(peaks 1-3) were statistically significant  after multiple
hypotheses  correction  using Benjamini  and  Hochberg
false discovery rate control at 10%. Due to the limitation
in sample sizes (n = 6), we generated 50 bootstrap sam-
ples from the  original samples to perform  a cross-valida-
tory calculation (Efron  and Tibshirani, 1993) and deter-
mine ROC of  a linear discriminant classifier using these
peaks (Datta and de Padilla, 2006). Bootstrapped samples
were randomly divided  into  training set (35  samples)
and test  set (15 samples)  from each group for linear dis-
criminant analysis (Fisher,  1936) with varying classifica-
tion cut-offs.  Based on results  for the test samples, we
plotted 1-specificity  values against the sensitivity values
in order  to construct the  ROC curves (Fig. 3). The maxi-
mum AUROC (1.0)  was achieved using intensity values
of peaks 1-3 or peaks 1-4, indicating ideal classification
performance.  We also  obtained unity  AUROCs con-
structed  with  any of the top three peaks together with
peaks  4-5;  however, AUROC dropped  slightly (0.92)
when the ROC curve was  drawn using only two chan-
nels (peaks 4 and 5).
         Provisional Peptide Identification
  Several more formal approaches were used for further
characterization of discriminating peaks  in the MADLI-
TOF  screen.  First, peptide  mass  fingerprinting was
applied to the MALDI-TOF  mass spectra.  Because we
were  dealing with clean AF samples we assumed the
complexity of proteins was low enough for at least some
useful information to be drawn regarding the most abun-
dant peptide species present. Among the top-scoring can-
didates  the  most frequent  occurrences were:  mouse
      alpha-fetoprotein precursor (P02772), mouse serum albu-
      min  precursor  (P07724),  and myosin regulatory  light
      chain 2 skeletal  muscle isoform (P97457).  These peptides
      were detected in essentially all B6J and B6N control  sam-
      ples. Many peptides showed  weaker redundancy across
      samples;  a typical example   is shown for control and
      alcohol-exposed  B6J and  B6N samples with  regards to
      the five top-scoring candidate peptides (Table 3). In  all
      four cases the top two scoring candidates were, respec-
      tively,  alpha-fetoprotein and  serum albumin.  Whereas
      both proteins belong to the albuminoid gene family the
      current AF  screen did not indicate differences  in the
      MALDI-TOF peptides derived from serum albumin.
        An effort was made using in silica trypsin digestion to
      map the top peaks identified by statistical analysis (Table
      2) onto the most abundant proteins identified  by peptide
      mass fingerprinting. Although we could not identify the
      most significantly  reduced peak in the  alcohol-exposed
      B6J pregnancies  (m/z = 3,163.5), the second-most signifi-
      cantly reduced peak, which was also the second-most im-
      portant classifying peak (m/z  = 1,495.8), was successfully
      mapped to mouse  alpha-fetoprotein precursor. The third-
      most significantly reduced peak (m/z = 1,369.6) was also
      identified as a potential peptide of mouse alpha-fetopro-
      tein precursor, assuming  one missed in  silica cleavage.
      Analysis of some B6J  samples by LC-MS/MS and MuD-
      PIT further  implicated both  peaks  1,495.8 and  1,369.6
      (assuming one missed cleavage) as trypsin-induced  pep-
      tide  fragments  cleaved from mouse  alpha-fetoprotein
      precursor protein.  The related amino acid sequences cor-
      responded to  residues 514-526  (DETYAPPPFSEDK) and
      191-201  (ADNKEECFQTK),   respectively,  of  this  605
      amino  acid  protein.  Furthermore,  among  the top 20-
      altered peaks observed in  alcohol-exposed B6J samples in
      terms of absolute t statistic, nine of  them (m/z = 1,495.8,
                                                                      Birth Defects Research (Part A) 82:177- 186 (2008)
                                 Previous
TOC

-------
184
                      DATTA ET AL.
                       Table 3
    Top Five Scoring Proteins (in Terms of Lowest
    pValue from MALDI-TOF Comparison) Shown
   for Representative Control and Alcohol-Exposed
       AF Samples in B6J and B6N Pregnancies
Name of candidate protein
(UniProt)
Primary
accession   Amino
number   acids  Coverage
B6J (saline)
Mouse alpha-fetoprotein
  precursor                    P02772      605    43%
Mouse serum albumin precursor   P07724      608    18%
N-terminal acetyltransferase
  complex ARD1 subunit
  homologA                   Q9QY36     235    44%
Short/branched chain specific
  acyl-CoA dehydrogenase,
  mitochondrial precursor        Q9DBL1     432    30%
Myosin regulatory light chain 2,
  skeletal muscle isoform         P97457      169    38%

B6J (ethanol)
Mouse alpha-fetoprotein
  precursor                    P02772      605    30%
Mouse serum albumin precursor   P07724      608    23%
Phospholipid hydroperoxide
  glutathione peroxidase,
  mitochondrial precursor        O70325      197    42%
Serine/threonine-protein kinase
  TBK1                        Q9WUN2    729    22%
Potassium channel
  tetrameriesation               Q9D7X1     259    35%
B6N (saline)
Mouse alpha-fetoprotein
precursor
Mouse serum albumin precursor
Receptor-interacting serine/
threonine-protein kinase 2
Vinculin
Caspase-4 subunit plO
B6N (ethanol)
Mouse alpha-fetoprotein
precursor
Mouse serum albumin precursor
Putative SplOO-related protein
Spermatid-specific linker histone
Hl-like protein
Succinyl-CoA ligase


P02772
P07724

P58801
Q64727
P70343


P02772
P07724
Q99388

Q9QYLO
Q9Z218


605
608

539
1,066
373


605
608
208

170
433


45%
34%

34%
19%
58%


36%
21%
49%

51%
33%
  Based on the Aldante search engine (http://www.expasy.org/
tools/aldente/).
1,369.6, 1,774.9, 1,556.8, 1,638.8, 1,337.7,  1,685.9,  897.5,
897.6) were predicted  from an in silica trypsin cleavage
of  mouse alpha-fetoprotein  precursor protein.   These
matches  cover  different  regions  of the  protein and
include bins  of 100, 200, 300, and 500 amino acid resi-
dues. Therefore, we conclude from  these  findings that
reduced detection of mouse alpha-fetoprotein precursor
protein accounted for two  of three  major classifier  peaks
that can distinguish the alcohol-exposed B6J  litters. The
third classifier (peak 3,163.5)  differentiated between the
sensitive strain  and the insensitive strain  in  the  unex-
posed, but also significantly differentiated between the
exposed and unexposed in the sensitive strain; however,
the identity of this peptide was not determined in the
present study. By these criteria mouse alpha-fetoprotein
precursor protein levels were not altered in the alcohol-
exposed B6N  pregnancies.  A  few  B6J  fetuses  were
excluded from  AF pooling that may have been dead or
severely malformed for days prior  to AF procurement.
Although it is feasible to assay AF from individual grav-
ida, such abnormal fetuses must ultimately be procured
at a much earlier gestational stage to be useful. In fact,
the individual AF from a few severely malformed fetuses
from  the exposed  B6J group  that we did  examine by
MALDI-TOF profiles were not  informative because there
were  no abnormal fetuses to compare from the control
B6J or exposed B6N groups (not shown).
                                                 DISCUSSION

                                The optic primordium is a critical target of alcohol in
                              experimental  teratogenesis and  in  FASD (Green  et al.,
                              2007; Higashiyama  et al., 2007). Early gestational expo-
                              sure to alcohol reprograms genetic  networks during ini-
                              tiation of the FAS in mice (Green et al., 2007). That effect
                              was demonstrated in the GD8 mouse embryonic headfold
                              at 3 h  following  a single maternal injection of ethanol
                              (2.9  g/kg).  In  the aftermath of global genetic responses
                              that clearly differentiated high-risk (B6J) from  low-risk
                              (B6N) inbred  lines of C57BL/6 mice (Green et al., 2007),
                              results from the present study show that dysmorphogen-
                              esis was associated with eventual changes in the AF-com-
                              partment of the fetus that could be detected by MALDI-
                              TOF  mass  spectrometry and specialized data  analysis
                              methods. Because the current FAS  animal model  is not
                              highly  penetrant  the  pooled  AF  samples would have
                              included grossly unaffected as well as malformed fetuses.
                              Both pedigrees (B6J, B6N) were exposed to alcohol and
                              for both test  substrains we  measured  an effect  of the
                              alcohol exposure in terms of fetal weight reduction, but
                              only  one  substrain showed  a response  in terms of
                              increased malformation rates. In contrast to malforma-
                              tions, the fetal weight reduction was more evenly distrib-
                              uted  across a  litter. Thus, the general and  specific  bio-
                              markers for FAS/FASD that might emerge from such an
                              AF analysis on GD17  following acute maternal alcohol
                              intoxication  on GD8  can  only be anchored  to  the
                              increased risk  for malformations on a litter basis. These
                              changes may be summarized as  follows: (a) the  AF pro-
                              teome  in  alcohol-exposed B6J  pregnancies showed  a
                              highly significant  drop in the abundance of three peaks
                              (m/z =  3,163.5, 1,495.8,  1,369.6);  (b) ROC analysis  found
                              these peaks to be  highly sensitive and specific for  classi-
                              fying the susceptible group by exposure; (c)  two of these
                              peaks (1,495.8,  1,369.6) mapped  to mouse alpha-fetopro-
                              tein  precursor protein, as did 9 of  the  20 most altered
                              peaks based on in silica digestion; and (d) none  of these
                              peaks were found to  be altered by alcohol in the B6N
                              substrain. Taken together, these findings suggest that dis-
                              crete changes  to the AF proteome can be anchored to the
                              observed risk for alcohol-related  birth defects in a mouse
                              model for FAS. We interpret these  changes to represent
                              an ability to  better identify fetuses more  likely  to be
                              affected with FAS dysmorphology in association with the
                              incidence rate of detectable malformations.
                                Recently, alpha-fetoprotein has been  considered as a
                              biomarker for  perinatal distress  (Mizejewski, 2007).  Dis-
                              cordant  levels  of AFP  in  AF have  been indicative of
Birth Defects Research (Part A) 82:177- 186 (2008)
                                 Previous
                        TOC

-------
                                            BIOMARKERS OF FAS/FASD
                                                           185
structural defects in the brain and spinal cord (elevated)
or low birth weight-fetal growth restriction (reduced).  In
general,  developmental  regulation of AFP may be con-
nected with growth and differentiation or perinatal stres-
sors as reflected in the functional role attributed to AFP
in small molecule binding and transport (e.g., fatty acids,
retinoids, hormones, heavy  metals, drugs, and toxicants).
Although interesting for the pathogenesis of developmen-
tal defects,  this is dispensable for major organogenesis  as
shown by  the  lack  of malformations in AFP  knockout
mice (Gabant et al., 2002).
  Realization of alpha-fetoprotein as a general biomarker
for  FAS  has  practical  implications  for  understanding
alcohol's mode of action on the fetus  as well as potential
translation  to  clinical  diagnostics.  On one hand, alpha-
fetoprotein  is released from various cell types and gains
access to the extracellular fluids such  as the AF compart-
ment  (AF-AFP) and  maternal serum (MS-AFP). There-
fore, lower amounts of AF-AFP may secondarily yield
lower  MS-AFP levels that  could, ultimately, reflect an
increased risk of alcohol-related  malformations in babies
whose mothers drink  heavily during pregnancy. In fact,
low MS-AFP  was found to  predict FAS  correctly in 59%
of alcoholic pregnancies (Halmesmaki et al.,  1987). That
study  followed  several standard  diagnostic  proteins
(human placental lactogen, pregnancy specific beta-1-gly-
coprotein, and alpha-fetoprotein) in 35 pregnant problem
drinkers and  14 abstinent control women, and concluded
that low alpha-fetoprotein and pregnancy specific beta-1-
glycoprotein in maternal serum were  useful indicators  in
predicting  FAS. We  arrived  at  the same link  between
alpha-fetoprotein  and FAS through a  completely inde-
pendent, non a priori  discovery-based screen  of the AF
proteome and support the scientific justification for moni-
toring MS-AFP as  part of  prenatal care  of  drinking
women and early screening for alcohol-damaged fetuses.
  On  the  other hand, the  reduction in AF-AFP raises
questions regarding  alcohol's mode of action on the AF
proteome.   The presence of alpha-fetoprotein has  been
detected almost universally  in postimplantation embryos,
yolk sac, amnion, embryonic disc, and early  primitive
streak stages for  all  mammalian species studied so  far
(Mizejewski, 2004). Apart from the long-running debate
on  alpha-fetoprotein's  role in brain development,  ele-
vated  MS-AFP  is a clinical  biomarker for  NTDs such  as
spina  bifida or anencephaly (Brock and Sutcliffe, 1972).
The failure of neural  tube  closure  results in leakage  of
this serum  protein into  the AF and MS at higher levels
than normal.  In contrast, MS-AFP is abnormally low  in
some  pregnancies that  carry  trisomy 21  (Davis et al.,
1985).  The clinical "triple test" performed at 14—22 weeks
of pregnancy is used to screen fetal and placental prod-
ucts in serum samples of expectant mothers >35 years  in
age  to detect  trisomy 21 (Spencer et al., 1997; Caserta
et al.,  1998; Wald et al., 2006a,b; Mizejewski, 2007). This
test  measures alpha-fetoprotein levels along with uncon-
jugated estradiol and human chorionic gonadotropin that
are reduced in trisomy 21 pregnancies.
  The alpha-fetoprotein  precursor is synthesized at high
levels  by fetal liver  cells and visceral yolk sac endoder-
mal cells.   Thus,  acute  gestational  exposure to  alcohol
likely  alters the AF proteome as a secondary consequence
of fetal  development. Because fetal  growth  retardation
was observed in both strains (B6J, B6N) but reduced AFP
was detected only  in one strain (B6J), the data do not
      suggest that growth  retardation may have  affected the
      AFP level produced  in the fetal liver. Furthermore, we
      are not aware  of studies that implicate liver dysfunction
      in FAS children; however, the  drop in AF-AFP could
      reflect the aftermath of acute gestational alcohol exposure
      on liver and/or yolk sac development. This might have
      implications on multiple tissues because  alpha-fetopro-
      tein  functions  as a binding protein  for  small  molecules
      such  as vitamin D,  estrogens, fatty  acids,  and metals
      (Mizejewski, 2004;  Gitlin and Boesman, 1966; Attardi and
      Ruoslahti, 1976). In addition to its  role as  a  molecular
      troubleshooter alpha-fetoprotein contains sequence motifs
      that  render it a druggable target in diagnostics or thera-
      peutics (Mizejewski, 2004; Uriel, 1989). Perhaps the motif
      sequence DETYAPPPFSEDK (m/z  =  1,495.8) could pro-
      vide a molecular target for early FAS diagnosis or thera-
      peutic intervention,  through an understanding of the
      small molecules that  might bind to this peptide domain.
      Additional  studies  will be needed to establish  the  link
      between  dysregulation  of the embryonic  transcriptome
      (Green et al., 2007) and  disruption of alpha-fetoprotein in
      the AF proteome  (current  study) following gestational
      alcohol exposure.
                    ACKNOWLEDGMENTS

        The authors are grateful to Kenneth Lyons Jones, M.D.
      of the University of California San Diego for thoughtful
      input.



                          REFERENCES

      Alexander-Kaufman K, James G, Sheedy D, et al. 2006. Differential pro-
         tein expression in the prefrontal white matter of human alcoholics: a
         proteomics study. Mol Psychiatry 11:56-65.
      Attardi B, Ruoslahti E. 1976. Foetoneonatal oestradiol-binding protein in
         mouse brain cytosol is alpha-fetoprotein. Nature 263:685-687.
      Bearer CF, Stoler JM, Cook ID, et al. 2005. Biomarkers of Alcohol Use in
         Pregnancy. Alcohol Research and Health, NIAAA publications 28:38-
         43 (pubs.niaaa.nih.gov/publications/arh28-l /38-43.pdf).
      Benjamini Y, Hochberg Y.  1995. Controlling the false discovery rate: a
         practical and powerful approach to multiple testing. J Royal Statist
         Soc Series B Methodol 57:289-300.
      Bertrand },  Floyd RL, Weber MK, et al. 2004. National Task Force  on
         FAS/FAE.  Fetal  alcohol  syndrome:  guidelines  for referral and
         diagnosis. Atlanta, GA: Centers for Disease  Control and Preven-
         tion (http://www.cdc.gov/ncbddd/fas/documents/FAS_guidelines_
         accessible.pdf).
      Breiman L. 2001. Random Forests. Machine Learning 45:5-32.
      Brock DJH,  Sutcliffe RG. 1972. Alpha-fetoprotein in the antenatal diagno-
         sis of anencephaly and spina bifida. Lancet 2:197-199.
      Caserta D, Baldi M, Carta G, et al.  1998. Tri-test: clinical considerations
         on 1784 cases. Minerva Ginecol 50:73-75.
      Chen SY, Charness ME, Wilkemeyer MF, et al. 2005. Peptide-mediated
         protection from ethanol-induced neural tube defects. Dev Neurosci
         27:13-19.
      Cheung CY, Brace RA. 2005. Amniotic fluid volume and composition in
         mouse pregnancy. J Soc Gynecol Investig 12:558-562.
      Datta S, de Padilla LM. 2006. Feature selection and machine learning with
         mass spectrometry data for  distinguishing cancer and non-cancer
         samples. Statistical Methodology:  Special Issue on Bioinformatics
         3:79-92.
      Davis RO, Casper P, Huddleston JF, et al. 1985. Decreased levels of amni-
         otic fluid AFP associated with Down syndrome. Am J Obstet Gynecol
         153:541-544.
      Ducharme  GR, Gannoun A, Guertin M-C, et al. 1995. Reference values
         obtained by kernel-based estimation of quantile regressions. Biomet-
         rics 51:1105-1116.
      Efron B, Tibshirani RJ. 1993. An Introduction to the Bootstrap. New York:
         Chapman & Hall.
      Fisher RA. 1936. The Use of Multiple Measurements in Taxonomic Prob-
         lems. Ann Eugenics 7:179-188.
                                                                         Birth Defects Research (Part A) 82:177- 186 (2008)
                                   Previous
TOC

-------
186
DATTA ET AL.
Gabant P, Forrester L, Nichols J, et al. 2002. Alpha-fetoprotein, the major
    fetal serum protein, is not essential for embryonic development but
    is  required for female fertility. Proc Natl Acad Sci USA 99:12865-
    12870.
Gitlin D, Boesman M. 1966. Serum AFP, albumin, and -G-globulin in the
    human conceptus. J Clin Invest 45:1826-1830.
Goodlett CR, Horn KH, Zhou FC. 2005.  Alcohol teratogenesis: mecha-
    nisms  of  damage and strategies  for  intervention.  Exp  Biol  Med
    230:394-406.
Green  ML, Singh AV, Zhang  Y,  et al. 2007. Reprogramming  of  genetic
    networks during initiation of fetal alcohol  syndrome. Dev Dynam
    236:613-631.
Halmesmaki E, Autti I, Granstrom ML, et al. 1987. Prediction of fetal alco-
    hol syndrome by maternal alpha fetoprotein, human placental lactogen
    and pregnancy specific beta 1-glycoprotein. Alcohol Suppl 1:473^76.
Higashiyama D, Saitsu H,  Komada M, et al. 2007. Sequential developmen-
    tal changes in holoprosencephalic mouse embryos exposed to ethanol
    during the gastrulation period. Birth Defects Res A Clin Mol  Teratol
    79:513-523.
Hovland DN Jr, Cantor RM, Lee GS, et al. 2000. Identification of a murine
    locus conveying susceptibility to cadmium-induced  forelimb malfor-
    mations. Genomics 63:193-201.
Jones EA, Clement-Jones M, James OFW, et al. 2001. Differences between
    human and  mouse  alpha-fetoprotein expression during early devel-
    opment. Ann Hum Genet 198:555-559.
Jones KL, Smith DW, Ulleland  CH, et al. 1973. Pattern of malformation in
    offspring of chronic alcoholic  mothers. Lancet 1:1267-1271.
Kotch LE, Sulik KK. 1992.  Experimental fetal alcohol syndrome: proposed
    pathogenic basis for a variety of associated  facial and brain anoma-
    lies. Am J Med Genet  44:168-176.
Lewohl JM, Van Dyk DD, Craft GE, et al.  2004.  The application  of pro-
    teomics to the human alcoholic brain. Ann NY Acad Sci 1025:14-26.
Mizejewski GJ. 2004.  Biological roles of alpha-fetoprotein  during preg-
    nancy and perinatal development. Exp  Biol Med 779:439^63.
Mizejewski GJ. 2007. Physiology  of alpha-fetoprotein as a biomarker for
    perinatal distress: relevance to adverse pregnancy outcome. Exp Biol
    Med (Maywood) 232:993-1004.
Satten  GA, Datta S, Moura H,  et  al. 2004. Standardization and denoising
    algorithms  for mass  spectra to  classify  whole-organism bacterial
    specimens. Bioinformatics 20:3128-3136.
Sorace  JM, Zhan M. 2003. A data review  and re-assessment of ovarian
    cancer serum proteomic profiling. BMC Bioinformatics 4:24.
Spencer K, Muller F, Aitken DA. 1997. Biochemical markers  of trisomy 21
    in  amniotic fluid. Prenat Diagn 17:31-37.
Streissguth AP, Barr HM,  Kogan  J, et al. 1996. Understanding  the occur-
    rence of secondary disabilities in clients with fetal alcohol syndrome
             (FAS) and fetal alcohol effects (FAE): Final report to the Centers for
             Disease Control and Prevention (CDC). Final Report to the Centers
             for  Disease  Control and  Prevention  (CDC), August,  1996, Seattle:
             University of Washington, Fetal Alcohol & Drug Unit, Tech. Rep. No:
             96-06.
         Sulik  KK. 2005. Genesis of alcohol-induced craniofacial dysmorphism.
             Exp Biol Med 230:366-375.
         Sulik KK, Johnston MC. 1983.  Sequence of developmental alterations fol-
             lowing acute ethanol exposure in mice: craniofacial features of the fe-
             tal alcohol syndrome. Am  J Anat 166:257-269.
         Sulik KK, Johnston MC, Daft PA, et al. 1986. Fetal alcohol syndrome and
             DiGeorge anomaly: critical ethanol exposure periods for craniofacial
             malformations as illustrated in an animal model. Am  J Med Genet
             (suppl) 2:97-112.
         Tsangaris G, Weitzdorfer R, Pollak D,  et al. 2005. The amniotic  proteome.
             Electrophoresis  26:1168-1173.
         Tuloup M, Hoogland C, Binz PA, et al. 2002. A new Peptide Mass Finger-
             printing tool on  ExPASy: ALDentE. Swiss Proteomics  Society  2002
             congress. Applied Proteomics. Lausanne (CH), Dec:3-5.
         Uriel J.  1989. The physiological role of alpha-fetoprotein in cell growth
             and differentiation. J Nucl Med Allied Sci 33:12-17.
         Viljoen DL, Carr LG, Foroud TM, et al. 2001. Alcohol dehydrogenase-2»2
             allele is associated  with  decreased prevalence of fetal alcohol  syn-
             drome in the mixed-ancestry population of the Western Cape Prov-
             ince, South Africa. Alcohol Clin Exp Res 25:1719-1722.
         Wald  NJ, Barnes IM, Birger R,  et al. 2006a.  Effect  on Down  syndrome
             screening performance of  adjusting for marker levels in a previous
             pregnancy. Prenat Diagn 26:539-544.
         Wald  NJ, Morris  JK, Ibison J,  et al. 2006b.  Screening in early pregnancy
             for  pre-eclampsia  using  Down syndrome quadruple  test markers.
             Prenat Diagn 26:559-564.
         Washburn MP, Wolters D, Yates JR. 2001. Large-scale analysis of the yeast
             proteome by multidimensional protein identification technology. Nat
             Biotechnol 19:242-247.
         Webster WS, Walsh DA, Lipson AH, McEwen SE. 1980. Teratogenesis af-
             ter acute alcohol exposure in inbred and outred mice. Neurobehav
             Toxicol 2:227-234..
         Westney L, Bruney R, Ross B, et al. 1991. Evidence that gonadal hormone
             levels in amniotic fluid are decreased  in males born to  alcohol users
             in humans. Alcohol Alcoholism 26:403^07.
         Wolters DA, Washburn MP, Yates JR. 2001. An automated multidimen-
             sional protein identification technology for shotgun proteomics.  Anal
             Chem 73:5683-5690.
         Yamamoto R, Azuma M, Wakui Y, et al. 2001. Alpha-fetoprotein microhe-
             terogeneity:  a potential  biochemical marker for Down's syndrome.
             Clin Chim Acta 304:137-141.
Birth Defects Research (Part A) 82:177- 186 (2008)
                                          Previous
  TOC

-------
© 2009 Wiley-Liss, Inc.
               Birth Defects Research (Part A) 85:732-740 (2009)
  Inducible  70  kDa Heat  Shock  Proteins  Protect  Embryos

    from  Teratogen-Induced Exencephaly: Analysis Using

                           Hspala/alb  Knockout  Mice


                       Marianne Barrier,1'2* David J. Dix,3 and Philip E. Mirkes1'2
   1Birth Defects Research Laboratory, Division of Genetics and Development, Department of Pediatrics, University of Washington,
                                            Seattle, Washington
            Department of Veterinary Physiology and Pharmacology, Texas A&M University, College Station, Texas
       3Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
                        Received 20 February 2009; Revised 28 May 2009; Accepted 28 May 2009


BACKGROUND: It is well known that a variety of teratogens induce neural tube defects in animals; however,
less is known about proteins  that play a role in  protecting  embryos from teratogen-induced neural tube
defects. Previously, our laboratory has shown that embryos overexpressing the 70-Da heat shock proteins
(HSPs) Hspala and  Hspalb were partially protected from the deleterious effects  of exposure to hyperther-
mia in vitro.  METHODS: In the present studies, we have used  a transgenic  mouse  in which both of the
stress-inducible  HSPs  Hspala and Hspalb  were  deleted  by  homologous recombination.  Time-mated
Hspala/alb    (KO) and wildtype  (WT) mice were exposed to hyperthermia in vivo on gestational  day
8.5.  RESULTS: Results  show that 52% of the gestational day  15 fetuses from KO litters were exencephalic,
whereas  only  20% of WT fetuses were affected. In addition, 6% of treated KO fetuses also exhibited eye
defects (microphthalmia and anopthalmia),  defects not observed in WT fetuses exposed to  hyperthermia.
Lysotracker red staining and caspase-3 enzyme  activity were examined within 10 hours  after exposure to
hyperthermia, and significantly greater levels of apoptosis  and enzyme activity were observed in the KO
embryos compared with WT embryos. CONCLUSIONS: These  results show that embryos lacking the Hspala
and Hspalb genes are significantly more sensitive to hyperthermia-induced neural tube and eye defects, and
this increased sensitivity is correlated  with increased amounts of apoptosis. Thus, these results  also suggest
that Hspala and Hspalb play an important role in protecting embryos from hyperthermia-induced congenital
defects, possibly by reducing  hyperthermia-induced apoptosis.   Birth Defects  Research (Part A) 85:732-740,
2009.   © 2009 Wiley-Liss, Inc.

Key words: Hsp70; Hspala; Hspalb; Hyperthermia; neural tube defect
                INTRODUCTION
  Neural tube defects (NTDs) are some of the most com-
mon congenital defects, with  approximately 4000  preg-
nancies per year,  or  12 per day, affected by an NTD in
the United States (Finnell et al.,  2000).  Neural tube
defects are related to  failure of the  embryonic neural
folds to fuse properly along the neuroaxis. Two common
forms of NTDs are anencephaly (incomplete closure of
the anterior neural folds)  and spina bifida (incomplete
closure along the posterior neural folds).  In  humans,
potential causitive agents  for  NTDs  include retinoids,
pesticides,  organic  solvents,  ionizing radiation,  vinyl
chloride, water  nitrates, and  disinfection  by-products
(Finnell et  al., 2000; Padmanabhan, 2006). In addition, the
anticonvulsant medications carbamazepine and valproic
     Additional Supporting Information may be found in the online version of this
     article.
     Presented in part at the 46th annual meeting of the Teratology Society
     June 24 to  29, 2006 at the Loews Ventana Canyon Resort in Tucson,
     Arizona.
     The U.S. Environmental Protection Agency through its Office of Research
     and Development partially funded and collaborated in the research
     described here. It has been subjected to agency  review and approved for
     publication.
     Supported by National Institutes of Health grants  R01ES07026, R01ES08744,
     P30ES07033.
     Philip E. Mirkes's current address: Department of Veterinary Physiology and
     Pharmacology, Texas A&M University, College Station, Texas.
     "Correspondence to: Marianne Barrier, US EPA, ORD, Research Triangle Park,
     NC 27711. E-mail: barrier.marianne@epa.gov
     Published online 28 July 2009 in Wiley InterScience (www.interscience.
     wiley.com).
     DOI: 10.1002/bdra.206lO
Birth Defects Research (Part A): Clinical and Molecular Teratology 85:732- 740 (2009)
                              Previous
TOC

-------
                                  HSPA1A AND HSPA1B SUPPRESS TERATA
                                                       733
acid have  also  been implicated in  NTDs in humans
(Lammer et al., 1987; Rosa, 1991). Several studies impli-
cate maternal hyperthermia as a risk factor for NTDs in
humans (Edwards, 1986;  Graham et al., 1998; Moretti
et al., 2005).
  Previous work from  our laboratory has shown that
hyperthermia primarily targets  cells  in  the developing
rodent central nervous system, inducing  excessive  levels
of cell death (Mirkes, 1985; Mirkes and Little, 1998, 2000)
In  addition to inducing  heat  shock protein   synthesis
(Lindquist  and Craig, 1988; Li  and Nussenzweig, 1996;
Nagata, 1996; Welch, 1987),  hyperthermia also rapidly
activates the extracellular signal-regulated protein kinases
(ERKs), c-JUN N-terminal  kinases  (JNK)  and  stress-
activated protein  kinase (p38) signal  transduction path-
ways in postimplantation  mouse embryos (Kyriakis and
Avruch, 1996; Verheij et al., 1996; Mirkes et al.,  2000).
  Although a variety of physical and chemical  agents are
known that can disrupt embryonic  development in ani-
mals and humans, little is known about  factors that  can
be  activated to protect  early postimplantation mamma-
lian embryos  from  these teratogens. Nonetheless,  an
extensive literature documents that heat shock proteins
(HSPs), particularly 70 kDa Hspala and Hspalb, can pro-
tect cells exposed to  a variety of toxic exposures such as
heat,  radiation,  oxidative  stress,  and  chemical  toxins
(Jaattela et al.,  1992; Moseley, 1996;  Nollen et al., 1999;
Hunt et al., 2004; Mayer  and Bukau, 2005; Niu et  al.,
2006).
  In mice,  the 70 kDa Hspa (formerly known  as Hsp70)
family consists of 13 known members  in NCBI  Entrez
Gene (http://www.ncbi.nlm.nih.gov/). The majority of
these 13  Hspa  genes  and proteins are  constitutively
expressed   in  the  absence  of stress.  Constitutively
expressed   Hspa are  known  to  act  as  chaperones that
assist in folding,  transport, assembly, and function of
proteins in the cytoplasm, mitochondria, endoplasmic
reticulum,  and  nucleus  (Beckmann et al., 1990; Shi and
Thomas, 1992; Georgopoulos  and Welch, 1993). In addi-
tion to  these  constitutively expressed members  of  the
family, two other members, Hspala and Hspalb, are rap-
idly induced in response to various stresses. In this arti-
cle, we will use  Hspala/alb to indicate Hspala and
Hspalb, unless specifically indicated otherwise. These in-
ducible Hspa presumably function in a manner similar to
the  constitutive Hspa; however, they do so in the context
of  stress-induced  alterations  in  cellular  metabolism
(Gabai et  al., 1995; Kampinga et al., 1995; Stege et  al.,
1995). In addition,  more recent evidence suggests that in-
ducible Hspala/alb  play a direct role in protecting cells
from a variety of stresses by inhibiting stress-induced ap-
optosis (Mosser et al., 1997; Mosser et al., 2000; Didelot
et al., 2006). For example,  Hspala/alb affect caspase-de-
pendent apoptosis by inhibiting translocation of the Bcl-2
family member Bax  (Stankiewicz et al.,  2005),  inhibiting
cytochrome c release from  the mitochondria and prevent-
ing the activation of caspase-3 through modulation of the
apoptosome (Li et al., 2000; Matsumori et al., 2006; Steel
et al., 2004; Tsuchiya et al., 2003). Hspala/alb are also
thought to  affect caspase-independent apoptosis, when a
protective   effect   was  seen  after  the  transfection  of
Hspala/alb  in  cells treated with  exogenous  caspase
inhibitors  (Creagh et al.,  2000). Other  mechanisms by
which Hspala/alb is believed  to affect stress-induced
apoptosis   are being an  effector  of the  antiapoptotic
      Akt/PKB, prosurvival kinase (Barati et al., 2006; Rafiee
      et al., 2006), inhibiting the activation of a key mitogen-
      activated protein kinase, JNK1 (c-Jun N-terminal Kinase;
      Meriin et al., 1999; Yaglom et al., 1999; Gabai et al., 2000;
      Park et  al., 2001; Lee et al., 2005), and inhibiting the
      release  of  proapoptotic  protein Smac/DIABLO from
      myocyte mitochondria (Jiang  et al., 2005).
       Rat and  mouse postimplantation embryos  are capable
      of responding to heat stress with the induction of several
      HSPs (Mirkes, 1987; Walsh et al., 1987; Higo  et al., 1989;
      Bennett et al., 1990; Honda et al., 1991), the most promi-
      nent being Hspala/alb. Hspala/alb can be induced in
      day 10 rat  embryos (day 8.5 in mice)  by temperatures
      above 40°C. At these temperatures, synthesis  of Hspala/
      alb can  be detected within 30 to 60 minutes after expo-
      sure (Mirkes, 1987),  and accumulation  of Hspala/alb
      protein can be  detected within  2.5 hours (Mirkes  and
      Doggett, 1992).  Once  synthesized, Hspala/alb protein
      can be detected in the embryo for up to 24 hours. Thus,
      temperatures that exceed the  normal growth  tempera-
      tures (37-38°C) by more than 3°C rapidly induce the syn-
      thesis and accumulation of specific HSPs.
       We hypothesize that the induction of Hspala/alb after
      heat shock  helps protect embryos  from hyperthermia-
      induced  NTDs by  reducing  the  induction of apoptosis.
      The goal of the present study was  to observe Hspala/
      alb protection of early postimplantation mouse embryos
      and to determine the  molecular mechanisms underlying
      these protective effects.


               MATERIALS AND METHODS
          Hspala/alb Knockout and Wildtype Mice
       In the present studies,  we have used a transgenic
      mouse in which both of the  stress-inducible members of
      the  70-kDa  HSP family (Hspala/alb) were  deleted by
      homologous  recombination  from C57B1/6-J mice.  The
      University    of   Washington   received   heterozygous
      Hspala/alb knockout (KO)  mice from  the USEPA.  The
      heterozygotes were then crossed to produce Hspala/alb
      KO and wildtype (WT) lines which were then maintained
      separately. Knockout  and  WT genotypes were verified
      with PCR  analysis of tail clips  using F191:5'GTVl  CAC
      TTT AAA CTC  CCT CC 3' and R644 5'CTG CTT CTC TTG
      GCT TCG3' primers. Knockout and WT  embryos were
      evaluated for somite number at several gestational time
      points from 8 days, 22 hours  to 9 days, 12 hours (4 litters
      per time point) to determine whether there is  a difference
      in developmental timing between the two strains.

               In Vivo Hyperthermia Exposure
       Females were mated overnight with males  of the same
      strain and checked for a vaginal plug at 8:00  AM the next
      morning. For females with a  plug, gestational day 0  was
      determined  to  start at midnight the night  before.  To
      determine pregnancy, the weight of each plugged female
      was recorded at the time of plug check and again on the
      morning of gestational day 8.5. Mice that demonstrated a
      weight gain of 2 gm or more were assumed  to be preg-
      nant, and those that gained less were assumed not to be
      pregnant. For females with plugs, the beginning of gesta-
      tion was set at midnight of the previous night. At noon
      on  gestational day 8.5, the pregnant dams were exposed
      for  10 minutes to a water bath set  at 38°C  for controls
                                                                   Birth Defects Research (Part A) 85:732-740 (2009)
                                Previous
TOC

-------
734
BARRIER ET AL.
and 43°C for hyperthermia treatment. For all treatments,
the pregnant dam was placed in  a 50-ml conical tube
with the tip removed for air flow and holes drilled into
the sides to allow water to circulate around the animal.
The restrained mouse was then partially submerged in
the water bath set at the appropriate temperature. Mater-
nal core body temperature was monitored and recorded
via a rectal probe (Skin Temp Probe from MiniMitter,
Bend, OR).  A Minilogger  (MiniMitter,  Bend,  OR)  was
used  to display  real-time  temperature  on a  computer
using in-house visualization software. The  depth of the
mouse  in the water bath was  adjusted as needed to
achieve  consistent temperature exposure levels. For the
control  treatment,  maternal   temperature was  kept
between 37° and 38°C.  For the hyperthermia treatment,
maternal temperature was  monitored that so it reached
43°C  at the 7-minute  mark and held at 43°C  for the
remaining  3  minutes  of the   exposure.  Following
the 10-minute exposure, the animal was removed from
the water bath, pat-dried briefly with paper towels, and
quickly  placed in a 38°C incubator and  monitored until
core temperature returned to  normal. The  treated mice
were then returned to their cage until the embryo collec-
tion time point. At the appropriate time  point, the preg-
nant dams were euthanized by cervical  dislocation, and
the gravid uteri were removed  for embryo extraction.

         Gestational Day 15.5 Evaluations
  At least 10 hyperthermia- and 5 control-treated litters
from  each  Hspala/alb  KO and WT strains were eval-
uated for NTDs.  Time-mated females were treated at ges-
tational  day 8.5, and the litters were extracted at gesta-
tional day 15.5. For each litter, the numbers of live and
resorbed fetuses  were recorded, and the  fetuses were
evaluated  for  developmental   abnormalities. Differences
in percent  resorbed  and abnormal fetuses  among treat-
ment  groups were analyzed using a one-way ANOVA
with Bonferroni's  Multiple Comparison test performed
using  GraphPad  Prism  version   5.01   for   Windows
(GraphPad Software, San Diego, CA).

   Tissue Collection for Western Blot Analysis
  Time-mated females were treated at  gestational  day
8.5. At  1, 2.5, and 5  hours after the 10-minute exposure
(either 38°C or 43°C), embryos were removed from the
uterus for analysis. Embryos from  three  litters  were col-
lected for each of the 12 strain (Hspala/alb KO, WT) X
treatment (Hyperthermia, Control)  X time point (1-,  2.5-,
5-hour)  combinations. Embryos dissected from the uterus
and surrounding membranes were  flash  frozen or stored
in RNAlater (Am7020, Applied Biosystems, Foster City,
CA).  Protein  was  extracted  with  the  miRvana  Kit
(AM1561,  Applied Biosystems,  Foster  City,  CA)  and
quantified using BCA protein assay (23227;  Pierce, Rock-
ford, IL). Samples were run on  12.5% PAGE gels (Protean
4 system,  BioRad) and  transferred to PolyScreen PVDF
hybridization transfer membrane (PerkinElmer,  Boston,
MA) using a semidry transfer apparatus (Ellard Instru-
mentation, Ltd., Monroe, WA). Immunoblot analysis was
performed  using 3% nonfat dry milk  in  Tris-buffered
saline/0.5% Tweens for  blocking and antibody dilutions.
The  primary  antibodies used were for  the  inducible
forms of Hspala/alb  (SPA-810; Stressgen, Ann Arbor,
MI) and beta-actin (A3854; Sigma-Aldrich, Inc., St. Louis,
        MO) as a loading control.  Membranes were incubated
        overnight (Hspala/alb) or 2 hours (actin). The secondary
        antibody used  was  HRP-linked  antimouse  secondary
        (NA931;  GEHealthcare Bio-Sciences  Corp. Piscataway,
        NJ).  ECL  Plus  Western  Blotting  Detection  System
        (RPN2132; GEHealthcare Bio-Sciences Corp., Piscataway,
        NJ)  was used for detection  of the antibodies and  mem-
        branes were imaged with the Kodak Image Station 440CF
        (Eastman Kodak Co., Rochester, NY).


            Lysotracker Red Staining of Lysosomes in
                          Whole Embryos
          Time-mated  females were treated  at gestational day
        8.5.  At 10 hours after the 10-minute  exposure (38°C  or
        43°C), embryos were removed from the uterus for analy-
        sis.  At least  three  litters were  collected for each of the
        four strain (Hspala/alb KO, WT) X  treatment (hyper-
        thermia, control) combinations. Embryos dissected from
        the  uterus and  surrounding membranes  were stained
        with 5  \iM  Lysotracker  Red  (Molecular Probes,  Inc.,
        Eugene, OR) in IX Hank's balanced salt solution (Molec-
        ular Probes, Inc., Eugene, OR) for 30  minutes at 38°C in
        the  dark. Embryos were then fixed in 4% paraformalde-
        hyde overnight at 4°C. The fixed embryos were mounted
        on  microscope  slides using  ProLong  Gold antifade
        mounting media and  Coverwell imaging  chamber gas-
        kets (Invitrogen, Carlsbad, CA).  The head regions of the
        embryos  were   imaged  using  the   BioRad  Radiance
        2000MP microscope  (Bio-Rad  Laboratories,  Inc.,  Rich-
        mond, CA) with 20X  objective, 0.75NA (or 10X, 0.45NA)
        at the  Texas A&M Image  Analysis   Laboratory.  Lyso-
        tracker fluorescence was detected using 568-nm excitation
        wavelength and 590-nm  emission wavelength. Images
        were analyzed  using Metamorph (Universal  Imaging,
        West Chester, PA). In brief, each image was corrected by
        subtracting the background and transformed  to a binary
        image.  The pixels  present were then  counted to deter-
        mine the signal per unit area. Differences in Lysotracker
        staining among treatment groups were analyzed using a
        one-way ANOVA  with Bonferroni's  Multiple Compari-
        son  test performed using GraphPad  Prism version 5.01
        for Windows (GraphPad Software).


                     Caspase-3 Enzyme Assay
          Time-mated  females were treated  at gestational day
        8.5.  At 5 hours after the 10-minute exposure (either 38°C
        or 43°C), embryos were  removed from the uterus for
        analysis of caspase-3  enzyme activity. Three litters were
        collected  for each  of  the four strain  (Hspala/alb KO,
        WT) X treatment (hyperthermia, control)  combinations.
        The  embryos were dissected from the uterus and sur-
        rounding membranes,  and then the heads were removed
        and  stored  in phosphate buffered saline  at —20°C for
        approximately 3 weeks until use. Each embryo head was
        processed individually by adding cell lysis buffer (Cas-
        pase-3  Cellular  Activity  Assay Kit PLUS,  AK-703; BIO-
        MOL Research Laboratories, Plymouth Meeting, PA) and
        disrupting the tissue by pipetting with a  standard plO
        pipette tip. Caspase-3 activity was  measured for each
        head using  the BIOMOL Caspase-3 Cellular Activity
        Assay Kit according  to  the  manufacturer's instructions
        and   using the  Power Wave XC Universal  Microplate
        Spectrophotometer plate reader (BIO-TEK Instruments,
Birth Defects Research (Part A) 85:732-740 (2009)
                                Previous
  TOC

-------
                                  HSPA1A AND HSPA1B SUPPRESS TERATA
                                                       735
Inc., Winooski, VT).  Activity readings were taken every
10 minutes for 4 hours. Enzyme activity was determined
by the rate  of cleavage of a caspase-3 colorimetric sub-
strate (DEVD-pNA) normalized to embryo protein con-
tent  (fmols/min/ug protein) as described in the Quanti-
Zyme Assay Kit.  A one-way ANOVA with Bonferroni's
multiple comparison test was performed using GraphPad
Prism version 5.01 for  Windows (GraphPad Software) to
determine   significance  in  pairwise  comparisons  of
enzyme activity.
Hspa la/alb
Wild Type
Untreated
3B°C Control
43"C Heat Shock
Hspala/alb
Knockout
Untreated
3B°C Control
4 y C Heat Shock
Hipala/alb Protein
                     RESULTS
  Hspala/alb Expression in Hspala/alb WT and
                    KO Embryos
  In the present  study  we used a transgenic mouse in
which both  of  the  stress-inducible members  of  the
70-kDa  HSP  family (Hspala/alb)  were  deleted  by
homologous recombination.  In preliminary experiments,
we performed western blot analyses to confirm the lack
of inducible Hspala/alb protein. As shown in Figure 1,
we detected little  if any  expression  of HSPala/alb in KO
embryos compared to a robust induction of HSPala/alb
in WT embryos. Untreated litters of KO and WT embryos
were  evaluated for neural fold closure and somite num-
ber at multiple gestational time points (8 days, 22 hours;
9 days,  6 hours; 9 days, 12 hours)  to determine whether
there  is a  difference in developmental timing between
the two strains. Based upon a statistical  analysis (Mann
Whitney, unequal variance, two-tailed, unpaired) of  dif-
ferences  in  somite  number  between  KO  and   WT
embryos, only the difference at 9 days, 6 hours  (WT-20,
KO-17 somites) was  statistically significant at the 95%
confidence  interval (See  Supplemental Table SI).

 Hyperthermia-Induced NTDs  in Hspala/alb WT
                  and KO Embryos
  Hyperthermia  and  control-treated  fetuses  of both
Hspala/alb KO and WT litters were evaluated at gesta-
tional day 15.5  to  observe  morphologic  abnormalities
resulting  from the   treatments (Fig.  2).  Among  WT
fetuses,  hyperthermia exposure at gestational day  8.5
induced  exencephaly in 20%  of  hyperthermia-exposed
embryos (Fig. 2A; Table 1). Among KO fetuses, hyper-
thermia exposure at  gestational day 8.5 induced exence-
phaly in 52% of hyperthermia-exposed embryos, a signif-
icant  2.6-fold  increase in the incidence  of exencephaly
compared to WT  fetuses (p < 0.001; Table 1). In addition,
6% of hyperthermia-treated KO embryos exhibited  eye
defects  (microphthalmia and  anopthalmia), defects  not
observed  in  WT embryos  exposed  to hyperthermia
(Fig. 2B, Table 1). Even in the absence of hyperthermia
exposure, KO embryos were more  sensitive to maldevel-
opment (2.5%  exencephaly  among KO  fetuses vs.  0%
among WT fetuses) and in utero death (18%  resorption
among KO embryos/fetuses vs. 6% among WT embryos/
fetuses;  Table 1).  There are  significantly more  affected
embryos (resorptions + defects) in hyperthermia-treated
KO litters, than in control KO or hyperthermia-treated
WT litters  (Table 1). These results show that embryos
lacking  the inducible Hspala/alb  genes  are more sensi-
tive to hyperthermia-induced neural tube  and eye defects
compared to their counterparts that contain these genes.
In addition, embryos lacking  the inducible Hspala/alb
WT  KO
 1 hour
                               WT  KO
                               2.5 hours
                       Time After Exposure (hours)

     Figure 1. Western blot analysis of inducible Hspala/alb protein
     expression in untreated, control, and heat shock-treated Hspala/
     alb KO and WT embryos show induction of Hspala/alb in WT,
     but not KO embryos. (A) Representative Western blot image of
     Hspala/alb protein expression in WT and KO samples as com-
     pared to a protein standard of inducible Hspala/alb. (B) Chart
     summarizing the relative expression levels of Hspala/alb protein
     at 3 time points after treatment. [Color figure can be viewed in the
     online issue, which is available at www.interscience.wiley.com.]

     genes are more prone to maldevelopment (exencephaly,
     resorption)  even  in  the absence of a known teratogenic
     exposure.

       Apoptosis in Hspala/alb WT  and KO Embryos

       To test the hypothesis that the increase in exencephaly
     among KO fetuses is related to an increase in hyperther-
     mia-induced apoptosis during neural tube closure,  we
     used the lysosome stain, Lysotracker Red, to  assess  the
     location and abundance of lysosomes as indicators of cell
     death. We focused our analysis on the  head  regions of
     the embryos, particularly along the anterior neural folds.
     There appeared to be a moderate increase in Lysotracker
     Red staining  in the prosencephalon of the Hspala/alb
     WT embryo  in  response to  heat shock, whereas  the
     increase is observed  to  be more  widespread  through
     much of the head region of the  Hspala/alb KO embryo
     (Fig.  3).  Analysis of the signal per unit area showed sig-
     nificant increases in  Lysotracker Red staining in response
     to hyperthermia in both Hspala/alb WT (fourfold,  p <
     0.01) and KO (sixfold, p  < 0.001) embryos (Fig. 4).  The
     signal levels after hyperthermia treatment were signifi-
                                                                    Birth Defects Research (Part A) 85:732-740 (2009)
                                Previous
TOC

-------
736
BARRIER ET AL.
                 A)
                                 Unaffected
                Exencephaly
                     Gestational day 15 C57BI/6 Hspa1a/a1b WT mice after heat shock
                  B)
                         unaffected   exencephaly       exencephaly & anopthalmia
                         unaffected  microptnalmia     anopthalmia   unaffected
                     Gestational day 15 C57BI/6 Hspa1a/a1b KO mice after heat shock

Figure 2. Evaluation of C57B1/6 Hspala/alb-WT and KO mice on gestational day 15. Representative images of unaffected fetuses compared to
those with exencephaly and/or eye defects. [Color figure can be viewed in the online issue, which is available at www.interscience.wiley.com.]
cantly higher (1.7-fold, p < 0.01) in the Hspala/alb KO
embryos than in the Hspala/alb WT embryos (Fig. 4).
  We also  used a caspase-3  enzyme activity  assay to
evaluate levels of apoptosis in embryo heads. Caspase-3
is an effector caspase, which is activated only during the
apoptotic process and is therefore  a marker of apoptotic
activity. We observed a significant sevenfold (p < 0.05)
increase in caspase-3 activity in response to heat shock in
the Hspala/alb-KO litters, but a nonsignificant threefold
increase in  the Hspala/alb-WT litters (Fig. 5).  The cas-
pase-3 activity levels after heat shock treatment were sig-
nificantly higher  (4.6-fold, p  <  0.05) in the  Hspala/
alb-KO embryos compared  with the  Hspala/alb-WT
embryos (Fig. 5).
                          DISCUSSION

          It is well known that a variety of teratogens induce
        NTDs in animals; however, less is known about proteins
        that play a role in protecting embryos  from teratogen-
        induced  NTDs. Thus, the first goal of the present study
        was to determine whether Hspala/alb, a protein that is
        rapidly induced in response to various stresses including
        hyperthermia,  protects   early postimplantation  mouse
        embryos from  teratogen-induced  malformations. To do
        this,  we compared the  levels of hyperthermia-induced
        defects in WT and HSPala/alb null fetuses. Our results
        clearly show that  embryos/fetuses lacking  HSPala/alb
        are significantly more sensitive to hyperthermia-induced
Birth Defects Research (Part A) 85:732-740 (2009)
                                Previous
  TOC

-------
                                  HSPA1A AND HSPA1B SUPPRESS TERATA
                                                        737
                                                     Table 1
  Summary of evaluations of C57B1/6 Hspala/alb-WT and KO mice on gestational day 15. A one-way ANOVA
      analysis with Bonferroni's multiple comparison posttest was used to determine significance in pairwise
    comparisons of treatment/strain groups for percent malformations and resorptions  (GraphPad Prism 5.01)
                                             Hspala/alb WT
                                     Hspala/alb KO
                                     Control
    Heat Shock
Control
  Significant Comparisons with 1-way Anova + Bonferroni's MCT.
  WT, wildtype; KO, knockout; MCT, multiple  comparison posttest; Ctrl, control; HS, heat shock.
  aKO Ctrl vs. KO HS p < 0.001, WT HS vs. KO HS p < 0.001.
  bKO Ctrl vs. KO HS p < 0.01, WT HS vs. KO HS p < 0.001.
                                                                                                           Heat
Total no. of litters
Total no. of pups
No. (%) with exencephaly
No. (%) with eye defects
No. (%) resorbed
No. (%) affected
(All defects + Resorptions)
8
67
0 (0%)
0 (0%)
4 (6%)
4 (6%)

11
90
17 (20%)
0 (0%)
9 (9%)
26 (27.5%)

5
36
1 (2.5%)
0 (0%)
9 (18%)
10 (20.5%)

11
81
43 (52%)a
5 (6%)
11 (12%)
56 (60%)b

NTDs  (exencephaly) and eye defects. Thus,  one of the
functions of HSPala/alb during neural tube closure is to
act as  a suppressor of teratogenic effects,  at least  as  it
applies to hyperthermia. Additional research  is required
to determine whether  HSPala/alb can suppress defects
induced by other teratogens, particularly those teratogens
that do not induce the  expression of HSPala/alb.
  Despite the fact that embryos/fetuses lacking HSPala/
alb  are significantly  more  sensitive to hyperthermia-
induced NTDs (exencephaly) and eye defects,  approxi-
mately  half  of  the embryos exposed to hyperthermia
    Hspala/alb Wildtype  Hspala/alb Knockout
Figure  3. Representative  images  of  Lysotracker  Red-stained
Hspala/alb-WT and KO embryo head regions demonstrate lev-
els of apoptosis found along the neural tube following control
and heat shock-treatment. For orientation, the general forebrain
(F), midbrain (M), and hindbrain (H) regions are indicated.
      exhibit apparently normal neural tube closure  and eye
      development. Similarly, even in the presence of inducible
      HSPala/alb, approximately 20% of the embryos exposed
      to  hyperthermia  fail to  complete  neural tube closure,
      resulting in exencephaly.  Thus, although  HSPala/alb
      plays  an  important  role  in  protecting  embryos  from
      hyperthermia-induced defects, this protection is not com-
      plete.  This  suggests  that there must be  other protective
      factors in early postimplantation embryos. One such fac-
      tor may be Hsp25, which has been shown to be protec-
      tive in cells (Landry et al., 1989;  Lavoie  et al., 1993); we
      have also shown that it is  constitutively expressed  in
      postimplantation  rat  embryos  and is also significantly
      induced by exposure to  hyperthermia   (Mirkes  et  al.,
      1996). Given the potential to modulate the expression of
      heat shock proteins therapeutically (Herbst and  Wanker,
      2007; Putics et al., 2008; Roesslein et al.,  2008), it will be
      important to determine whether Hsp25 is also a  suppres-
                                                                                Lysotracker Red
                                                                               (Overall Utter Averages)
                                                              0.20
                                                            I
                                                            f
Significant Comparisons
WT Ctrl vs WT HS - • p < 0-01
KO Ctrl VJ KO HS - ** p < O.OOJ
WT HS vs KO HS • • p •: 0.01




C±D
Hspala/alb
Wildtype
Cont








Hspala/alb
Krockout
rol
*
*








Hspala/alb
Wildtype
Heat

Hspala/alb
Knockout
Shock
                                                              0,00
      Figure 4. Litter averages for Lysotracker Red signal in the head
      region of Control- and Heat Shock-treated Hspala/alb KO and
      WT embryos.  The increase in Lysotracker Red  staining  in
      response to heat  shock was significant in both Hspala/alb-KO
      and WT embryos. The level of staining after heat  shock treat-
      ment was significantly higher in the Hspala/alb-KO embryos
      than in the Hspala/alb-WT embryos.
                                                                     Birth Defects Research (Part A) 85:732-740 (2009)
                                 Previous
TOC

-------
738
                                           BARRIER ET AL.
               Caspase-3 Enzyme Activity Assay
                     |Gverali Litter Averages)
 _ 06
IO.M
 £ 0.4 -
 2-

 C o<
 Significant Comparisons
KO Ctrl vi KO MS - * p < 0,05
WT HS vs KO HS - * p i 0.05
         Hspals/alb
          Wildtyp*
            Hsp»la/aib
             Knockout
Hspsla/alb
 Wildtype
Hspala/alb
 Knockoy?
                                      Heat shot*
Figure 5. Litter average for caspase-3 enzyme specific activity in
control-  and hyperthermia-treated  Hspala/alb KO and  WT
embryo heads. The increase  in  caspase-3 enzyme activity in
response to heat shock was significantly greater in the Hspala/
alb-KO  embryos  (sevenfold) than  in  the  Hspala/alb-WT
embryos (3.3-fold). The caspase-3 enzyme activity levels after
heat shock treatment were significantly higher  in the Hspala/
alb-KO embryos than in the Hspala/alb-WT embryos.
sor of teratogenic effects. In addition, it is important to
identify other suppressors of teratogenic effects.  Recent
work from our laboratory has shown that p53, a key cel-
lular regulator that determines whether a cell will arrest
or die in response to a variety of stresses, also suppresses
hyperthermia-induced  exencephaly  in  mouse  fetuses
(Hosako et al., 2009). Although far from conclusive, our
finding that  HSPala/alb  and p53, both  of which are
known  to  regulate the apoptotic pathway, function as
suppressors of teratogenic  effects suggests  that apoptosis
is  causally linked to the failure  of neural tube closure
and the development of exencephaly in  hyperthermia-
treated embryos.
  Although  our  evaluations  of  somite  numbers  in
untreated KO and WT embryos showed a trend of fewer
somites in the KO embryos at the  8  day, 22 hour and
9 day, 6 hour time points, only the latter difference is
statistically significant. This difference disappears  by the
9 day, 12 hour time point. A trend line fit to the  somite
data for the KO (R-square = 0.9932) and WT (R-square =
0.9998)  litters estimate  an  increase of one  somite every
1.6 hours for KO and 1.8 hours for WT. Based on the for-
mulas  for these trend lines, there would be an approxi-
mately four-somite difference at the time of treatment on
gestational day 8.5 (~6.5 somites  in KO and 10.5 somites
in WT). Although we cannot rule out the possibility that
the  nonsignificant  trend   of fewer  somites in  KO
compared with WT embryos around the time of exposure
to hyperthermia plays a role in the observed increase in
affected embryos in KO litters, it is unlikely that this is
the only contributing  factor  considering the  increase in
maldevelopment observed in control-treated KO litters.
  In addition to showing  that embryos/fetuses lacking
HSPala/alb  are significantly  more sensitive  to  hyper-
thermia-induced  NTDs (exencephaly) and  eye defects,
our results show that embryos/fetuses lacking  HSPala/
alb are significantly more sensitive to the induction of
exencephaly and intrauterine demise in the absence of a
hyperthermia exposure. These  results raise two interest-
ing questions:  what  is  the  cause of exencephaly and
resorptions  in  KO embryos/fetuses in the absence  of
hyperthermia, and why are KO embryos more sensitive
than  WT  embryos  in  the  absence  of  hyperthermia-
induced  Hspala/alb?  The cause of exencephaly and
resorptions in KO embryos/fetuses could be either devel-
opmental errors or an exposure that is teratogenic to KO
but not WT embryos/fetuses.  Although we cannot rule
either of these causes in or out, the  fact that our control
pregnant mice  are restrained  and immersed in a  water
bath at 38°C suggests that this treatment might constitute
a teratogenic exposure in KO but not WT embryos.  Exen-
cephaly was not observed in untreated litters from  either
WT or KO breeding colonies. Perhaps a more interesting
question is why some KO  embryos develop exencephaly
or die in  utero  without exposure  to hyperthermia.  It
might  be  that  although Hspala/alb are "inducible,"
these genes are also expressed constitutively at  low  levels
in the absence of any inducer.  Although we do not have
definitive data suggesting that  this is the case,  the  West-
ern blot data presented in  Figure 1 do show faint bands
at the position  expected for Hspala/alb. If Hspala/alb
is indeed  constitutively  expressed   in  the  absence  of
Hspala/alb inducers, this Hspala/alb presumably plays
some roles, one of which  might be to inhibit abnormal
levels of programmed cell death. The absence of this pro-
tective function in KO embryos then might sensitize cer-
tain embryos to maldevelopment (exencephaly) or  death
when confronted with  developmental errors  or a low-
level teratogenic exposure.  Clearly, more research will be
needed to answer these intriguing questions.
  Given our initial finding  that HSPala/alb is a suppres-
sor of teratogenic effects; our second goal was to begin to
elucidate the molecular mechanisms underlying the pro-
tective effects of HSPala/alb. In this effort,  we  were
guided by our finding that hyperthermia-induced exence-
phaly is accompanied by hyperthermia-induced apopto-
sis in the neural folds and surrounding tissues (Fig.  3).
Although we do not know how the increase in apoptosis
is directly related to the observed exencephaly, the data
show an increase in cell death in tissues  (neural  folds)
known to play a key role in neural tube closure. The fact
that apoptosis is induced in any other tissue as well does
not detract from our conclusion that apoptosis plays a
role in hyperthermia-induced  exencephaly. This increase
in cell death might interfere with the normal  closure of
the neural  tube in this region and is consistent with the
morphology  of  the observed  defect of exencephaly  in
which a section of the cranial  region is absent. In  addi-
tion, we were guided by an extensive literature showing
that one of the important functions  of HSPala/alb is to
inhibit apoptosis at various points  in the extrinsic and
intrinsic apoptotic pathways (Arya et al., 2007). Thus, we
hypothesized that the induction of Hspala/alb after heat
shock  helps to  protect  embryos  from   hyperthermia-
induced  neural  tube  defects  by  reducing  the  level
of  hyperthermia-induced  apoptosis.  Conversely,  we
hypothesized that in the  absence of the antiapoptotic
effects of HSPala/alb, the increased exencephaly in null
embryos should be accompanied by  an increase in apo-
ptosis. Results of the present studies show that embryos
lacking the inducible Hspala/alb genes are significantly
Birth Defects Research (Part A) 85:732-740 (2009)
                                 Previous
                                             TOC

-------
                                        HSPA1A AND HSPA1B SUPPRESS TERATA
                                                                 739
more  sensitive to hyperthermia-induced NTDs  and that
this increased sensitivity is associated with increased lev-
els  of apoptosis.  Although  not  formally proven, our
results from  HSPala/alb KO embryos support the hy-
pothesis   that HSPala/alb  normally  protects  embryos
from  hyperthermia-induced  congenital defects,  possibly
by reducing hyperthermia-induced apoptosis.
   Although our data clearly  show  that  hyperthermia-
induced  apoptosis is significantly elevated in neurulating
mouse embryos  in  the  absence of HSPala/alb,  we do
not know  the   specific  anti-apoptotic  mechanisms  of
HSPala/alb  that are  compromised  and  responsible for
the increased apoptosis. HSPala/alb is  known to block
apoptosis at  a number of points in  the apoptotic path-
way. For example, HSPala/alb  can  inhibit apoptosis by
blocking translocation of Bax from the cytoplasm  to the
mitochondria (Gotoh et al., 2004; Stankiewicz  et al., 2005),
by inhibiting  the formation of a functional apoptosome
complex  through interaction  with  Apaf-1  (Beere  et al.,
2000; Saleh et al., 2000), by blocking the activation  of Bid
(Gabai et al.,  2002), and by blocking the migration of AIF
from mitochondria to  nuclei (Ravagnan et al., 2001). We
also  know that hyperthermia both  induces the  rapid
expression  of HSPala/alb   (Mirkes, 1987;  Mirkes  and
Doggett,  1992; Mirkes et  al.,  1994;  Thayer  and Mirkes,
1997; Mirkes  et  al., 1999) and activates the mitochondrial
apoptotic pathway  in postimplantation  rodent  embryos
(Mirkes  and  Little,  2000; Mirkes et  al.,  2001; Little and
Mirkes, 2002; Little et al., 2003) by inducing the release of
cytochrome c from mitochondria and  the subsequent acti-
vation of both initiator (caspase-9)  and effector  caspases
(caspase-3,—6, and—7). Additional research is required to
determine the mechanisms by which HSPala/alb inter-
acts with apoptotic pathways in  rodent embryos  and
thereby  modulates  the  levels  of  programmed and/or
teratogen-induced apoptosis.


                ACKNOWLEDGMENTS
   The  authors   wish  to  dedicate   this  manuscript  to
Dr. Tom Shepard in honor of his numerous contributions
to the field of teratology. We also thank Sucheol Gil, Jen-
nifer Faske, Roula Mounemne, and Elizabeth M. Watson
for outstanding technical assistance.


                      REFERENCES
Arya R, Mallik M, Lakhotia SC. 2007. Heat shock genes - integrating cell
    survival and death. J Biosci 32:595-610.
Barati MT, Rane  MJ, Klein  JB,  McLeish KR. 2006. A proteomic screen
    identified  stress-induced chaperone proteins as  targets of Akt phos-
    phorylation in mesangial cells. J Proteome Res 5:1636-1646.
Beckmann  RP, Mizzen LE, Welch WJ. 1990. Interaction of Hsp  70 with
    newly  synthesized proteins: implications for protein folding and  as-
    sembly. Science 248:850-854.
Beere HM,  Wolf BB,  Cain K, et al. 2000. Heat-shock protein 70  inhibits ap-
    optosis by preventing recruitment of procaspase-9  to  the Apaf-1
    apoptosome. Nat Cell Biol 2:469^75.
Bennett GD, Mohl VK, Finnell RH. 1990. Embryonic and maternal heat
    shock responses to a teratogenic hyperthermic insult.  Reprod Toxicol
    4:113-119.
Creagh EM, Carmody RJ, Cotter TG. 2000. Heat shock protein 70 inhibits
    caspase-dependent and -independent apoptosis in Jurkat T cells. Exp
    Cell Res 257:58-66.
Didelot C,  Schmitt E, Brunet M, et al. 2006. Heat shock proteins:  endoge-
    nous modulators of apoptotic cell  death. Handb  Exp  Pharmacol
    (172):171-198.
Edwards MJ. 1986. Hyperthermia as a teratogen: a review of experimental
    studies and  their clinical significance. Teratog Carcinog Mutagen 6:
    563-582.
       Finnell RH, Gelineau-van Waes J, Bennett GD, et al. 2000. Genetic basis of
          susceptibility to environmentally induced neural tube defects. Ann N
          Y Acad Sci 919:261-277.
       Gabai VL, Mabuchi K, Mosser DD, Sherman MY. 2002. Hsp72 and stress
          kinase c-jun N-terminal kinase regulate the bid-dependent pathway in
          tumor necrosis factor-induced apoptosis. Mol Cell Biol 77:3415-3474.
       Gabai VL, Yaglom JA, Volloch V, et al. 2000. Hsp72-mediated suppression
          of c-Jun N-terminal kinase is implicated in development of tolerance
          to caspase-independent cell death. Mol Cell Biol 20:6826-6836.
       Gabai VL, Zamulaeva IV, Mosin AF, et al. 1995. Resistance  of  Ehrlich
          tumor cells to apoptosis can be due to accumulation of heat shock
          proteins. FEES Lett 375:21-26.
       Georgopoulos C, Welch WJ. 1993. Role of the major heat shock proteins
          as molecular chaperones. Annu Rev Cell Biol 9:601-634.
       Gotoh T, Terada K, Oyadomari S, Mori M. 2004.  hsp70-DnaJ chaperone
          pair prevents nitric  oxide- and CHOP-induced apoptosis by inhibi-
          ting  translocation of  Bax to mitochondria.  Cell Death  Differ 11:
          390-402.
       Graham JM Jr, Edwards  MJ, Edwards MJ. 1998. Teratogen update: gesta-
          tional effects of maternal hyperthermia due to febrile illnesses  and re-
          sultant patterns of defects in humans. Teratology 58:209-221.
       Herbst M,  Wanker  EE. 2007. Small  molecule inducers  of heat-shock
          response reduce polyQ-mediated huntingtin aggregation. A possible
          therapeutic strategy. Neurodegener Dis 4:254-260.
       Higo  H, Lee JY, Satow Y, Higo K.  1989. Elevated expression of proto-
          oncogenes accompany enhanced induction of  heat-shock genes after
          exposure  of rat embryos in  utero to ionizing irradiation. Teratog
          Carcinog Mutagen 9:191-198.
       Honda K, Hatayama T, Takahashi K, Yukioka M.  1991. Heat shock pro-
          teins in human and mouse embryonic cells after exposure  to heat
          shock or teratogenic  agents. Teratog Carcinog Mutagen 11:235-244.
       Hosako H,  Francisco LE, Martin GS, Mirkes PE. 2009. The roles of p53
          and p21 in normal development and hyperthermia-induced  malfor-
          mations. Birth Defects Res B Dev Reprod Toxicol 86:40-47.
       Hunt  CR, Dix  DJ, Sharma  GG,  et  al. 2004. Genomic  instability and
          enhanced radiosensitivity in  Hsp70.1- and Hsp70.3-deficient mice.
          Mol Cell Biol 24:899-911.
       Jaattela M, Wissing D, Bauer PA, Li GC. 1992. Major  heat shock  protein
          hsp70 protects  tumor  cells from tumor necrosis  factor cytotoxicity.
          Embo J 11:3507-3512.
       Jiang B, Xiao W, Shi Y, et al. 2005. Heat shock pretreatment inhibited the
          release of Smac/DIABLO from mitochondria and apoptosis induced
          by hydrogen peroxide in cardiomyocytes and  C2C12 myogenic cells.
          Cell Stress Chaperones 10:252-262.
       Kampinga HH, Brunsting JF, Stege GJ, et al. 1995.  Thermal protein dena-
          turation and protein aggregation in cells made  thermotolerant by var-
          ious chemicals:  role of heat shock proteins.  Exp Cell Res 219:536-546.
       Kyriakis JM, Avruch J. 1996. Protein  kinase cascades  activated by stress
          and inflammatory cytokines. Bioessays 18:567-577.
       Lammer EJ, Sever LE, Oakley GP, Jr. 1987. Teratogen update: valproic
          acid. Teratology 35:465^73.
       Landry J, Chretien P, Lambert H, et al. 1989. Heat shock resistance con-
          ferred by expression of the human HSP27 gene in rodent cells. J Cell
          Biol 109:7-15.
       Lavoie JN,  Hickey  E, Weber  LA, Landry  J.  1993. Modulation of  actin
          microfilament dynamics and fluid phase pinocytosis by phosphoryla-
          tion of heat shock protein  27. J Biol Chem 268:24210-24214.
       Lee JS, Lee JJ, Seo JS. 2005. HSP70 deficiency results in activation of c-Jun
          N-terminal  Kinase,  extracellular signal-regulated kinase,  and cas-
          pase-3  in hyperosmolarity-induced apoptosis.  J Biol Chem 280:
          6634-6641.
       Li CY, Lee JS, Ko YG, et al. 2000. Heat shock protein 70 inhibits apoptosis
          downstream of cytochrome c release and upstream of caspase-3 acti-
          vation. J Biol Chem 275:25665-25671.
       Li GC, Nussenzweig A.  1996. Thermotolerance and heat shock proteins:
          possible involvement of Ku autoantigen in regulating Hsp70  expres-
          sion. EXS 77:425^49.
       Lindquist S, Craig EA. 1988. The heat-shock proteins. Annu Rev Genet
          22:631-677.
       Little  SA, Kim  WK, Mirkes PE. 2003. Teratogen-induced  activation  of
          caspase-6 and caspase-7 in early postimplantation mouse embryos.
          Cell Biol Toxicol 19:215-226.
       Little  SA, Mirkes PE. 2002. Teratogen-induced activation of caspase-9 and
          the mitochondrial apoptotic pathway in early postimplantation
          mouse embryos. Toxicol Appl Pharmacol 181:142-151.
       Matsumori  Y, Northington FJ, Hong SM, et al. 2006.  Reduction of cas-
          pase-8  and -9  cleavage  is  associated with  increased c-FLIP and
          increased binding of Apaf-1 and Hsp70 after neonatal hypoxic/ische-
          mic injury in mice overexpressing Hsp70. Stroke 37:507-512.
       Mayer MP, Bukau B. 2005. Hsp70 chaperones: cellular functions and mo-
          lecular mechanism. Cell Mol Life Sci 62:670-684.
                                                                               Birth Defects Research (Part A) 85:732-740 (2009)
                                      Previous
TOC

-------
740
BARRIER ET AL.
Merlin AB, Yaglom JA, Gabai VL, et al. 1999. Protein-damaging stresses
    activate c-Jun N-terminal kinase via inhibition of its dephosphorylation:
    a novel pathway controlled by HSP72. Mol Cell Biol 19:2547-2555.
Mirkes PE. 1985.  Effects of acute exposures to elevated temperatures on
    rat embryo growth and development in vitro. Teratology 32:259-266.
Mirkes PE. 1987. Hyperthermia-induced heat shock response and thermo-
    tolerance in postimplantation rat embryos. Dev Biol 119:115-122.
Mirkes PE, Cornel LM, Wilson KL, Dilmann WH. 1999. Heat shock pro-
    tein 70 (Hsp70) protects postimplantation murine embryos from the
    embryolethal effects of hyperthermia. Dev Dyn 214:159-170.
Mirkes PE, Doggett B. 1992. Accumulation of heat shock protein 72 (hsp
    72) in postimplantation rat embryos after exposure to various periods
    of hyperthermia (40 degrees -43 degrees C) in  vitro:  evidence  that
    heat shock protein 72 is a biomarker of heat-induced embryotoxicity.
    Teratology 46:301-309.
Mirkes PE, Doggett B, Cornel L. 1994. Induction of a  heat shock response
    (HSP 72) in rat embryos exposed to selected chemical teratogens. Ter-
    atology 49:135-142.
Mirkes PE, Little SA. 1998. Teratogen-induced cell death in  postimplanta-
    tion mouse embryos:  differential tissue sensitivity and hallmarks of
    apoptosis. Cell Death  Differ 5:592-600.
Mirkes PE, Little SA. 2000.  Cytochrome c release from mitochondria of
    early postimplantation murine embryos exposed  to 4-hydroperoxycy-
    clophosphamide, heat shock, and staurosporine. Toxicol Appl Phar-
    macol 162:197-206.
Mirkes PE, Little SA, Cornel L, et al. 1996. Induction of heat shock protein 27
    in rat embryos exposed to hyperthermia. Mol Reprod Dev 45:276-284.
Mirkes PE, Little SA, Umpierre CC. 2001. Co-localization  of active  cas-
    pase-3 and DNA fragmentation (TUNEL) in normal and hyperther-
    mia-induced abnormal mouse development. Teratology  63:134—143.
Mirkes PE, Wilson KL, Cornel LM. 2000. Teratogen-induced activation of
    ERK, JNK, and p38 MAP kinases in early  postimplantation murine
    embryos. Teratology 62:14-25.
Moretti  ME,  Bar-Oz B, Fried S, Koren G. 2005.  Maternal  hyperthermia
    and the risk  for neural tube defects in offspring: systematic review
    and meta-analysis. Epidemiology 16:216-219.
Moseley PL. 1996. Heat shock proteins:  a broader perspective. J Lab Clin
    Med 128:233-234.
Mosser DD, Caron AW, Bourget L,  et al. 1997. Role of the human heat
    shock protein hsp70  in protection  against stress-induced  apoptosis.
    Mol Cell Biol 17:5317-5327.
Mosser DD, Caron AW, Bourget L, et al. 2000. The chaperone function of
    hsp70  is required for protection against stress-induced  apoptosis.
    Mol Cell Biol 20:7146-7159.
Nagata K.  1996.  Regulation of thermotolerance  and ischemic tolerance.
    EXS 77:467^81.
Niu P, Liu L, Gong Z, et al. 2006. Overexpressed heat shock protein 70
    protects cells against DNA  damage caused by ultraviolet C in a dos-
    e-dependent manner.  Cell Stress  Chaperones 11:162-169.
Nollen EA, Brunsting JF, Roelofsen H, et al. 1999. In vivo chaperone activ-
    ity of heat shock protein 70 and thermotolerance. Mol Cell Biol 19:
    2069-2079.
          Padmanabhan R. 2006. Etiology, pathogenesis and prevention of neural
              tube defects. Congenit Anom (Kyoto) 46:55-67.
          Park HS, Lee JS, Huh SH, et al. 2001. Hsp72 functions as a natural inhibi-
              tory protein of c-Jun  N-terminal kinase. Embo J 20:446^56.
          Putics  A, Vodros  D, Malavolta M, et al. 2008. Zinc  supplementation
              boosts the  stress response in the elderly: Hsp70 status is linked to
              zinc  availability in peripheral  lymphocytes.  Exp  Gerontol  43:
              452-461.
          Rafiee  P, Theriot ME, Nelson VM,  et al. 2006. Human esophageal micro-
              vascular endothelial  cells  respond to acidic pH stress by PI3K/AKT
              and p38 MAPK-regulated induction of Hsp70 and Hsp27. Am J
              Physiol Cell Physiol 291:C931-945.
          Ravagnan L, Gurbuxani  S, Susin SA, et al. 2001. Heat-shock protein 70
              antagonizes apoptosis-inducing factor. Nat Cell Biol 3:839-843.
          Roesslein  M, Schibilsky  D, Muller L,  et  al. 2008.  Thiopental  protects
              human T lymphocytes from apoptosis in vitro via the expression of
              heat shock  protein 70. J Pharmacol Exp Ther 325:217-225.
          Rosa FW.  1991.  Spina bifida in infants of women treated with carbamaze-
              pine during pregnancy. N Engl J Med 324:674-677.
          Saleh A, Srinivasula SM,  Balkir L,  et al. 2000. Negative regulation of the
              Apaf-1 apoptosome by Hsp70. Nat Cell Biol 2:476^83.
          Shi Y,  Thomas JO. 1992.  The transport of proteins  into the  nucleus
              requires the 70-kilodalton heat shock protein or its cytosolic  cognate.
              Mol Cell Biol 12:2186-2192.
          Stankiewicz AR, Lachapelle G, Foo CP, et al. 2005. Hsp70 inhibits heat-in-
              duced apoptosis upstream of mitochondria by preventing Bax  trans-
              location. J Biol Chem 280:38729-38739.
          Steel R, Doherty JP, Buzzard K, et al.  2004. Hsp72 inhibits apoptosis
              upstream of the mitochondria and not through  interactions with
              Apaf-1. J Biol Chem 279:51490-51499.
          Stege GJ,  Kampinga HH, Konings AW. 1995. Heat-induced intranuclear
              protein aggregation and thermal radiosensitization.  Int J Radiat Biol
              67:203-209.
          Thayer JM, Mirkes  PE. 1997.  Induction of Hsp72 and transient nuclear
              localization of Hsp73 and Hsp72 correlate with the  acquisition and
              loss of thermotolerance in postimplantation rat embryos. Dev Dyn
              208:227-243.
          Tsuchiya D, Hong S, Matsumori Y, et al. 2003. Overexpression of rat heat
              shock protein 70 is associated  with  reduction of early mitochondrial
              cytochrome C release and subsequent DNA fragmentation after per-
              manent focal ischemia. J Cereb Blood Flow Metab 23:718-727.
          Verheij M, Bose R, Lin XH, et al. 1996. Requirement for ceramide-initiated
              SAPK/JNK signalling in stress-induced apoptosis. Nature 380:75-79.
          Walsh  DA, Klein NW, Hightower LE, Edwards MJ. 1987.  Heat shock and
              thermotolerance during early  rat embryo  development. Teratology
              36:181-191.
          Welch  WJ. 1987. The mammalian heat shock (or stress) response: a cellu-
              lar defense mechanism. Adv Exp Med Biol 225:287-304.
          Yaglom JA, Gabai VL, Meriin AB, et al.  1999. The function of HSP72 in
              suppression of  c-Jun N-terminal kinase activation can be dissociated
              from  its role in prevention of protein  damage. J Biol Chem 274:
              20223-20228.
Birth Defects Research (Part A) 85:732-740 (2009)
                                          Previous
   TOC

-------
                                                      Genes and Immunity (2008), 1-8
                                                      © 2008 Macmillan Publishers Limited All rights reserved 1466-4879/08 $32.00
                                                      www.nature.com/gene

ORIGINAL ARTICLE

Integrated analysis of genetic and proteomic data

identifies  biomarkers associated  with  adverse  events

following  smallpox vaccination


DM Reif1, AA Motsinger-Reif2, BA McKinney3, MT Rock4, JE Crowe Jr4-5-6 and JH Moore7-8
^National Center for Computational Toxicology, US Environmental Protection Agency, Research Triangle Park, NC, USA; ^Department
of Statistics, Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA; department of Genetics, University
of Alabama School of Medicine, Birmingham, AL, USA; ^Department of Pediatrics, Vanderbilt University Medical Center, Nashville,
TN, USA; 5Department of Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, TN, USA; 6Program in
Vaccine Sciences, Vanderbilt University Medical Center, Nashville, TN, USA; 7Department of Genetics, Dartmouth Medical School,
Lebanon, NH, USA and 8Computational Genetics Laboratory, Dartmouth Medical School, Lebanon, NH, USA
Complex clinical outcomes, such as adverse reaction to vaccination, arise from the concerted interactions among the myriad
components of a biological system. Therefore, comprehensive etiological models can be developed only through the integrated
study of multiple types of experimental data. In this study, we apply this paradigm to high-dimensional genetic and proteomic
data collected to elucidate the mechanisms underlying the development of adverse events (AEs) in patients after smallpox
vaccination. As vaccination was successful in all of the patients under study, the AE outcomes reported likely represent the
result of interactions among immune system components that result in excessive  or prolonged immune stimulation. In this
study, we examined 1442 genetic variables (single nucleotide polymorphisms) and 108 proteomic variables (serum cytokine
concentrations) to model AE risk. To accomplish this daunting analytical task, we employed the Random Forests (RF) method
to filter the most important attributes, then we used the selected attributes to build a final decision tree model. This strategy is
well suited to integrated analysis, as relevant attributes may be selected from categorical or continuous data. Importantly, RF is
a natural approach for studying the type of gene-gene, gene-protein and protein-protein interactions we hypothesize to be
involved in the development of clinical AEs. RF importance scores for particular attributes take interactions into account, and
there may be interactions  across data types.  Combining information from  previous studies on AEs related to smallpox
vaccination with the genetic and proteomic attributes identified by RF, we built a comprehensive model of AE development that
includes the cytokines intercellular adhesion molecule-1 (ICAM-1 or CD54),  interleukin-10 (IL-10), and colony stimulating
factor-3 (CSF-3 or G-CSF) and a genetic polymorphism in the cyokine gene interleukin-4 (IL4). The biological factors included
in the model support our hypothesized mechanism for the development of AEs involving prolonged stimulation of inflammatory
pathways and an imbalance of normal tissue damage repair pathways. This study shows the utility of RF for such analytical
tasks, while both enhancing and reinforcing our working model of AE development after smallpox vaccination.
Genes and Immunity advance online publication, 16 October 2008; doi:10.1038/gene.2008.80

Keywords: smallpox; Random Forests; integrated analysis; genetic; proteomic; interactions
Introduction

Live attenuated vaccinia virus, delivered intradermally,
is  the vaccine given to immunize individuals against
smallpox.  Although vaccination of healthy adults with
vaccinia virus  induces  a protective response  in the
majority of  individuals immunized, vaccinia virus is
reactogenic in a significant number of  vaccinees.1 The
most common  adverse events (AEs) after vaccination
include fever, lymphadenopathy (swelling and tender-
ness of lymph nodes) and a generalized acneiform rash.
Correspondence: Dr DM Reif, National Center for Computational
Toxicology, US Environmental Protection Agency, D343-03, 109 TW
Alexander Drive, Research Triangle Park, NC 27711, USA.
E-mail: reif.david@epa.gov
Received 11 June 2008; revised and accepted 27 August 2008
Collectively, these clinical reactions suggest that indivi-
duals suffering AEs have immune responses beyond the
necessary magnitude, or sustain the immune response
longer than necessary.
  To elucidate the complex pathophysiology underlying
unwanted responses  to vaccination, we gathered high-
dimensional genetic and proteomic data in a cohort of
subjects in which  a  portion experienced  an  AE  after
primary immunization with Aventis Pasteur smallpox
vaccine.  Through  a  comprehensive  examination  of
systemic  (serum)  cytokine/chemokine  changes com-
bined with the characterization of polymorphisms in a
large panel of candidate genes, we sought to provide a
thorough portrayal of the complex genetic and proteomic
interplay behind the development of AEs. Knowledge
of  how  risk factors  in  a   subject's  genetic  back-
ground  interact  with dynamically changing  levels  of
                                    Previous
TOC
Next

-------
                                  Integrated analysis of smallpox vaccination data
                                                      DMReifefa/
         immunological proteins could shed light on important
         therapeutic  targets  or  pathways  to  direct   vaccine
         modification and pre-vaccination screening procedures.
           It is increasingly  gaining acceptance  that  complex
         clinical outcomes, such as  adverse reaction to  vaccina-
         tion, arise from the concerted interactions among  the
         myriad components  of a biological system.2 Complica-
         ting genetic factors,  such as multiple contributing loci
         and/or susceptibility alleles, incomplete penetrance and
         epistasis, are further convoluted by proteomic, metabo-
         lomic and environmental effects.3 If such a multiscale
         system is to be understood, then interactions among its
         many attributes must be considered.4 Although there is
         considerable intuitive  appeal to  the incorporation  of
         multiple types of biological data, simultaneous analysis
         of information on different scales of measurement (that
         is,  continuous proteomic data and categorical genetic
         data) creates additional analytical challenges. Therefore,
         appropriate  computational  analysis methods  must
         traverse large  numbers of input variables  and handle
         diverse data types. For this study, we employed a two-
         stage analytical strategy. The first step was to filter a list
         of  over 1500 genetic and  proteomic attributes, taking
         interactions within and across data types into  account,
         down to  an analytically  tractable  subset of candidates.
         The second  step involved  careful statistical and biolo-
         gical exploration of  the  filtered  subset of candidate
         attributes, resulting in a final model of AE development.
           For the first (filter) step, we implemented a random
         forest™  (RF)  approach.5  RF is  a machine  learning
         technique that builds a forest of classification  trees by
         sampling, with replacement, from the data and  selecting
         the attribute at each tree node from a random subset of
         all attributes. The RF method offers many advantages for
         the analysis of  diverse biological data. First, it can handle
         a  large number of  input attributes,  both discrete (for
         example, single nucleotide polymorphisms, or SNPs)
         and  continuous (for example,  microarray  expression
         levels or  data  from  high-throughput proteomic techno-
         logies). Second, RF estimates the relative importance of
         attributes in discriminating between classes (in this case,
         AE status), thus providing a metric for feature selection.
         Third, RF produces a highly accurate classifier with an
         internal unbiased estimate of generalizability during the
         forest-building  process.  Fourth,  RF is  robust in  the
         presence  of etiological heterogeneity and missing data.6
         Finally, learning is fast and computation  time is modest
         even for very large data sets.7
           In the second (modeling) step, we took advantage of
         the tractable number of attributes identified by the RF
         filter to explore thoroughly the statistical and biological
         relationships among the attributes and  AE outcomes.
         Decision  trees were  used  to  derive  a  descriptive,
         biologically interpretable model of the functional inter-
         actions among the  attributes associated with  systemic
         AEs. Our final model justified our multiscale  analysis
         strategy, in that it included  the cytokines intercellular
         adhesion molecule-1 (ICAM-1 or CD54),  interleukin-10
         (IL-10) and colony stimulating factor-3 (CSF-3 or G-CSF),
         as well as an SNP in interleukin-4 (IL4). Evaluating our
         final model from  an  immunological perspective,  we
         conclude that AEs in response to  smallpox vaccination
         result from the hyperactivation of inflammatory path-
         ways, leading  to excess recruitment and stimulation of
         monocytes in peripheral tissues. This model is consistent
      with work demonstrating overstimulation of inflamma-
      tory and tissue damage repair pathways  developed in
      earlier studies of AEs after smallpox vaccination.8"11
      Materials and methods
      Study subjects
      Vaccines, study  subjects  and  clinical vaccine  study
      design have been described in detail.9 Briefly, 148 (116
      with  recorded AE  information)  healthy adults were
      enrolled at the Vanderbilt University Medical Center as
      part of a multicenter study of primary immunization
      against smallpox  using the Aventis Pasteur smallpox
      vaccine at National Institutes of Health (NIH) Vaccine
      and Treatment Evaluation Units. NIH-DMID Protocol 02-
      054 was implemented. Volunteers  were eligible if they
      had no smallpox vaccination scar, no history of vaccinia
      virus immunization, normal renal and hepatic serum
      chemistry values,  no contraindications  against immuni-
      zation (pregnancy, immunosuppression or eczema) and
      negative  serum  test  results for  hepatitis  B surface
      antigen,  hepatitis C virus antibody, rapid plasma reagin
      and HIV-1 ELISA. There were a  total of 61 subjects for
      whom both genetic and proteomic data were gathered.
      Individuals were asked to self-identify race; white (60)
      and Asian (1) were the only categories  identified in this
      cohort. To facilitate comparison with earlier studies, and
      because there  was no statistical difference in age, gender
      or race according to AE status (data not shown), the data
      were not adjusted for these covariates.

      Clinical assessments
      Details of the clinical  assessments  have been described
      earlier.9  For   all  study subjects,  a team  of trained
      physicians and nurse providers  examined the medical
      history  and   clinical  symptoms to ensure consistent
      clinical  assessment. Subjects were examined  on five
      visits within the first month after vaccination and were
      assessed for occurrence of an AE. Collection of serum for
      cytokine measurements occurred at the evaluation just
      before vaccination  (baseline) and at  the  evaluation
      between days 5 and  7 post-vaccination  (acute phase).
      Although all  AEs were noted, only systemic AEs were
      considered in this study,  as we expected these to  be
      associated more strongly with serum cytokine expression
      than  would   an  AE  displayed only at the site  of
      inoculation. Systemic  AEs  included fever, generalized
      rash  and  lymphadenopathy.  Specifically,  fever was
      defined as an  oral temperature of > 38.3 °C. Generalized
      rash was defined  as  skin eruptions on non-contiguous
      areas in reference to  the  site of vaccination.  Detailed
      descriptions of the acneiform rashes considered in this
      study  have been  described.12 Lymphadenopathy was
      defined as enlargement or tenderness of regional lymph
      nodes  attributed to vaccination.  For subjects on which
      both  genetic  and  proteomic data were  gathered,  16
      subjects  experienced a systemic AE and 45 subjects did
      not experience an  AE.

      Identification of genetic polymorphisms
      The custom SNP panel used in this study was based  on
      the NCI SNP500 Cancer project13 and has been described
      earlier.14 The  majority  of SNPs included  on the panel
      target soluble factor mediators and signaling pathways,
Genes and Immunity
                                        Previous
TOC
Next

-------
                                                          Integrated analysis of smallpox vaccination data
                                                          DMReifefa/
many  of  which  have  immunological  significance.
Genotyping  for SNPs  was  performed  using  DNA
amplified directly from  Epstein-Barr virus-transformed
B  cells  generated from  peripheral blood  samples col-
lected from each subject. Genotyping was performed at
the Core Genotyping Facility of  the  National Cancer
Institute (NCI, Gaithersburg, MD, USA). Genotypes were
generated   using  the  Illumina   GoldenGate   assay
technology. Of  the  1536  SNPs  assayed, a  total  of
1442 genotypes passed standard quality control filters.
In Reif et  al.,15 the  complete list  of SNPs analyzed is
available.

Quantification  of serum cytokine levels
Serum samples were obtained just prior to vaccination
(baseline)  and  6-9  days  after  vaccination (acute),  as
described  earlier in  detail.9 Serum samples were col-
lected in 5 ml Vacutainer serum separator tubes (Becton
Dickinson, San Jose, CA, USA) and were centrifuged at
700 x g  for lOmin.  The  serum  then was  collected,
aliquoted  into   cryovials   (Sarstedt Inc.,  Numbrecht,
Germany)  and stored at -80 °C until assayed. Cytokine
concentrations  were  determined  using  rolling  circle
amplification  technology-enhanced custom dual anti-
body sandwich immunoassay arrays,  as described.16"19
The expression  levels of  108  protein  analytes were
measured  in  100 ul  serum aliquots from  the patient
samples. Glass  slides held 12 replicate spots  of mono-
clonal  capture  antibodies specific  for each  analyte.
Duplicate samples of sera were incubated for 2 h, washed
and then incubated with  secondary biotinylated poly-
clonal antibodies. The 'rolling circle' method  was then
used to amplify signals.17 Quality control measures were
used to optimize antibody pairs, minimize array-to-array
variation and standardize  procedures of chip  manufac-
turing.17 ATecan LS200 unit was  used to scan arrays and
customized software was  used  to  determine  mean
fluorescence intensities.  In addition, 15 serial dilutions
of recombinant analytes at known concentrations (stu-
died in parallel on each slide) were used to develop best-
fit equations for each analyte, and the  upper and lower
limits of quantitation were defined. Changes  in serum
cytokine  concentrations  were  calculated  as  percent
change from the subject's baseline value because of the
broad individual range of systemic cytokine expression
before and after immunization.

Random Forests
An  RF  is a  collection  of  decision   tree  classifiers,
where each tree in the forest has  been trained using a
bootstrap  sample of  individuals  from  the  data, and
each split  attribute  in the tree  is  chosen  from among
a  random  subset  of  attributes. Classification   of
individuals is on the basis of aggregate voting over  all
trees in the forest.
  Each tree in the forest was constructed as follows from
data having N=61  individuals  and M = 1552  explana-
tory (genetic plus proteomic) attributes:

(1)  The  method chose a training sample by selecting
    N individuals, with replacement, from the entire data
    set.
(2)  At each node in the  tree, m attributes were selected
    randomly  from the entire set of M attributes in the
    data. The absolute magnitude of m was a function of
    the number of attributes in the data set and remained
    constant throughout the forest-building process.
    The method chose the best split at the current node
    from among the subset of m attributes selected above.
    We iterated the second and third steps until the tree
(3)

(4)
    was fully grown (no pruning).
Repetition of this algorithm yielded a forest of trees, each
of which had been  trained  on bootstrap  samples  of
individuals (see Figure 1). Thus, for a given  tree, certain
individuals  were left  out during  training. Prediction
error and attribute importance were estimated  from
these 'out-of-bag' individuals.
  The out-of-bag (unseen) individuals  were  used  to
estimate the  importance of particular attributes accord-
ing to the following logic: If randomly permuting values
of a  particular attribute did not affect  the predictive
ability of trees on out-of-bag samples, then that attribute
was  assigned a low  importance  score. If,  however,
randomly permuting the values of a particular attribute
drastically impaired the  ability  of trees to  correctly
predict the  class  of  out-of-bag  samples,  then the
importance score of that attribute was high.  By running
out-of-bag samples down entire trees during the permu-
tation procedure, attribute interactions were taken into
account when calculating importance scores,  as class was
assigned in the context of other attribute nodes  in the
tree.
  The recursive partitioning trees  comprising an  RF
provide an explicit representation of attribute interaction
that is readily applicable to  the study  of  interactions
among multiple data  types.20-21 These  models  may
uncover  interactions  among  genes, proteins and/or
environmental factors that do not exhibit strong marginal
effects. In addition,  tree methods are suited to dealing
with certain types of genetic heterogeneity, as splits near
the root node define separate model subsets in  the data.
RFs capitalize on the solid benefits of decision trees and
   M attributes
                   M attributes
Entire
datasct


Bootstrap
sample

Out-of-bag
individuals
m attributes
     Stepl
                    Step 2
 Step 3
Figure 1  Construction of individual trees using the Random Forest
method from a full data set of N individuals and M attributes.
Proceeding from the root node, individual subjects were classified
into terminal AE  status leaves according to the value  of that
individual's genetic or proteomic attribute at each node. The steps
correspond to those described in the text.
                                      Previous
                                                                                                       Genes and Immunity

-------
                                  Integrated analysis of smallpox vaccination data
                                                      DMReifefa/
         have  demonstrated excellent  predictive  performance
         when the forest is diverse (that is, trees are not highly
         correlated with each other) and composed of individu-
         ally strong classifier trees.5'22 The RF method is a natural
         approach for  studying  gene-gene,  gene-protein  or
         protein-protein interactions because importance scores
         for  particular attributes take interactions  into account
         without demanding a pre-specified model.23

         Decision trees
         To  represent  the  interactions  among genetic and/or
         proteomic attributes associated with AEs, decision trees
         were chosen to build the  final  model because of their
         ready interpretability and explicit modeling of attribute
         interactions. The tree classified  individual subjects into
         AE  groups  by proceeding down a  dichotomous tree,
         where the genetic or proteomic attribute at each node (or
         split)  was  selected for  the gain  in  information  it
         provided. Gain in information was attributed  when
         knowledge about the variation in this attribute separated
         subjects into appropriate AE classes. When interpreting
         the tree, attributes at each node were taken in the context
         of attributes  at nodes closer to the root—thus allowing an
         explicit representation  of attribute interactions. To aug-
         ment  the  generalizability of  our  final  model,  we
         stipulated that at least five subjects must appear in each
         terminal  (status) leaf and used  10-fold cross-validation
         (CV) to estimate the predictive ability of the final model.
         Although CV accuracy was  reduced by allowing trees
         with less than  five  subjects in  terminal nodes,  CV
         accuracy proved  to be insensitive to changes in other
         tree parameters for these data. We used the implementa-
         tion of the C4.5 decision-tree algorithm provided  in the
         Weka machine learning software package to obtain our
         final model.24

         Data analysis strategy
         RF analysis  was performed using the freely available R
         package randomForest.25-26 This  package is based on the
         original Fortran code available  at  the website cited in
         Breiman  and Cutler.27 RF was used to analyze data sets
         containing each biological data  type separately and in
         parallel, resulting in two stratified data sets (genetic only;
         proteomic only) and a combined data set  (both genetic
         and proteomic attributes). Genetic attributes were treated
         as categorical, whereas proteomic attributes were treated
         as continuous values.  For each genetic, proteomic, or
         combined data set, forests comprised of 10 000 trees were
         grown. Attribute importance was calculated using the
         out-of-bag permutation test described above. The relative
         importance  (rank) of functional genetic attributes and
         related proteomic  attributes was  determined  from the
         mean decrease in  the  Gini index using the out-of-bag
         permutation testing procedure. The relative importance
         determined  from  the  mean decrease in  classification
         accuracy produced nearly identical results both here and
         in extensive simulation studies.28
           The simulation studies28 used the current data as the
         basis  for a range of  simulated  models,  providing
         guidance for the parameters in the  analysis discussed
         here.  The  relative  rank  of  simulated  genetic and
         proteomic predictors was evaluated for a range of filter
         cutoffs and  on both stratified and combined data sets.
         Results from these data-based simulation studies demon-
         strated high confidence that AE-associated attributes

Genes and Immunity
                                        Previous
having relatively meager effects would be ranked in the
top 10% of attributes in RF analysis, and that analysis of
the combined (genetic and proteomic) data was generally
advantageous.  Therefore,  we chose the  top 10%  of
attributes as ranked by RF as candidates for inclusion
in our final model. To represent  the interactions among
genetic and/or proteomic attributes associated with AEs,
we built a decision tree model.
  Biological interpretation of our final model was aided
by  Chilibot  (chip literature robot)  knowledge  mining
software.29  Chilibot  inferred   relationship  networks
among  the  attributes  in  the final model based  on
linguistic analysis  of  relevant   records  from public
biomedical  literature databases.  The natural  language
processing approach used  by Chilibot is  superior  to
standard co-occurrence text mining approaches because
parsing text into sentences can characterize the type of
relationship  (for  example,  inhibition or stimulation)
between  input terms.  The  terms  given  explicitly  to
Chilibot  as input  were  'ICAM-1,'  'IL-10,' 'IL-4'  and
'CSF-3/ as well as the alternate gene names 'CD54' and
'G-CSF  (for  ICAM-1  and  CSF-3,  respectively).  The
software  automatically  adds syntactic synonyms  (for
example,  "IL 10,' 'IL-10,' 'IL10,' and so on) to the search
criteria. Because the goal of this study  is hypothesis
generation,  as opposed to  strict  hypothesis  testing,
Chilibot was used to aid in discovery rather than using
any pre-defined network relationships.
Results
Filtering of important attributes using RFs
Supplementary  Table 1 lists  all attributes having an
importance rank in the top 10% relative to all attributes
in the combined data set.  Figure 2 depicts the attribute
importance score landscape over the entire data set. This
landscape proved robust to changes in RF parameters,
provided  that a sufficiently large forest (10 000 trees) was
grown.  RF  identified  both  genetic   and  proteomic
attributes as  important discriminators  of AE status.
Approximately one-third of the attributes  identified as
important were genetic, with the remaining  two-thirds
being proteomic. Although this distribution among data

              Importance scores for all attributes
                                                 High
                      Attribute importance
Figure 2 Attribute importance 'landscape' showing the shape of
the importance curve ranking all attributes in the combined (genetic
plus  proteomic) data set.  Attributes above  the horizontal  line
indicate a relative importance rank in the top 10% (90th percentile)
of all attributes in the data set.

-------
                                                           Integrated analysis of smallpox vaccination data
                                                           DMReifefa/
types  may reflect systematic  patterns  concerning  the
etiology of AE  outcomes, the bias  toward proteomic
attributes probably arose out of the fact that the cytokine
array was specifically designed to capture variation in
important systemic mediators. In  contrast,  the genetic
data include candidate SNPs in and around genes having
a variety of immunological functions. In addition, with
multiple  SNPs  per  gene, correlation existing among
polymorphisms  (that is, haplotypes) could drive down
RF importance scores for particular SNPs, as RF might
select any SNP from within a  haplotype at a particular
node. Indeed, the IL4 SNP in our final model was part of
a group  of four SNPs in IL4 having nearly identical
importance scores,  and  Haploview  analysis showed
them to be in high linkage disequilibrium,  providing
evidence that these genetic polymorphisms are inherited
as a haplotype.30 In this context, linkage disequilibrium
has an impact akin to etiological heterogeneity, which is a
concern in any association study.  The  heterogeneity
concern is part of the rationale for using RF as a first-
stage filter that  identifies a handful (the top 10%) of
attributes  for  further  consideration.  The  effect  of
repeated samplings over many thousands of trees gives
all attributes an unbiased opportunity to demonstrate AE
association, even if importance  scores for groups of SNPs
in linkage  disequilibrium are slightly tamped  down.
Thus,  attributes  whose  importance  scores  may be
tamped down by phenomena such as linkage disequili-
brium still have a chance to surpass our 10% importance
threshold over a sufficiently large forest of resampled
trees,  whereas  slightly   down-weighted  importance
scores may push interesting attributes below an  overly
strict first-stage threshold in a  smaller forest. Consider-
ing the RF importance rank of attributes included in our
final model relative to all attributes in the combined data
set, all three proteomic attributes were ranked in the top
1%, and the IIA SNP (rs 2243290) was ranked in the top
5%. Relative to their respective data types, the IIA  SNP
was ranked in the top 1% among all attributes  in the
genetic data  set, and ICAM-1, CSF-3 and  IL-10 were
ranked in the top 1% among all proteomic attributes.

Modeling the association of genetic and proteomic biomarkers
with AEs
Having  filtered  out the  noise using RFs,  we used  a
decision  tree representation  to  explore  interactions
among the attributes in our filtered list related  to AE
status. The final decision tree model is shown in Figure 3.
Our final model included four variables—three proteo-
mic  attributes and  one  genetic  attribute.  Change in
ICAM-1 concentration comprises the root node  of the
tree, with subsequent nodes composed of change in IL-10
concentration, a  SNP  in IL4, and  change  in  CSF-3
concentration. Imposing our minimum  of  five indivi-
duals  per terminal  (AE status) leaf, this tree correctly
classified 89% of individuals (with seven misclassifica-
tions)  in the  full data  set and achieved a  10-fold  CV
(prediction) accuracy of 75%.
  Figure 4 characterizes  the  biological relationships
among the attributes in the tree using Chilibot. Inter-
active relationships were characterized into one of three
types based on the verbs connecting pairs of attributes in
the biomedical  literature as   follows:  (1) Stimulatory
relationships were connected by verbs such as 'activate,'
'stimulate' or  'enhance.' (2) Inhibitory relationships were
                                       Previous
                         (7/2)
Figure 3  Final model of genetic and proteomic factors contributing
to AE development. Each node (oval) constitutes a decision point
based on the genotype of genetic attributes (IL4 SNP) or whether the
concentration change from baseline in proteomic attributes (ICAM-
1, IL-10 and CSF-3) was above (upward-pointing arrows) or below
(downward-facing arrows) a calculated threshold.  Starting at the
root node (ICAM-1), subjects were classified into AE status leaves
(rectangles)  by proceeding  along the decision points  at each
attribute node. Given below each terminal leaf is the total number
of  subjects classified into that AE status group/the number of
subjects incorrectly assigned to that AE status group.
Figure 4 Biological relationships among the attributes in our final
model  characterized using Chilibot. Connections between each
attribute node (oval) are denoted according to the type of interactive
relationship they represent: stimulatory (solid), both  stimulatory
and inhibitory (dotted) or neutral (dashed). Arrowheads indicate
that interactions between particular biological attributes are  bi-
directional.
connected by  verbs such  as 'decrease,' 'attenuate'  or
'inhibit.' (3) Neutral relationships were assigned when
the nature of the relationship could not be determined
contextually. Mining the biomedical literature suggested
interactive relationships  connecting all  of  the  attribute
nodes in our final model.  Stimulatory,  inhibitory  or
neutral pair-wise interactive relationships  were identi-
fied between each of ICAM-1,  IL-10,  IIA  and CSF-3.


                                               Genes and Immunity

-------
                                  Integrated analysis of smallpox vaccination data
                                                      DMReifefa/
         Thorough examination of the networks inferred facili-
         tated the  biological  interpretation  of  the  final model
         discussed below.
         Discussion

         Our final model provides an immunologically plausible
         and testable biological mechanism of AE occurrence after
         smallpox vaccination that  includes both  genetic and
         proteomic factors. The analytical strategy used is appro-
         priate for the study of complex phenotypes, as outcomes
         such  as AE development likely result from the interplay
         of multiple genetic, proteomic  and environmental fac-
         tors.31'32 The  decision  tree trained on  the attributes
         passing our RF filter proposes a solid  biological model
         of AE development.
          The  attributes  included in  this  tree point to  an
         important role  of one  particular immune  cell type:  is
         monocytes. Monocytes  are  bone marrow-derived circu-
         lating blood cells that  are  precursors  of tissue macro-
         phages. Monocytes are recruited actively to the sites of
         inflammation, where they differentiate into macrophages
         in tissues. These macrophages  play important roles in
         coordinating both innate  and  adaptive immune re-
         sponses. Macrophages  are  activated by  microbial pro-
         ducts such as endotoxin and by T-cell cytokines such as
         interferon-y.  Activated macrophages phagocytose and
         kill microorganisms, secrete pro-inflammatory cytokines
         and present antigens to helper T cells.
          The root node of the tree  we  developed is  ICAM-1
         (CD54), where small changes from baseline  concentra-
         tion (<11%) of ICAM-1 predict a non-AE response to
         vaccination and high changes from baseline  concentra-
         tion (>11%) point toward AE risk, depending on factors
         in subsequent  nodes. ICAM-1  is mainly expressed  on
         endothelial cells, T  cells,  B cells  and  monocytes.  It
         functions in cell-cell adhesion, which plays a crucial
         role in  monocyte differentiation  into  macrophages, as
         entry into tissues is  necessary.  In  addition,  ICAM-1
         expression is upregulated in mature monocytes,33 aiding
         in cell  adhesion and the eventual differentiation into
         macrophages.  Circulating  monocytes  are in  random
         contact with endothelial cells, and the adhesion molecule
         E-selectin slows the monocyte by inducing rolling of the
         monocyte  along  the  endothelial  surface  before  firm
         attachment to  vascular cell  adhesion molecule  1  or
         ICAM-1, which interact with integrins  on the monocyte
         surface. Once  the  monocyte  is  tightly bound, it then
         migrates between endothelial cells.34'35 Excessive levels
         of ICAM-1 might cause an 'over-recruitment' of mono-
         cytes into tissue,  triggering  an unnecessarily  active
         innate inflammatory response.
          For individuals with large changes in  ICAM-1, the next
         node in the tree is IL-10, where changes from baseline
         >85%  are associated with AEs.  IL-10 is produced  by
         activated macrophages and  some helper T cells for which
         a major function is to inhibit activated macrophages and,
         therefore, to maintain homeostatic control of innate and
         cell-mediated immune reactions. Changes in IL-10 levels
         may indicate an imbalance in this delicate homeostasis,
         leading to AEs.
          For individuals with mild changes in IL-10 concentra-
         tion, the next node is an SNP in the gene encoding IL-4.
         In an earlier genetic study of two  vaccination cohorts
      (including a subset of individuals in the present data),
      this same IL-4 polymorphism was associated with AEs
      (P = 0.05 and  P = 0.06 in the first  and  second cohort,
      respectively).15 Interestingly,  by including proteomic
      factors in this study, our model indicates that the AE
      risk conferred by this SNP is dependent on proteomic
      context. IL-4 is a  cytokine produced mainly by the TH2
      subset of CD4+ helper T cells, whose functions include
      the induction of  the differentiation  of TH2 cells from
      naive CD4 + precursors, stimulation of IgE production by
      B  cells  and  suppression of  interferon-y-dependent
      macrophage  functions.36"38  Although direct functional
      significance of the SNP is unknown, it is reasonable that
      the different  genotypes  could result in  functionally
      different versions of  the  IL-4 protein  or  in  different
      bioavailability levels of IL-4. The fact that multiple SNPs
      in IL4  achieved  nearly identical  importance  scores
      indicates that  variation within the IL4 gene region may
      be related functionally  to the development of AEs.
      Because of the intricate cross-talk between macrophages
      and the TH2  response in maintaining homeostasis, it is
      plausible that  the major IL4 genotype (CC) is associated
      with  calming  the activated macrophage response and
      directing the  acquired immune system to  progress in
      response to vaccine presentation, whereas the variant
      genotypes  (AC or AA) fail to  calm the innate response,
      presenting increased AE  risk.
        For individuals having one of the variant genotypes at
      IL4, the lowest node of the tree is CSF-3 (G-CSF). G-CSF
      is a cytokine produced by activated T cells, macrophages
      and endothelial cells at the sites of infection, which acts
      on  the bone  marrow to mobilize  and  increase  the
      production of neutrophils to replace those consumed in
      inflammatory  reactions. In our model, increased levels of
      CSF-3 after vaccination  (change >78%) indicated in-
      creased risk of suffering an AE. This finding implies
      another possible over-recruitment event in the develop-
      ment  of AEs,  as neutrophils have been associated with
      host  tissue damage  and failure  to terminate  acute
      inflammatory  responses.39 This  reaction is consistent
      with the types of AE symptoms observed in this study
      and with the overall proposed biological mechanisms of
      AE development.
        The results  of  this study provide a viable biological
      hypothesis of  AE occurrence after smallpox vaccination
      that is experimentally testable. Our model includes both
      genetic and proteomic biomarkers. Allowing for such an
      integrative model  is  an  important  strength  of  our
      analytical strategy. It is increasingly recognized that the
      pathophysiology of complex clinical outcomes hinges on
      biological factors  acting on multiple levels.40 Therefore,
      the formulation of  robust etiological  models must take
      this inherent complexity into account and capitalize on
      the power of modern  experimental  data-generating
      techniques.
        We conclude that AEs after smallpox vaccination result
      from hyperactivation of inflammatory signals, leading to
      excess recruitment and  stimulation of  monocytes in
      peripheral tissues. Our  analysis  identifies  a set  of
      interacting genetic and proteomic candidates associated
      with AEs, such as ICAM-1, IL-10, IIA and CSF-3. As the
      proteomic  measurements occurred early in the period
      after vaccination,  before most AEs presented themselves
      clinically, our model could be used as a diagnostic tool in
      the prediction of AEs. Of course, the ultimate goal of
Genes and Immunity
                                       Previous
TOC
Next

-------
                                                           Integrated analysis of smallpox vaccination data
                                                           DM Reif eta/
such a study is the identification and characterization of
biological risk factors contributing to the inappropriate
immune response to vaccination.  We present a hypo-
thesized mechanism of AE development that  targets
specific elements of systemic inflammatory pathways for
further study.
  Future studies should further evaluate the reproduci-
bility of the current model, given that the  number of
vaccinated subjects meeting the criteria for inclusion and
having both genetic  and proteomic data was relatively
small.  Ideally, our  model would  be  evaluated for
replication in an entirely independent sample. However,
the validity of our current model can be assessed through
the statistical process of internal CV (where our model
achieved a 75%  prediction  accuracy)  and through
comparison of these results with our earlier studies of
genetic15 or proteomic8 data alone. In this study, our RF
approach with the combined data identified all attributes
highlighted in the  earlier  proteomic  study8 (ICAM-1,
CSF-3, Eotaxin and TIMP-2) and two of the three genes
highlighted in the earlier genetic study15 (MTHFR and
IIA but  not IRF1). Although the IRF1 polymorphisms
were   not   ranked  in  the  top  10%  of  all  attribute
importance  scores  in  the  combined data   set,  these
attributes would have passed the top 10% filter criteria
relative to  only genetic attributes. Given that the subset
of subjects used in this study (that is, those having both
genetic and proteomic  data) has  only partial overlap
with subjects in either of the earlier studies, we feel that
the current results are remarkably stable.
  Finally, our hypothesized model must be tested at the
bench. The functional consequences of genetic variability
in IL4  should be characterized fully. Time series studies
with dense measurement points are needed to shed light
on  the  dynamic interplay between  the  signaling of
ICAM-1, IL-10 and CSF-3.  Additional data are needed
on  the effects  of these cytokines in other physiological
compartments. Careful  assessment of external  factors
(such as nutrition, fitness and relevant environmental
exposures)  influencing  protein  expression  should be
considered  in future  studies. The results from this study
suggest that analysis of the  molecular and cellular  basis
of complex clinical phenomena will require an experi-
mental approach that takes into  account the broader
spatial and temporal physiological context of complex
biological systems.
Acknowledgements

This work was supported by the National Institutes of
Health  (NIH)/National Institute of Allergy  and  Infec-
tious Diseases  (NIAID) Vaccine Trials  and  Evaluation
Unit (contract N01-AI-25462, study DMID 02-054); NIH/
NIAID  (Grants R21-AI-59365,  K25-AI-064625 and R01-
AI-59694) and NIH/National Institute of General Med-
ical Sciences (NIGMS) (Grant  R01-GM-2758). Cytokine
analysis was a kind gift of Stephen Kingsmore, PhD, and
Molecular Staging Incorporated. Genotype analysis was
a kind gift of Stephen Chanock, MD, and the NCI Center
for Cancer Research. Kathryn Edwards, MD, coordinated
the original acquisition of the data  analyzed  for this
study.  The  United  States  Environmental  Protection
Agency (EPA),  through  its  Office  of Research and
                                       Previous
Development,  collaborated in  the research  described
here. It has  been subjected  to Agency review and
approved for publication, although it does not necessa-
rily represent the views or polices of the US EPA.
References

 1 Kemper AR, Davis MM, Freed GL. Expected adverse events in a
   mass smallpox vaccination campaign. Eff Clin Pract 2002; 5: 84-90.
 2 Reif DM, White BC, Moore JH. Integrated analysis of genetic,
   genomic and proteomic  data. Expert Rev Proteomics 2004; 1:
   67-75.
 3 Maniolo TA, Collins FS. Genes,  environment, health, and
   disease: facing up to complexity. Hum Hered 2007; 63: 63-66.
 4 Nicholson JK. Global systems biology, personalized  medicine
   and molecular epidemiology. Mol Syst Biol 2006; 3: 1-6.
 5 Breiman L. Random forests. Much Learn 2001; 45: 5-32.
 6 Lunetta KL, Hayward LB, Segal J, Van EP. Screening large-
   scale association  study  data:  exploiting  interactions using
   random forests. BMC Genet 2004; 5: 32.
 7 Robnik-Sikonja M. Improving random forests. Proc  Eur Conf
   Much Learn 2004; 3201: 359-370.
 8 McKinney BA, Reif DM,  Rock MT, Edwards KM, Kingsmore
   SF, Moore JH et al. Cytokine expression patterns associated
   with systemic adverse events following smallpox immuniza-
   tion. / Infect Dis 2006; 194: 444-453.
 9 Rock MT, Yoder SM, Talbot  TR, Edwards KM,  Crowe Jr JE.
   Adverse events after smallpox immunizations are associated
   with alterations in systemic cytokine levels. / Infect  Dis 2004;
   189: 1401-1410.
10 Rock MT, Yoder SM, Talbot  TR, Edwards KM,  Crowe Jr JE.
   Cellular immune responses to diluted and undiluted Aventis
   Pasteur smallpox vaccine. / Infect Dis 2006; 194: 435-443.
11 Talbot TR, Stapleton JT, Brady RC, Winokur PL, Bernstein DI,
   Germanson T et al. Vaccination success  rate  and reaction
   profile  with  diluted and undiluted  smallpox vaccine: a
   randomized controlled trial. JAMA 2004; 292: 1205-1212.
12 Talbot TR, Bredenberg HK,  Smith M, LaFleur  BJ,  Boyd A,
   Edwards KM. Focal and generalized folliculitis following
   smallpox vaccination among vaccinia-naive recipients. JAMA
   2003; 289: 3290-3294.
13 Garcia-Closas M, Malats N, Real FX, Yeager  M, Welch  R,
   Silverman D  et al. Large-scale evaluation of candidate genes
   identifies  associations between VEGF polymorphisms and
   bladder cancer risk. PLoS Genet 2007; 3: e29.
14 Packer BR, Yeager M, Burdett  L, Welch R, Beerman M,  Qi L et al.
   SNPSOOCancer: a public resource for sequence validation, assay
   development, and frequency analysis for genetic variation in
   candidate genes. Nucleic Acids Res 2006; 34: D617-D621.
15 Reif DM, McKinney BA, Motsinger-Reif AA,  Chanock SJ,
   Edwards KM, Rock MT et al. Genetic basis for adverse events
   after smallpox vaccination. / Infect Dis 2008; 198: 16-22.
16 Kader HA,  Tchernev VT, Satyaraj E, Lejnine  S, Kotler G,
   Kingsmore SF et al. Protein microarray analysis of disease
   activity in pediatric inflammatory bowel disease demonstrates
   elevated serum PLGF, IL-7, TGF-betal, and IL-12p40 levels in
   Crohn's disease and  ulcerative colitis patients  in remission
   versus active disease. Am J Gastroenterol 2005; 100: 414^23.
17 Perlee L, Christiansen J, Dondero R,  Grim wade  B, Lejnine S,
   Mullenix  M  et  al. Development and  standardization  of
   multiplexed  antibody microarrays for  use in  quantitative
   proteomics. Proteome Sci 2004; 2: 9.
18 Schweitzer B, Wiltshire S, Lambert J, O'Malley S, Kukanskis K,
   Zhu Z et al. Inaugural article:  immunoassays with rolling circle
   DNA  amplification: a versatile  platform for   ultrasensitive
   antigen detection. Proc Natl Acad Sci USA 2000; 97: 10113-10119.
19 Schweitzer B, Roberts S, Grimwade B, Shao W, Wang M, Fu Q
   et al. Multiplexed protein profiling on microarrays by rolling-
   circle amplification. Nat Biotechnol 2002; 20: 359-365.


                                              Genes and Immunity
TOC
Next

-------
                                      Integrated analysis of smallpox vaccination data
                                                            DM Reif eta/
          20 Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification
             and Regression Trees. Chapman & Hall: New York, 1984.
          21 Province MA, Shannon WD, Rao DC. Classification methods
             for confronting heterogeneity. Adv Genet 2001; 42: 273-286.
          22 Bureau A, Dupuis J, Falls K, Lunetta KL, Hayward B, Keith TP
             et al. Identifying SNPs predictive of phenotype using random
             forests. Genet Epidemiol 2005; 28: 171-182.
          23 McKinney BA, Reif DM,  Ritchie MD, Moore JH. Machine
             learning for  detecting  gene-gene interactions:  a review. Appl
             Bioinformatics 2006; 5: 77-88.
          24 Witten IH, Frank E. Data Mining: Practical Machine  Learning
             Tools   and  Techniques,  2nd  edn.  Morgan  Kaufmann:  San
             Francisco, 2005.
          25 Ihaka  R, Gentleman R. R: a language for  data analysis  and
             graphics. / Comput Graph Stat 1996; 5: 299-314.
          26 R Development Core Team. R: a language and environment
             for statistical computing. R foundation for statistical comput-
             ing. Available at http://www.R-project.org, 2006.
          27 Breiman L,  Cutler A. Random forests. Available at http://
             www.stat.berkeley.edu/ ~ breiman/RandomForests/cc_home.
             htm, 2004.
          28 Reif DM, Motsinger AA, McKinney BA, Crowe Jr JE, Moore JH.
             Feature selection using  a random forests classifier for the
             integrated analysis of multiple data types. In: Proceedings of the
             IEEE Symposium on Computational Intelligence in Bioinformatics
             and Computational Biology, 2006, pp 171-178.
          29 Chen  H,  Sharp  BM.  Content-rich biological  network con-
             structed by  mining PubMed abstracts.  BMC  Bioinformatics
             2004; 8: 5-147.
          30 Barrett JC, Fry B, Mailer J, Daly MJ. Haploview: analysis and
             visualization of LD and haplotype maps. Bioinformatics 2005;
             21: 263-265.
       31 Moore JH. The ubiquitous nature of epistasis in determining
          susceptibility to  common human diseases. Hum Hered  2003;
          56: 73-82.
       32 Wilke RA, Reif DM, Moore JH.  Combinatorial pharmacoge-
          netics. Nat Rev Drug Discov 2005; 4: 911-918.
       33 Most J, Schwaeble W, Drach  J, Sommerauer A, Dierich MP
          Regulation  of the  expression of ICAM-1 on human mono-
          cytes and monocytic tumor cell lines.  / Immunol 1992; 148:
          1635-1642.
       34 Peters W, Charo IF. Involvement of chemokine receptor 2 and
          its ligand, monocyte chemoattractant protein-1, in the devel-
          opment of atherosclerosis: lessons from knockout mice. Curr
          Opin Lipidol 2001; 12: 175-180.
       35 Zittermann SI, Issekutz AC.  Basic fibroblast growth factor
          (bFGF, FGF-2) potentiates leukocyte recruitment to inflamma-
          tion by enhancing endothelial adhesion molecule expression.
          Am J Pathol 2006; 168: 835-846.
       36 Eslick J, Scatizzi JC, Albee L, Bickel E, Bradley K, Perlman H.
          IL-4 and IL-10 inhibition of spontaneous monocyte apoptosis
          is associated  with Flip upregulation. Inflammation 2004;  28:
          139-145.
       37 Mangan  DF, Robertson  B,  Wahl  SM.  IL-4  enhances
          programmed  cell  death  (apoptosis) in stimulated  human
          monocytes. /  Immunol 1992; 148: 1812-1816.
       38 Soruri A, Kiafard Z, Dettmer C, Riggert J, Kohl J, Zwirner J. IL-
          4 down-regulates anaphylatoxin receptors in monocytes and
          dendritic cells and impairs  anaphylatoxin-induced migration
          in vivo. J Immunol 2003; 170: 3306-3314.
       39 Serhan CN, Savill J. Resolution of inflammation: the beginning
          programs the end. Nat Immunol 2005; 6: 1191-1197.
       40 Hood L. Systems biology: integrating  technology, biology, and
          computation.  Mech Ageing Dev 2003; 124: 9-16.
          Supplementary Information accompanies the paper on Genes and Immunity website (http://www.nature.com/gene)
Genes and Immunity
                                            Previous
TOC
Next

-------
ELSEVIER
Available online at www.sciencedirect.com

     -"•   ScienceDirect

    Toxicology in Vitro 22 (2008) 296-300
                                           Toxicology
                                            in Vitro
                                                                                        www. elsevier. com/locate/toxinvit
      9,10-Phenanthrenequinone  induces  DNA  deletions  and  forward
                      mutations via oxidative mechanisms  in  the
                                yeast  Saccharomyces  cerevisiae

                Chester E.  Rodriguez a>1, Zhanna Sobol  b>1, Robert H. Schiestl b'*
          a Department of Pharmacology, Geffen School of Medicine, Center for  the Health Sciences, Los Angeles,  CA 90095-1735, USA
 Departments of Pathology and Environmental Health Sciences, Geffen School of Medicine and School of Public Health, UCLA, Los Angeles, CA, USA
                                     Received 15 June 2007; accepted 4 September 2007
                                           Available online 12 September 2007
Abstract

   The estimated cancer risk from diesel exhaust particles (DEP) in the air is approximately 70% of the cancer risk from all air pollutants.
DEP is comprised of a complex mixture of chemicals whose carcinogenic potential has not been adequately assessed. The polycyclic aro-
matic hydrocarbon quinone 9,10-phenanthrenequinone (9,10 PQ) is a major component of DEP and a suspect genotoxic agent for DEP
induced DNA damage. 9,10 PQ  undergoes redox cycling to produce reactive oxygen species that can lead to oxidative DNA damage.
   We used two systems in the yeast Saccharomyces cerevisiae to examine possible differential genotoxicity of 9,10 PQ. The DEL assay
measures intra-chromosomal homologous recombination leading to  DNA deletions and the CAN assay measures forward mutations
leading to canavanine resistance. Cells were exposed to 9,10 PQ aerobically and anaerobically followed by DNA damage assessment.
The results indicate that 9,10 PQ  induces DNA deletions and point mutations in the presence of oxygen while exhibiting negligible effects
anaerobically. In contrast to the cytotoxicity observed aerobically, the anaerobic effects of 9,10 PQ seem to be cytostatic in nature, reduc-
ing growth without affecting cell  viability. Thus, 9,10 PQ requires oxygen for genotoxicity while different toxicities exhibited aerobically
and anaerobically suggest multiple mechanisms of action.
© 2007 Elsevier Ltd. All rights reserved.

Keywords: 9,10-Phenanthraquinone;  1,4-Benzoquinone; DNA deletions; DEL assay; Canavanine assay; Forward mutations; Saccharomyces cerevisiae;
Yeast
1. Introduction

  Exposure to diesel exhaust particles (DEP) in urban air
represents an important  cancer risk factor. According to
the  Multiple Air Toxics Exposure Study for  the  South
Coast Air Basin (MATES  II study) in  Southern Califor-
nia, DEP emissions represent 70% of all carcinogenic risk
from ambient measurements in the South Coast Air Basin
(2000). The levels of airborne DEP  have increased dra-
  Abbreviations: 9,10 PQ, 9,10-Phenanthrenequinone; 1,4 BQ, 1 4-Benzo-
quinone; DEP, Diesel Exhaust Particles.
 * Corresponding author. Tel.: +310 267 2087; fax: +310 267 2578.
   E-mail address: rschiestl@mednet.ucla.edu (R.H. Schiestl).
 1 Contributed Equally.

0887-2333/S - see front matter © 2007 Elsevier Ltd. All rights reserved.
doi:10.1016/j.tiv.2007.09.001
matically over the last few  decades  due to the  increased
use of diesel-based engines, which provide higher fuel effi-
ciency and lower carbon dioxide emissions than gasoline-
based engines, but emit between 30-100 times more par-
ticulate  matter into  the atmosphere (Ma and Ma, 2002;
Peterson and  Saxon, 1996). It has been  estimated that
DEP constitute as much as 40% of the respirable particu-
late matter in  a city such  as Los Angeles where the daily
human intake has been estimated to be as much  as 300 (ig
(Ma and Ma,  2002). Considering that the  lung  clearance
of DEP has been estimated to be about 18 days, signifi-
cant accumulation  can be  expected  to  take  place (Sun
et al., 1984).
   The composition of DEP consists of an inert carbona-
ceous core onto which a complex  mixture of chemical
                                     Previous

-------
                                 C.K Rodriguez et al. I Toxicology in Vitro 22 (2008) 296-300
                                                                                                            297
entities  is adsorbed. Over 450  different chemical species
have been identified in the organic layer of DEP including
transition metals, a variety of polycyclic aromatic hydro-
carbons (PAHs), nitroaromatic hydrocarbons,  quinones,
aldehydes, ketones, aliphatic hydrocarbons, and heterocy-
clic compounds (Li et al., 2004). It is not clear, however,
which components from the complex  mixture of chemicals
mediate carcinogenicity induced by  DEP,  but quinones
represent suspect agents in this process since DEP-induced
toxicity is consistent with quinone-like chemistry (Arimoto
et al., 1999; Bai  et al., 2001; Sagai et al., 1993). Quinones
such as 9,10 PQ are substantial constituents of DEP and
can mediate the production  of reactive  oxygen species,
potentially leading to oxidative DNA  damage (Bolton
et al., 2000). Moreover, quinones can also exert DNA dam-
age through their actions as electrophiles, resulting in cova-
lent adducts at nucleophilic centers of DNA. In particular,
9,10 PQ is one of the most toxic quinones found in DEP
with measured  levels of about 24 (ig per gram of DEP
(Cho et al., 2004). In this study, the genotoxicity profile
of 9,10  PQ  was examined by two  different assays estab-
lished in the yeast S. cerevisae. The yeast DEL assay is a
measure of  chromosomal rearrangements (Schiestl,  1989;
Schiestl et al., 1989) while the CAN assay represents a mea-
sure for point mutations (Whelan et  al., 1979).  The DEL
assay measures  a  DNA deletion event that occurs as a
result of  intrachromosomal  homologous  recombination
between 400 base pair repeats separated by 6 k basepairs
of DNA. An elevated frequency of homologous recombi-
nation  is associated with genomic  instability and  an
increased risk of cancer (Bishop and Schiestl, 2003). Thus,
the DEL assay measures a genotoxic  endpoint that is rele-
vant in carcinogenesis. The CAN assay measures resistance
to the toxic  compound canavanine. Canavanine is an ana-
log of arginine and enters the cell via the arginine-specific
permease CAN1. A forward point mutation event disrupts
the function of CAN1 thereby inhibiting the uptake of can-
avanine and rendering the cell  canavanine  resistant. The
goal of this study is to gain  insight  about how 9,10 PQ
leads to DNA damage. Incubations were carried out under
aerobic  and anaerobic conditions to distinguish  redox
cycling from the direct actions of the quinone. Since 9,10
PQ is  not considered  an electrophile, observable DNA
damage was expected to be strictly aerobic. For compari-
son purposes,  the genotoxicity  of  1,4-benzoquinone (1,4
BQ) was also examined as a quinone whose toxicity has
been exclusively  attributed to  its actions as an electrophile
and does not  undergo  redox cycling (Kondrova  et al.,
2007).

2. Materials and methods

2.1. Chemicals

   9,10-Phenanthrenequinone  (CAS 84-11-7),  1,4-benzo-
quinone (CAS 106-51-4) and Canavanine (CAS 543-38-
4) were  purchased from Aldrich (Milwaukee, WI).
2.2. Yeast strains and growth conditions

   The S.  cerevisiae strain used for the DEL assay is the
diploid  strain RSI 12 with the genotype: MATa/MATa,
ura3-52,  leu2-3,112/leu2-A98,  trp5-27/TRP5,   arg4-3/
ARG4,  ade2-40/ade2-101, ilvl-92/ILVl,   HIS3::pRS6/
his3 A  300, LYS2/lys2-801. This strain was created by
R.H. Schiestl (Schiestl and Prakash, 1988).
   The S.  cerevisiae strain used for the  canavanine  resis-
tance assay was wildtype Y433 (provided by P. Hieter) with
the following genotype: MATa ura3-52 leu2-A98 ade 2-101
ilvl-92 his3-A200 Iys2-801.
   RSI 12 and Y433 cells were grown on  YPAD (1% yeast
extract,  2% peptone, 2% dextrose, and 80.0 (ig/ml adenine)
plates solidified with 2% agar.
   RSI 12 cells were  grown in  suspension using synthetic
complete liquid growth media lacking leucine (SC-leu med-
ium). Y433 cells were grown in YPAD (1% yeast extract,
2% peptone, 2% dextrose, and 80.0 (ig/ml  adenine)  syn-
thetic liquid medium.

2.3. DEL Assay

   The  yeast DEL assay was performed  as  previously
described  with  minor  modifications (Brennan  et  al.,
1994). In brief, part of an RSI 12 colony was used to inoc-
ulate 5.0 ml of SC-leu liquid medium. After overnight incu-
bation at 30 °C  and  275 RPM, the concentration of cells
was determined by hemocytometer-counting, and aliquots
containing  1.0  x 107 cells were adjusted to 5.0ml  with
SC-leu medium before exposure to 9,10 PQ  or 1,4 BQ for
17 hours at 30 °C.  The final concentrations  of 9,10 PQ in
the media were:  2.5 (iM, 5 (iM, 10 (iM, 15 (iM and 20 (iM
for aerobic  conditions  and 20 (iM, 30 (iM, 40 (iM and
50 (iM under anaerobic conditions. The final  concentra-
tions for 1,4 BQ under both conditions were 60 (iM and
100 (iM. Quinones were dissolved in acetone which consti-
tuted 0.4% of the final incubation mixture. Incubations
were carried out in 25 ml Erlenmeyer flasks equipped with
gas tight rubber septa. Anaerobic conditions were achieved
by purging the cultures with nitrogen gas, using a syringe
needle through the septa, for one hour at 30 °C with  shak-
ing prior to addition of test compound. Aerobic experi-
ments were  performed similarly,  with the exception that
the flasks were not swept with nitrogen and the septa con-
tained a small opening to introduce air into the headspace.
   Following incubation,  cells were  pelleted (3200 rpm,
10 minutes,  4 °C),  washed, and re-suspended in  1 ml of
sterilized distilled water. Cells were then counted by hemo-
cytometer, diluted, and a volume corresponding to 50-100
cells was plated on synthetic complete (SC)  medium and
synthetic complete medium lacking histidine (SC-his).
Plates were incubated at 30 °C for 2-3 days before being
analyzed for viable and histidine revertant colonies, respec-
tively. A dose-dependent doubling of the control recombi-
nation frequency and  statistically significant increase in
recombination frequency over  the control is considered a
                                  Previous

-------
298
                                   C.K Rodriguez et al. I Toxicology in Vitro 22 (2008) 296-300
positive response in the yeast DEL assay (Kirpnick et al.,    doubling of the untreated frequency was considered a posi-
2005).                                                       tive effect.
2.4. Canavanine-resistance assay
                                  3. Results
   The canavanine-resistance forward mutation assay was
carried  out similarly as the DEL assay and as previously
described with minor modifications (Whelan et al., 1979).
In brief, a volume of 5.0 ml of YPAD liquid medium was
inoculated with a single Y433 colony. After overnight incu-
bation at  30 °C and  275 rpm, cells were counted with a
hemocytometer, and adjusted to  1.0 xlO7 cells/ml in a vol-
ume of 5.0 ml  of YPAD liquid medium before incubation
with  9,10  PQ  or 1,4 BQ for seven hours under aerobic
and anaerobic  conditions, as similarly described for the
DEL assay. The final concentrations in  the  media were
20 uM 9,10 PQ (aerobic), 60 (iM 9,10 PQ (anaerobic) and
100 (iM  1,4 BQ. The  resulting  cultures were centrifuged
(3200 rpm, 10 minutes, 4 °C), washed, and re-suspended
in 1.0 ml of sterilized  distilled water. Following  counting
by hemocytometer, samples were diluted and a volume cor-
responding to  100-200 cells was plated  on SC medium
lacking arginine in the presence and absence of canavanine
at a concentration of 219 (iM. Plates  were then incubated
at 30 °C for 2-3 days and subsequently analyzed for viable
and canavanine-resistance mutants, respectively.  Similarly
to the DEL assay, a statistically significant increase and
                                  3.1. DNA Deletions in diploid RSI 12 cells

                                     The DEL assay measures DNA deletions, a subgroup of
                                  chromosomal rearrangement  events.  These  events most
                                  likely occur in  response to DNA double strand  breaks
                                  (Galli and Schiestl, 1998) and a wide range of clastogenic
                                  compounds induce DEL recombination (Kirpnick et al.,
                                  2005). Thus, DNA deletion  induction is a gauge  of a com-
                                  pound's ability to cause DNA double strand breaks. Table
                                  1 depicts the effect  of 9,10 PQ  and 1,4 BQ on the S.  cerevi-
                                  siae diploid strain RSI 12 under aerobic and anaerobic con-
                                  ditions. 9,10 PQ leads  to  significant  induction of DNA
                                  deletions starting at the 5 (iM  dose. Cell viability and gen-
                                  eration number begin to significantly decrease at the 15 (iM
                                  exposure concentration. In the absence of oxygen, 9,10 PQ
                                  does not induce DNA deletions  or diminish  cell viability
                                  even  though  the  exposure concentrations  range  from
                                  20 uM to 50 uM. At the 40 (iM dose 9,10 PQ induces a sig-
                                  nificant decrease in generation number. The 50 (iM  dose
                                  also  decreases the  generation  number  but  the decrease is
                                  not statistically significant.  The  control quinone  1,4 BQ
                                  does not lead to an increase in DNA  deletions and  does
Table 1
DNA deletions, survival and generation number in diploid S. cerevisiae cells under aerobic and anaerobic conditions

9,10 PQ (Aerobic)
Dose (uM)
% Plating efficiency
generations
DEL Events/104 viable cells
0.0
57.9 ±13. 19
4.78 ±0.14
0.93 ±0.39
2.5
49.06 ± 12.66
4.74 ±0.23
1.46 ±0.42
5.0
50.41 ± 13.87
4.60 ± 0.24
1.84 ±0.44*
10.0
52.15 ±6.93
4.58 ± 0.22
2.08 ± 0.52*
Induction of DEL
                           1.00
                                           1.57
                                                            1.97
                                                                           2 23
                                                               15.0
                                                               23.42 ± 13.30*
                                                               0.96 ±0.25™
                                                               7.98 ±1.90™

                                                               8.56
                                                                20.0
                                                                1.79 ± 1.03**
                                                                0.77 ± 0.58**
                                                                22.40 ± 15.03*

                                                                24.03
9,10 PQ (Anaerobic)
Dose (uM)
% Plating efficiency
generations
DEL Events/104 viable cells

Induction of DEL
0.0
56.12 ± 10.17
4.89 ±0.18
1.16 ±0.60

1.00
20.0
60.14 ± 16.6
4.66 ±0.24
0.70 ±0.24

0.61
30.0
56.81 ±8. 12
4.49 ± 0.30
0.98 ± 0.25
40.0
70.14 ±7.28
3.77 ±0.03**
1.54 ±0.17
50.0
40.07 ± 16.1
2.43 ±1.56
1.86 ±0.30
                                                            0.84
                                                                           1.33
                                                                                           1.60
1,4 BQ (Aerobic)
Dose (uM)
% Plating efficiency
generations
DEL Events/104 viable cells

Induction of DEL
0.0
47.64 ± 20.25
5.14 ±0.06
1.34 ±0.58

1.00
60.0
82.83 ± 8.86
4.36 ±0.40*
1.58±0.13

1.18
100.0
69.56 ± 20.29
4.43 ± 0.47
1.97 ±0.18

1.47
1,4 BQ (Anaerobic)
Dose (uM)
% Plating efficiency
generations
DEL Events/104 viable cells
Induction of DEL
0.0
72.66 ±49.38
5.11 ±0.17
1.3 ±0.56
1.00
60.0
122.44 ± 38.44
4.05 ±0.38*
1.11 ±0.03
0.85
100.0
75.25 ±5.28
4.17 ±0.47*
1.46 ±0.19
1.12
(*) P value < 0.05, (**) P value < 0.01. Statistical significance determined by student's t-tssi with comparison to the untreated control (zero dose). Data is
presented as the average ± standard deviation. DEL event frequency = number of deletion events per 104 viable cells.
                                       Previous

-------
                                   C.K Rodriguez et al. I Toxicology in Vitro 22 (2008) 296-300
                                                                                                                  299
not decrease cell viability up to 100 (iM. This was true for
both aerobic and anaerobic conditions. There was a slight
but statistically significant  decrease in generation number
for this quinone under both exposure conditions.

3.2. CAN Mutations in haploid Y433 cells

   Mutations that confer canavanine resistance in yeast are
largely (90%) single base-pair alterations such as base  sub-
stitutions or frameshift mutations (Tishkoff et al.,  1997).
The CAN assay in the current study provides information
about  the  base damaging activity  induced by 9,10  PQ
(Table 2). The  point mutation frequency was determined
at a single  exposure  dose of 20 (iM  for 9,10 PQ  and
100 uM  1,4 BQ. At this concentration 9,10 PQ induced a
significant increase in point mutations as well  as a signifi-
cant decrease in both cell viability and generation number.
In the absence of oxygen, there was no significant increase
in point mutations or decrease in cell viability for 9,10 PQ.
However, there was a significant reduction in generation
number. The control  quinone  1,4 BQ  led to  an average
Table 2
Forward mutations, survival  and  generation number in haploid S.
cerevisiae cells under aerobic and anaerobic conditions

9,10 PQ (Aerobic)
Dose (uM)
% Plating efficiency
generations
CANR Events/106 viable cells
Induction of mutations

9,10 PQ (Anaerobic)
Dose (uM)
% Plating efficiency
generations
CANR Events/106 viable cells

Induction of Mutations

1,4 BQ (Aerobic)
Dose (uM)
% Plating efficiency
generations
CANR Events/106 viable cells

Induction of mutations
1,4 BQ (Anaerobic)
Dose (uM)
% Plating efficiency
generations
CANR Events/106 viable cells

Induction of mutations
0.0
141.42 ±9.63
4.99 ± 0.39
0.16 ±0.19

1.00
20.0
29.57 ± 4.06*
0.64 ± 0.66™
2.16 ±0.49™

13.50
0.0
173.00 ±94.40
5.26 ± 0.46
0.13 ±0.05

1.00
0.0
141.42 ±9.63
4.99 ± 0.39
0.16 ±0.19

1.00
0.0
173.00 ±94.40
5.26 ± 0.46
0.13 ±0.05

1.00
60.0
214.67 ±30.89
2.63 ±0.51™
0.61 ±0.76

4.69
100.0
19.96 ± 10.04*
0.13 ±0.22**
5.78 ±5.21

36.13a
100.0
19.96 ± 10.04*
0.96 ± 0.66™
0.98 ±0.34*

7.54
(*) P value < 0.05, (**) P value < 0.01. Statistical significance determined
by student's t-test with comparison to the untreated control (zero dose).
Data is presented as the average ± standard deviation. CANR Frequency
= number of CANR events per 106 viable cells.
 a Although 1,4 BQ leads to a 36-fold increase in CAN mutations, the
induction of point mutations is not statistically significant because the
actual  frequency of CANR events per 106 viable cells is highly variable
(large standard deviation).
                                36 fold increase in point mutations in the presence of oxy-
                                gen but the increase was not statistically significant due to
                                large standard deviation. In the absence of oxygen, 1,4  BQ
                                led to a statistically  significant 7.5 fold increase in  point
                                mutations. Under both aerobic and anaerobic conditions,
                                1,4 BQ caused a significant decrease in cell viability and
                                generation number.
4. Discussion

   The goal of this study was to examine the DNA damag-
ing activity of 9,10 PQ under aerobic and anaerobic condi-
tions. The DEL assay for DNA deletions and the CAN
assay for point mutations in  the yeast  S. cerevisiae pro-
vided a measure of genotoxicity. The findings of this study
reveal that 9,10 PQ leads to DNA  deletions and point
mutations only  in  the presence of oxygen. In the absence
of oxygen, 9,10  PQ caused a significant decrease in genera-
tion number in both the diploid and haploid strains but at
approximately three-fold higher concentrations than in the
presence of oxygen. This  finding is consistent with previ-
ously observed results where the IC50 for growth inhibition
was 13.81 (iM  under aerobic conditions and 36.00 (iM
anaerobically  (Rodriguez  et al.,  2004).  The  quinone  1,4
BQ does not induce DNA deletions and does not cause a
loss of cell viability in the diploid strain at concentrations
as high as 100 (iM. However, it does  cause  a decrease in
generation number. In the haploid strain,  1,4 BQ leads to
a decrease in  cell viability and generation number in  the
presence of oxygen and leads only to a decrease in genera-
tion number  in the absence  of oxygen. This non-redox
cycling  quinone also  leads to forward  mutations in  the
absence  of oxygen. The  genotoxicity  probably  occurs
through direct interaction with DNA.
   The growth inhibition mediated by 9,10 PQ and 1,4 BQ
in the absence of oxygen is likely a result of GAPDH inhi-
bition. Both quinones have been previously shown to inhi-
bit  GAPDH   by  an  oxygen-independent   mechanism
(Rodriguez et al.,  2005).  Inhibition of  GAPDH activity
may have an effect on cell cycle progression because active
GAPDH has been shown to stimulate cell proliferation and
reverse cyclin  B  inhibition by other enzymes (Carujo et al.,
2006). The current  study concludes that the activity of 9,10
PQ  leads to  growth inhibition under both  aerobic  and
anaerobic conditions but leads to DNA deletions and point
mutations only in the presence of oxygen. Furthermore, the
electrophilic,  non-redox  cycling  1,4  BQ  leads  to point
mutations independent  of oxygen  but does not cause
DNA deletions.
                                Acknowledgements

                                   This study was supported by a UC Toxic Substances Re-
                                search and Teaching Lead Campus Program fellowship to
                                CR and ZS, and an EPA STAR fellowship to ZS.
                                    Previous

-------
300
                                         C.K Rodriguez et al. I Toxicology in  Vitro 22 (2008) 296-300
References

Arimoto, T., Yoshikawa, T., Takano, H., Kohno, M., 1999. Generation of
   reactive oxygen species and 8-hydroxy-2'-deoxyguanosine formation
   from diesel exhaust particle components in L1210 cells. The Japanese
   Journal of Pharmacology 80, 49-54.
Bai, Y., Suzuki, A.K., Sagai, M., 2001. The cytotoxic effects of diesel exhaust
   particles on human pulmonary artery endothelial cells in vitro: role of
   active oxygen species. Free Radical Biology & Medicine 30, 555-562.
Bishop, A.J.,  Schiestl, R.H., 2003. Role of homologous recombination in
   carcinogenesis. Experimental and Molecular Pathology 74, 94-105.
Bolton, J.L.,  Trush,  M.A., Penning, T.M., Dryhurst,  G., Monks, T.J.,
   2000.  Role of quinones in toxicology. Chemical Research in Toxicol-
   ogy 13, 135-160.
Brennan, R.J., Swoboda, B.E., Schiestl, R.H., 1994. Oxidative mutagens
   induce intrachromosomal recombination in yeast. Mutation Research
   308, 159-167.
Carujo, S., Estanyol,  J.M., Ejarque, A., Agell, N., Bachs, O., Pujol, M.J.,
   2006.  Glyceraldehyde 3-phosphate dehydrogenase  is  a SET-binding
   protein and regulates cyclin B-cdkl  activity. Oncogene 25, 4033-4042.
Cho, A.K., Di  Stefano, E.,  You, Y.,  Rodriguez, C.E., Schmitz, D.A.,
   Kumagai, Y., Miguel, A.H., Eiguren-Fernandez, A., Kobayashi, T.,
   Avol, E.,  Froines, J.R.,  2004. Determination of Four Quinones in
   Diesel Exhaust  Particles,  SRM  1649a,  and Atmospheric PM2.5.
   Aerosol Science and Technology 38, 68-81.
Galli, A., Schiestl, R.H., 1998. Effects of DNA double-strand and  single-
   strand breaks on intrachromosomal recombination events in cell-cycle-
   arrested yeast cells. Genetics 149, 1235-1250.
Kirpnick, Z., Homiski, M., Rubitski, E., Repnevskaya, M., Hewlett, N.,
   Aubrecht, J., Schiestl, R.H., 2005. Yeast DEL assay detects clastogens.
   Mutation Research 582, 116-134.
Kondrova, E., Stopka, P., Soucek, P., 2007. Cytochrome P450 destruction
   by  benzene metabolites 1,4-benzoquinone and 1,4-hydroquinone and
   the formation  of hydroxyl radicals in  minipig  liver microsomes.
   Toxicology In Vitro 21, 566-575.
Li, N., Alam, J., Venkatesan, M.I., Eiguren-Fernandez, A., Schmitz, D.,
   Di  Stefano,  E.,  Slaughter, N., Killeen, E.,  Wang, X., Huang, A.,
   Wang, M., Miguel, A.H.,  Cho, A., Sioutas, C, Nel, A.E., 2004. Nrf2 is
   a  key  transcription  factor that  regulates  antioxidant defense in
   macrophages and epithelial cells: protecting against the proinnamma-
   tory and  oxidizing effects of diesel exhaust chemicals. Journal of
   Immunology 173, 3467-3481.
Ma, J.Y., Ma, J.K., 2002. The dual effect of the particulate and organic
   components of diesel exhaust particles on the alteration of pulmonary
   immune/inflammatory responses and metabolic enzymes. Journal of
   Environmental Science and Health,  Part C: Environmental Carcino-
   genesis and Ecotoxicology Reviews 20, 117-147.
MATES-II, 2000. Multiple Air Toxics Exposure Study in the South Coast
   Air Basin (MATES-II). A report of the  South Coast Air Quality
   Management  District.  Available:  http://www.aqmd.gov/matesiidf/
   matestoc.htm.
Peterson, B., Saxon, A., 1996.  Global  increases in allergic respiratory
   disease: the possible role of diesel exhaust particles. Annals of Allergy,
   Asthma and Immunology 77, 263-268, quiz 269-270.
Rodriguez,  C.E.,  Shinyashiki, M., Froines,  J., Yu, R.C., Fukuto, J.M.,
   Cho,  A.K., 2004. An examination  of  quinone toxicity using the
   yeast  Saccharomyces  cerevisiae  model  system.  Toxicology  201,
   185-196.
Rodriguez, C.E., Fukuto, J.M., Taguchi, K.,  Froines, J., Cho, A.K., 2005.
   The interactions of 9,10-phenanthrenequinone with  glyceraldehyde-3-
   phosphate dehydrogenase (GAPDH), a potential site for toxic actions.
   Chemico-Biological Interactions 155, 97-110.
Sagai, M., Saito, H., Ichinose, T., Kodama, M., Mori, Y., 1993. Biological
   effects of diesel exhaust particles. I. In vitro production of superoxide
   and in vivo  toxicity in mouse. Free  Radical Biology & Medicine  14,
   37-47.
Schiestl,  R.H.,  Prakash, S., 1988. RAD1,  an excision repair gene of
   Saccharomyces cerevisiae,  is also  involved in recombination. Molec-
   ular and Cellular Biology 8, 3619-3626.
Schiestl, R.H.,  1989. Nonmutagenic carcinogens induce intrachromoso-
   mal recombination in yeast. Nature 337,  285-288.
Schiestl, R.H., Gietz,  R.D., Mehta, R.D., Hastings, P.J.,  1989. Carcin-
   ogens induce intrachromosomal recombination in yeast. Carcinogen-
   esis 10, 1445-1455.
Sun, J.D., Wolff,  R.K., Kanapilly, G.M., McClellan, R.O.,  1984. Lung
   retention and metabolic fate of inhaled benzo(a)pyrene associated with
   diesel exhaust particles. Toxicology  and Applied Pharmacology 73,
   48-59.
Tishkoff, D.X.,  Filosi, N., Gaida, G.M., Kolodner, R.D.,  1997. A novel
   mutation avoidance mechanism dependent  on S. cerevisiae RAD27 is
   distinct from DNA mismatch repair. Cell 88, 253-263.
Whelan,  W.L.,  Gocke,  E., Manney, T.R.,  1979.  The CAN1  locus of
   Saccharomyces cerevisiae:  fine-structure  analysis and forward muta-
   tion rates. Genetics 91, 35-51.
                                             Previous

-------
                                                 Toxicology Letters 181 (2008) 148-156
                                             Contents lists available at ScienceDirect
                                                 Toxicology Letters
                                   journal  homepage:  www.elsevier.com/locate/toxlet
Comparing single and repeated  dosimetry data for perfluorooctane
sulfonate  in rats1^

Leona A. Harris3'*, Hugh. A. Barton5'1
' Department of Mathematics and Statistics, The College of New Jersey, Swing, NJ 08628, USA
b National Center for Computational Toxicology, Environmental Protection Agency, Research Triangle Park, JVC 27711, USA
ARTICLE   INFO

Article history:
Received 27 May 2008
Received in revised form 10 July 2008
Accepted 10 July 2008
Available online 29 July 2008

Keywords:
Perfluorooctane sulfonate
Pharmacokinetic modeling
PBPK
                                        ABSTRACT
Perfluorooctane sulfonate (PFOS) is a member of a class of perfluorinated chemicals used in a variety
of consumer and industrial applications because of their oleophobic and hydrophobic properties. It has
been shown to cause toxicity in adult and developing laboratory animals. Because PFOS has also been
shown to be widely distributed throughout the environment, there have been concerns about its poten-
tial health risk to humans. Limited  pharmacokinetic data for PFOS are available in rodents and humans,
while epidemiological studies of workers and extensive toxicity studies in rodents have been performed.
The existing pharmacokinetic and toxicity database in rodents can be useful in the cross-species extrap-
olations  needed to evaluate  and interpret internal dosimetry in humans. A mathematical model that
describes the disposition of PFOS in adult rats following intravenous, oral, and chronic dietary exposures
was developed to gain a better understanding of the pharmacokinetics of PFOS and to determine whether
single-dose kinetics are predictive of repeated-dose kinetics. In order to characterize existing time-course
data, time-dependent and concentration-dependent changes in the pharmacokinetic parameters for uri-
nary  and biliary clearance and liver distribution were needed. Whether these  time-dependent changes
represent inconsistencies across experiments, effects of aging in the rats, or chemically induced changes
in pharmacokinetics remains to be determined.
                                               © 2008 Elsevier Ireland Ltd. All rights reserved.
1. Introduction

   Perfluorooctane  sulfonate  (PFOS)  and related perfluoroalkyl
acids are used for  a variety of consumer and  industrial  appli-
cations because of their oleophobic and hydrophobic properties.
These chemicals have been used in fabric protection products, the
coatings  of paper plates and microwave popcorn bags, firefighting
foams, herbicides, denture cleaners, shampoos, and floor polishes
(Lau et al., 2007; OECD, 2002; Trudel et al, 2008).  Manufactur-
ing of PFOS and substances known or suspected  to generate PFOS
through  degradation pathways was ceased by the largest manu-
facturer (3M) between 2000 and 2002.  PFOS has been shown to
be persistent and widely  distributed in  the environment.  It has
been detected in the liver and blood  of wildlife across the globe
 * This work was reviewed by US EPA and approved for publication but does not
necessarily reflect official Agency policy. Mention of trade names or commercial
products does not constitute endorsement or recommendation by EPA for use.
  * Corresponding author.
   E-mail addresses:  harrisl@tcnj.edu (L.A.  Harris),  habarton@alum.mit.edu
(HughA Barton).
 1 Current address: Pharmacokinetics, Dynamics and Metabolism Department,
Pfizer Global Research & Development, Groton, CT 06340, USA.

0378-4274/S - see front matter © 2008 Elsevier Ireland Ltd. All rights reserved.
doi:10.1016/j.toxlet.2008.07.014
                         in North America, Europe, Asia, and Antarctica, in blood samples
                         of 3M fluorochemical-production workers, as well as in blood sam-
                         ples of non-occupationally exposed humans, at levels which appear
                         to be declining since the phase-out (Calafat et al., 2007; Giesy and
                         Kannan, 2001; Olsen et al., 2005, 2008).
                            Epidemiological studies focused on workers and more recently
                         on the general population, notably pregnant women and their off-
                         spring have observed limited effects that may not be reproduced in
                         other studies (Alexander and Olsen,  2007; Fei et al., 2008). Toxic-
                         ity studies have evaluated effects in adult and developing animals.
                         Rats and nonhuman primates have been used in repeated exposure
                         studies  of adults. Liver enlargement has been  observed in rats and
                         monkeys (Seacat et al., 2002, 2003a), while hepatocellular adeno-
                         mas were observed in rats in a chronic toxicity study (3M Company,
                         2002). That study also observed evidence of thyroid effects. Limited
                         malformations arose from developmental exposures, which may
                         be due to maternal toxicity rather than direct effects of PFOS (Lau
                         et al., 2004,  2007). However, offspring of mice and rats exposed
                         during pregnancy demonstrated dose-dependent toxicity, notably
                         mortality of newborn pups shortly after birth at high doses. Direct
                         correlations of some effects with blood or liver concentrations of
                         PFOS have been reported, but limited data exist for describing PFOS
                         pharmacokinetics (Johnson et al., 1979a,b).
                                        Previous
                     TOC

-------
                                          LA. Harris, HughA. Barton / Toxicology Letters 181 (2008) 148-156
                                                                                                                                   149
   Johnson et al. (1979b) found that PFOS was well absorbed when
a single oral dose of potassium PFOS (mean dose, 4.2mg/kg) was
administered to three male Charles River CD rats. In this study, 95%
of the dose was absorbed within 24 h. In a similar study, Johnson
et al. (1979a) found that PFOS distributes primarily to the liver and
the blood when a single intravenous dose of potassium PFOS (mean
dose, 4.2 mg/kg) was administered to six male Charles River CD rats.
By 89 days, 25% of the dose was found in the liver, 3% of the dose
was found in the plasma, and 42.8% of the dose had been excreted
in the urine and feces. Therefore, since there is no evidence to sug-
gest that PFOS is further metabolized, the half-life for elimination
from the body, tj/2, for male  rats  appears to be greater than 89
days. Johnson et al. (1984) showed that cholestyramine adminis-
tered in  feed to rats for several days after an intravenous dose of
potassium PFOS  increases the  fecal elimination of PFOS substan-
tially and, hence, PFOS appears to undergo marked enterohepatic
recirculation.
   PFOS  and related chemicals have become a major focus in envi-
ronmental toxicology (Lau et al.,  2007; Trudel et al., 2008). The
persistence and wide  distribution of PFOS in the environment has
caused concerns about its potential  health risk to humans. Lim-
ited pharmacokinetic data for PFOS in humans have been used to
estimate serum elimination half-lives between 5 and 9 years, con-
siderably higher than the serum elimination  half-life of 7.5 days
for adult  rats, though the half-life for elimination from the body
was greater than 89  days in rats (Johnson et al., 1979b;  OECD,
2002; Olsen et al., 2007). While there is also  a limited pharma-
cokinetic database in rodents, extensive toxicity studies in rodents
have been performed. Therefore a mathematical model describing
the pharmacokinetics of PFOS in rats would be useful in performing
cross-species extrapolations needed to evaluate internal dosimetry
in humans.
   The objective of this research was to develop a  physiologi-
cally based pharmacokinetic (PBPK) model of PFOS that could be
used to gain a better understanding of the pharmacokinetics of
PFOS in rats following a variety of exposure  scenarios. The  PBPK
model described in this  paper was used to evaluate the existing
time course data for PFOS concentrations in the liver, plasma, red
blood cells, urine, and feces following intravenous, oral, and chronic
dietary exposures to determine whether single-dose kinetics  are
predictive of repeated-dose kinetics. In this evaluation, inconsis-
tencies among the different exposure scenarios were observed and
led to the use of time-dependent and concentration-dependent
changes in the pharmacokinetic parameters for PFOS to obtain rea-
sonable predictions of the time course data. The time-dependent
and concentration-dependent changes  presented in this paper
are hypotheses that require additional experimental investigation
to better understand the biological processes underlying these
changes.

2.  Methods

2.1.  Model structure

   The pharmacokinetics of PFOS in adult male and nonpregnant female rats can
be described using standard PBPK compartmental  modeling (Gerlowski and Jain,
1983; Nestorov, 2003). The PBPK model presented in this paper describes the body
as a set of tissue compartments representing the organs and tissues in the body
interconnected by blood flow through the compartments. See Fig. 1 for a graphi-
cal representation of the model. A system of differential equations based on mass
balance is used to track the uptake and absorption of PFOS into the blood stream,
transport of PFOS from one tissue compartment to the  next via the plasma, and
elimination of PFOS from the body via the urine and feces.  For purposes of this
model, plasma and serum are assumed interchangeable. The system variables, rep-
resenting the amounts of PFOS in the various tissue compartments, are defined in
Table 1 and the model parameters are defined in Tables 2 and 3. Each tissue com-
partment in the model is described as a diffusion-limited compartment (including
sub-compartments for the capillary blood in the tissue and the tissue cells) cons is-

Upper

0,.
/
urine
a,
QL
I
*.l
Lo\

P*A^*P:A Plasma
— H —
Red Blood Cells

Rest of Body Plasma
— M—
Rest of Body Tissue

Liver Plasma
— H—
Liver Tissue
u
ver 1
fact ( — > feces




           oral •
       Fig. 1. Conceptual representation of a physiologically based pharmacokinetic model
       for PFOS exposure in rats. The boxes represent tissue compartments and the arrows
       represent plasma flow to/from the tissue compartments and diffusional transfer
       across the cell membrane.
       tent with an assumption that the rate of diffusion across the cell membrane is slow
       compared to the blood flow rate into the tissue (Gerlowski and Jain, 1983). Since
       PFOS has been shown to distribute primarily to the liver and blood (Johnson et al.,
       1979a), compartments representing the blood, liver, and rest of the body have been
       included in the model. In addition, a two-compartment gastrointestinal tract has
       been included as a route of entry.
       Table 1
       Model variables and concentrations
       Variable    Description
                                                 Calculation
       ApiaSF       Amount of PFOS in the plasma (mg)
       Ap:A       Amount of PFOS-bound albumin in
                 plasma (mol)
       ARBC       Amount of PFOS in the red blood
                 cells (mg)
       AU        Amount of PFOS in the urine (mg)
       Ag        Amount of PFOS in the rest of body
                 tissue (mg)
       ARP        Amount of PFOS in the rest of body
                 plasma (mg)
       AUGI       Amount of PFOS in the upper
                 gastrointestinal tract (mg)
       ALGI       Amount of PFOS in the lower
                 gastrointestinal tract (mg)
       AF        Amount of PFOS in the feces (mg)
       AL        Amount of PFOS in the liver tissue
                 (mg)
       ALP        Amount of PFOS in the liver plasma
                 (mg)
       CPlas       Total concentration of PFOS in
                 plasma (mg/L)
       CpiaSF       Free concentration of PFOS in
                 plasma (mg/L)
       CAP        Free concentration of albumin in
                 plasma (M)
       CP:A       Concentration of PFOS-bound
                 albumin in plasma (M)
       CRBC       Concentration of PFOS in red blood
                 cells (mg/L)
       CR        Concentration of PFOS in rest of
                 body tissue (mg/L)
       CRP        Concentration of PFOS in rest of
                 body plasma (mg/L)
       CL        Concentration of PFOS in liver
                 tissue (mg/L)
       CLP        Concentration of PFOS in liver
                 plasma (mg/L)
State variable
State variable

State variable

State variable
State variable

State variable

State variable

State variable

State variable
State variable

State variable
CplasF = Cplas - CP:A MWp 1000

CAP = CA - Cp:A
CR=/

CRP =ARp/\

CL=AL/VL
                                       Previous
TOC

-------
150
                                               LA. Harris, HughA. Barton / Toxicology Letters 181 (2008) 148-156
Table 2
Physiological parameters defined for the PFOS PBPK model
Parameter
                              Description
                                                                                          Value
                                                                                                                        Source
BW
/Plas
VL
VIP
VLT
WoB
WoBP
WoBT
Qc
Qplas
OK
QL
OR
Rat body weight
Volume of plasma
Volume of red blood cells
Volume of blood
Fraction of plasma in blood
Volume of liver
Volume of liver plasma
Volume of liver tissue
Volume of rest of body
Volume of rest of body plasma
Volume of rest of body tissue
Blood flow rate by the heart (cardiac output)
Plasma flow rate by the heart
Plasma flow rate to the kidneys
Plasma flow rate to the liver
Plasma flow rate to the rest of body
                                                             0.288 kg
                                                             0.0090 L
                                                             0.0076 L
                                                             0.0166 L
                                                             0.5434
                                                             0.0105 L
                                                             0.0012 L
                                                             0.0093 L
                                                             0.2090 L
                                                             0.0045 L
                                                             0.2045 L
                                                             5.5432 L/h
                                                             3.0122L/h
                                                             0.4247 L/h
                                                             0.5512 L/h
                                                             2.5875 L/h
Johnson etal. (1979a)
Johnson etal. (1979a)
Johnson etal. (1979a)
Calculated as VPiis + VRBc
Calculated as Vp,as/(Vpias + VRBC)
Brown et al. (1997)
Brown et al. (1997)
Calculated as VL - VLP
Calculated as 0.82(BW) - VBld - VL
Brown etal. (1997)
Calculated as VRoB - VROBP
Brown et al. (1997)
Calculated as/piasQ£
Brown et al. (1997)
Brown etal. (1997)
Calculated as Qpias - QL
2.2.  Blood compartment

    The blood has sub-compartments for the plasma and red blood cells. The equa-
tions representing the blood compartment describe the transport of PFOS through
the systemic plasma to/from the tissue compartments, the binding of PFOS to the
plasma protein albumin, the diffusion of PFOS into the red blood cells, and the elim-
ination of PFOS into the urine. The differential equation for the amount of PFOS in
the plasma is given by
  at
    - = QL(CLP - Cp,aS
- Cplas) + PARBC (  ^£ _ Cp,asF )  - ClurCp,as,
                J RBC
The first two terms describe the transport of PFOS through the plasma to/from the
liver and the rest of the body, respectively; the third term represents the diffusion
of PFOS across the red blood cell membrane; and the fourth term describes uri-
nary clearance as a first-order process removing PFOS from the arterial plasma.
This simplified description of kidney filtration is commonly used in PBPK mod-
eling when the kidney is lumped together with the other tissues  in the rest of
body compartment (Krishnan and Andersen,  1994; Nestorov, 2003). In the model
the urinary clearance rate is described as a fraction of the kidney blood flow rate
(Clur=/urQk).
    It has been shown that PFOS binds to the albumin in the blood (Jones et al., 2003).
This affects the availability of PFOS to distribute to the other tissue compartments
(Mendel, 1992). Here we assume that the total plasma concentrations of PFOS, CPias,
are available to distribute to the liver and the rest of body compartments, while
only the free concentrations of PFOS in the plasma (CplasF = Cplas-Cp:AMWp 1000)
are available to distribute to the red blood cells. The differential equation for the
                                                amount of PFOS bound to albumin in the plasma is given by

                                                       ,,  .      CplasF    r    ,/   j. r
                                                                            -
                                                where CAF = CA - CP:A is the free concentration of albumin available to bind to PFOS
                                                in the plasma.
                                                   The differential equation for the amount of PFOS in the red blood cells is given
                                                by

                                                d^RBC    D.   \r     CRBC \
                                                — - 7—  = PARBC  CPlasF - - -
                                                 dt          \_       PRBC )

                                                where PARBc  is the permeability cross-product for the red blood cells, PRBc is the
                                                RBC-to-plasma partition coefficient, and CRBC/PRBC is the concentration of PFOS free
                                                to leave the red blood cells.


                                                23.  Liver compartment

                                                   The  diffusion-limited  compartment  representing  the  liver  has  sub-
                                                compartments for the liver plasma and the liver tissue. The differential equation
                                                for the amount of PFOS in the liver plasma is given by
                                                            s - CLP) + PALiv   ± -
                                                The first two terms describe the transport of PFOS to/from the systemic plasma
                                                and diffusional transfer between the plasma and tissue, while the last two terms
                                                describe the distribution of PFOS to the liver from the upper and lower sections of
Table 3
Biochemical and kinetic parameters defined for the PFOS PBPK model
Parameter
kau
kal
fed
kb
a
/ur
kf
PARBC
PARoB
PALiv
Kd
Kon
koff
MWA
MWp
MWP:A
N
CA
BMax
PL
PRBC
PROB
Description
Rate of absorption from the upper GI tract
Rate of absorption from the lower GI tract
Rate of transfer from upper-lower GI tract
Maximum rate of biliary elimination
Biliary elimination rate of decrease
Fraction of urinary clearance
Rate of fecal elimination
Red blood cells permeability cross-product
Rest of body permeability cross-product
Liver permeability cross-product
Equilibrium disassociation constant
Association rate for PFOS-albumin
Disassociation rate for PFOS-albumin
Molecular weight of albumin
Molecular weight of PFOS-
Molecular weight of P:A complex
Number of binding sites
Total concentration of albumin in plasma
Maximum binding capacity
Liver-to-plasma partition coefficient
RBC-to-plasma partition coefficient
Rest of body-to-plasma partition coefficient
Value
0.114 h-1
1.0 h-1
0.01 h-1
8.0 h-1
5.0 h-1
0.28
5.0 h-1
0.002(BW)°-75 L/h
O.OOl(BW)0-75 L/h
0.00025(BW)OJ5L/h
10-7M
IQSM-'s-1
10-2 s-1
66,000g/mol
499.12g/mol
66,499.1 2 g/mol
1
0.00041 M
0.00041 M
8.66
801.3
0.47
Source
Estimated
Estimated
Estimated
Estimated
Estimated
Estimated
Estimated
Estimated
Estimated
Estimated
Estimated
Estimated
Calculated as /Cdkon
Seshagiri and Adiga (1989)
3M (2001), PubChem (2008)
Calculated as MWA + MWP
Assumption
Teeguarden and Barton (2004)
Calculated as N x CA
Estimated
Estimated
Estimated
                                               Previous
                                           TOC

-------
                                             LA. Harris, HughA. Barton / Toxicology Letters 181 (2008) 148-156
                                                                                                                                              151
the gastrointestinal tract due to oral absorption and enterohepatic recirculation. The
differential equation for the amount of PFOS in the liver tissue is given by
In the first term, PALiv represents the liver permeability cross-product, PL represents
the liver-to-plasma partition coefficient, and Q./PL represents the concentration of
PFOS free to leave the liver tissue. The second term represents the biliary elimina-
tion of PFOS  from the hepatocytes, the cells of the liver tissue that produce bile,
to the lower gastrointestinal tract. A time-dependent biliary elimination rate that
decreases with time, kb(t) = k^l(at + 1 ), was used to capture the elimination dynamics
of PFOS in feces. The parameter a is a measure of the rate of decrease (see Table 3).

2.4. Rest of body compartment

   The rest of body compartment represents the rest of the organs and tissues in
the body not included in the blood and liver compartments. It is a diffusion-limited
compartment that has sub-compartments representing the plasma and tissue cells
in the remaining organs and tissues. The differential equations for the amounts of
PFOS in the rest of body plasma and the rest of body tissue cells are given by
            ,,S - CRp) + PARoB     - CRp
As in the blood and the liver compartments, these equations describe the transport of
PFOS to/from the systemic plasma and diffusion transfer across the cell membrane.

2.5. Absorption from the gastrointestinal (GI) tract

   A two-compartment model is used to describe the uptake of PFOS into gastroin-
testinal tract following oral administration, absorption of PFOS into the liver via the
portal venous system, and elimination of PFOS in the feces. The differential equation
for the amount of PFOS in the upper gastrointestinal tract is given by
The first term represents the transfer of PFOS to the liver due to absorption via the
portal venous system and the second term represents the transfer of PFOS to the
lower gastrointestinal tract. The differential equation for the amount of PFOS in the
lower gastrointestinal tract is given by
(Mia
 dt
             + kb(t)AL - k,,ALGl -
The first two terms represent input from the upper gastrointestinal tract and from
the liver tissue via the bile; the third term describes the transfer of PFOS to the liver;
and the fourth term describes fecal elimination.

2.6. Routes of elimination

   The model describes two primary routes for the elimination of PFOS from the
body: urine and feces. The differential equations for the amounts of PFOS in the
urine and feces are given by

dAu
~   = clurCplasF
                                                                       daily body weights and food consumption levels needed to estimate daily oral doses
                                                                       for each of the dose groups. Although rats eat episodically during the 12-h dark cycle
                                                                       (Yuan, 1993), the relatively slow clearance of PFOS makes it reasonable to treat the
                                                                       dietary exposure as a daily bolus oral dose. The PBPK model treats a single, oral dose
                                                                       as an initial amount in the upper gastrointestinal tract, so an iterative process of
                                                                       starting and stopping the model each day, adding the new daily dose to the current
                                                                       amount in the uppergastrointestinal tract was used to simulate these chronic dietary
                                                                       exposures.

                                                                       2.8.  Model implementation and parameterization

                                                                          The model was implemented and parameterized using the software MATLAB
                                                                       (The MathWorks, Inc., Natick, MA). Organ-specific and species-specific parameters
                                                                       (e.g. cardiac output, plasma flow rates, and organ volumes) obtained from the lit-
                                                                       erature are given in Table 2. An intravenous data set (Johnson et al., 1979a) was
                                                                       used to calculate the tissue-to-plasma partition coefficients, PRBc, PL, and PR. This
                                                                       study reported that at 89 days following a single intravenous dose of 4.2 mg/kg of
                                                                       potassium PFOS, 42.8% of the dose had been excreted in urine and feces and 25.21%,
                                                                       2.81%, and  0.47% of the dose remained  in the liver, plasma, and red blood cells,
                                                                       respectively. Assuming that the distributional pseudo-equilibrium between the tis-
                                                                       sues and plasma was attained at 89 days (Lam et al., 1982; Khor and Mayersohn,
                                                                       1991), this data was used to estimate each partition coefficient as the ratio of tissue
                                                                       concentration to plasma concentration at 89 days (see Table 3). In this model, the
                                                                       total plasma concentration of PFOS at  89 days estimated by Johnson et al. (1979a)
                                                                       was used to estimate PL and PR; however, it is assumed that only the free plasma con-
                                                                       centration of PFOS is available to distribute to the red blood cells. Therefore PRBc was
                                                                       estimated as the ratio of the red blood cell concentration of PFOS to the free plasma
                                                                       concentration of PFOS at 89 days. Here the free plasma concentration of PFOS at 89
                                                                       days is calculated by solving the binding equation T=(BMaxF)/(/(d+F) + F for Fusing
                                                                       the total plasma concentration at 89 days, T, as estimated by Johnson et al. (1979a).
                                                                       The value of PUBC estimated here may be an upper bound in light of recent measure-
                                                                       ments with human blood that did not detect PFOS in  red blood cells (Ehresman et
                                                                       al., 2007).
                                                                          The remaining  chemical-specific  parameters were estimated by  fitting  the
                                                                       model to experimental data using a two-step process. In the first step, a sub-model
                                                                       consisting of all of the model compartments except the uppergastrointestinal tract
                                                                       was fit to the  data in the intravenous study (Johnson et al.,  1979a).  The upper
                                                                       gastrointestinal tract was excluded from the model in this step because this com-
                                                                       partment is only needed to simulate the pharmacokinetics of PFOS following oral
                                                                       uptake. Therefore the parameters associated with this compartment, kiu and ktt, do
                                                                       not contribution to the distribution of PFOS following intravenous administration.
                                                                       In the second step of the parameter estimation process, the parameters obtained
                                                                       from the first step were fixed and used to estimate k,a and kti by fitting the entire
                                                                       model to an oral gavage data set (Johnson et al., 1979b). Refer to Table 3 for a list
                                                                       of the resulting parameter values obtained from fitting the model to experimental
                                                                       data.


                                                                       3.  Results

                                                                          The PBPK model described in this paper was used to predict
                                                                       the pharmacokinetics of PFOS  in rats following intravenous, oral,
                                                                       and chronic dietary exposures. Model simulations tracking the time
                                                                       courses of PFOS concentrations in the liver, plasma, red blood cells,
                                                                       urine, and feces are  compared to experimental data to determine
                                                                       whether single-dose PFOS kinetics are predictive of repeated-dose
                                                                       kinetics.
dAr
~dT
= kfALG
2.7. Dietary modeling

   The model was used to simulate a 104-week dietary study reported by the 3M
Company (2001, 2002). In the study, male and female Sprague-Dawley rats were
assigned to six dose groups (40-70 rats/sex/dose level) and were fed diets containing
potassium PFOS concentrations of 0 ppm, 0.5 ppm, 2 ppm, 5 ppm, or 20 ppm. This
paper will not address Dose Group 1 (the control group given 0 ppm) and Dose
Group 6 (a recovery group given 20 ppm). The intent of the 104-week study was
to analyze the sub-chronic and chronic effects of PFOS exposure in rats. To study
the sub-chronic effects of PFOS exposure at weeks 4 and  14, a small group of rats
(5 rats/sex/dose level/sacrifice) were added to each dose group. These results were
reported by Seacat et al. (2003a,b). Over the 104-week period of the study, body
weight changes, food consumption levels, and mean daily intake levels (mg/kg/day))
were measured consistently. These measurements were obtained wee klyforthe first
16 weeks and then monthly for the remainder of the study.  In order to use the model
to simulate these daily dietary exposures, linear interpolation was used to obtain
                                                                          3.1. Single-dose kinetics

                                                                              Two  single-dose  pharmacokinetic studies  (Johnson  et al.,
                                                                          1979a,b) were used to estimate model parameters and determine
                                                                          whetherthe model is capable of accurately predicting PFOS kinetics
                                                                          following acute exposures. Results from simulating a single intra-
                                                                          venous dose of 4.2 mg/kg of potassium PFOS  in  six male rats are
                                                                          shown in Fig. 2. The graphs compare model simulations to the data
                                                                          to illustrate the ability of the model to replicate the distribution of
                                                                          PFOS to the liver, plasma, red blood cells, urine, and feces for 90 days
                                                                          following exposure. The results from the intravenous data set sug-
                                                                          gest that the elimination of PFOS in feces decreases overtime while
                                                                          liver amounts remain high. In order to simulate this behavior, the
                                                                          biliary elimination rate was treated as a decreasing function of time.
                                                                          In the model, a mean body weight of 288 g, the initial body weight of
                                                                          the rats in the Johnson et al. (1979a) study, was used as an estimate
                                           Previous
                                                               TOC

-------
152
                                        LA. Harris, HughA. Barton / Toxicology Letters 181 (2008) 148-156
                                                 30    40    50    60
                                                     Time (days)
                                                       Plasma
                                                                        Liver
Fig. 2. Intravenous exposure in male rats. Model simulations are compared to experimental data for a single intravenous dose of 4.2 mg/kg of potassium PFOS in male rats
(Johnson et al., 1979a). (a) Model simulations (solid and dashed curves) track the cumulative percent of dose in urine and feces overtime, (b) Model simulations predict the
percent of dose in tissues on day 89.
of the body weight for the 90-day period following exposure. It is
possible that the body weights may have increased to more than
500 g during this time (Charles River Laboratories, 2008); however,
body weight changes were not reported in thejohnson et al. (1979a)
study. Using body weight changes for an equivalent 90-day period
as reported in a 2-year study (3M Company, 2002) and using the
parameters listed in Tables 2-3, the model is able to reproduce the
results of the IV study shown in Fig. 2. Since similar results were
obtained when using a fixed body weight and increasing the body
    weight over time, while the extent of the body weight changes in
    thejohnson et al. (1979a) study are unknown, the parameter values
    using a fixed body weight are reported here. The model was also
    used to simulate a single oral dose of 4.2 mg/kg of potassium PFOS
    in three male rats. These results are shown in Fig. 3. The graphs
    show that while the model is able to characterize the elimination
    of PFOS in urine over time, it was difficult to fit the model to the
    plasma and feces time courses using the limited number of data
    values.

D
	
O
. . , ...
Data: Urine
— Model: Feces
Data: Feces
                                                    Time (hrs)

0
_
Data: Plasma
Fig. 3. Oral administration in male rats. Model simulations (solid and dashed curves) are compared to experimental data for a single oral gavage dose of 4.2 mg/kg of potassium
PFOS in male rats (Johnson et al., 1979b).
                                        Previous
TOC

-------
                                        LA. Harris, HughA. Barton / Toxicology Letters 181 (2008) 148-156
                                                                                                                              153
Table 4
Model parameter values used to
changes in urinary elimination and
sures in male rats
                           reflect time-dependent and dose-dependent
                           liver partitioning during chronic dietary expo-
Parameter
                                                       Value
/ur
b (high-dose)
b (mid-dose and mid-high dose)
b (low-dose)
r (all doses) (h)
c (all doses)
                                                       0.01
                                                       0.3
                                                       0.6
                                                       0.9
                                                       3360
                                                       7
3.2.  Repeated-dose kinetics
   To capture the biological changes observed in the plasma and
liver PFOS concentration data, the urinary elimination rate con-
stant, /ur, and  the  liver-to-plasma partition coefficient, PL, were
treated as functions of time. To account for the decreasing plasma
levels, the following Hill function is used  to describe the  urinary
elimination rate constant as an increasing function of time:
/ur(t)=/ur
 bt4
•4 if4
                                                                  where/ur(t) has an initial value of/ur and increases to a maximum
                                                                  value of/ur + b, and r is a measure of the rate of increase. Similarly,
                                                                  to account for decreasing liver concentrations, the liver-to-plasma
                                                                  partition coefficient is described as a decreasing function of time
                                                                  in the following way:
   Using the parameters in Table 3 that were estimated from the
intravenous and oral data sets and a smaller fraction of urinary
clearance, /ur  (see Table 4), the model was used to predict PFOS
liver and plasma levels in male rats for 104 weeks of daily dietary
exposures. Results from simulating Dose Groups 2-5 of the study
are shown in Fig. 4. The graphs  suggest that the model predicts
liver and plasma concentrations reasonable well in earlier weeks.
However, decreasing plasma and liver concentrations in the data
between weeks 14 and 53 suggest that there are unknown biologi-
cal changes that cause increased elimination of PFOS from the body
and changes in the liver partitioning (see Fig. 4d). These biological
changes make it impossible for the model to predict PFOS kinetics
in later weeks using the same set of parameter values.
                                                                     The parameter r in the equations for/ur(t) and PL(£) was cho-
                                                                   sen in such a way that the changes in the urinary elimination and
                                                                   liver partitioning would occur between weeks 14 and 53. Refer to
                                                                   Tables 4-5 for a list of the parameter values used to reflect these
                                                                   changes during chronic dietary exposures in male and female rats.
                                                                     Results from simulating these time-dependent changes in males
                                                                   are shown in Figs. 5-7. These figures illustrate that the changes
                                                                   needed to accurately predict PFOS plasma levels were dependent
                                                                   on the dose level. The high-dose group (20 ppm) required less of an
                                                                   increase in/ur than the low (0.5 ppm), mid- (2 ppm), and mid-high
                                                                   (5 ppm) dose groups. For the high-dose group (see Fig. 5), the frac-
                                104 Week Dietary Study - Male
                                                      100     120
                          20     40      60      80
                                       Weeks
                                                      100     120
                                                                      if
                                                                      \ 3; 20
                                                                                       104 Week Dietary Study - Male
                                                                                 20      40
                                                                                                            100     120
                                                                     00      20      40
                                                                                                            100     120
                                                                                              Weeks
           (C)
                                104 Week Dietary Study - Male
                                 40
                                               80
                                                      100     120
                                        60
                                       Weeks
                                                                        2000

                                                                    13   1500
                                                                       r
                                                                       iiooo

                                                                        soo

                                                                          0
                                                                                       104 Week Dietary Study - Male
                                                                                               60
                                                                                              Weeks
                                                                                              Weeks
Fig. 4. Chronic dietary exposures in male rats. Model simulations (solid curves), without time-dependent changes in the urinary elimination and liver partitioning, are
compared to data from a 104-week dietary study reported by 3M Company (2002) in which male rats were fed diets containing (a) 0.5 ppm, (b) 2 ppm, (c) 5 ppm, and (d)
20 ppm of potassium PFOS.
                                      Previous
                                                            TOC

-------
154
                                         LA. Harris, HughA. Barton / Toxicology Letters 181 (2008) 148-156
                         104 Week Dietary Study - Male
                                                                                            104 Week Dietary Study - Male
    I
    |_150


    "E "^
    |    50
    o
    I    o
                          40       60       80
                                Weeks
                                                 100      120
CO
SS 4
Q_
.S 3
I?
e o* 2
§ 1
c
o
0 Q

*
xx"^ ^"1""\
y' ^"-^^
/ ^^—- — _ 	 	 „___ 	 ~__— - -^
/



__________U 	 i____J 	 L__________X_^,

"
_



0 20 40 60 80 100 1;
                          40       60       80
                                Weeks
                                                 100      120
Fig. 5. Time-dependent changes in urinary elimination and liver partitioning during
chronic (high-dose) dietary exposures in male rats. Model simulations (solid curves)
are compared to data for the high-dose group (20 ppm).

Table 5
Model parameter values used to reflect time-dependent changes in urinary elimi-
nation and liver partitioning during chronic dietary exposures in female rats
Parameter
                                                        Value
b (low-dose, mid-dose, mid-high-dose)
r (all doses) (h)
c (all doses)
0.01
0.3
3360
7
tion of urinary clearance increases over time to a maximum value
of/ur + b = 0.31. Fig. 6 show that the mid and mid-high dose levels
required a similar level of increase in /ur. For these dose groups,
the fraction of urinary clearance increases  over time to a maxi-
mum value of/ur + b = 0.61. In order to predict the plasma levels in
the low-dose group, an even larger fraction of urinary clearance is
required (see Fig. 7). In the low dose group, the fraction of urinary
elimination increases overtime to a maximum value of/ur + b = 0.91.
   Time-dependent changes  in the urinary elimination and liver
partitioning were also required to simulate  chronic dietary expo-
sures in female rats for the low-, mid-, and mid-high dose groups.
However, the changes in/ur were not dose-dependent as in the male
rats. For each of these dose groups, the fraction of urinary clearance
was increased over time to a maximum value  of/ur+ b = 0.31. For
the high-dose group, plasma levels reached pseudo-steady state (as
                                                                                             40       60       80       100      120
                                                                                                   Weeks
           Fig. 7. Time-dependent changes in urinary elimination and liver partitioning during
           chronic (low-dose) dietary exposures in male rats. Model simulations (solid curves)
           are compared to data for the low-dose group (0.5 ppm).
in the data) and, hence, only time-dependent changes in the liver
partitioning were required. Results from simulating these time-
dependent changes for all of the dose groups are shown in Fig. 8.
   The results in Figs. 4-8 suggest that the urinary elimination rate
may be a function of the age of the animals, i.e. as the animals age
the urinary elimination rate increases. This provides some support
for the need to decrease the original estimate of/ur from 0.28 to 0.01.
The original estimate of/ur was calculated using an intravenous
study in which male rats were 8 weeks old at the onset of exposure,
approximately the age of the animals at the start of the  dietary
study, so it might have been expected that the same value of /ur
could be used.

4. Discussion

   The  pharmacokinetics and resulting tissue dosimetry for per-
fluoroalkyl acids has been  under increasing investigation due  to
the availability of measured blood concentrations in human pop-
ulations (Calafat et al., 2007; Olsen et al., 2005, 2008) and the
substantial differences in half lives reported for different species
(summarized in Lau et al., 2007 and Trudel et al., 2008). In addition,
plasma concentrations of perfluoroalkyl acids have been measured
fairly frequently in the animal toxicity  studies facilitating toxi-
            E   15
                               104 Week Dietary Study - Male
                                       60      SO
                                      Weeks
                                                                   O).
             I
             a.   40
             ii.
             |   400
                                                                      c —J
                                                                      •S -i 200
                                                                      2 o)
                                                                      c 3
                                                                      o  100
                                                                                         104 Week Dietary Study - Male
                                                                                                Weeks
                                                                                 I
                                      Weeks
                                                                                                Weeks
Fig. 6. Time-dependent changes in urinary elimination and liver partitioning during chronic (mid-dose and mid-high dose) dietary exposures in male rats. Model simulations
(solid curves) are compared to data for the (a) mid-dose (2 ppm), and (b) mid-high dose group (5 ppm).
                                         Previous

-------
                                       LA. Harris, HughA. Barton / Toxicology Letters 181 (2008) 148-156
                                                                                                                          155
         (a).
            IE
            2 2>
                            104 Week Dietary Study - Female
                              40      60      80
                                   Weeks
                                                   100      120
                40      60      80     100      120
                     Weeks
                                                                (b) (B  40
                                                                  p O!
                                                                  E 310
                                                                  5   0
                                                                  O    0
                                                                  •S  100
                                                                  11
                                                                                   104 Week Dietary Study - Female
                                                                             60
                                                                            Weeks
                                                                                           60
                                                                                          Weeks
         (C)5
 400

 300

 200
>
' 100


  0
                             104 Week Dietary Study - Female
                                      60
                                    Weeks
                     -j-r-
                                      60
                                    Weeks
                                                                (d)«
                                                                  SE
                                                                  Si
                                                                  2 3 500
                                                                  '
                                                                                    104 Week Dietary Study - Female
300
200
100
0
C
' ^ j
'M

20 40 60 80 100 12
Weeks
                                                                               60
                                                                              Weeks
Fig. 8. Time-dependent changes in urinary elimination and liver partitioning during chronic dietary exposures of female rats. Model simulations (solid curves) are compared
to data from the 104-week dietary study in which female rats were fed diets containing (a) 0.5 ppm, (b) 2ppm, (c)5ppm, and(d) 20ppm of potassium PFOS. The high-dose
group only required changes in the liver partitioning.
cological evaluations based upon plasma concentrations, rather
than the typical analyses for environmental chemicals based upon
exposure doses. Measurements of PFOS concentrations in a 2-year
dietary study were a major focus of the analysis presented here (3M
Company, 2001, 2002). These analyses offer the potential to better
interpret human biomonitoring data if the pharmacokinetics are
reasonably well understood.
   In contrast to perfluorooctanoate acid (PFOA), for which fairly
robust pharmacokinetic data sets exist in several species (Kudo and
Kawashima, 2003), PFOS has been the subject of fewer investiga-
tions. Data for PFOA and PFOS in monkeys were previously analyzed
using a model proposing saturable resorption in the kidneys result-
ing in greater clearance at high plasma concentrations and slower
clearance  at lower  concentrations resulting  in a  longer appar-
ent half life (Andersen  et al.,  2006). However, pharmacokinetic
analyses for PFOA in rats have  not demonstrated concentration-
dependent clearance (Wambaugh et al., submitted for publication).
Concentration-dependent clearance can result in apparent discrep-
ancies between single-dose pharmacokinetic studies, such as blood
time course data following intravenous or oral dosing, and repeated
dosing studies, particularly when the  single-dose studies do not
involve  an adequate range  of concentrations. Such discrepancies
have led to proposals for altering pharmacokinetic parameters such
as the volume  of distribution (Washburn et al., 2005).
   The modeling reported here indicates discrepancies among dif-
ferent studies as  well as an apparent  need for time-dependent
changes in pharmacokinetic parameters for PFOS over the course of
a 2-year study in rats. Based upon the measured data in the study,
                                                   changes in body weight and food consumption levels over time
                                                   were modeled. To account for decreasing plasma and liver levels
                                                   in the data, time-dependent changes in urinary elimination and
                                                   liver partitioning were made. Other possible changes that were not
                                                   modeled in this effort include decreasing absorption over time and
                                                   increasing fecal  elimination over time. However, the latter would
                                                   involve increasing biliary elimination which essentially has been
                                                   shut down due  to the decreased fecal elimination in the single-
                                                   dose intravenous study. Whether the necessary time-dependent
                                                   changes are a consequence of the PFOS exposure itself, the aging
                                                   of the animals, or some combination of the two is unclear. Dur-
                                                   ing the writing  of this manuscript, Tan et al. (2008) published a
                                                   related model for the disposition of perfluoroalkylacids in rats and
                                                   monkeys. The authors similarly determined that time-dependent
                                                   changes were needed to accurately characterize the data. They had
                                                   similar difficulties fitting the oral PFOS data set and chose to rep-
                                                   resent the data from the intravenous study and the oral study as
                                                   a single oral dose; their analyses did not include the dietary study
                                                   data analyzed here.
                                                     Using only plasma concentrations it is not possible to completely
                                                   distinguish between changes in tissue distribution and clearance,
                                                   so the  time-dependent changes described here are  hypotheses
                                                   requiring additional experimental investigation. Measurements of
                                                   daily PFOS urinary and fecal excretion, as well as plasma and  tis-
                                                   sue concentrations during a repeated-dose study, would be useful
                                                   in testing these hypotheses. While the major gender differences
                                                   observed for PFOA are not observed with PFOS, there were mod-
                                                   est differences in the pharmacokinetic parameters required for
                                     Previous
                                            TOC

-------
156
                                               LA. Harris, HughA. Barton / Toxicology Letters 181 (2008) 148-156
describing the two sexes in the chronic study. No concentration-
dependent effects were observed for the females, while time and
concentration-dependent changes  in clearance were utilized for
describing  the chronic  male rat plasma data. These  results  are
inconsistent with the  saturable  resorption hypothesis in that the
highest clearance is predicted for the lowest concentration, rather
than the  highest. Some additional data on PFOS in blood, urine,
and feces has recently been presented (Chang S, Hart JA, Ehres-
man DJ, Butenhoff JL, personal communication) that may provide
additional perspectives on the analyses presented here and by Tan
et al. (2008). Developing a clearer  understanding of the  pharma-
cokinetics of  PFOS will be  beneficial for better interpreting the
relationship between blood concentrations measured in the animal
toxicity studies and those measured in humans.


Conflict of interest statement

   There  are no conflicts of interest.


Acknowledgement

   The authors would like to thank Jennifer Seed for useful com-
ments on this article.


References

3M Company, 2001.104-Week dietary chronic toxicity study with perfluorooctane
   sulfonic acid  potassium salt (PFOS T-6295) in rats. 3M Study Report, Covance
   Study No.: 6329-183. Analytical laboratory report on determination of the pres-
   ence and concentration of perfluorooctanesulfonate (PFOS) in the serum of
   Sprague-Dawley rats exposed to potassium perfluorooctanesulfonate via gav-
   age. 3M Report No. Fact Tbx-002, LRN-U2121.
3M Company, 2002. 104-Week dietary chronic toxicity and carcinogenicity study
   with perfluorooctane sulfonic acid potassium salt (PFOS; T-6295) in rats. Final
   Report.  3M Company, St. Paul, MN. January 2, 2002. U.S. EPA Docket AR-226-
   1051 a.
Alexander, B.H., Olsen, G.W., 2007. Bladder cancer in perfluorooctanesulfonyl fluo-
   ride manufacturing workers. Ann. Epidemiol. 17,471-478.
Andersen, M.E., Clewell III, H.J., Tan, Y.M., Butenhoff, J.L., Olsen, G.W., 2006. Phar-
   macokinetic modeling of saturable, renal resorption of perfluoroalkylacids in
   monkeys—probing the determinants of long plasma half-lives. Toxicology 227,
   156-164.
Brown, R.P.,  Delp, M.D., Lindstedt, S.L., Rhomberg, L.R., Beliles, R.P., 1997. Physiological
   parameter values for physiologically based pharmacokinetic models. Toxicol.
   Ind. Health 13, 407-484.
Calafat, A.M., Wong, L.Y., Kuklenyik, Z.,  Reidy, J.A., Needham, L.L, 2007. Polyfluo-
   roalkyl  chemicals in the U.S. population: data from the National Health and
   Nutrition Examination Survey (NHANES) 2003-2004 and comparisons with
   NHANES 1999-2000. Environ. Health Perspect. 115,1596-1602.
Charles River Laboratories, 2008. Weight  Chart for the Crl:CD(SD) Rat. http://
   www.criver.com/research_models_and_services/research_models/CD_IGS.html.
Ehresman, D.J., Froehlich, J.W., Olsen, G.W., Chang, S.C., Butenhoff, J.L., 2007. Compar-
   ison of human whole blood, plasma, and serum matrices for the determination
   of perfluorooctanesulfonate (PFOS), perfluorooctanoate (PFOA), and other fluo-
   rochemicals. Environ. Res. 103,176-184.
Fei, C, Mclaughlin, J.K., Tarone, R.E., Olsen, J., 2008. Fetal growth indicators and
   perfluorinated chemicals: a study in the Danish National Birth  Cohort. Am. J.
   Epidemiol. 168, 66-72.
Gerlowski,  L.E., Jain, R.K., 1983. Physiologically based pharmacokinetic modeling:
   principles and applications. J. Pharm. Sci. 72,1103-1127.
Giesy, J.P., Kannan, It, 2001. Global distribution of perfluorooctane sulfonate in
   wildlife. Environ. Sci. Technol. 35,1339-1342.
Johnson, J.D., Gibson, S.J., Ober, R.F., 1979a.  Extent and route of excretion and tis-
   sue distribution  of total carbon-14  in rats after a single intravenous dose of
   FC-95-14C. Project No. 8900310200, Riker Laboratories, Inc., St.  Paul, MN. U.S.
   EPA Docket AR-226-0006.
    Johnson, J.D., Gibson, S.J., Ober, R.F., 1979b. Absorption of FC-95-14C in rats after a
        single oral dose. Project No. 8900310200, Riker Laboratories, Inc., St. Paul, MN.
        U.S. EPA Docket AR-226-0007.
    Johnson, J.D., Gibson, S.J., Ober,  R.F.,  1984. Cholestyramine-enhanced fecal elim-
        ination  of carbon-14 in rats after the  administration  of ammonium! 14C]
        perfluorooctanoate or potassium [14C] perfluorooctanesulfonate. Fund. Appl.
        Toxicol. 4, 972-976.
    Jones, P.O., Hu, W., De Coen, W., Newsted, J.L., Giesy, J.P., 2003. Binding of perfluori-
        nated fatty acids to serum proteins. Environ. Toxicol. Chem. 22, 2639-2649.
     Khor, S.P., Mayersohn, M., 1991. Potential error in the measurement of tissue to blood
        distribution coefficients in physiological pharmacokinetic modeling: residual
        tissue blood. I. Theoretical considerations. Drug Metab. Dispos. 19,478-485.
     Krishnan, It, Andersen, M.E., 1994. Physiologically based pharmacokinetic modeling
        in toxicology. In: Hayes, A.W. (Ed.), Principles and Methods in Toxicology, 3rd ed.
        Raven Press, Ltd., New York, pp. 149-188.
     Kudo, N., Kawashima, Y, 2003. Toxicity and toxicokinetics of perfluorooctanoic acid
        in humans and animals. J. Toxicol. Sci. 28, 49-57.
     Lam, G., Chen, M., Chiou, W.L., 1982.  Determination of tissue  to blood  partition
        coefficients in physiologically-based pharmacokinetic studies. J. Pharm. Sci. 71,
        454-456.
     Lau, C, Butenhoff, J.L., Rogers, J.M., 2004. The developmental toxicity of perfluo-
        roalkyl acids and their derivatives. Toxicol. Appl. Pharmacol. 198, 231-241.
     Lau, C, Anitole, It, Hodes, C., Lai, D., Pfahles-Hutchens, A., Seed, J., 2007. Perfluo-
        roalkyl acids: a review of monitoring and toxicological findings. Toxicol. Sci. 99,
        366-394.
     Mendel, C.M., 1992. The free hormone hypothesis: distinction from the free hormone
        transport hypothesis. J.Androl. 13,107-116.
     Nestorov,  I.,  2003. Whole body pharmacokinetic models. Clin. Pharmacokinet. 42,
        883-908.
     OECD, 2002. Co-Operation on  Existing Chemicals:  Hazard  Assessment of Per-
        fluorooctane Sulfonate (PFOS) and Its Salts. Joint Meeting of the Chemicals
        Committee and the Working Party of Chemicals, Pesticides and Biotechnology,
        Organisation for Economic Cooperation and Development.
     Olsen, G.W., Huang,  H.Y, Helzlsouer, It]., Hansen, K.J., Butenhoff, J.L., Mandel, J.H.,
        2005. Historical comparison of perfluorooctanesulfonate, perfluorooctanoate,
        and other fluorochemicals  in human blood. Environ. Health Perspect. 113,
        539-545.
     Olsen, G.W., Burris, J.M., Ehresman, D.J., Froehlich, J.W., Seacat, A.M., Butenhoff, J.L.,
        Zobel, L.R., 2007. Half-life of serum elimination  of perfluorooctanesulfonate,
        perfluorohexanesulfonate, and perfluorooctanoate in retired fluorochemical
        production workers. Environ. Health Perspect. 115,1298-1305.
     Olsen, G.W., Mair, D.C., Church, T.R., Ellefson, M.E., Reagen, W.It, Boyd, T.M., Herron,
        R.M., Medhdizadehkashi, Z., Nobiletti, J.B., Rios, J.A., Butenhoff, J.L., Zobel, L.R.,
        2008. Decline in perfluorooctanesulfonate and other polyfluoroalkyl chemicals
        in American Red Cross Adult Blood Donors, 2000-2006. Environ. Sci. Technol.
        42, 4989-4995.
     PubChem,   2008.  http://pubchem.ncbi.nlm. nih.gov/summary/summary.cgi?cid=
        3736298.
     Seacat, A.M., Thomford,  P.J., Hansen, K.J., Olsen, G.W., Case,  M.T., Butenhoff, J.L.,
        2002. Subchronic toxicity studies on perfluorooctanesulfonate potassium salt
        in cynomolgus monkeys. Toxicol. Sci. 68, 249-264.
     Seacat, A.M., Thomford, P.J., Hansen, K.J., Clemen, L.A., Eldridge, S.R., Elcombe, C.R.,
        Butenhoff, J.L, 2003a. Sub-chronic dietary toxicity of potassium perfluorooc-
        tanesulfonate in rats. Toxicology 183,117-131.
     Seacat, A.M., Thomford, P.J., Hansen, K.J., Clemen, L.A., Eldridge, S.R., Elcombe, C.R.,
        Butenhoff, J.L., 2003b. Erratum to "sub-chronic dietary toxicity of potassium
        perfluorooctanesulfonate in rats". Toxicology 192,  263-264.
     Seshagiri, P.B., Adiga, PR., 1989. Comparison of biotin binding protein of pregnant
        rat serum with rat serum albumin. J. Biosci. 14, 221-231.
     Tan, Y.M., Clewell III, H.J., Andersen, M.E., 2008. Time dependencies in perfluoroocty-
        lacids disposition in rat and monkeys: a kinetic analysis. Toxicol. Lett. 177,38-47.
     Teeguarden, J.G., Barton, H.A., 2004. Computational modeling of serum-binding pro-
        teins and clearance in extrapolations across life stages and species for endocrine
        active compounds. Risk Anal. 24, 751-770.
     Trudel, D., Horowitz, L., Wormuth, M., Scheringer, M., Cousins, I.T., Hungerbuhler, It,
        2008. Estimating consumer exposure to PFOS and PFOA. Risk Anal. 28,251-269.
     Wambaugh, J.F., Barton, H.A., Setzer,  R.W., submitted for publication. Comparing
        models for PFOA pharmacokinetics Using Bayesian analysis. J. Pharmacokinet.
        Pharmacodyn.
     Washburn, S.T., Bingman, T.S., Braithwaite, S.It, Buck, R.C., Buxton, L.W., Clewell, H.J.,
        Haroun, LA., Kester, J.E., Rickard, R.W., Shipp, A.M., 2005. Exposure assessment
        and risk characterization for perfluorooctanoate in selected  consumer articles.
        Environ. Sci. Technol. 39,3904-3910.
     Yuan, J., 1993. Modeling blood/plasmaconcentrations in dosed feed and dosed drink-
        ing water toxicology studies. Toxicol. Appl. Pharmacol. 119,131-141.
                                               Previous
TOC

-------
Journal of Environmental Science and Health Part C, 27:57—90, 2009               f*\ Tavlof &. FfanClS

Copyright © Taylor & Francis Group, LLC

ISSN: 1059-0501 (Print); 1532-4095 (Online)

DOI: 10.1080/10590500902885593
Predictive Models  for


Corcinogenicity  and


Mutagenicity:  Frameworks,


State-of-the-Art,  and


Perspectives



E. Benfenati,1 R. Benigni,2 D. M. DeMarini,3 C. Helma4

D. Kirkland,5 T. M.  Martin,6 P Mazzatorta7

G. Ouedraogo-Arras,8 A. M. Richard,9 B. Schilter,7

W. G. E. J. Schoonen,10 R. D. Snyder,11 and C. Yang12	

1Istituto di Ricerche Farmacologiche "Mario Negri," Milano, Italy
2Istituto Superiore di Sanita, Environment and Health Department, Rome, Italy
Environmental Carcinogenesis Division, US EPA, Research Triangle Park, North
Carolina, USA
4In Silica Toxicology, Basel, Switzerland
5Covance Laboratories Ltd, Harrogate, United Kingdom
6Sustainable Technology Division, National Risk Management Research Laboratory,
US EPA, Cincinnati, Ohio, USA
7Nestle Research Center, Quality and Safety Department, Lausanne, Switzerland
8L'Oreal, Safety Research Department, Aulnay-sous-Bois, France
9National Center for Computational Toxicology, US EPA, Research Triangle  Park,
North Carolina, USA
10Schering-Plough Research Institute, Oss, The Netherlands
11 Schering-Plough Research Institute, Summit, New Jersey, USA
12 Center for Food Safety and Applied Nutrition, Food and Drug Administration, College
Park, Maryland, USA

Mutagenicity and carcinogenicity are endpoints of major environmental and regula-
tory concern. These endpoints are also important targets for development of alternative
methods for screening and prediction due to the large number of chemicals of potential
concern and the tremendous cost (in time, money, animals) of rodent carcinogenicity
bioassays. Both mutagenicity and carcinogenicity involve complex, cellular processes
that are only partially understood. Advances in technologies and generation of new data
Received January 29, 2009; accepted March 9, 2009.
Address correspondence to E. Benfenati, Head, Laboratory of Environmental Chem-
istry and Toxicology, Istituto di Richerce Farmacologiche "Mario Negri," Via La Masa
19, Milan 20156, Italy. E-mail benfenati@marionegri.it

                              57
                 Previous I    TOC

-------
58  E. Benfenati et al.

     will permit a much deeper understanding. In silica methods for predicting mutagenicity
     and rodent carcinogenicity based on chemical structural features, along with current
     mutagenicity and carcinogenicity data sets, have performed well for local prediction
     (i.e., within specific chemical classes), but are less successful for global prediction (i.e.,
     for a broad range of chemicals). The predictivity of in silica methods can be improved by
     improving the quality of the data base and endpoints used for modelling. In particular,
     in vitro assays for clastogenicity need to be improved to reduce false positives (relative
     to rodent carcinogenicity) and to detect compounds that do not interact directly with
     DNA or have epigenetic activities. New assays emerging to complement or replace some
     of the standard assays include Vitotox™, GreenScreenGC, and RadarScreen. The needs
     of industry and regulators to assess thousands of compounds necessitate the develop-
     ment of high-throughput assays combined with innovative data-mining and in silica
     methods. Various initiatives in this regard have begun, including CAESAR, OSIRIS,
     CHEMOMENTUM, CHEMPREDICT, OpenTox, EPAA, and ToxCast™. In silica meth-
     ods can be used for priority setting, mechanistic studies, and to estimate potency. Ulti-
     mately, such efforts should lead to improvements in application of in silica methods for
     predicting carcinogenicity to assist industry and regulators and to enhance protection
     of public health.

     Key Words: Carcinogenicity; mutagenicity; QSAR; in silica; predictive methods


     INTRODUCTION

     The  EC-funded SCARLET (Structure-activity relationships leading experts
     in  mutagenicity and  carcinogenicity) project was designed to investigate the
     current status and future application of predictive models for carcinogenicity
     and mutagenicity. The project organized a Workshop in Milan, Italy, 2-4 April
     2008. Participants discussed the potentials and the issues of the methods but
     also  considered societal and industrial needs and the possibilities of interac-
     tions with and incorporation of a wider body of scientific research dealing with
     carcinogenicity and toxicity in general.
        Studies on carcinogenicity and mutagenicity span diverse fields such as bi-
     ology, toxicology, biochemistry, and chemistry. These studies are important not
     only from a scientific point of view but also for the societal consequences re-
     lated to the toxicity of carcinogenic  and mutagenic compounds. To explore the
     different perspectives, different tools may be preferable. In silico tools are those
     based on computer programs and include so-called (quantitative) structure-
     activity relationships; i.e., (Q)SAR methods. In silico tools offer advantages for
     several scenarios, particularly where test data are unavailable or prohibitively
     expensive and time consuming to generate. For this reason, it is useful to pro-
     vide  a general overview of the evolving field of carcinogenicity and mutagenic-
     ity studies toward the goal of enhancing in silico approaches. More details on
     the SCARLET project and the workshop can be found on the Internet (1).
        It is not feasible or practical to report all the contributions and positions
     related to the use of in silico methods for predicting mutagenicity and carcino-
     genicity. Here we will organize the discussion in five areas to assess the utility
                 Previous  I     TOC

-------
                       Predictive Models for Carcinogenicity and Mutagenicity  59

of ire silico tools for different potential scenarios:

1. The scientific framework of carcinogenicity and mutagenicity studies;
2. The needs of industry and regulators;
3. State-of-the art of the methods of ire silico prediction for carcinogenicity and
   mutagenicity;
4. Connecting new and traditional data into a new scenario;
5. Conclusions.


THE SCIENTIFIC  FRAMEWORK OF CARCINOGENICITY AND
MUTAGENICITY STUDIES

The meeting opened with a brief overview of the emerging knowledge of mecha-
nisms underlying mutagenesis and carcinogenesis. The importance of the "mu-
tagenesis paradigm" was emphasized because this is the general step-wise pro-
cess by which a cell deals with DNA damage. Most chemical mutagens do not
directly induce mutations (i.e.,  a change  in nucleotide sequence in the DNA).
Instead, mutagens induce DNA damage, which  can be, for example, a DNA
adduct (a molecule bound covalently to a nucleotide) or a single- or double-
strand break. In either case, the primary nucleotide sequence is not changed.
A complex array of signalling pathways detects the DNA damage and directs
the cell to do one of three general things:  repair the damage, convert the dam-
age to a mutation, or signal the cell to die (apoptosis). Thus, mutagenesis is
a cellular process involving enzymatic activities and usually DNA replication.
Consequently, mutagens make DNA damage and cells make mutations. Mod-
elling this complex process directly may be possible in the future as a clearer
understanding of the underlying signalling pathways emerges.
    Carcinogenesis is  now  clearly understood  to proceed in a  Darwinian
process  in which  cells with a growth advantage are selected in a step-wise
fashion, resulting in a tumor. Both mutational and epigenetic (non-mutational)
events are involved and are necessary for this process. Epigenetic events are
changes in gene expression and do not involve a change in nucleotide sequence
(mutation). Gene  expression can be  modulated by methylation of DNA  or
methylation or  acetylation  of histones, which are proteins surrounding the
DNA. Carcinogenesis  can be initiated by either a  genetic (mutational)  or
epigenetic (non-mutational) event. However, both are ultimately necessary for
the formation of a tumor.
    More and more studies are directed toward development of alternative
tests to animal studies. However, given the high  frequency of irrelevant
positives from ire vitro mammalian cell tests, unless there is  improved accu-
racy of these tests to predict ire vivo genotoxic or carcinogenic hazard, many
                   Previous

-------
60 E. Benfenati et al.

    substances will be classified inappropriately; i.e., falsely labelled hazardous
    or  even banned. Improvements  to basic cell culture are needed  to  avoid
    reactions between test substance and culture medium that can result  in
    production of clastogenic levels of hydrogen peroxide. A review of the top
    concentration (10 mM) for testing non-toxic and freely soluble substances
    is urgently  needed to see if this concentration is justified.  Many  different
    measures of cytotoxicity can be used to determine an appropriately  cytotoxic
    top concentration, but there is now evidence that these do not always  select
    the same concentration. If a measure is used that underestimates the toxicity,
    then higher-than-warranted concentrations may be used and positive  artefacts
    may occur. Also, the levels of cytotoxicity required in chromosomal aberration
    and mouse lymphoma assays are largely based on old published data, and
    it is not certain  that these can be substantiated if more  modern protocols
    are used (2). These and other factors that may contribute to a high rate of
    irrelevant positive results were highlighted by Kirkland et al. (3). A 3-year
    research program is underway that is funded by the EU cosmetic  industry
    association COLIPA and supported by the UK National Institute for the 3Rs,
    together with ECVAM, to  evaluate changes in study design that would reduce
    the frequency of irrelevant positive results.
        Also discussed was the concept of non-covalent DNA interactions. For ex-
    ample, DNA groove-binding or intercalation, which has not been modelled ad-
    equately, might also explain many "false positive" findings. Nearly one half of
    the marketed drugs that are not structurally alerting but are still positive in
    in vitro cytogenetics assays may operate through DNA intercalation and topoi-
    somerase poisoning (4, 5). This conclusion has been drawn both from cell-based
    testing and 3-D DNA docking evaluations.


    Some Examples of Recent In Vitro Techniques
        Recently, three different assays were examined for implementation as mu-
    tagenicity, genotoxicity, and/or clastogenicity assays; i.e., Vitotox, GreenScreen
    GC, and RadarScreen. For clastogenicity also, in vivo experiments have to  be
    performed with rodents.

    Vitotox
        The  Vitotox  assay  (Thermo  Technologies,  Finland)  is performed  in
    Salmonella  typhimurium  strain TA104 (6-8). The assay is based on the ac-
    tivation of an SOS repair system by genotoxic compounds. In these bacteria, a
    luciferase gene of the beetle Vibrio frescio is introduced by  molecular design,
    and this gene is under the transcriptional control of the recN promoter. This
    recN promoter, in turn, is strongly repressed, which prevents the luciferase
    gene expression under control conditions. In the presence of a DNA-damaging
    genotoxic compound,  the RecA regulator protein recognizes the resulting
                Previous  I     TOC

-------
                       Predictive Models for Carcinogenicity and Mutagenicity  61

Table 1: Predictive values of Vitotox, GreenScreen GC and RadarScreen assays in
comparison with scores for full Ames tests and a set of 156,81, and 154
compounds, respectively. The compounds are references, a specific set of
pre-clinical and clinical developmental compounds of Organon, and a specific
clastogenic set of steroidal compounds (20)
Vitotox

Sensitivity
Specificity
Concordance

90
90
90
n
48
108
156
GreenScreen GC

39
98
74
n
33
48
81
RadarScreen

55
52
53
n
47
107
154
free ends  or mismatches in DNA. This results in a cascade of biochemical
interactions and reactions leading to the de-repression of the strong recN pro-
moter and subsequent transcription of the luciferase gene. This will enhance
the production of the luciferase protein. After addition of luciferin and ATP,
the synthesis of light can then be visualized. Light enhancement is an indi-
cation for  genotoxicity. A second S. typhimurium TA104 strain, constitutively
expressing the luciferase gene, is used as a control strain for measuring cyto-
toxicity. A ratio score of 1.5 between the genotoxicity and cytotoxicity strains
is used as  a cut-off for real genotoxicity.
   The assay can be performed in the presence or absence of S9 fraction of rat
liver homogenates induced with Aroclor 1254. With the reference compounds,
4-nitroquinolin-l-oxide  and benzo[a]pyrene, reproducible results were found
in independent assays. Table 1 shows the results of these tests in comparison
with the Ames test. The concordance of the Vitotox test and the full Ames test
was 90%.  The sensitivity was 90% for the true positives, and the specificity
was 90% for the true negatives. The potential advantage of this assay relative
to the standard Ames assay is that smaller volumes/amounts of sample  are
needed, cytotoxicity is determined, and it is faster (results are obtained in 3 h
versus 3 days).

GreenScreen GC and RadarScreen
   The GreenScreen GC (Centronix, UK) and RadarScreen (reMYND,  Bel-
gium) assays are carried out in yeast (Saccharomyces cerevisiae). The advan-
tage of these yeast assays is that chromosomal segregation in the meiotic phase
in yeast resembles that of vertebrates. In these assays, the focus is on the acti-
vation of the RAD54 promoter, which is the analogue of the RAD51 promoter in
vertebrates (9). Both RAD54 and RAD51 are recombinational repair genes that
belong to the class of RAD52 genes, which are involved in the repair of chro-
mosomal double-strand break damage (10-13). In the GreenScreen GC strain,
the RAD54 gene has been replaced by the gene encoding Green Fluorescence
Protein  (GFP) (14-17), whereas in the RadarScreen strain the /3-galactosidase
                   Previous  I     TOC

-------
62  E. Benfenati et al.

    (LacZ) gene was introduced (18, 19). The end-point measurement with GFP is
    assessed after 24 h and with /3-galactosidase after 6 h. The control analysis for
    cytotoxicity can be carried out in the same sample by adsorption measurement
    with a spectrophotometer.
        A drawback of the GFP system is that some compounds gave autofluores-
    cence at the wavelength of GFP. Another issue is that the addition of the S9
    fraction caused quenching of the GFP signal. With a specific set of 100 candi-
    date pharmaceuticals, autofluorescence occurred with 15 compounds, whereas
    for 15 other compounds activation with the rat-liver S9 fraction was essen-
    tial. These problems could be overcome with the  use  of the /3-galactosidase
    enzyme. This enzyme can cleave 6-0-/3-galactopyranosyl-luciferin (Promega,
    Madison, WI) into free luciferin, which in turn can be quantified with luciferase
    of the firefly. At the wavelength of 650 nm, no quenching was found with rat-
    liver. Due to luminometry, this assay became very sensitive, leading to a better
    identification of clastogenic compounds. With the reference compounds, methyl
    methanesulphonate and benzo[a]pyrene, reproducible results were found in in-
    dependent assays.
        With both the GreenScreen GC and RadarScreen assays, the predictivity
    with the full Ames assay was relatively low compared with  the Vitotox eval-
    uation, leading to sensitivity of 39% and 55% and specificity of 98% and 52%,
    respectively. Table 2 shows the results of these tests compared with the clasto-
    genicity test. This new RadarScreen assay should  be a good predictor for the
    identification of real clastogenic and/or carcinogenic compounds. For example,
    among 40 steroidal compounds (20) a predictivity score concordance of 82.5%
    was calculated. With respect  to the compounds tested in the full Ames test,
    12  Ames negative compounds were identified with  clastogenic human lym-
    phoblast or CHO  assays.  For these 12 compounds, a concordance of 66% was
    found between the RadarScreen and Ames assay. Overall analysis of 132 com-
    pounds with this RadarScreen assay, compared with the available in vitro clas-
    togenicity/aneuploidy assays,  found that the sensitivity for the positive com-
    pounds was 80%, the specificity for the negative compounds was 77%, and the
    Table 2: Predictive values for Vitotox, GreenScreen GC and RadarScreen assays in
    comparison with scores for in vitro clastogenicity/aneuploidy for a set of 132,44,
    and 130 compounds, respectively. The compounds are references, a specific set of
    pre-clinical and clinical developmental compounds of Organon and a specific
    clastogenic set of steroidal compounds (20)
Vitotox

Sensitivity
Specificity
Concordance

29
89
51
n
85
47
132
GreenScreen GC

22
95
57
n
23
21
44
RadarScreen

80
77
78
n
83
47
130
                Previous  I    TOC

-------
                       Predictive Models for Carcinogenicity and Mutagenicity  63

Table 3: Sensitivity values of Vitotox and RadarScreen assays separately and
combined according to scores for in vitro mutagenicity in the Ames test for a set of
156 compounds, for in vitro clastogenicity assays for a set of 85 compounds, and
for in vitro carcinogenicity assays using tests as in (21) for a set of 50 compounds,
these being references, a specific set of pre-clinical and clinical developmental
compounds of Organon, and a specific clastogenic set of steroidal compounds
(20)

                                      Sensitivity (+ and -)
Mutagenicity
Vitotox
RadarScreen
Vitotox + RadarScreen
43/48
26/47
23/24
90
55
96
Clastogenicity
5/17
66/83
66/85
29
80
78
Carcinogenicity
3/10
40/49
41/50
30
82
82
concordance was 78% (Table 2). This same analysis for the GreenScreen GC
assay with 44 compounds showed a sensitivity of 22%, a specificity of 95%, and
a concordance of 57% (Table 2).
   A positive score in a mutagenicity, genotoxicity, clastogenicity, or carcino-
genicity assay is seen as a strong negative factor for a compound that leads
most times to removal of the compound. Introduction and the combined usage
of the Vitotox and RadarScreen tests may lead to a better deselection and/or
ranking of the lead compounds in the early phase  of the selection process with
respect to genotoxic and clastogenic compounds. The sensitivity of the Vitotox
and RadarScreen assays together was 96% compared with the Ames assay, 78%
with clastogenicity tests, and 82% with in vitro carcinogenicity tests described
in (21) (Table 3).
   Thus, the combination of the Vitotox and RadarScreen in vitro assays leads
to a very high predictivity for in  vitro mutagenicity, clastogenicity, and in vitro
carcinogenicity tests. Moreover,  with the introduction of the AhR assays with
rat H4IIE and human HepG2 cells, a possible risk factor for non-genotoxic
carcinogens might be identified early in the selection process.
THE NEEDS OF INDUSTRY AND REGULATORS

The chemical industry is facing increasing challenges regarding the safety as-
sessment of chemicals by regulatory agencies throughout the world. In par-
ticular, in Europe REACH (Registration, Evaluation, Authorization, and re-
striction of Chemicals) took effect on 1 June 2007. Due to the large number
of chemicals to evaluate, REACH promotes the use of alternative methods to
animal testing to achieve these goals within a reasonable timeframe. REACH
requires that a large number of substances be tested for genotoxicity and, be-
cause positive results in vitro usually trigger additional in vivo testing, a large
role for predicting/confirming these results is envisaged.
                   Previous  I     TOC

-------
64  E. Benfenati et al.

        For the cosmetic industry, the seventh amendment to the European Cos-
    metic Directive will also phase out the use of animals for testing ingredients
    starting in March 2009. Therefore, the use of non-animal methods is manda-
    tory for the seventh amendment, whereas it is a recommendation for REACH.
    Both regulations affect not only companies located in Europe but also those
    exporting chemicals to Europe. There are incentives for seeking  alternative
    methods (in vitro and in silico methods) in hazard/risk assessment. It should
    be emphasized that structure-activity concepts are already used, for example,
    by the U.S. Environmental Protection Agency (EPA).
    The EC Research Projects
        To deal with this new situation and these new regulatory requirements, the
    European Union (EU) has promoted a wide range of initiatives to adequately
    prepare for REACH. Several EC-funded projects relate exclusively or partly to
    the REACH regulation and in silico models, and others will start soon.
        CAESAR (Computer Assisted Evaluation of industrial chemical Sub-
    stances According to Regulations) is an EC-funded project, coordinated by  E.
    Benfenati (Milan), that is devoted to developing in silico models for REACH for
    five endpoints: carcinogenicity, mutagenicity, reproductive toxicity, bioconcen-
    tration factor, and skin sensitization. All models will be made freely available
    at the Web site: http://www.caesar-project.eu.
        Because the purpose is related to regulation, careful  attention has been
    given to the source of the data to be used for QSAR models, including checking
    the suitability of the experimental procedure  according to REACH. Further-
    more, great care has been given to the data quality from a  chemical and toxic-
    ity point of view. Many mistakes have been found in the data, even when data
    have been taken from recent papers. Double  checking of all chemical struc-
    tures found that more than 10% of the chemical structures  were incorrect (22).
    Conversely,  some sources, such as the EPA DSSTox database (23), were found
    to be of high quality.
        OSIRIS (Optimized  Strategies for Risk Assessment of Industrial Chem-
    icals based on Intelligent Combinations of Non-Test and Test Information) is
    an EC-funded project, coordinated by G. Schuiirmann (Leipzig), that is devoted
    to developing intelligent testing strategies for  REACH. In silico, in vitro, and
    in vivo data will be integrated with exposure scenario. Software will be devel-
    oped to assist users in determining if the data are sufficient for modelling  or
    if further experiments are  needed. Models from CAESAR also will be incorpo-
    rated within OSIRIS.
        Other EC-funded projects are  highly  relevant to supporting REACH
    legislation,  even  if the  project descriptions were not dedicated  specifically
    to REACH. We  already introduced  SCARLET, coordinated  by E. Benfenati
    (Milan).  The EC-funded project  CHEMOMENTUM, coordinated by P. Bala
                Previous  I    TOC

-------
                       Predictive Models for Carcinogenicity and Mutagenicity  65

(Warsaw), will produce software for automatic QSAR modelling. The user will
have the capability to build up workflows and extract data from databases and
chemical structures from repositories. Chemical descriptors will be calculated
automatically, including 3D descriptors, and a battery of algorithms will be
available for modelling. A grid-based approach will allow different parts of the
centralized workflow to be physically present in different locations without af-
fecting software performance. This user-friendly scheme will allow simplified
and seamless modelling, whereas today the user has to switch from the dif-
ferent components. CHEMOMENTUM also will incorporate docking tools. The
output of the model will also be available in QSAR Model Reporting Format as
specified by requirements of the Organization for Economic Cooperation and
Development (OECD). Models for REACH will be available that utilize the
data produced within CAESAR.
   CHEMPREDICT is an EC-funded project coordinated by E. Benfenati (Mi-
lan). This project is devoted to developing models focused on new simplified
chemical descriptors that are based on the simplified molecular input line en-
try system (SMILES) format. Endpoints for modelling include carcinogenicity
and genotoxicity.
   OpenTox is a new EC project coordinated by B.  Hardy (DouglasConnect)
that started in 2008. It is focused on the development of an open-source frame-
work providing unified access to toxicity data, (Q)SAR models, and validation
procedures.
   The European Chemicals Bureau (ECB), among others, has promoted stud-
ies aimed at clarifying the state of the art in the field of (Q)SARs for muta-
gens and carcinogens. These studies have been described in official EU reports
(24, 25) as well as in scientific publications (26, 27).
   Several initiatives have been implemented to help companies fulfill the
requirements of legislation. The  EPAA (European  Platform for Alternative
Approaches to Animal testing) is one such initiative (28) that aims to facili-
tate efforts to share knowledge and research to accelerate implementation and
acceptance of the 3Rs  (Replacement, Reduction, Refinement). The EPAA was
founded in 2005 from a joint initiative of the European Commission companies
and trade associations. This partnership consists of a steering committee, a
mirror group that provides critical  input to the steering committee,  and five
working groups.  The EPAA will map currently  used and ongoing projects in
the area of the 3Rs, implement projects where there are gaps, and promote the
acceptance and the use of the 3Rs.  The EPAA is seeking to partner not only
with European stakeholders but with international initiatives as well.
The Situation in the Pharmaceutical Industry
    Within the pharmaceutical industry, the focus of structure-activity mod-
elling is typically on the development of new drugs with a beneficial effect
                 Previous   I     TOC

-------
66 E. Benfenati et al.

    on a particular disease at an acceptable low dose level. Treatment should im-
    prove the well-being of the patient, whereas unwanted side-effects should be
    absent or very low. However, if the drug has to be administered at relatively
    high dose levels, side effects may be induced due to supra-pharmacology or off-
    target pharmacology. Moreover, at high dosages, the compound itself and/or its
    metabolites may induce adverse side effects, such as genotoxicity, carcinogenic-
    ity, reprotoxicity, hepatotoxicity, nephrotoxicity, cardiotoxicity, neurotoxicity, or
    blood or skin toxicity. Within the portfolio of Organon, the toxicity failure rate
    from 1960 to 2000 was due mainly to genotoxicity/carcinogenicity (20%) and re-
    protoxicity (20%), followed by hepatotoxicity (12%), nephrotoxicity (12%), and
    cardiotoxicity (12%). This portfolio is completely different from that of Roche
    during the same  period of time in which genotoxicity/carcinogenicity reached
    only a level of 6% and reprotoxicity a level of 2%. On the other hand, hepato-
    toxicity, nephrotoxicity, and cardiotoxicity reached levels of 20%, 4%, and 16%,
    respectively. These figures can easily be explained by virtue of the focus on
    reproductive medicine within Organon. One of the main topics in this field is
    male  and female  contraception as well as hormone replacement  therapy. The
    effects of estrogens, progestagens, and androgens with respect to these toxicity
    areas are well known. An overall reduction in the attrition rate of compounds
    with 20% to 50% failure rates by means of introducing early toxicity screening
    might lead to a sharp cost reduction in drug development (29-31).
        The strategy within a pharmaceutical industry should be to start with
    early toxicity screens as early as possible in the discovery process. Possible
    toxicity screens could be divided into mutagenicity, genotoxicity, clastogenic-
    ity, cytotoxicity, cytochrome P450 induction, and nuclear receptor activation.
    These predictive toxicity assays should be an integral part of the "ranking and
    selection" process of candidate drugs, should be on a medium to high through-
    put level,  should only use a small amount of compound (1 to 20 mg), and should
    be carried out with a limited amount of man power. Implementation of in sil-
    ico procedures with structure-activity relationship (SAR) programs, such as
    DEREK, TOPKAT, MultiCASE, and Mutalert, can already be initiated at the
    stage of hit selection, in vitro assays  for measuring mutagenicity, genotoxic-
    ity, clastogenicity, non-genotoxic carcinogenicity,  cytotoxicity, nuclear receptor
    activation, and cytochrome P450 enzyme activation, as well  as competition  as-
    says,  can be implemented from the point of lead selection to the final choice of
    the development candidate in the preclinical selection phase.


    Genotoxicity Tests within the Pharmaceutical  Industry
        According to  the guidelines of the U.S. Food and Drug Administration
    (FDA), four different endpoints testing mutagenicity and  clastogenicity are
    considered for a  new drug approval or food ingredients notifications.  These
    tests are:
                Previous  I     TOC

-------
                       Predictive Models for Carcinogenicity and Mutagenicity  67

1.  An in vitro test for gene mutation in bacteria, i.e., Salmonella reverse mu-
   tation;

2.  An in vitro test with  cytogenetic evaluation of chromosomal damage, e.g.,
   in vitro micronucleus  or in vitro chromosome aberration;
3.  In vitro mammalian mutation, i.e., in vitro mouse lymphoma Tk+/~ assay;
   and
4.  An in vivo test for chromosomal damage using rodent hematopoietic cells,
   i.e., micronucleus assay.

   These in vitro tests are still relatively time-consuming, have a relatively
low throughput, and require a relatively large amount of compound.


Non-genotoxic Carcinogens within the Pharmaceutical  Industry
   Activation of the AhR with dioxin (TCDD) leads in both rats and humans
to  an increased incidence of liver tumors (32-34). TCDD is, therefore, classi-
fied as a non-genotoxic carcinogen. The difficulty in using AhR  activation as
a marker for non-genotoxic carcinogenesis lies in the fact that not all AhR in-
ducers are necessarily non-genotoxic carcinogens. For instance, grapes, fruits,
and vegetables, which are generally considered very healthy, contain chemicals
that enhance AhR activity and,  thus, affect AhR-driven pathways. The main
question is whether these protective mechanisms increase or decrease tumor
incidence. Thus, a compound that activates the AhR can either be beneficial
or  carcinogenic. Because it is  difficult to predict in which class  an AhR acti-
vator will fall, it is advisable to  steer away from AhR activation during drug
development, even though good compounds may be thrown out with the bad.
   For the identification of at least one specific group of non-genotoxic carcino-
gens, it is relevant to measure  the activation of the rat and/or human AhR. For
this, simple cellular assays are available in both rat H4IIE and human HepG2
cells, both of which make use  of the metabolism of 3-cyano-7-ethoxycoumarin
(CEC) by cytochrome P450 enzyme 1A1 and/or 1A2—enzymes that are induced
following AhR activation. For TCDD and 3-methylcholanthrene, the activity is
similar in both cell lines. However, species-specific differences have also been
observed; e.g., indigo activity is dominant in the rat cell line, whereas indirubin
is a much stronger AhR agonist in the human cell line. On the other hand, com-
pounds such as menadione activated only the human AhR receptor, whereas
flutamide, Org D, PCB156, and PCB157 were specific for the rat receptor (35).
This species difference has also  been observed for a number of Organon pro-
prietary compounds.
   As mentioned above, existing in silico models may "miss"  prediction  of
many genotoxicity tests, which in turn may be related at least in part to the
fact that the existing computational programs cannot detect non-covalent DNA
                   Previous

-------
68 E. Benfenati et al.

    interactions. Very few DNA intercalators were employed in the training sets of
    these models; even if they were, most of them were classical fused planar multi-
    ring compounds such as acridines rather than the atypically structured inter-
    calators reported recently (4, 5). Because of the complexity of the SARs and the
    probable need to account for electrostatic and other (i.e., van der Waals, hydro-
    gen bonding) effects, it will be difficult to improve programs such as DEREK or
    MCASE to identify such molecules. As the genotoxicity database grows, how-
    ever, it may be possible to establish a predictive tool for non-covalent chemi-
    cal/DNA interactions.
    The Food Industry and the Need of In Silico Models for
    Carcinogenicity and Mutagenicity
        In recent years there has been mounting concern about food as a source of
    exposure to potentially toxic chemicals. It has been estimated that there are
    over five million man-made chemicals known, of which 70,000 are in use today.
    The application of continuously improving analytical methods has revealed
    that many of these chemicals can enter the food chain and result in human
    exposure.
        Food chemical risk assessment is the scientific process used to characterize
    the health significance of potentially harmful chemicals in food. Classically, it
    comprises four steps: (1) hazard identification; (2) hazard  characterization; (3)
    exposure assessment; and (4) risk characterization. In general, hazard identifi-
    cation and characterization rely on toxicological data obtained in experimental
    animals, mainly rodents.  Because toxicological information is limited or absent
    for the majority of the inadvertent food-borne chemicals, the assessment of the
    health significance of such chemicals is difficult or impossible.  Nevertheless,
    the detection of such chemicals in food products  may trigger not only heavy
    management action (e.g., public recall) but also public concern resulting in loss
    of consumer confidence for the food supply. In such situations, the availability
    of reliable tools to establish levels of safety concern without hard toxicological
    data appears of particular importance to ensure adequate  consumer protection
    without undue overconservatism. This should ultimately  allow optimal use of
    the limited resources available.
        Solutions to this general issue are not straightforward. Obviously,  experi-
    mental toxicology is not a practical tool to deal with situations requiring fast
    decision  making. Furthermore, even if sufficient facilities to perform toxicolog-
    ical testing within a relevant time frame were available, it still can be ques-
    tioned whether testing a large number of substances would be a rational and
    practical approach. In this context,  in silico predictive models have obvious
    advantages in terms of time, cost, and animal protection.
        In silico strategies are already proactively and successfully used for pre-
    clinical screening in  pharmaceutical discovery pipelines in which an  early
                Previous  I     TOC

-------
                       Predictive Models for Carcinogenicity and Mutagenicity  69

identification of toxicological hazard offers a clear competitive advantage. Such
efforts allow the exclusion of chemicals that could potentially produce unac-
ceptable adverse effects in further regulatory toxicology tests. The situation
of the food industry is different and requires the development of alternative
models with the following specific characteristics:


•  Risk rather than hazard-based. In the food context, the most likely appli-
   cation of computational toxicology models would be in the establishment
   of the level of  safety concern associated with  the  inadvertent/accidental
   presence of a chemical in products. This requires not  only qualitative in-
   formation on the potential hazardous properties of the chemical (e.g., car-
   cinogenicity) but also quantitative information (e.g., carcinogenic potency),
   allowing the derivation of a margin of exposure (MoE) with the estimated
   intake. The interpretation of the size of the MoE (e.g., allowing for various
   uncertainties such as inter- and intra-species differences) would likely help
   to make decisions at the management level.
•  Reliable, high sensitivity.  Most (Q)SAR predictive  models  suffer from in-
   herent poor sensitivity; i.e., the ability to correctly identify true positives
   (36).  Modellers,  partially because  they are often confronted with non-
   representative  datasets, have focused their attention  on identification of
   toxicophores that are overly general and, as a result, models tend to have
   many false positives. This has made computational  toxicology a useful tool
   for high-throughput screening (HTS) but different strategies should be op-
   timized if the target is to have a low number of false  negatives or a high
   concordance.
•  Global. Compounds found in foods and food ingredients present a high
   structural  diversity and complexity that may be greater  than  synthetic
   Pharmaceuticals (37) and, therefore, require  the development of global in
   silico models.
    Ideally, in silico toxicology  strategies  should predict adverse  effects  in
the human population. Because the toxicological training databases currently
available consist mainly of in vitro and animal data with high limitations to
predict human situations (38), the development of such models will always
constitute a significant challenge. Their practical application in the food sector
will depend on their potential to accurately predict endpoints that are cur-
rently used to make food safety management decisions. This includes the need
to establish confidence limits. The acceptance of these models will be possible
only if the analysis is fully transparent. Therefore, the promotion of validated,
freely available tools based on open-source codes, such as those developed by
ECB and EPA, is recommended.
                    Previous  I     TOC

-------
70  E. Benfenati et al.

    STATE-OF-THE-ART OF IN SILICO METHODS FOR PREDICTION OF
    CARCINOGENICITYAND MUTAGENICITY

    To properly evaluate the utility of current in silico methods, it has to be clari-
    fied that different purposes are envisioned and, thus, different evaluations are
    possible. Also, for this reason, practical applications of in silico programs for
    prediction of mutagenicity or carcinogenicity have ranged in utility from indis-
    pensable to useless. Next we list some of the different ways that in silico tools
    for prediction of mutagenicity and carcinogenicity can be employed.


    Priority Setting
        Models are usually used to set priorities among chemicals for further test-
    ing. For this use, several (Q)SAR models and databases are commercially (or
    freely) available, typically for alert-identification and read-across. Due to the
    differences in the systems (knowledge-based versus artificial intelligence, SAR
    versus QSAR, applicability domains, and extent to which mode-of-action  is
    considered in the model development), combining several  models appears  to
    be more sound than relying on a single one. The so-called global and local
    models may be used for the purpose of priority setting. Because regular up-
    dates of the models are released, care should be taken to re-assess their per-
    formances  on  a regular basis. Therefore, active participation of industry  is
    needed to evaluate and improve existing in silico models so they can meet
    their  needs. For REACH, although the chemicals to be characterized have
    been chosen on the basis of the amounts produced annually, it may be valu-
    able to further prioritize them using global SAR. Thus, those chemicals  that
    pose major concern may be identified and given a higher priority for further
    evaluation.


    Mechanistic Investigation
        When using human knowledge-based systems such as Derek (Lhasa Ltd.)
    or OncoLogic (U.S. EPA) or any model based on a mechanistic understanding
    (as opposed to models based purely on statistics), it is possible to  gain insight
    into the mechanism underlying the mutagenicity/carcinogenicity.


    Quantitative Evaluation of the Potency
        In REACH this evaluation is requested in the case of genotoxic compounds
    to assess if the expected exposure level for the scenario of use of the chemical
    compound will produce an unacceptable risk.
       Although a particular model may provide results unacceptable for a cer-
    tain use, for a different purpose the same model can be useful. For instance,
    a global model with overall prediction accuracy of 65-70%, as can be the  case
                Previous   I     TOC

-------
                       Predictive Models for Carcinogenicity and Mutagenicity  J

for carcinogenicity models, might be considered unsuitable as a substitute for
traditional testing methods; however, the same model might be useful when
combined with  other considerations as a means for prioritization. Further-
more, even within the same regulation, such as REACH, in some cases a clas-
sifier model  can be useful, for instance to identify the presence for a certain
mutagenic fragment, whereas for other purposes, such as in support of risk
assessment,  a quantitative  model with less uncertainty is necessary because
the toxic effect has to be considered along with the exposure level.
   There are many publications on the databases, structural alerts, and mod-
els, and a number of commercial products are built on these. Publicly available
genetic toxicity  and carcinogenicity data sets include CCRIS (39),  EPA Gene-
tox (40), NTP (41), IARC (21, 42), EPA IRIS (43), U.S. FDA CRADA database
(44), Tokyo-Eiken (45), Mutants  (46), CPDB (47, 48), ISSCAN (49), and pri-
mary publications. Commercial databases derived from these sources also are
available from Leadscope (50), Lhasa (51), and MDL (52). These databases pro-
vide  test results form both regulatory-accepted test protocols as well as other
screening methods.
   Based on these data, commercial prediction models are currently mar-
keted in the form of global models, including MultiCASE (53), TOPKAT (54),
MDL QSAR  (55), and Leadscope FDA Model Applier (56), whereas OncoLogic
(57)  and LAZAR (58) are freely available  for use. Derek from Lhasa offers a
knowledge-base classification. Toxtree is  a  software tool developed by ECB
(through IdeaConsult Ltd.) that is able to estimate different types of toxic haz-
ards by applying structural rules; it is an open-source, freely available appli-
cation that can  be downloaded from the ECB Web site (59). The new module
predicts mutagenicity and carcinogenicity by applying a revised, updated list
of Structural Alerts (SA), and, when applicable, three QSARs for congeneric
classes (see  details in the Toxtree scientific  manual) (24, 25). The Leadscope
system provides data-mining and prediction methods based on both biologi-
cal and chemical databases. Currently, researchers at the U.S. FDA and U.S.
EPA use Leadscope for chemical and biological read-across based on databases
and to build predictive models. Many of the above prediction methods tend to
rely heavily  on chemical structures and summarized biological endpoint data.
Quite often these summarized endpoints are too far removed from the original
experimental measures and, hence, have lost at least some of their biological
context.
   In mutagenicity and carcinogenicity, it is possible to distinguish between
(a) coarse-grain methods, relying on the recognition of SAs; (b) fine-tuned ap-
proaches, which include Quantitative Structure-Activity Relationships (QSAR)
methods for  congeneric classes of chemicals (same chemical scaffold, same pre-
sumed mechanism of action); and (c) global QSAR models that  attempt to
combine elements of the previous two approaches and address more chemical
classes (60).
                   Previous  I     TOC

-------
72  E.Benfenatietal.

        Furthermore, some studies are based on human expertise, which identifies
    a series of structural alerts, whereas other tools are based on techniques used
    to discover the presence of relationships not yet known; data-mining tools are
    often used in the latter case. Examples of models that codify human knowledge
    include HazardExpert, OncoLogic, Toxtree, and DEREK. MultiCASE, Lead-
    scope,  and LAZAR use software based on the data-driven discovery  of geno-
    toxic or chemical fragments identified by specific automated algorithms. There
    are also mechanistic-based models that rely  on prediction of likely chemical
    reactions (61).
        A comparative analysis of existing lists of SAs derived for the prediction
    of rodent carcinogenicity has indicated that they have a prediction accuracy
    of 65% for the rodent carcinogenicity bioassay and an even  better prediction
    accuracy of 75% for Salmonella mutagenicity results; i.e., surprisingly, rodent
    carcinogenicity SAs predict mutagenicity better than they predict rodent car-
    cinogenicity (26). In addition, these SAs and the Salmonella assay have been
    shown to be equally predictive of the rodent carcinogenicity data. Overall, the
    SAs are a powerful tool for coarse-grain characterization of the chemicals; i.e.,
    for description of sets of chemicals, preliminary hazard characterization, cate-
    gory formation for regulatory purposes, or for selecting subsets of chemicals to
    submit to fine-tuned QSAR analyses for priority setting. A previous analysis on
    the priority setting criteria adopted by the U.S. National Toxicology Program
    in selecting chemicals to be bioassayed has shown that the structural criteria
    adopted to short-list suspect chemicals were able to enrich  the target up to
    ten times. In fact, 70% of the chemicals bioassayed as suspect carcinogens (i.e.,
    SAs or positive Salmonella data)  were carcinogens, whereas only 7% of the
    chemicals bioassayed based on production/exposure considerations were car-
    cinogenic (62). This result points to the high reliability of SAs for priority set-
    ting. On the other hand, the SAs are overly general and are not well suited as
    a tool for discriminating between positives and negatives of  congeners within
    a chemical class;  this is the role of the local, fine-tuned QSARs for congeneric
    classes (27).
        A survey of local QSARs for congeneric classes of chemicals has shown that
    these are classified into (a) models for the gradation of the potency of the pos-
    itives (mutagens or carcinogens) and (b) those that discriminate between pos-
    itives and negatives. This is a crucial difference with respect to models such
    as those for aquatic toxicity, where it is assumed that all the  chemicals can be
    scaled along one axis of potency, ranging from highly potent  to weakly potent
    chemicals. In mutagenicity and carcinogenicity this does not generally hold
    true; i.e., models  for potency most often fail to separate positives from nega-
    tives. Thus, the models can be applied in two phases: first to separate positives
    from negatives, and second to assess the potency of the chemicals predicted as
    positive in the first phase (63).
        The survey on QSARs included (a) a short list of promising models; (b) re-
    calculation of the statistics; and (c) most importantly, the performance of real
                Previous  I     TOC

-------
                       Predictive Models for Carcinogenicity and Mutagenicity  73

external predictivity tests. The latter consisted in selecting from the literature
test chemicals falling in the same applicability domains of the training sets,
but that had never been considered by the authors of the models. The QSARs
selected were all scientifically interpretable, had good internal statistics and
cross-validation, but varied widely in their external predictivity. The QSARs
for potency  had an external  prediction ability  in the range 30-70% correct
(percentage  of chemicals whose potency was correctly predicted within 1 log
unit). On the other hand, the  QSARs for activity (yes/no) had an external pre-
diction ability in the range 70-100% correct. This indicates that classification
estimates (e.g., yes/no) generally are much more reliable than estimating data
points or relative potency rankings. This also confirms the dichotomy between
QSARs for potency and QSARs for positivity/negativity (64).
   Another important result of the survey was that internal validation mea-
sures (e.g., cross-validation, other statistics) are not good predictors of external
predictivity (26, 64). Hence, they should be considered only as a means for bet-
ter describing the performances of training sets. It should be emphasized that
the external predictivity of high quality, local QSARs (70-100%) is in the same
range as the intra-assay agreement of the generally reliable and reproducible
Salmonella mutagenicity assay (80-85%) (65). Hence, the uncertainty inher-
ent to the two methods is comparable. In addition, as indicated above, rodent
carcinogenicity SAs correlate with rodent carcinogenicity bioassay results on a
large database to the same extent as the Salmonella assay results.
   The OECD  guidelines point out that to facilitate the consideration of a
(Q)SAR model for regulatory purposes, the model should be associated with
a mechanistic interpretation if possible (66).  Furthermore, mechanistically
based (Q)SARs  provide (a) a common ground of discussion for modelers, tox-
icologists, and regulators; (b)  additional tools for minimizing the possibility
of chance correlation; (c) intelligible information to guide synthetic chemists
in preparing safer chemicals;  and (d) a rational foundation for developing a
QSAR science.
   On the other hand, the large amount of legacy data and the anticipated ex-
plosion of toxicity-related information expected to be generated over the next
year (see, for instance, the ToxCast™ program below), call for the application
of flexible data-mining tools. In the past, for instance, Bursi and coworkers (67)
showed results of a global SAR model for mutagenicity with accuracy similar
to that of the intra-assay experiments for the Ames test mentioned above. Re-
cently, Gini and Ferrari showed similar results obtained within the CAESAR
project (68).  It should be mentioned that for mutagenicity data, modelling data
sets  consisting of several thousands of compounds are available and, in this
case, the role of modern computer techniques are suitable to screen a wide se-
ries of possibilities. In this way, it is possible to explore relationships between
the presence of a certain fragment and the toxicity and, thus, to mimic the
process  done manually by the human experts. Thus, data-mining tools can  be
used to  explore data in new ways and to identify novel toxicity mechanisms.
                   Previous  I     TOC

-------
74  E. Benfenati et al.

    A further contribution of global QSAR models (69) was presented by Toropov
    and coworkers that showed some models to predict potency for carcinogenicity
    and mutagenicity using simple descriptors based on SMILES format.


    Examples of QSAR Methods Based on Data-mining
    for Carcinogenicity
        We  will now  present some data-mining studies in more  detail. Martin
    and coworkers developed several  different QSAR methodologies for  acute
    aquatic toxicity in order to  model large, noncongeneric data sets (70). The
    methodologies include  the Hierarchical clustering method, the  FDA MDL
    QSAR method, the single-model method, and the nearest-neighbor method.
    These methods were shown to yield excellent  prediction results (70). The
    hierarchical clustering approach uses Ward's method (71) to divide an ex-
    perimental toxicity training set into a series of structurally similar  clus-
    ters where each cluster is assumed  to represent a common mode-of-action.
    The structural similarity  is  defined  in  terms of 2-D and 3-D descrip-
    tors. A genetic algorithm-based technique is used to generate statistically
    valid QSAR models for each cluster. The toxicity for a given query  com-
    pound is estimated using the average of the  predictions  from the  clus-
    ter models whose chemicals are structurally  the most similar to the query
    compound.
        The FDA MDL QSAR method  is a variation of the clustering methodol-
    ogy of Contrera and coworkers (72).  In this method, the prediction for each
    test chemical is made using a unique model that is fit to the chemicals  from
    the entire training set that are the most similar to the test compound. In the
    single-model method, a multilinear regression (MLR) model is fit to the entire
    data set using molecular descriptors as independent variables. In the nearest-
    neighbor method, the predicted toxicity is simply the average of the chemicals
    in the training set that are most similar to the test chemical.
        The predictive ability of the QSAR methods developed by Martin and
    coworkers  was evaluated using several different  carcinogenicity  datasets.
    First, the methods were evaluated using a small congeneric set of aromatic
    amines. Franke and coworkers reported that they were able to develop excel-
    lent correlations for this data set using multilinear regression  models (73). It
    was shown that cross validation might overestimate the predictive ability of
    regression models if it consists of only refitting  the model coefficients to the
    training sets for the different cross-validation folds. It is suggested that one
    could obtain a  conservative  estimate of the potential prediction accuracy of
    multilinear regression methods by using a genetic algorithm to fit a new mul-
    tilinear model to the training set for  each cross validation fold. The different
    QSAR methods achieved prediction concordances of 60-66% (averaged over the
    different sex-species sets) for the aromatic amines data sets.
                Previous  I    TOC

-------
                      Predictive Models for Carcinogenicity and Mutagenicity  75

   Next, the predictive ability of the QSAR methods was evaluated using the
larger data sets contained in the Carcinogenic Potency Database (CPDB) (47).
Each sex-species dataset contained ~600-750 noncongeneric chemicals. The
fraction of carcinogenic compounds for each data set were 47%, 45%, 43%, and
44% for the male rat, female rat, male  mouse, and female mouse sex-species
datasets, respectively. The QSAR methods of Martin and coworkers achieved
prediction concordances of about 61-63%, sensitivities of 48-60%, and speci-
ficities of 65-73% (averaged  over the different  sex-species sets) from 5-fold
cross-validations. The results achieved for  the CPDB  were similar to  those
achieved by QSAR-based approaches in the two NTP training exercises (74).
   It has been suggested that in order to successfully model large, noncon-
generic carcinogenicity data sets, one should develop a series of more focussed
QSAR models (75). To test this  strategy the predictive  ability of class-specific
models was compared with the predictive ability of the noncongeneric QSAR
methods described above. The training set for each of the class-specific models
consisted of only the chemicals in that  particular class while for the noncon-
generic QSAR methods the training data for the different classes were pooled
together.  The comparison was performed using a data set of 280 chemicals
taken from the NTP database (75). Benigni and Richard assigned the chemi-
cals in the NTP data set to ten different structural classes (e.g., electrophilic
alkylating agents and halogenated aliphatic compounds). The data set was
separated randomly into training (80%) and prediction sets (20%) 10 times.
Sampling was  done  so that there was  an equal number of cancer and non-
cancer scores for each chemical class (for both the training and prediction sets).
The results for the 10 different prediction sets were pooled together. The class-
based models achieved an average prediction concordance of about 58%. The
hierarchical  and nearest-neighbor methodologies achieved slightly lower pre-
diction concordances of 55% and 57%, respectively. These results indicate that
it may be possible to correlate  noncongeneric datasets without  manually di-
viding the datasets into classes  (although one could argue that the results are
inconclusive  due to the low prediction  concordances). The prediction concor-
dances were lower for the NTP data set compared with the CPDB and aromatic
data sets because composite cancer scores were modelled.
INCORPORATING NEW AND TRADITIONAL DATA INTO A NEW
SCENARIO

Carcinogenicity and Mutagenicity Data: New Initiatives to
Improve Access and Utility for Modelling
   A number of new initiatives are underway to improve access to exist-
ing public  carcinogenicity  and mutagenicity data for use in modelling, to
                   Previous  I     TOC

-------
76  E. Benfenati et al.

    encourage use of less summarized activity classifications, to create linkages
    and structure-searchable access to publicly available sources of toxicity data,
    and to infuse new types of high-throughput biological test data (i.e., biochemi-
    cal, cell-based, etc.) along with chemical structure considerations into the pre-
    diction modelling paradigm (76, 77). These various initiatives offer the promise
    of moving the current paradigm for toxicity prediction toward one that can be
    applied more broadly and confidently to larger swaths of chemical  space and
    to a greater diversity of toxicity endpoints (78).
        Current structure-based SAR models for prediction of chemical carcino-
    genicity and mutagenicity rely on a relatively small number of publicly avail-
    able data resources  in which the data being modelled are typically highly
    summarized and aggregated  representations of the actual experimental re-
    sults (i.e., positive and negative calls) (79). The Berkeley Carcinogenic Potency
    Database (CPDB) (47), which includes bioassay results for more than 1500 sub-
    stances curated from literature reports, has been commonly employed in this
    way for past SAR  modelling  studies. EPAs DSSTox Database Network (23)
    offers elaborated and quality reviewed structure-data file (SD file) represen-
    tations of public toxicity data sets, such as the CPDB-A11 Species (CPDBAS)
    Summary Tables (48) as well as expanded data linkages and coverage of chem-
    ical space for carcinogenicity and mutagenicity. In particular, the most recently
    published DSSTox CPDBAS SD file includes a number of new species-specific
    summary activity fields, along with a species-specific normalized score for car-
    cinogenic potency (TD50) and sex/species-specific tumor incidences (80). To fur-
    ther facilitate use of these summary activity fields and associated data within
    the CPDBAS data file, these chemical structure-associated data have been de-
    posited in the large, publicly available PubChem database (81) as seven "bioas-
    says"  or PubChem AIDs (PubChem Assay Identifiers); i.e., AIDs for CPDBAS
    Mutagenicity, Rat, Mouse, Hamster, Dog-Primates, SingleCellCall, and Mul-
    tiCellCall. Separate indexing of component activity classifications within the
    PubChem system allows a user to take full advantage of the tools  and capa-
    bilities within PubChem for "read-across" and SAR clustering in bioassay and
    structure space; i.e., allowing for comparisons of CPDB compounds  across the
    entire PubChem inventory (millions of compounds, hundreds of assays).
        The entire DSSTox published data file inventory (> 16,000 records, >8,000
    unique compounds) has been  deposited/updated within PubChem, enabling a
    user to cross-reference between the two systems. A user can now link directly
    from  PubChem substance and bioassay listings to DSSTox data files, docu-
    mentation, and Source chemical data pages where available (e.g., to  the CPDB
    or National Toxicology Program online chemical data  pages). Having estab-
    lished this direct correspondence between PubChem and DSSTox CIDs (Com-
    pound/structure IDs), the DSSTox Structure-Browser (82) now incorporates a
    direct link from the DSSTox structure-search results page to the corresponding
    PubChem Compound (CID) summary results page, allowing less experienced
                Previous  I    TOC

-------
                       Predictive Models for Carcinogenicity and Mutagenicity  J J

users (e.g., toxicologists, risk assessors) to directly access relevant PubChem
bioassay information, links, and data for similar substances.
    The concept of chemical "toxicity profiling" has gained new prominence
and importance in the field of toxicity prediction and recently has been rec-
ommended as a long-term goal  for toxicity screening and assessment by a
prominent advisory committee and U.S. government  agencies (83, 84). Tox-
icity profiling can occur  at  two  levels:  (1) profiles of in vivo responses  are
made possible by increasing availability of detailed observational data and ex-
perimental measures associated with chronic toxicity studies, and (2) newer
in vitro high-throughput screening (HTS) data offer a means for broadly char-
acterizing a chemical's biological profile in terms of target interactions, path-
way perturbations, and cellular responses. Several initiatives are currently un-
derway to harness legacy toxicity data from diverse domains of study (cancer
bioassays, genetic toxicity, developmental and reproductive toxicity, skin sen-
sitization, etc.) into hierarchical toxicity data models suitable for building re-
lational databases (85, 86). These historical reference data are necessary to
anchor and validate  new predictive toxicology approaches based on alterna-
tive in vitro test methods as well as to eventually move away from current in
vivo rodent test systems to mechanistic pathway-based test data more directly
relevant to humans.
    The ToxCast™ project within the U.S. EPA's National Center for Compu-
tational Toxicology is a prominent example of a new predictive toxicology ini-
tiative that is aggressively harnessing legacy in vivo data as well as employing
new HTS in vitro technologies and toxicity profiling concepts (76, 87). As part
of this effort, the ToxRef database has been built to house in vivo toxicity data
in standardized data model representations, and it is being populated with ro-
dent bioassay study data (chronic, developmental, reproductive) for hundreds
of pesticidal active ingredients registered for use in the United States across
a range of toxicity investigation areas. Public release of various representa-
tions of these data, in conjunction with other ToxCast™ data, occurred in late
2008, through venues such as DSSTox and PubChem, and the new EPA AC-
ToR (Aggregated Computational Toxicology Resource) system (88, 89), which
will house all ToxCast™ data as well as supporting publicly available data.
Phase I of the ToxCast™ effort is generating data in hundreds of HTS bio-
chemical and cell-based assays for 320 selected compounds, mostly pesticidal
actives (90), for which a  rich profile of toxicity data exists within ToxRef or
other public sources. The goal is to develop candidate predictive signatures for
various toxicity endpoints to undergo further testing and validation in Phase
II.
    Microarray data  (i.e.,  information on the changes in gene  expression of
many genes) are becoming more available each year, and structure-annotated
data bases containing such information will soon be available for general use.
The complex array of genes that are mutated or whose expression is modulated
                   Previous  I     TOC

-------
78  E. Benfenati et al.

    is only incompletely understood at this point. However, microarray data, along
    with epigenetic and genetic data, will permit a more precise modelling of the
    cancer process and improve predictive toxicology in silico.
        The generation of large amounts of new HTS and in vitro data, coupled
    with enrichment and elaboration of reference in vivo toxicity data in the public
    domain, offer new challenges and opportunities to SAR modellers. Although
    SAR modelling has had many successes and has been an extremely valuable
    tool for toxicity screening and prioritization in the absence of biological data,
    the limitations of a structure-only approach to prediction are well known and
    enumerated in the literature. In general, chemical class-based approaches that
    offer a greater chance of mechanistic coherence can be applied more confidently
    to prediction, but in a narrow range of chemical space. In contrast, global SAR
    prediction approaches can be applied more broadly, albeit usually with  less
    confidence. The concepts of chemical similarity and chemical class can be use-
    fully employed in the new paradigm to focus investigation on regions of HTS
    activity space, exploring differences within the space. By the same token, clus-
    ters within HTS activity profile space (such as in a heat map representation)
    can be projected onto chemical space, potentially implicating members of mul-
    tiple chemical classes and offering new SAR hypotheses for further investiga-
    tion. HTS, or "fast biology" results can also potentially be employed as "biolog-
    ical descriptors" in a traditional QSAR paradigm; i.e., coupled with traditional
    chemical descriptors for SAR/QSAR model construction (91).
        Finally, adding layers of richness to data model representations of legacy
    in vivo toxicity results, such as in the ToxRef database, yields a great variety of
    new activity profile representations, or "endpoints," for use in guiding and an-
    choring SAR and HTS predictive toxicology investigations. A smaller region of
    chemical space associated with a  more focused activity profile, in turn, is likely
    to be offset by the greater potential mechanistic coherence of the data and,
    therefore, greater potential for modelling and prediction success. The  effective
    incorporation of SAR concepts into ToxCast™ and similar toxicity modelling
    efforts will be crucial for their ultimate success.
    Two Case Studies of Data-mining in a New, Broader Scenario
        These case studies describe the approaches investigated within regulatory
    agencies to go beyond the traditional QSAR paradigm for predictive toxicology
    and to include biology more explicitly into the QSAR process.

    Case Study 1: Genetic Toxicity Data-mining with Integrated
    Database
        The predictive data-mining methodology was applied to data from vari-
    ous regulatory agencies and industry partners. Some findings from this case
                Previous  I     TOC

-------
                       Predictive Models for Carcinogenicity and Mutagenicity  79

study were recently published (92) in which the FDA CRADA SAR (Cooper-
ative Research and Development Agreement Structure Activity Relationship)
genetic toxicity database (2006 version) was integrated with proprietary indus-
try data. The proprietary data were shared by Leadscope structural features
statistics without the actual connection tables. The 3220 chemicals in this in-
tegrated database were 30% drugs, 22% food ingredients, and 48% industrial
chemicals, the latter including agricultural  chemicals. Various  data sources
were integrated according to the ToxML criteria (80, 86). For compounds to be
scored for assessment (e.g., test calls), ToxML requires the  data to be accom-
panied by information on test system, including conditions such as controls,
dosage regimen, and cytotoxicity (92). The ToxML data model and data entry
tool (ToxML Editor) are freely and publicly available (93). The genetic toxic-
ity profile of these chemicals was analyzed by structural features across the
various strains of Salmonella typhimurium reverse mutagenesis (Salmonella
mutation), mouse lymphoma mutation, in vitro chromosome aberration (ivt
CA), and in vivo micronucleus (micronucleus).
    Structural features associated with point mutations—in particular,  base
substitutions or frame shifts—were found.  For example,  alkyl halides are
highly correlated with mutagenicity in  all Salmonella strains, whereas aryl
halides are  not.  Well-known groups such as  epoxides, nitro,  nitroso,  and
quinines are highly associated with all four genetic  toxicity endpoints. On
the other hand, structural features such as azo, benzimidazole, and quinolines
are correlated with mutagenicity but not with clastogenicity. Features such as
alkenyl ketones, aryl aldehyde, pyrazine (H), and base nucleosides are associ-
ated only with clastogenicity. When the four genetic toxicity outcomes are cor-
related using the structural features, two mutagenicity tests using Salmonella
and mouse lymphoma  correlated well. Salmonella mutagenesis  outcome also
correlated well with that of in vitro chromosome aberrations. However, in vivo
micronucleus did not correlate well with in vitro chromosome aberrations. If a
chemical is ivt CA negative, it will probably be micronucleus negative. It is im-
portant to note that these genetic toxicity screening tests should be used more
as a profile rather than as an individual predictor for carcinogenicity. These
structural features can be further refined to form structural alerts grouped by
chemical reactivity.
    Structural alerts representing positive carcinogenicity/negative genotox-
icity were extracted from the data set.  Many  structural alerts  for genotoxic
carcinogens  for general industrial chemicals were consistent with literature
results  (94). The landscape of these alerts  changed significantly when the
structure space changed from industrial chemicals toward drugs. Several new
structural alerts representing non-genotoxic carcinogens were presented. One
of the structural alerts included the statin analogs.
    To further understand the biology of  the statins, various target or-
gan lesions  from the  chronic studies were compared with the  SAR-ready
                   Previous  I     TOC

-------
80  E. Benfenati et al.

    carcinogenicity database (45).  In chronic studies, liver lesions included cen-
    trilobular necrosis, hypertrophy, and vacuolar degeneration of perilobular hep-
    atocytes, cellular atypia, fatty change, and bile duct hyperplasia. Liver organ
    weight increases and ALT/AST enzyme level increases were also noticed dur-
    ing the chronic studies. These statin analogs also showed thyroid lesions, thy-
    roid organ weight increase, and CPK enzyme increase. In the carcinogenicity
    database, adenoma, carcinoma, and fibrosarcoma of liver were observed as well
    as thyroid follicular cell adenoma. The increased incidence of liver and thyroid
    tumors is connected by a well-known mechanism of thyroid-stimulating hor-
    mone instigating the liver microsomal enzymes (95). The same connection was
    also reported previously from  data-mining the CPDB database (96). As pre-
    sented in this case, understanding non-genotoxic carcinogens requires under-
    standing of biological mechanisms involved in target organ  effects. Therefore,
    a quality database providing in-depth target organ findings in chronic studies
    along with the carcinogenicity and genetic toxicity data can be vitally impor-
    tant to further our knowledge of carcinogens.


    Case Study 2: NTP High throughout Screening and
    Understanding Genotoxicity and Carcinogenicity
        One  of the questions that can be posed of a database of genetic toxic-
    ity screening tests  is whether the current genetic  toxicity tests sufficiently
    reflect the  different mechanisms involved in  carcinogenesis. In this regard,
    the HTS campaign initiated by the U.S. National Toxicology Program (NTP)
    is  worth discussing. The objectives of that project are to develop methods to
    screen and prioritize the nomination of chemicals for rodent bioassays and
    to look for approaches to gain insights on mode-of-actions for various  toxi-
    city endpoints. Potentially,  some of these data and methods can lead to im-
    provements in predictive toxicology. NTP initially selected 1408 chemicals and
    tested against 24 bioassays. The chemicals included food ingredients (13%),
    agricultural chemicals (17%), drugs and hormones (20%), and general indus-
    trial chemicals (50%). The  24  bioassays included caspase and kinase activi-
    ties of NCGC cell  lines (3T3,  BJ, SHSY5, H4IIe, Hek293, HepG2,  HUVEC,
    Jurkat, N2A,,IkB signalling protein, JNK Alpha), and SKNSH, MRC5, Renal,
    and Mesenchymal assays. The bioassay panel also  includes 7 FRED and 13
    NCGC strains for cell viability (97).
        When compounds were clustered against the bioassays and cell viabilities,
    most of the compounds were not differentiated by  activities; however, some
    blocks of chemicals did separate based on cell viabilities. The differentiation
    of cell viabilities and activities against compounds increased markedly when
    the observations were based on smaller units of structural features (i.e., frag-
    ments of molecules) rather than individual compounds. Using cell viability
    to probe for rodent acute toxicity, biological and chemical  fingerprints were
                Previous  I    TOC

-------
                       Predictive Models for Carcinogenicity and Mutagenicity

investigated. The bioassay profile also has been compared with the genotox-
icity and carcinogenicity endpoints. Of the  1408 chemicals, 543 had rodent
carcinogenicity data, 1112 had Salmonella data, 344 had mammalian muta-
genesis data, 428 had ivt CA data, and 223 had micronucleus data within
the database sources mentioned above. At the compound level, there were too
many missing values to permit meaningful correlations between the bioassays
and toxicity data (i.e., too few cases where the same compound had all types of
assay data across the data sources).
    When the observations were based on structural features and statistics
were recalculated, the trends reported previously among the four genetic tox-
icity tests in the first data-mining case study were  again  observed. Struc-
tural features correlating either positively or negatively across the different
endpoints were selected to recalculate the statistics. Based on the 1408 data
set, Salmonella mutagenesis  (R  = 0.52)  and in vitro chromosome aberra-
tions (R = 0.39) showed some correlations to rodent carcinogenicity, in which
rodent carcinogenicity was defined as induction of tumors in both  rat and
mouse.
    From this method, structural features positive for both rodent  carcino-
genicity and genotoxicity included aromatic amines, azo, epoxides,  halides,
and nitroso/nitrosamine.  Pyridine (H) was found to correlate more with non-
genotoxic carcinogens. Genotoxic but not carcinogenic features included alkyl
nitro, benzimidazole, quinoline, and 1,4-diamino benzene features. These re-
sults are quite consistent with earlier reports on the classes of rodent carcino-
genicity (98, 99). Using the feature analyses, a compound-class-driven cate-
gorization of the correlations between bioassays and toxicity endpoints was
conducted by 2-D clustering of correlations between the various toxicity and
bioassays. A heat-map generated from these data was used to tease  out par-
ticular biological assays correlating highly with rodent carcinogenicity for a
specific set of compound classes.
    To summarize, a battery of genotoxicity screening tests can be used for
profiling compounds to understand carcinogenicity potential. Structural alerts
with genotoxic-carcinogenic outcome probabilities stratified by potency can be
developed based on the feature-based  methodology. The current NCGC bioas-
says from the first NTP HTS campaign may not correlate well with genotoxic-
ity and carcinogenicity at a global compound level. However, toxicity of a par-
ticular class of chemicals correlates relatively well with some of the biological
assays. In addition, NTP has expanded their screen to include other cell-based
and activity assays.
    The importance of the two case studies presented above is not to emphasize
an efficacy of one particular test or assay,  but rather to approach the genetic
toxicity and carcinogenicity problem in a new way by linking chemical struc-
tures to biological effects with the introduction of new molecular-level assays
to yield potential mechanistic insights. In the long run, predictive data-mining
                   Previous  I     TOC

-------
82  E. Benfenati et al.

    methods can help in revisiting data, testing strategies, and past presumptions
    involved in risk assessment.
    CONCLUSIONS

    In QSAR models for carcinogenicity (mainly) and mutagenicity, there are still
    a number of open problems and unresolved issues. Most of these involve the
    point of application (e.g., screening vs. late-stage)  and interpretation of soft-
    ware output. "False positives" generated by  programs  such  as DEREK or
    MC4PC are greatly reduced by applying expert knowledge that takes into ac-
    count the chemical context of the alert or biophore, and whether hydrolysis or
    metabolism are likely to convert the molecule  to a true alert as defined  origi-
    nally and confirmed in the  literature. The "false negative" designation, on the
    other hand, suggests that the learning sets on which the prediction systems
    were based initially may lack some  essential  knowledge. Studies using both
    a cell-based system and 3-D DNA docking/electrostatic modelling have shown
    that many "false negatives" can be  explained by non-covalent binding (e.g.,
    DNA intercalation  or groove binding), and that the genotoxicity of such inter-
    actions is largely the result of topoisomerase II inhibition. This type of chemi-
    cal/DNA interaction was and still is poorly understood and, consequently, may
    be trained  inappropriately  in the learning sets of the most commonly used in
    silico programs. Moreover, because most of these putative intercalating agents
    do not possess classical, planar intercalating structures, simple visual inspec-
    tion does not allow prediction of non-covalent binding to DNA.
        In order to significantly improve in silico  models  for carcinogenicity and
    mutagenicity, it is crucial to understand and accept that there are still  prob-
    lems with the experimental methods  dedicated to study these endpoints. Thus,
    several of the problems that appear  as (Q)SAR problems are actually typical
    of the general limitations of the current experimental  techniques and state of
    the knowledge. (Q)SAR models, for their part,  are more suitable  to statistical
    treatment  of the data, which highlight their accuracy, sensitivity, specificity,
    reliability,  and predictivity. Different tools are useful, depending on whether
    the model is a classifier (SAR) or a regression model.
        In the case of the in vitro models designed  to replace in vivo methods, it is
    more and more common to  have a statistical evaluation for the false positives
    and negatives of the method. However, similar objective  appraisal is missing
    in animal models when compared with human  toxicity. The extrapolation from
    animal models to humans is not an easy task given the paucity of data on the
    latter.  In addition, the variability of the in vivo data is poorly described.
        These intrinsic scientific problems are complicated by different purposes
    and intended uses of the models.  Models for  regulations are  strictly linked
    to legal specifications that  depend  on the specific regulation. Even within the
                Previous  I     TOC

-------
                       Predictive Models for Carcinogenicity and Mutagenicity  83

same regulation, different possible uses of the modelling tools are possible. For
example, within the REACH legislation, carcinogenicity has to be described as
a category for prioritization and classification and labelling. However, contin-
uous quantitative estimates of potency are also needed to estimate the risk of
the carcinogen within a given scenario of exposure.
    There are several possible uses of the in silico models, such as prioritisa-
tion, screening, mechanistic studies, and support for risk assessment. Today
the acceptance of in silico tools and predictive models, some based on incorpo-
ration of newer HTS data, for carcinogenicity as alternative methods to in vitro
or in vivo testing, is still highly debated, and the varied discussions at the
workshop demonstrated this. One reason for this is the lack of knowledge and
experience produced on the proposed alternative methods.  It was mentioned
that even the Ames test, when originally proposed, was severely criticized. De-
spite this, after decades, mutagenicity is now a requested endpoint in several
regulations.
    It may happen that a scientist strongly supports and advocates for use of
a particular method with which he or she is comfortable. However, the techni-
cal, theoretical differences between the different models should have  a lower
emphasis compared with the advantages for the user and practical utility of
models for a certain application. As Galileo said, hypotheses have to be exper-
imentally evaluated. In the case of in silico models, they have to be proven
to work for their intended  use. There are at least two possible applications:
in silico  models can classify potential carcinogens or  mutagens or they can
predict a potency from a continuous  model, with an estimated confidence in-
terval, namely the error range.  Depending on the errors in the quantitative
predictions or classifications, the method can be used as a screening tool or as
a substitute to in vitro or in vivo testing if the error is acceptable. In  the case
of models for regulatory purposes, where conservative  measures for ensuring
public safety is of primary concern, the errors that give rise to false negatives
are much more relevant and high specificity is thus critical. The  measure of
sensitivity and specificity is much more important than that of concordance.
This holds true both for classification and continuous QSAR models. The pre-
dicted residuals are more appropriate than just R2.
    To cope with the scientific problems of a better understanding and predic-
tion of carcinogenicity and  mutagenicity, new  efforts have to be planned and
organized, integrating different tools, not only in silico. We discussed the new
possibilities  of a large screening of chemical substances and some ongoing ini-
tiatives. The challenge is to reinforce and expand knowledge on toxicity phe-
nomena by introducing and using new experimental data that are more easily
generated than with the classical in vivo methods. The use of these data is
changing the perspective of toxicity evaluation. The huge amount  of data will
offer new ways to explore relationships between data of different origin as a
contribution to the understanding of toxicity. In this evolving scenario there
                   Previous   I     TOC

-------
84  E. Benfenati et al.


     will be a need for powerful data-mining tools that are capable of extracting
     knowledge from a complex multidimensional space. New initiatives requiring
     a paradigm shift are increasing, and the hope is that a better understanding of
     toxicity phenomena will be achieved for ensuring safer chemicals and stream-
     lined, yet protective regulatory procedures.



     ACKNOWLEDGEMENTS

     We acknowledge the financial support of the European Commission, project
     SCARLET, contract no. SP5A-CT-2007-044166. This manuscript has been re-
     viewed by the National Health and Environmental Effects Research Labora-
     tory and National Center for Computational Toxicology, U.S. Environmental
     Protection Agency, and is approved for publication. Approval does not signify
     that the contents reflect the views of the Agency, nor does mention  of trade
     names or commercial products constitute endorsement or recommendation for
     use.


     REFERENCES

       1.  StruCture-Activity Relationships Leading ExperTs in mutagenicity and carcino-
     genicity (SCARLET), European Commission Project SPSA-CT-2007-014166. Available
     at: http://scarlet-project.eu

       2.  Moore, MM, Honma,  M, Clements,  J, Bolcsfoldi, G,  Burlinson, B, Cifone, M,
     Clarke, J, Delongchamp, R, Durward, R, Fellows, M, Gollapudi, B, Hou, S, Jenkinson, P,
     Lloyd, M, Majeska, J, Myhr,  B, O'Donovan, M, Omori,  T, Riach, C, San, R, Stankowski,
     LF Jr, Thakur, A, Van Goethem, F, Wakuri,  S, Yoshimura, I. Mouse Lymphoma Thymi-
     dine Kinase Gene Mutation Assay: Follow-up Meeting of the International Workshop
     on Genotoxicity Tests-Aberdeen, Scotland, 2003—Assay acceptance criteria, positive
     controls, and data evaluation. Environ Mol Mutagen 2006;47:1—5.

       3.  Kirkland,  D, Pfuhler, S, Tweats, D, Aardema, M,  Corvi,  R, Darroudi,  F,
     Elhajouji, A, Glatt, H, Hastwell, P,  Hayashi, M,  Kasper, P, Kirchner, S, Lynch, A,
     Marzin, D, Maurici, D, Meunier, J-R, Miiller, L, Nohynek, G, Parry, J, Parry, E, Thybaud,
     V, Tice, R, van Benthem, J, Vanparys, P, White, P. How to reduce false positive results
     when undertaking in vitro genotoxicity testing and thus avoid unnecessary follow-up
     animal tests: Report of an ECVAM Workshop. Mutation  Research 2007;628:31-55.

       4.  Snyder, RD, Ewing, DE, Hendry, LB. Evaluation of DNA intercalation  potential
     of pharmaceuticals and other chemicals by cell-based  and three-dimensional computa-
     tional approaches. Environ Mol Mutagen 2004;44:163-173.

       5.  Snyder, RD, Ewing, D, Hendry, L. DNA intercalative potential of marketed drugs
     testing positive in in vitro cytogenetics assays. Mutation Res 2006;609:47-59.

       6.  van der Lelie, D, Regniers, L,  Borremans, B, Provoost, A, Verschaeve, L. The
     VITOTOX   test,  an SOS bioluminescence  Salmonella typhimurium test to measure
     genotoxicity kinetics. Mutagenesis 1997;20:449—454.

       7.  Verschaeve, L, Van Compel, J, Thilemans, L, Regniers, L, Vanparys, P, van der
     Lelie, D. VITOTOX® bacterial genotoxicity and toxicity test for the rapid screening of
     chemicals. Environ Mol Mutagen 1999;33:240-248.
                 Previous   I     TOC

-------
                         Predictive Models for Carcinogenicity and Mutagenicity  85


  8.  Van Compel, J, Woestenborghs, F, Beerens, D, Mackie, C,  Cahill, PA, Knight,
AW, Billinton, N, Tweats, DJ, Walmsley, RM. An assessment of the utility of the yeast
GreenScreen assay in pharmaceutical screening. Mutagenesis 2005;20:449^154.

  9.  Pastink, A, Eeken, JCJ, Lohman, PHM.  Genomic integrity and the repair  of
double-strand DNA breaks. Mut Res 2001;480-481:37-50.

 10.  Clever, B, Interthal, H, Schmuckli-Maurer, J, King, J, Sigrist, M, Heyer, W-F. Re-
combinational repair in yeast: functional interactions between RadSl and Rad54 pro-
teins. EMBO J 1997;16:2535-2544.

 11.  Sonoda, E, Sasaki, MS, Buerstedde, J-M, Bezzubova, O, Shinohara, A, Ogawa, H,
Takata, M, Yamaguchi-Iwai, Y, Takeda, S. RadSl-deficient vertebrate cells accumulate
chromosomal breaks prior to cell death. EMBO J 1998;17:598-608.

 12.  Arbel, A, Zenvirth, D, Simchen, G. Sister chromatid-based DNA repair is medi-
ated by RAD54, not by DMC1 or TID1. EMBO J 1999;18:2648-2658.

 13.  Dronkert,  MLG, Beverloo,  HB, Johnson, RD,  Hoeijmakers, JHJ,  Jasin, M,
Kanaar, R. Mouse RAD54 affects DNA Double-strand break repair and sister chromatid
exchange. Mol Cell Biol 2000;20:3147-3156.

 14.  Van Compel, J, Woestenborghs, F, Beerens, D, Mackie, C,  Cahill, PA, Knight,
AW, Billinton, N, Tweats, DJ, Walmsley, RM. An assessment of the utility  of the
yeast GreenScreen  assay in  pharmaceutical screening. Mutagenesis 2005;20:449—
454.

 15.  Billinton, N, Barker, MG, Michel,  CE, Knight, AW, Heyer,  W-D, Goddard, NJ,
Fielden, PR, Walmsley, RM. Development of a green fluorescent protein reporter for a
yeast genotoxicity biosensor. Biosensores Bioelectronics 1998;13:831-838.

 16.  Cahill, PA, Knight, AW,  Billington,  N,  Barker,  MG, Walsh, L, Keenan, PO,
Williams, CV, Tweats, DJ, Walmsley, RM.  The GreenScreen  genotoxicity assays: a
screening validation programme. Mutagenesis  2004;19:105—119.

 17.  Knight, AW, Billinton, N, Cahill, PA, Scott, A, Harvey, JS, Roberts, KJ, Tweats,
DJ, Keenan, PO, Walmsley, RM. An analysis  of results from 305 compounds tested with
the yeast RAD54-GFP genotoxicity assay (GreenScreen GC)—including relative predic-
tivity of regulatory tests and rodent carcinogenesis and performance with autofluores-
cent and coloured compounds. Mutagenesis 2007;22:409^il6.

 18.  Cole,  GM, Schild,  D, Lovett, ST,  Mortimer, RK. Regulation of RAD54-  and
RAD52-lacZ gene fusions in  Saccharomyces cerevisiae  in response to DNA damage.
Mol Cell Biol 1987;7:1078-1084.

 19.  Averbeck, D, Averbeck, S. Induction of the genes  RAD54 and RNR2 by various
damaging agents in Saccharomyces cerevisiae. Mut Res 1994;315:123-138.

 20.  Joosten, HFP, Acker, FAA, Dobbelsteen van den DJ, Horbach GJMJ, Krajnc, EL
Genotoxicity of hormonal steroids. Toxicol Lett 2004;151:113-134.

 21.  International Agency  for  Research   on  Cancer  (IARC). Available  at:  http://
www.iarc.fr/ENG/Databases/index.php.

 22.  Zhao,  C, Boriani, E, Ghana, A, Roncaglkioni, A, Benfenati, E. A new hybrid
system  of QSAR models  for  predicting bioconcentration factor (BCF). Chemosphere
2008:73:1701-1707.

 23.  EPA DSSTox.  U.S. Environmental Protection Agency's Distributed  Structure-
Searchable Toxicity  (DSSTox) Database Network.  2009. Available at: http://www.
epa.gov/ncct/dsstox/

 24.  Benigni, R, Bossa, C, Jeliazkova, NG, Netzeva, TI, Worth, AP. The Benigni/Bossa
rulebase for mutagenicity and carcinogenicity—a module of Toxtree. Report EUR 23241
                     Previous  I     TOC

-------
86  E. Benfenati et al.


     EN. 2008. Luxembourg, Office for the Official Publications of the European Communi-
     ties. EUR—Scientific and Technical Report Series.

      25.   Benigni, R,  Bossa,  C, Netzeva, TI, Worth, AP.  Collection and evaluation of
     (Q)SAR models for mutagenicity and carcinogenicity. EUR 22772 EN. Luxembourg, Of-
     fice for the Official Publications of the European Communities. EUR—Scientific and
     Technical Research Series; 2007.

      26.   Benigni, R, Bossa, C. Predictivity and reliability of QSAR models: the case of
     mutagens and carcinogens. Toxicol Mechanisms Meth 2008;18:137-47.

      27.   Benigni, R, Netzeva, TI, Benfenati, E, Bossa, C, Rainer, R, Helma, C, Hulzebos, E,
     Marchant, C, Richard, A, Woo, Y-T, Yang, C. The expanding role of predictive toxicology:
     an update on the (Q)SAR models for mutagens and carcinogens. J Environ Sci Health
     C Environ Carcinog Ecotoxicol Revs 2007;25:53-97.

      28.   European  Partnership for Alternative Approaches to animal testing (EPAA).
     Available at: http://ec.europa.eu/enterprise/epaa/brochure.htm

      29.   Brown, D, Superti-Furga, G. Rediscovering the sweet spot in drug discovery. Drug
     Discovery Today  2003;8:1067-1077.

      30.   DiMasi, JA.  Risks in new drug development: Approval success rates for investi-
     gational drugs. Clin Pharmacol Ther 2001;69:297-307.

      31.   Kola, I, Landis, J. Can the pharmaceutical industry reduce attrition rates? Na-
     ture Reviews Drug Discovery 2004;3:711-715.

      32.   Fingerhut, MA, Halperin, WE, Marlow, DA, Piacitelli, LA, Honchar, PA, Sweeney,
     MH,  Greife, AL, Dill,  PA, Steenland, K, Suruda,  AJ. Cancer mortality in work-
     ers exposed  to  2,3,7,8-tetrachlorodibenzo-p-dioxin.  N  Engl  J Med  1991;324:212-
     218.

      33.   Manz, A, Berger, J, Dwyer, JH, Flesch-Janys, D, Nagel,  S, Waltsgott, H. Can-
     cer mortality among workers  in chemical plant  contaminated with dioxin. Lancet
     1991;338:959-964.

      34.   Zober, A, Messerer, P, Huber, P. Thirty-four-year mortality follow-up of BASF
     employees exposed to 2,3,7,8-TCDD after the 1953 accident. Int Arch Occup Environ
     Health 1990; 62:139-157.

      35.   Westerink, WMA, Stevenson, JCR, Schoonen, WGEJ. Pharmacologic profiling of
     human and rat cytochrome P450 1A1 and 1A2 induction and competition. Arch Toxicol
     2008;82:909-921. Doi 10.1007/S00204-008-0317-7.

      36.   Snyder, RD,  Smith, MD. Computational prediction of genotoxicity:  room for im-
     provement. Drug Discovery Today 2005;10:1119-1124.

      37.   Ertl, P, Roggo, S, Schuffenhauer, A. Natural Product-likeness score and its appli-
     cation for prioritization of compound libraries. J Chem Inf Model 2008;48:68-74.

      38.   Contrera,  JF, Jacobs, AC, DeGeorge, JJ. Carcinogenicity testing and the eval-
     uation of regulatory requirements for  pharmaceuticals.  Regul Toxicol Pharmacol
     1997;25:130-145.

      39.   CCRIS (Chemical Carcinogenesis Research Information System), developed and
     maintained by National Cancer Institute. Available at: http://toxnet.nlm.nih.gov/cgi-
     bin/sis/htmlgen?CCRIS

      40.   Gene-Tox:  Created  by US,  EPA.  Available at: http://toxnet.nlm.nih.gov/cgi-
     bin/sis/htmlgen?GENETOX

      41.   National Toxicology Program (NTP). Available at: http://ntp.niehs.nih.gov
                  Previous  I     TOC

-------
                          Predictive Models for Carcinogenicity and Mutagenicity   87


 42.  International Programme on Chemical Safety (IPCS). INCHEM, a product co-
operation between  the International  Programme  on Chemical  Safety  (IPCS) and
the Canadian Centre  for Occupational Health and Safety (CCOHS). Available at:
http ://www.inchem .org/

 43.  IRIS EPA. Integrated risk information system for carcinogen classification, orig-
inally prepared in US EPA. Available at: http://www.epa.gov/iris/index.htm

 44.  FDA Cooperative Research and  Development Agreemen) database. Leadscope.
Available at: http://www.leadscope.com/fda_databases/

 45.  Tokyo-Eiken, Tokyo Metropolitan Institute of Public Health providing primary
mutagenicity of food additives for about 300 chemicals. Available at: http://www.tokyo-
eiken.go.jp/henigen/

 46.  The  Mutants, a database  of mutagenicity/genetic  toxicity  database  spon-
sored  by  Dr.  Motoi  Ishidate.  Available  at: http://members.jcom.home.ne.jp/mo-
ishidate/index.html

 47.  CPDB Summary Tables.  Summary Table of Chemicals in the Carcinogenic Po-
tency Database: Results for Positivity,  Potency (TD50), and Target Sites. 2007. Avail-
able at: http://potency.berkeley.edu/chemicalsummary.html

 48.  Gold, LS, Slone,  TH, Williams, CR, Burch, JM, Stewart, TW, Swank, AE, Bei-
dler, J, Richard, AM. DSSTox Carcinogenic  Potency Database Summary Tables—All
Species, SDF Files and Documentation, CPDBAS_v5c_1547_20NOV2008. 2008. Avail-
able at: http://www.epa.gov/ncct/dsstox/sdLcpdbas.html

 49.  Benigni,  R, Bossa, C, Richard, AM, Yang, C.  A novel  approach: chemical rela-
tional databases, and the role of the ISSCAN database on assessing chemical carcino-
genicity. Ann 1st Super Sanita 2008; 44:48-56.

 50.  Leadscope Inc., Ohio, USA. http://www.leadscope.com

 51.  Lhasa Limited, Leeds, UK. http://www.lhasalimited.org/

 52.  RTECS,  Registry of toxic effects of chemical substances, created by NIOSH,
maintained and marketed by MDL Symix. Available  at: http://mdl.com/products/
predictive/rtecs/index.jsp

 53.  MultiCASE MC4PC. Available at: http://www.multicase.com/products/prod01.htm

 54.  TOPKAT. Available at:  http://accelrys.com/products/discovery-studio/toxicology/

 55.  MDL QSAR. Available at: http://www.mdl.com/products/predictive/qsar/index.jsp

 56.  Leadscope FDA Model  Applier. Available at: http://www.leadscope.com/product.
info.php?productsJd=66

 57.  OncoLogic. A computer system to evaluate the carcinogenic potential of chemi-
cals. Available at: http://www.epa.gov/oppt/newchems/tools/oncologic.htm

 58.  LAZAR  (Lazy  Structure Activity Relationship).  Available  at:  http://www.
predictive-toxicology.org/lazar/

 59.  Toxtree. Available at: http://ecb.jrc.it/qsar/

 60.  Benigni,  R. Structure-activity relationship studies of chemical mutagens and
carcinogens:  mechanistic  investigations  and  prediction  approaches.  Chem  Revs
2005;105:1767-1800.

 61.  Sello, G, Sala,  L, Benfenati,  E. Predicting  toxicity:  A  mechanism of action
model of chemical mutagenicity. Mutat Res Fundam Mol Mech Mutag 2001;479:141-
171.
                      Previous  I     TOC

-------
88  E. Benfenati et al.


      62.  Fung, VA, Huff, J, Weisburger, EK, Hoel, DG. Predictive strategies for select-
     ing 379 NCI/NTP chemicals evaluated for carcinogenic potential: scientific and public
     health impact. Fund Appl Toxicol 1993;20:413-36.

      63.  Benigni, R. Structure-activity relationship studies of chemical mutagens and
     carcinogens:  mechanistic  investigations  and  prediction  approaches.  Chem  Revs
     2005;105:1767-800.

      64.  Benigni, R, Bossa, C. Predictivity of QSAR. J Chem Inf Model. 2008;48:971-980.

      65.  Piegorsch, WW, Zeiger, E.  Measuring  intra-assay agreement for the Ames
     Salmonella assay. In:  Rienhoff, O, Lindberg DAB (Eds.),  Lecture Notes in Medical In-
     formation (Statistical Methods in Toxicology), Springer-Verlag, Berlin; 1991: 35-41.

      66.  OECD. The Report from the Expert Group on (Quantitative) Structure Activity
     Relationship ([Q]SARs) on the Principles for the Validation of (Q)SARs. 49Paris, OECD.
     OECD Series on Testing and Assessment; 2004.

      67.  Katzius, J, McGuire, R, Bursi, R.  Derivation and validation of toxicophores for
     mutagenicity prediction. J Med Chem 2005;48:312-320.

      68.  Ferrari, T,  Gini, G.  A new Predictive  Model  of Mutagenicity,  with statisti-
     cal  analysis and validation using data-mining tools in  WEKA.  Poster presented at
     SCARLET workshop,  April 2-4, 2008, Milan,  Italy. Available at: http://www.scarlet-
     project.eu/posters/FerrarLT-scarlet.pdf

      69.  Toropov, AA,  Toropova, AP, Benfenati,  E. QSAR  modelling of carcinogenicity
     and mutagenic potentials by optimal SMILES-based descriptors. Poster presented at
     SCARLET workshop,  April 2-4, 2008, Milan,  Italy. Available at: http://www.scarlet-
     project.eu/posters/Toropov _A-scarlet.pdf

      70.  Martin, TM, Harten, P, Venkatapathy, R, Das, S, Young, DM. A hierarchical clus-
     tering methodology  for the estimation of toxicity. Toxicol Mech Method 2008;18:251-
     266.

      71.  Romesburg, HC. Cluster Analysis for Researchers. Belmont, CA: Lifetime Learn-
     ing Publications; 1984.

      72.  Contrera, JF,  Matthews, EJ, Benz, RD. Predicting the carcinogenic potential of
     Pharmaceuticals in  rodents using molecular structural similarity and E-state  indices.
     Regulat Toxicol Pharmacol 2003;38:243-259.

      73.  Franke, R, Gruska, A, Giuliani, A, Benigni, R. Prediction of rodent carcinogenic-
     ity of aromatic amines: a quantitative structure-activity relationships model. Carcino-
     genesis 2001;22 (9):1561-1571.

      74.  Bristol, DW,  Wachsman,  JT, Greenwell, A. The NIEHS Predictive-Toxicology
     Evaluation Project.  Environ Health Persp 1996;104(Supplement 5):1001-1010.

      75.  Benigni, R, Richard, AM. QSARS of mutagens and carcinogens: Two  case studies
     illustrating problems in the construction of models for noncongeneric chemicals. Mutat
     Res 1996;371:29-46.

      76.  Dix,  DJ, Houck, KA,  Martin,  TM, Richard, AM,  Setzer, W, Kavlock, RJ. The
     ToxCast™  program for prioritizing toxicity testing of environmental chemicals. Tox
     Sci 2007;95:5-12.

      77.  Richard, AM, Yang, C, Judson, R.  Toxicity Data Informatics:  Supporting a New
     Paradigm for Toxicity  Prediction. Tox Mech Meth 2008;18:103-118.

      78.  Richard, AM. Future of Predictive Toxicology:  An Expanded View of  "Chem-
     ical Toxicity"—Future of Toxicology  Perspective. Chem Res  Toxicol 2006;19:1257-
     1262.
                  Previous  I      TOC

-------
                         Predictive Models for Carcinogenicity and Mutagenicity  89


 79.  Richard, AM, Williams, CR. Public sources of mutagenicity and carcinogenicity
data: Use in structure-activity relationship models. In: Benigni R (Ed.), QSARs of Mu-
tagens and Carcinogens. New York: CRC Press; 2002:145-173.

 80.  Richard, AM, Yang, C, Judson, R. Toxicity Data Informatics: Supporting a New
Paradigm for Toxicity Prediction. Tox Mech Method 2008;18:103-118.

 81.  NCBI PubChem. National Institutes of Health, National Library of Medicine,
National Center for Biotechnology Information, PubChem Project. 2007. Available at:
http://pubchem.ncbi.nlm.nih.gov/

 82.  Transue,  T,   Richard,   AM.  U.S.  Environmental   Protection   Agency
DSSTox  Structure-Browser v2.0.  2008.  Available  at:  http://www.epa.gov/dsstox-
structurebrowser/

 83.  National Research Council (NRC). Toxicity Testing in the 21st Century: A Vision
and a Strategy. Washington, DC: National Academies Press; 2007.

 84.  Collins, FS, Gray, GM, Bucher, JR. Transforming Environmental Health Protec-
tion. Science 2008;319: 906-907.

 85.  Martin,  TM, Houck, KA, McLaurin, K, Richard, AM, Dix, DJ. Linking Regu-
latory Toxicological Information on Environmental Chemicals with High-Throughput
Screening (HTS) and Genomic  Data. The Toxicologist CD. J Soc Toxicol 2007;96:219-
220.

 86.  Yang, C, Benz, RD, Cheeseman, MA. Landscape of current toxicity databases and
database standards. Curr Opin Drug Discovery Develop 2006;9:124—133.

 87.  Houck, K, Dix,  D, Judson, R, Martin, M,  Wolf,  M, Kavlock, R, Richard, AM.
DSSTox EPA ToxCast  High Throughput Screening Testing Chemicals Structure-Index
File: SDF File and  Documentation: TOXCST.v3a.320.12FEB2009. 2008. Available at :
http://www.epa.gov/ncct/dsstox/sdLtoxcst.html.

 88.  Judson,  R, Richard, AM, Dix, D,  Elloumi, F, Martin, M, Cathey, T, Transue, T,
Spencer,  R, Wolf, M. ACToR—Aggregated Computational  Toxicology Resource. Toxicol
Appl Pharmacol  2008;233:7-13.

 89.  ACToR.  Aggregated Computational  Toxicology Resource.  Available at: http://
actor.epa.gov/actor/

 90.  EPA ToxCast. U.S. Environmental Protection Agency's National Center for Com-
putational  Toxicology  ToxCastTM Program. 2009. Available at:  http://www.epa.gov/
comptox/toxcast/

 91.  Zhu, H, Rusyn, I, Richard, AM,  Tropsha,  A. The  Use of Cell Viability Assay
Data Improves the Prediction Accuracy  of Conventional Quantitative Structure Activ-
ity Relationship Models of Animal Carcinogenicity. Enviro Health Persp 2008;116:506-
513.

 92.  Yang, C, Arnby, CH, Arvidson, K, Aveston, S, Benigni, R, Benz, RD, Boyer, S,
Contrera, J, Dierkes, P, Han, X, Jaworska, J, Kemper, RA,  Kruhlak, NL, Matthews, EJ,
Rathman, JF, Richard, AM. Understanding Genetic Toxicity through Data-mining: The
Process of Building Knowledge by Integrating Multiple Genetic Toxicity Databases. Tox
Mech Method 2008;18:277-295.

 93.  Leadscope Inc. ToxML editor. Ohio, USA. http://www.leadscope.com/toxmLeditor.

 94.  Ashby, J.  Fundamental  structural alerts  to potential carcinogenicity  or non-
carcinogenicity. Environ  Mutagen 1985;7:919-921.

 95.  Klassen, CD. Casarett & DoulPs  Toxicology: The basic science of poisons. New
York: McGraw-Hill; 1996.
                     Previous  I      TOC

-------
90  E. Benfenati et al.


      96.   Yang, C, Richard, AM, Cross, KP. The Art of Data-mining the Minefields of Toxic-
     ity Databases to Link Chemistry to Biology. Curr Comp-Aided Drug Design 2006;2,135—
     150.

      97.   Xia, M, Huang, R, Witt, KL, Southall, N, Fostel, J, Choi, MH, Jadhav, A, Smith,
     CS, Inglese, J, Portier, CJ, Tice, RR, Austin, CP. Compound Cytotoxicity Profiling Us-
     ing Quantitative High-Throughput Screening, Environ Health Persp 2008;116-284-291.
     Available at: http://www.ehponline.org.

      98.   Gold, LS, Bernstein, L, MaGaw, R, Slone, TH. Interspecies extrapolation in car-
     cinogenesis: Prediction between rats and mice. Environ Health Persp 1989;81:211—219.

      99.   Benigni,  R,  Giuliani,  A,  Franke,  R,  Gruska,  A.  Quantitative  structure-
     activity relationships of mutagenic and carcinogenic aromatic amines. Chem  Revs
     2000;100:3697-3714.
                  Previous
TOC

-------
TOXKOLOGICAL SCIENCES 102(1), 15-32 (2008)
doi: 10.1093/toxsci/kfm286
Advance Access publication November 17, 2007
  Predicting  Maternal  Rat  and Pup Exposures:  How Different are They?
                                        Miyoung Yoon*'tjl and Hugh A. Bartonf2

     *National Research Council Research Associateship Program at U.S. Environmental Protection Agency, Research Triangle Park, North Carolina;
   ^US EPA Human Studies Facility, 104 Mason Farm Road, Chapel Hill, North Carolina 27599; and ^.National Center for Computational Toxicology,
                           U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711

                                     Received September 17, 2007; accepted November 14, 2007
  Risk  and  safety  assessments  for early life  exposures  to
environmental chemicals  or  Pharmaceuticals  based on  cross-
species extrapolation would greatly benefit from information on
chemical dosimetry  in  the young.  Although  relevant toxicity
studies involve exposures during multiple life stages, the mother's
exposure dose is frequently  used for  extrapolation of rodent
toxicity findings to humans and represents a substantial source of
uncertainty. A compartmental  pharmacokinetic model augmented
with biological information on factors changing during lactation
and  early  postweaning  was developed.  The model uses adult
pharmacokinetics,  milk  distribution, and relevant  postnatal
biology to  predict dosimetry  in the  young for chemicals. The
model  addressed three  dosing  strategies employed in toxicity
studies (gavage, constant ppm diet, and adjusted ppm diet) and
the impact of different pharmacokinetic properties such as rates
of clearance, milk distribution, and volume of distribution on the
pup exposure doses and internal dosimetry. Developmental delays
in clearance and recirculation of chemical in excreta from the pup
to mother were evaluated. Following comparison  with data  for
two chemicals, predictions were made for theoretical chemicals
with a range of characteristics. Pup exposure was generally lower
than the mother's with  a shorter half-life,  lower  milk transfer,
larger volume of distribution, and gavage dosing, while higher
with longer half-life, higher  milk transfer, smaller volume of
distribution, and dietary exposures. The present  model demon-
strated pup exposures do not always parallel the mother's. The
model predictions can be used to help design early life toxicity and
pharmacokinetic studies and better interpret study findings.
  Key Words: early life dosimetry; biological modeling; lactational
exposure.
  Disclaimer: This work was reviewed by EPA and approved for publication,
but does not necessarily reflect official Agency policy. Mention of trade names
or commercial products does not constitute endorsement or recommendation by
EPA for use.
  1 Present address: The Hamner Institutes of Health Sciences, 6 Davis Drive,
Research Triangle Park, NC 27709.
  2 To whom correspondence  should be addressed at  National Center for
Computational Toxicology, B205-1, Office of Research and Development, US
Environmental Protection Agency, 109 TW Alexander Dr., Research Triangle
Park, NC 27711. Fax: (919)-541-1994. E-mail: habarton@alum.mit.edu.

Published by Oxford University Press 2007.
         Evaluating potential risks from early life exposures is more
      challenging than evaluating risks in adults, in part, because the
      relevant toxicity  studies  including  one-  or  two-generation
      reproductive, developmental, and developmental neurotoxicity
      studies involve multiple life stages (e.g.,  gestation,  lactation,
      and postnatal growth of offspring). Currently, the average daily
      dose given to the  mother is used for extrapolation to humans,
      even  when  the  effects are observed in  the  offspring. To
      improve extrapolation  of  animal toxicity  data to humans,
      information on chemical dosimetry in the young during critical
      developmental windows would be  needed (Barton,  2005).
      However, dosimetry data from early life exposures are scarce
      for environmental chemicals, and even for pharmaceuticals, in
      multigeneration studies. Poorly  characterized pup dosimetry
      during lactational and early postweaning periods is a substantial
      source  of uncertainty in the extrapolation of rodent toxicity
      findings to humans along with uncertainty in the identification
      of critical developmental windows. In recent years, predictions
      of  perinatal  internal  exposures  have  been  made  using
      computational  pharmacokinetic  modeling for a  number  of
      environmental  and  pharmaceutical  chemicals  (reviewed  in
      Corley et  al.  (2003)).  However, it is  generally  difficult to
      develop a full physiologically based pharmacokinetic model
      because of limitations on pharmacokinetic information during
      the relevant  periods as well as  limited information regarding
      physiological parameters for early life stages (e.g., gestation,
      lactation,  and early postweaning).
         Knowledge  of  pup dosimetry can contribute not only to
      applying  study  results  in evaluating  risks  but  also  for
      improving toxicity study designs. A critical factor determining
      chemical  concentrations in pups  would  be the  extent and
      pattern of maternal  chemical exposure because  it determines
      chemical concentrations in the adult animal in repeated dosing
      scenarios  (Saghir  et al., 2006; Yuan, 1993). Several different
      dosing  approaches are used in toxicity studies including diet,
      drinking  water,  gavage,  and,   if  appropriate,   dermal and
      inhalation. Although dietary exposure often represents a rele-
      vant  exposure method  for chemicals,  in  some  cases  it is
      difficult to use due to technical problems preparing chemical-
      fortified diet or a  need to accurately determine maternal dose
      levels  leading  to the  use  of gavage dosing.  Only limited
                                   Previous
TOC

-------
16
                                                      YOON AND BARTON
                                                           TABLE 1
                                        List of the Theoretical Test Compound Categories

1


2

3
4
5
6
Chemical category
Base case (Vd° = 0.7, Pm = 1)


Small volume of distribution (Vd = 0.2)

Large volume of distribution (Vd = 2.5)
High milk transfer (Pm = 10)
Moderate milk transfer (Pm = 3)
Low milk transfer (Pm = 0.1)
Abbreviation
Base


SmVd

LgVd
HIghPm
MidPm
LowPm
Chemical properties
Uniformly distributed throughout the body /approximately
distributed to total body water
Milk concentration equals the maternal blood concentration
Limited distribution to tissues
Highly bound to plasma protein
Distribution to storage depot
Milk concentration greatly exceeds maternal blood level
Milk concentration is moderately higher than maternal blood level
Milk concentration is lower than maternal blood level
  Note. We evaluated 16 different theoretical test compounds in the present study. Two test compounds were defined for each of the six chemical categories, one
with a short half-life and the other with a long half-life. Categories 2-6 include 10 different theoretical compounds, each of which possesses the same chemical
properties as the base case except for the one factor varied to the value noted in parentheses. For base case chemicals, the impacts of delayed elimination capacity
or excreta recirculation were evaluated for each half-life. Otherwise, prenatal development of elimination and no recirculation were modeled for all the test
compounds in categories 2-6.  The details of pharmacokinetic parameter values used for each test compounds are listed in Table 3.
  "Vd represents the body weight-normalized volume of distribution (I/kg) here and in other tables.
consideration has  been  given to  potential differences in the
amount  of  chemical transferred   to  the  suckling  neonates
resulting from gavage versus dietary administration of a com-
pound (Arnold et al, 2000). There has been a concern about the
potential overexposure during lactation due to highly increased
maternal food consumption during this period, which has been
discussed as a potential cause of misinterpretation of increased
neonatal toxicity during lactation (Hanley and Watanabe, 1985).
To that end, a modified  dietary  administration  regimen  is
sometimes  employed  in reproductive  toxicity  study, which
adjusts the chemical  concentration in diet based on historical
food intake data during lactation to maintain relatively  constant
exposures  during this period (Hanley et al., 2002).
   The present study was intended to provide a tool to predict
pup dosimetry  using  limited biological  and pharmacokinetic
properties of test compounds and, thus, to help design toxicity or
pharmacokinetic studies in early  postnatal periods as well as to
help understand findings  from such studies.  The goals of this
research were to evaluate whether  one could use data  on adult
pharmacokinetics and milk transfer in conjunction with  a bi-
ologically based model to predict pup dosimetry to a reasonable
approximation and then evaluate  how different pharmacokinetic
properties  (e.g., rates of clearance and milk distribution) would
affect the pup exposure doses (e.g., from milk) and circulating
concentrations.   A  classical  compartmental  pharmacokinetic
modeling  approach was  employed, which was supported by
biological information on changing factors (e.g., increasing pup
body weight) during lactation and early postweaning period.
Three   dosing   approaches   employed   in   toxicity  studies
(i.e., unadjusted ppm diet, adjusted ppm diet, and gavage) were
simulated to compare resulting maternal and neonatal dosimetry.
Model performance was evaluated by comparing the model to
previously reported lactational exposure data for two chemicals.
Subsequently, exposures were  simulated for 16  theoretical
   compounds with  a variety of different  characteristics bench-
   marked from environmental chemicals and/or pharmaceuticals.
   The properties  of these theoretical test compounds are listed in
   Table  1,  categorized in  six  cases  varying  elimination  rate,
   volume of distribution,  and milk transfer. The present model
   simulates  pharmacokinetics in the dam and pups for the parent
   compound only. This model enables simultaneous consideration
   of several factors with potential  effects on pup exposures  and
   consequently provides a means to predict the overall impact of
   these factors on dosimetry in the young. From the results of this
   modeling  exercise, we have begun to derive  general  insights
   about pup exposures and different study designs for chemicals
   with different properties.
                            METHODS

                           Model Structure

     The model  was  coded  and  all  the simulations were performed  using
   acslXtreme (version 2.0.1.7, Aegis, Inc., Huntsville, AL). The structure of the
   biologically based pharmacokinetic model for chemical exposures of the dam
   and pups is illustrated in Figure 1. Simulation of chemical exposures was
   performed for 28 days after birth, of which the first 21 days were the lactational
   period followed by 1 week postweaning. The model describes changes in body
   weight, milk production and consumption, and food consumption during those
   4 weeks as well as exposure  by three methods—gavage, unadjusted feeding, or
   adjusted feeding. All abbreviations and symbols used in describing the model
   structure are listed in the legend of Figure 1. Equations for parameter values
   that change over the duration of simulation are presented in Table 2. Values for
   two other parameters, BWd and Vm, which change during the simulation period
   were  incorporated in the model using TABLE functions in acslXtreme as
   explained later in this section. All  other chemical parameters are listed in
   Table 3.

   Model Structure for the Dam

     The present  model uses  a  one-compartment  pharmacokinetic  model
   structure, in which the dam and pups were each represented as one central
                                         Previous
TOC

-------
                                         BIOLOGICAL MODELING OF RAT EARLY LIFE DOSIMETRY
                                                                                                                                            17
                          Maternal Exposure
                                     RFO,
                            Food
                     Unadjusted or Adjusted

                             vs.
                          Gavage
Feeding
 Dose
Gavage
 Dose
                Kari
                Karf
                                                                         Dam
                        Neonatal Exposure: Lactational
                                 Weaning after PND21
                      Neonatal Exposure: Post weaning
                          Milk
                                                                              Kan
                                                                         Pups
                                                                                    N*Vn
                                                                                                         Recirculation
                                                                                                           Birth to PND14
  FIG. 1.  Schematic representation of BBPK model for chemical exposures during lactation and early postweaning. Abbreviations used in the present model are
as follows: Dam (subscript d), the compartment for the mother; Kad, first-order absorption constant for the dam (per hour); yd, volume of distribution of the dam (1);
Ked, first-order rate constant for chemical elimination from the dam excluding milk secretion (per hour); /fL, rate constant for chemical secretion via milk from
the dam (per hour); Milk (subscript, m), the conceptual compartment for milk; Vm, volume of milk secreted from the dam/ingested by  the N pups (I/day); Pups
(subscript p), the compartment for the pups, as a litter; Kap, first-order absorption constant for the pups (per hour); N, the number of pups per litter; Vp, volume of
distribution of an individual pup (1); Kep, first-order rate constant for chemical elimination from the pups (per hour); RFDd, rate of feed dosing in the dam (g/h);
RFDp, rate  of feed dosing in the pups (g/h). Chemical concentration in each compartment is expressed as C with a subscript for corresponding compartment;
Cd, concentration in the dam (mg/1); Cm, concentration in the milk (mg/1); Cp, chemical concentration in the pups (mg/1).
compartment. Elimination of a test chemical (e.g., metabolic or urinary) was
described as a first-order process with a rate constant, Ked (per hour); saturable
metabolism was not modeled. Elimination of chemical from the dam through
milk was modeled as a separate process determined by another elimination rate
constant, /fL (Per hour) (Fig. 1). The chemical absorption process of the dam
was a first-order process described by Kad (per hour). The model structure for
dosing is detailed in a following section. The volume of distribution of the dam
compartment (Vd) was defined as a product of the body weight-normalized
volume of distribution (Vdd, I/kg)  with the body weight of the dam (BWd, kg).
   Changes in the amount of the test chemical in the dam (mg/h) are described
in Equations 1-3, where Ad represents the amount of chemical in the dam (mg).
The  overall chemical change in  the  dam was a function of the  chemical
absorbed from the  absorption site (RABd),  elimination (Ked  X  Ad), and
secretion via milk (/fL * ^d) (Equation 1).
                 dAd/dt = RABd - Ked X Ad - KL X Ad.

                 RABd = dABd/dt = Ka
-------
18
                                                          YOON AND BARTON
                                                                TABLE 2
                               Equations Incorporated in the Model to Describe Changing Parameters"
                                                   Simulation periods
                   Weekl
                                       Week2
                                                               WeekS
                                                                                          Week4
                                                                                                                      References
BWP
FOODd
FOOD/
AJ
Unadf
Adf
R
Prenatal
Delay*
0.0022 + 0.0028 X PND
(12 + 82 X PND)/(8.6 + PND)
0

1 1
FID/121e FID/202

1 1
(1
- 0.00014 X PND2 + 0.0000052 X PND3
(65000000 X e{-°-67 x PND)) +
(0.569 + (9.74 - 0.569))/((1 + e(22 ~

1
FID/269

1
X PND)/(PND + 7)
Doerflinger and Swithers (2004)
17
PND))/0.9)

1
1

1

Shirley (1984)
Redman and Sweney

Shirley (1984)




(1976)





  Notes. Bwp, body weight of the pup (kg, individual pup); FOODd, daily food consumption by the dam (g); FOODp, daily food consumption by the pup
(g, individual pup); AJ feeding dose adjustment factor; R, ratio of Kep/Ked, defining developmental pattern of elimination capacity in the pup.
  These equations were written in DISCRETE block in acslXtreme CSL file.
  kpOODp equation was in effect from PND17 and onward.
  c"Unadj" represents the unadjusted feeding dosing simulation.
  rf"Adj" represents the adjusted feeding dosing simulation.
  TID refers to the reference intake and the numbers represent the food intake by the dam during the indicated simulation week.
  ^"Prenatal" refers to development of adult elimination capability occurring before birth so it modeled as constant after birth.
  *"Delay" refers to delayed development of elimination capacity simulated in the model. The elimination capacity was modeled to reach the half maximum on
PND7.
                                                                TABLE 3
                       Chemical Parameters and Simulation Conditions for the 16 Theoretical Test Compounds
                                                       Vd (I/kg)
                                                  Ka (per hour)
Ke (per hour)
               Category
Half-life     Dam (Vdd)     Pup (Vd )     Pm     Dam     Pup
                                                                                                      Dam
         Pup
                                                                                                                             Recirculation
1
2
3
4
5
6
7
8
9
10
11
12
13

14
15
16
Base Short
Long
w/Developmental delay Short
Long
w/Recirculation Short

SmVd

LgVd

HighPm
MidPm


LowPm
Long
Short
Long
Short
Long
Short
Long
Short

Long
Short
Long
0.7
0.7
0.7

0.2

2.5

0.7

0.7

0.7
0.7
0.7
0.7

0.2

2.5

0.7

0.7

0.7
1 2
1 2
1 2

1 2

1 2

10 2

3 2

0.1 2
2
2
2

2

2

2

2

2
0.7°
0.036
0.7
0.03
0.7
0.03
0.7
0.03
0.7
0.03
0.7
0.03
0.7

0.03
0.7
0.03
0.7°
0.036
0.7 X R
0.03 X R
0.7
0.03
0.7
0.03
0.7
0.03
0.7
0.03
0.7

0.03
0.7
0.03
No
No
Yes

No

No

No

No

No
  Note. Abbreviations used for describing category are from Tables 1 and 2. Values in bold highlight the base case for comparison purposes and the changes in
individual parameters defining specific cases.
  "ti/2 = 1 h.
  btl/2 = 24 h.
                                           Previous
                                TOC

-------
                                         BIOLOGICAL MODELING OF RAT EARLY LIFE DOSIMETRY
                                                                                                                                             19
   When modeling developmental delays in elimination, Kep was modeled to
be proportionally related with the mother's value, i.e., following Kep = R X
Ked, where R indicates the developmental pattern of pups' elimination capacity.
Delayed development of elimination was modeled using a Michaelis-Menten
type curve (Table 2).

Structure for Lactational Transfer

   The dam and pup  compartments were connected with a conceptual milk
compartment, the volume of which  (Vm, 1) does  not refer to actual existing
volume  as  a separate compartment, but rather represents the postnatal day
(PND)-dependent volume of milk produced per day. It was assumed that the
pups consume all the milk produced without any delay between production and
ingestion. The rate of milk production and suckling was assumed to be constant
without  any circadian variation. A simplified description of milk intake was
used that did not describe separate suckling episodes throughout the day,  but
rather used a continuous input to the pups at a constant rate.
   The rate  constant for  lactational transfer/secretion of chemical  (/fL)  was
derived  from  two  predetermined  factors, the Vm  and the milk  partition
coefficient or ratio of the chemical concentration in milk to the dam's central
compartment (Pm). Pm was employed in the model as an index of the extent of
chemical transfer into milk relative to the levels of chemical in plasma (i.e.,
mother's central compartment). It was assumed that the chemical concentration
in milk  was in instant equilibration with  maternal blood and consequently
parallels her concentration. Pm was assumed to be constant during the whole
lactational period. By definition,

                            Cm = Pm X Cd.                          (7)

The  amount of chemical secreted in  milk (Am, mg) can be expressed as
a function  of  the clearance  to  milk (Clm, 1/h)  and the milk concentration
(Cm, mg/1) expressed as the product of the concentration in the dam and the
milk partition coefficient:
                 dAm/di = Clm X Cm = Clm X CA X Pm.
 (8)
The milk clearance is the volume of milk produced per day (ym) divided by 24
h. However, the model was specified in terms of rate constants, so we need to
derive the milk elimination rate constant /fL (per hour) on the z'th PND:

                        dAm/di = .KLXCdXVd.                      (9)

Setting Equations 8 and 9  equal and rearranging terms obtains:

                       £L = PmXVrm/(24XVd).                   (10)

Now the rate of lactational transfer of chemical (RML) is expressed as:
                     RML = KL XAd = dAML/dt,
(H)
where AML represents the amount of chemical secreted through milk obtained
by the  integration of Equation  11 and available for absorption to the pups in
Equation 6.

Model  Structure for Dosing
   The model included exposure of the dam by three dosing approaches—
gavage, unadjusted feeding (constant ppm in diet), or adjusted feeding (weekly
changes in ppm in diet).  All these were intended to provide the same target
dose, 15 mg/kg/day either throughout the study (gavage, adjusted feeding) or at
the appropriate baseline  period (unadjusted  feeding). After weaning, direct
dosing to the pups by gavage or unadjusted feeding was included in the model
for 1 week. The dosing to the pup was modeled for an individual pup, rather
than for the combined litter of eight pups. In the case of dietary exposure, the
pup started eating the same diet as the dam on PND 17 as described later. After
weaning, the pups were assumed to consume unadjusted diet.

   Gavage.  Gavage dosing was  modeled as a bolus dose scheduled  once
a day  (coded  in a DISCRETE block in acslXtreme)  at the target dose of
                                                                          15 mg/kg/day (ODOSEO). For gavage dosing, the daily dose to the dam or pups
                                                                          (Dosed or Dosep in Equations 3 and 6) was defined as:
                                                                                                 Dose = ODOSEO X BW.
                                                                          (12)
                                                                            Dietary  exposure.  To  simulate  feeding  exposure, two  factors  that
                                                                          determine the amount and rate of chemical input via diet were  incorporated
                                                                          in the model, the amount of food intake per day (FOOD, g/day) and the diurnal
                                                                          pattern of food consumption using the  mean percentage of total food intake
                                                                          during 1-h intervals (FOODPC, %/h). Feeding exposure for each consecutive
                                                                          simulation day was modeled as a continuous addition of the  chemical to the
                                                                          absorption site of the dam  or pups at a specific rate (RFC, g/h) using the
                                                                          TABLE function in acslXtreme.
                                                                                              RFC = FOODPC/100 X FOOD.
                                                                          (13)
                                                                          Integrating Equation 13 gives the amount of food  consumed (AFC, g). The
                                                                          amount of feed consumed per day (FOOD) was introduced using the DISCRETE
                                                                          block for the dam and when applicable, for the pups, to accommodate daily
                                                                          changes. The values for FOOD were determined by the equations in Table 2.
                                                                            The chemical concentration in diet (FEEDO, mg/g diet) to achieve the target
                                                                          dose (TARGET, 15 mg/kg/day) was derived using the mean food consumption
                                                                          (FID, g/kg/day) by the dam during gestation or during the first week of lactation
                                                                          as a reference point, depending on the simulation scenarios.
                                                                                                 FEEDO = TARGET/FID.
                                                                          (14)
                                                                         For unadjusted feeding, FEEDO was used for the whole duration of simulation
                                                                         without any modification. In order to simulate adjusted feeding exposure, a feeding
                                                                         dose-adjustment factor (AJ) was incorporated in the model to appropriately reduce
                                                                         chemical concentration in food during lactation based on the extent of increase in
                                                                         food intake during lactation compared to the reference intake (FID) (Table 2).
                                                                                                       = FID/mtake;.
                                                                                                  FEED = FEEDO X AJ,
(15)

(16)
       where Intake, indicates the mean food intake (g/kg/day) during the zth week of
       lactation and FEED represents the adjusted chemical concentration in food
       (mg/g food). Intake, was adapted from historical intake data (Shirley, 1984) for
       which the intake by the dam and pups was not discriminated, as is typical in
       toxicity studies. Adjustment of chemical concentration in food was modeled on
       a weekly basis and only for lactation period,  so AJ = 1  was used for the
       postweaning period  returning the  concentration  in  food to  the initial
       concentration. Consequently,  adjusted  and  unadjusted  feed concentrations
       were the same after weaning (Table 2). Although the direct dosing of the pups
       through food during the first week after weaning was expressed as unadjusted
       feeding,  some toxicity studies also adjust the diet during this period.
          The rate of chemical dosing via feeding (RED, mg/h) is:
                                                                                                  RFD = FEED X RFC.
                                                                          (17)
       Hence, the dose  from dietary exposure (FDOSE, mg) (i.e., the amount of
       chemical consumed via diet) was obtained by integrating RFD and then utilized
       as Dose in Equations 3 and 6 when dietary administration was simulated.
          Recirculation  of excreta.  Neonatal rats are known to be unable to
       eliminate  wastes  without maternal stimulation  for  several days after birth
       (Henning, 1981). Hence, it was expected that much of the chemical eliminated
       from the pups returned to the dam through this process. In order to simulate the
       recirculation  of excreta between the dam and pups,  the amount of chemical
       eliminated from the pups (AEP, g) was modeled as an additional chemical input
       to the dam without loss for the first 2 weeks of postnatal period (Fig. 1). AEP
       was added to the Dose in Equation 3 during those 2 weeks, where AEP  was
                                                               . X AD).

                               Model Parameterization

          The present model  incorporated known changes  in biological parameters
       during lactation and the early postweaning period. Modeled kinetic properties
                                         Previous
TOC

-------
20
                                                             YOON AND BARTON
for the test compounds were benchmarked using data from real chemicals. The
rationale for biological and chemical parameterization of the current model is
detailed in the Supplementary section. Parameter values were obtained from
literature when possible, but several assumptions were made due to limited data
availability for these early life stages in rats. Efforts were made to obtain values
within the same  study and/or for the same species of rats whenever possible.
Biological/pharmacokinetic parameters  changing during lactation  and early
postweaning were modeled with either linear interpolations between reported
time points of measurements using the TABLE function in acslXtreme or curve
fitting  to reported  data  points using nonlinear  regression  tools in Prism 4
(GraphPad Software, Inc., San Diego, CA). Equations from fitted curves are
listed in Table 2. The values used in TABLE functions for BWd and ym are
reported in the Supplementary section. The equation for consumption of food
and the equation for utilizing the TABLE value for daily milk volume were
written in a DISCRETE block, so that they varied on a daily basis, but stayed
constant during each of the 24 h. When converting data points from previously
published figures in the literature into numbers in order to incorporate them in
the model, Digitizelt software was used (version 1.5, www.digitizeit.de).

Biological Parameters
   The biological data incorporated in the present model are shown in Figures 2
and 3.
   Body weights.  Maternal body weight (BWd) changes during lactation and
postweaning periods were incorporated  into  a TABLE function  based on
published values for Sprague-Dawley rats (Shirley,  1984). The growth rate of
the neonates was derived from pup body weight data for Sprague-Dawley rats
(Doerflinger and Swithers, 2004) as shown in Figure 2 and Table 2.
   Food consumption.   Food intake for the dam was adapted from the same
study from which the body weight data were obtained (Shirley, 1984). Shirley
reported  the  total observed intake  as  maternal intake,  although  food
consumption by  the pups in later part of lactation was observed. In order to
calculate food intake solely by the dam, an estimated amount of food eaten by
a whole litter (average size reported as 9.5 pups)  was subtracted from the total
reported intake. For this purpose, the amount of food consumed per day by an
individual pup (g/pup/day) reported in another study with Sprague-Dawley rats
(Redman  and  Sweney,  1976)  was multiplied  by 9.5. The onset of diet
consumption by the pups was introduced as PND17 in the present model. The
estimated mother-only intake values (g/dam/day) incorporated in the model are
plotted in Figure 2 along with the observed total intake values during the last
5 days of lactation for comparison purposes. The pattern of feeding by the pups
was  modeled  with intake starting on  PND17  and rapidly  increasing until
PND21,  followed  by a continuous  increase over the  postweaning  period
(Redman and Sweney, 1976)(Fig. 2).

   Dietary dose adjustment.  To simulate the adjusted diet dosing regimen by
reducing  the  chemical concentration on a weekly basis,  the feeding dose
adjustment factor AJ described in Equations 15 and 16 was included in the model
as shown in Table 2. Values  for FID and Intake, were derived from gestational
and lactational food intake  data using Sprague-Dawley rats (Shirley,  1984).

   Diurnal variation in feeding behavior.  The diurnal feeding behavior of
the dam and its changing pattern during the lactation and postweaning periods
were included in  the model based on data for Wistar rats (Strubbe and Gorrisen,
1980). It was  incorporated  in the model as % total  daily  intake  per hour
(FOODPCd, %/h) as described earlier, for which four different patterns were
defined for each week of lactation and  postweaning using TABLE functions
applied sequentially for the corresponding simulation week.
   The diurnal fluctuation of feeding rates was also applied in modeling food
intake by the pups. From PND17 to weaning, it was assumed that the pups follow
the same feeding pattern as the dam during this period, i.e., circadian variations in
feeding were not yet obvious  (Doerflinger and Swithers, 2004; Redman and
Sweney,  1976). The diet consumption pattern in Sprague-Dawley rat pups on
PND25 was adapted to represent the feeding pattern during the postweaning
periods in the current model (Redman and Sweney, 1976). As in the case of the
                                                                      28
         100-1

     _c
     T3
               O Measured (Dam+Pups)
               x Estimated (Dam only)
                                         14
                                        PND
                                                        21
                                                                      28
        0.08-1
        0.06-
     O)
     i 0.04-
     ffi
        0.02'
        0.00
               BW, pup
                                         14
                                        PND
                                                        21
                                                                      28
          12-1
     
-------
                                          BIOLOGICAL MODELING OF RAT EARLY LIFE DOSIMETRY
                                                                                                                                               21
    0.100 -I
    0.075 -
•D  0.050 H
    0.025
    0.000
             X  Experimental
            -B- Modeled
             A  Theoretical
                                                 14
                                                                     21
                                      PND
  FIG. 3.  Theoretical milk  yield volume over lactation.  "Experimental"
represents data  points from Knight  et  al. (1984).  "Modeled" represents
recreated Vm values used  for creating the TABLE function. The connecting
solid line  shows the simulated Vm values over time. "Theoretical" represents
Vm values calculated to match the suggested caloric requirement from milk for
the growing pups (Stole et al., 1966).
dam, the feeding pattern was incorporated in the model as hourly rates of food
intake (FOODPCp%, total daily intake/h) using a TABLE function.
   Milk consumption.   In this model, the maternal milk yield equals the milk
intake  by the pups with no losses. Daily milk intake  by the pups was
incorporated in the model as a combination of experimentally measured values
for early lactation and theoretical values based on the energy requirement for
growing  rat  pups for the later part of lactation.  Milk  consumption was
simplified using continuous  suckling, rather than attempting to  capture its
episodic occurrence. Since pup intake of milk and food were not available from
a single study, multiple sources were utilized to create estimates that were also
evaluated to insure that the caloric intake was consistent with the growth of the
pups.
   In order to construct a milk intake curve for the  early period of lactation,
milk yield data determined from Wistar rats with a litter size of 10 were utilized
(Knight et al., 1984). We adopted the values for PNDs2 and 6 determined by
the tritiated water dilution technique  for the TABLE function for milk intake
during  the first week of lactation. For the later part  of lactation, theoretically
derived milk intake values  were incorporated  in the  model, based on  the
suggested caloric requirement of pups (Stole et al., 1966), calories provided by
independent feeding from PND17 and onwards (Redman and Sweney, 1976),
and the observed  ability of  neonatal rats to respond to  caloric  deficit and
consume either milk or diet at appropriate levels to match  their energy needs
(Henning, 1981). The milk intake in the second and third  weeks of lactation
was set to meet the suggested caloric requirement of 45 kcal/100 g body weight
(Stole et al.,  1966), either provided solely by milk from PND7 to PND16 or
provided both by  milk and  diet from  PND17 onwards,  i.e., milk  volume
suckled by the pups during these last 5 days was estimated to fulfill the energy
requirement not already provided by the diet. The overall milk intake pattern in
the current model consists of two experimental data points for PND2 and PND6
and 15 estimated  values for PNDs7-21 that were utilized in  the TABLE
function. A smooth transition  from the experimentally measured milk intake to
the calculated values was possible because the milk intake values calculated
using the two approaches were very similar for PND6 (Fig. 3). Since the Knight
et al.  (1984) data were for milk intake by 10 pups, daily milk intake per
kilogram pup body weight was calculated using the reported body weight in the
same paper, and  then the daily milk yield for 8 pups  was calculated using pup
body weight simulated in the  model (Doerflinger and Swithers, 2004). Caloric
values  of rat milk were derived from the milk composition data and the energy
        value  of its components (Bornschein  et al.,  1977; Luckey et al., 1954).
        Physiological fuel energy value of 3.41  kcal/g for Certified Rodent Diet5002,
        which is often used in multigeneration  toxicity studies (Hanley  et al., 2002;
        Hinderliter et al., 2005), was utilized to calculate caloric value of rat chow
        (www.labdiet.com).  Sometimes  a different  diet (e.g., DietSOOS), which has
        higher calories (i.e., 3.50 kcal/g) than Diet5002, is used for lactating dams and
        the diet switched after weaning (Howdeshell et al., 2007; Rayner  et al., 2007).
        However, these small differences in energy value of diets were not expected to
        make a substantial difference either in milk intake estimation or food intake by
        the pups during late lactation and postweaning period. For instance, only 2.5%
        less Vmilk value was estimated at most using DietSOOS in the simulation. The
        constructed curve for changing milk intake was incorporated in the model to
        simulate the daily milk  intake for the eight  pups (I/day, Vm). Circadian
        variation in milk intake  was not modeled,  i.e., constant suckling throughout
        a day  was assumed as suggested  from  a few studies (Godbole  et al., 1981;
        Redman and Sweney, 1976).

        Chemical Properties of Theoretical Test Compounds

          The present model was  run  for a series of hypothetical chemicals with
        different characteristics denoted as six categories (Table 1). A total of 16 dif-
        ferent  chemicals  were simulated incorporating different chemical parameters
        and conditions in the model (Table 3). These chemical parameter values were
        either benchmarked from actual chemicals or derived from a few assumptions
        detailed here.

          Oral bioavailability.   Oral bioavailability of the test chemical  administered
        via gavage, feeding, and  milk transfer was assumed to be 100% by setting the
        value of F (used in Equations 3  and 6) as 1.
          Absorption.  The absorption of the chemical  was simulated as a rapid
        process both in the dam  and pups. The  absorption constants for  the dam and
        pups were assumed to be the same (Kaj =  Kap) and to be constant over the
        duration of simulation.
          Distribution.  The volumes of the distribution for the dam (Vd, 1) or pups
        (Vp, 1) were calculated by multiplying the body weight-normalized volume of
        distribution sealer (Vd, I/kg) for the dam or pups with its body weight (kg). The
        same Vd values were used  both for the dam and the individual  pup (Vdd =
        Vdp) and kept constant during the  whole simulation. Three values of Vd were
        used to simulate different distribution scenarios: limited distribution  to tissues
        (Vd = 0.2), distribution  to  total body water (Vd = 0.7),  and distribution to
        a storage depot (Vd  = 2.5).
          Extent of milk transfer.   A ratio of the  concentration in dam's plasma to
        her milk (Pm) was used to describe the extent of chemical transfer to milk
        (Equation 7) since milk  was assumed to be in instant  equilibrium with the
        mother's concentration in her central compartment. The  ratio was  constant
        during the whole lactational period. Four Pm values were adopted in the present
        model: Pm  = 0.1 (milk  < plasma), Pm =  1 (milk = plasma), Pm  = 3, and
        Pm =  10 (milk > plasma). Milk  concentrations compared to the dam's plasma
        or blood for several environmental chemicals and Pharmaceuticals in rats fall
        into the  Pm  ranges used,  including  perfluorooctanoate  (ratio   fa  0.1),
        2,4-dichlorophenoxyacetic acid (2,4-D,  ratio fa  1), tetrachloroethylene (milk
        to blood partition coefficient ~ 10), zidovudine (milk to serum ratio ~ 1), and
        ranitidine  (milk  to  serum  ratio  fa 10)  (Alcorn  and  McNamara,  2002;
        Byczkowski et al,  1994; Hinderliter et al, 2005; McNamara et al,  1996;
        Sturtz  et al, 2006). Biologically, milk  concentration can reflect distribution
        dependent on physical chemical properties (e.g., partition coefficient) and also
        biochemical  properties  (e.g., active transport and protein binding). In the
        present model, Pm was varied with a fixed Vd, i.e., 0.7 I/kg, so high Pm cases
        would reflect  chemicals  actively transferred into milk rather  than highly
        lipophilic chemicals for  which  a  higher Vd would be expected as well as
        high Pm.
          Elimination.  The elimination rate constants  in  the dam (Ked) and  pup
        (Kep)  were  set to the  same values and treated  as  constant, except when
        investigating the  impact of developmental  delays. The  elimination  rate
                                          Previous
TOC

-------
22
                                                             YOON AND BARTON
                                                                   TABLE 4
          Chemical Parameters and Simulation Conditions for the Compounds for Model Benchmarking of 2,4-D and OTA
                                    Vd (I/kg)
   Ka (per hour)
Ke (per hour)
                               Dam
                                              Pup
                                                           Pm
                                                                       Dam
              Pup
                                                                                                  Dam
            Pup
                                                                                                                                    Recirculation
OTA
Prenatal development
Developmental delay
2,4-D
Adult female Vd
Smaller Vd

0.43
0.43

0.18
0.077

0.43
0.43

0.18
0.077

0.6
0.6

1.1
1.1

2
2

3.9
3.9

2
2

3.9
3.9

0.0067
0.0067

0.33
0.33

0.0067
0.0067 X R

0.33
0.33

No
No

No
No
  Note. The parameter values for OTA and 2,4-D were adapted from previously published adult female pharmacokinetic parameters (Li et al., 1997; Timchalk,
2004). To simulate the developmental delay in elimination capacity for OTA, R implemented the function in Table 2. For 2,4-D lactational exposure simulations,
the volume of distribution was reduced to the smaller Vd reported for this chemical (Timchalk, 2004).
constants were chosen to give  half-lives (f1/2  = 0.693/Ke) of 1 and  24 h,
representing rapid  and slower elimination, respectively, in nonlactating rats.
During lactation, the half-life in the dam can differ because the total elimination
includes excretion via milk which can shorten the overall half-life. A half-life of
24 h was considered the  longest reasonable to  consider in the current model
structure because it does not directly include  gestational exposures, which
would be expected to result in  a substantial  body burden carrying over into
lactation.
   Allometric  scaling  was not employed when incorporating the  elimination
constant in the model, so the half-life does not  change with age. Hence, Ked and
Kep were constant, independent  of body weight or corresponding age over the
period of simulation, except when modeling developmental delays in the pups
(Table 2). Kep was expressed  as proportion of Ked using R = Kep/Ked. When
R = 1,  the pups'  overall elimination capability was at adult levels at birth.
Alternatively,  a delayed  pattern  of  elimination  was  modeled,  with half
maximum activity reached on PND7. This development pattern was based on
critical changes in  neonatal kidney morphology and function observed within
the first postnatal week (Kavlock and Gray, 1982). However, it should be noted
that it only represents one possible scenario of changes in elimination capacity
during  rat  development,  and this formula  does not  refer to any  specific
metabolic  or renal process  (see Supplementary section  for other possible
developmental patterns).
   Defining base cases.  It was necessary to  set a point of reference to which
other simulation results could be compared. This was indicated as a "base case".
The  pharmacokinetic  parameters  for  the  base chemicals  are  shown  in
Table 3. Two base  chemicals were defined, one for short half-life and the other
for longer half-life compounds. For the base  case simulations,  elimination
capacity in the pups was assumed to have already reached an adult level at birth
(R = 1), milk concentrations were equal to maternal plasma concentrations, and
distribution was to  total body  water. Recirculation of excreta between the dam
and pups was not included in the base case simulation. To evaluate the impact
of selected biological or chemical factors on  the extent of neonatal exposure,
these factors were varied one factor at a time  using alternative values listed in
Table 3, and the simulation results were compared to those from the base case.
Each chemical property was varied with all other properties fixed to base case
values  in  the current study to explore the  range  of model predictions.
Alternatively, one can run the  model with different sets of chemical parameters
based on the properties of actual chemicals. This approach is required for the
model evaluation by comparison with measured data  for specific compounds.
                           Model Evaluation
   The  model  performance  was  evaluated  using  previously  published
lactational transfer studies for 2,4-D and ochratoxin A (OTA) in rats. Model-
   predicted concentrations of these chemicals in the dam, pups, and milk were
   compared to published  values. All the model assumptions and parameters
   described earlier were applied in the two benchmarking simulations (e.g., milk
   consumption and pup growth) except 2,4-D or OTA-specific pharmacokinetic
   parameters and the reported  experimental conditions were applied instead of
   theoretical values and study designs. (Table 4)
   Simulation of OTA Exposure via Milk in Rats

      Placental and lactational  transfer  of OTA in Sprague-Dawley rats were
   measured in a cross-fostering study (Hallen et al., 1998). The model was used
   to simulate exposure of the dam to OTA during lactation and the first week after
   weaning;  the  predicted concentrations  in the dam, pups, and milk were
   compared to the published values for PNDsl4 and 21. We benchmarked the
   model to the data from one of the cross-fostering groups in which the pups were
   born from an unexposed dam and  nursed by a foster mother exposed to the
   toxin  throughout premating,  gestation,  and lactation. Therefore,  it  was
   necessary to consider the chemical in the dam resulting from the prelactational
   exposure. It was reported that OTA was given to the dam by gastric intubation
   at 50  ug/kg/day five times per  week during five  consecutive weeks including
   2 weeks of premating and 3 weeks of gestation followed by 7 days/week for the
   3 weeks of lactation (Hallen et al., 1998). Since the present model does not
   have either premating or gestational periods, the  amount of OTA still in the
   dam resulting from the 5 weeks of exposure before birth was assumed to be
   equal  to that from 5 weeks of repeated exposure for an adult female. Hence, the
   prelactational exposure was simply modeled as repeated oral gavage of OTA to
   the adult female following the  dosing regimen described above.  To simulate
   this repeated OTA exposure in the adult female, selected parameter values were
   changed, i.e., the milk yield was set to zero, and the body weight of the  dam
   was held constant at early gestational weight, 0.25 kg (Shirley,  1984). OTA
   pharmacokinetic parameters were  adapted  from  the two-compartment model
   structure of Li et al. (1997) using adult female Sprague-Dawley rats. The beta
   phase elimination  constant  and  steady-state volume  of distribution were
   adopted as  Ke and Vd for  the current one-compartment model. The rapid
   absorption  constant,  Ka =  2,  was utilized  for  OTA  simulation  based on
   observation of efficient absorption of OTA from the gastrointestinal tract after
   oral administration in F344 rats  (Zepnik et al., 2003). A developmental delay in
   elimination was also modeled (Table 4).

   Simulation of 2,4-D  Exposure via Milk in Rats
      Sturtz et al. (2006) administered 2,4-D to Wistar rat dams with eight pups per
   litter during lactation using an adjusted feeding method. 2,4-D concentrations in
   the serum  of the dam and pups  and in the milk were determined on PND16. In
   this study, the 2,4-D concentration was adjusted by comparing the extent of food
   intake to the most recent intake for the two preceding days during lactation.
   Lacking details, we  approximated this study design and modeled the  diet
                                              Previous
TOC

-------
                                     BIOLOGICAL MODELING OF RAT EARLY LIFE DOSIMETRY
                                                                                                                                 23
                       1.0-

                       0,8-

                       0.6-

                       0,4-

                       0.2-
                       0.0
A)
                                         Prenatal
                                      7           14          21
                                           PND
                                        	simCd  	simCp
   1.0

   0.8-

   0.6

   0,4-

   0.2
                                                                      0.0
B)
                                                                                         Delayed
                                                                                                  14
                                                                                                              21
                                        sim Cm  A Cp  v Cm
                                                              PND
                                                              Cd
  FIG. 4.  Simulation of OTA concentration in the dam, pups, and milk. Predicted concentrations in the dam, pup, and milk are indicated as sim Cd, sim Cp, and
sim Cm, respectively. The published data points on PNDsl4 and 21 are represented with mean ± SD (Hallen et al., 1998). They are plotted in the middle of the
corresponding day of measurement. The predictions using prenatal and delayed developmental patterns of elimination are illustrated in A and B, respectively.
concentration adjustment as a weekly event using the food intake in the first week
of lactation. The model was evaluated for predicting the dosimetry for the lowest
exposure dose only, i.e., 15 mg/kg/day, because elimination of 2,4-D in rats has
been shown to  saturate at higher exposure levels, e.g., 50 mg/kg and above
(Gorzinski et al., 1987). The value for Pm was 1.1 according to the findings by
Sturtz et al. (2006) at  15  mg/kg/day. As listed in Table 4, two different
simulations were performed to compare the model with the experimental data.
The  first simulation  (adult  female  Vd) used the 2,4-D pharmacokinetic
parameters  derived from the  nonlactating adult  female rats (Griffin et al.,
1997; Timchalk, 2004). In the second simulation (smaller Vd), the volume of
distribution (Vd) was shifted to a value, which was the lower end reported for
2,4-D (Timchalk, 2004), while keeping the elimination rate constant unchanged.
                     Dose Metric Calculation
   Daily dose metrics simulated include the maximum concentration of the
chemical (Cmax, mg/1), 24-h cumulative area under the curve (AUC, mg X h/1),
and daily dose (mg/kg/day). The percentage of AUCp to AUCd  was also
reported as a dose metric of relative risk for the pups, as suggested  in Corley
et al. (2003). All the dose metrics were calculated for three selected PNDs—4,
16, and 28—representative for early lactation, (moderately) late lactation, and
early postweaning.
   The differences between the values on PND, and PND,_i were referred to as
the daily dose metrics on z'th PND for the 24-h AUC, the relative risk,  and Cmax.
The daily dose for gavage dosing was a fixed value of 15 mg/kg/day. In the
case of dietary exposure, the daily dose on PND, is given by the Equation 18.
                    (FDOSE; -FDOSE;_!)/BW;.
                                 (18)
During lactation, the daily dose to the pups was the sum of chemical input
from milk and diet (if applicable), where the milk dose on PND, is given by
Equation 19.
                     (AML; -
                                 (19)
This gives the daily dose to the pups as the sum of milk dose (Equation 19) and
dietary dose (Equation 18), which is applicable only for the last 5 days of
lactation (i.e., PNDsl7-21).
                           RESULTS

Evaluation of Model Performance
   The model performance was assessed using the published data
for 2,4-D and OTA. Figure 4  compares model simulations with
previously reported OTA data (Hallen et al., 1998). The reported
Cd and  Cm values  fell in the  predicted concentration  ranges
either simulated with  adult female pharmacokinetic parameters
(Fig. 4A)  or with  delayed  development of  pup  elimination
incorporated (Fig. 4B). The Cp was reasonably predicted in both
cases  showing 13% underestimation and 20%  overestimation
compared to the measured value on PND14, respectively.
   Table  5  reports the simulated 2,4-D concentrations  in the
dam, pups, and milk on PND 16 from two different simulations
with varying Vd.  The predicted 2,4-D levels in the dam, pups,
and milk were consistently about one-third of reported  values
with the adult female  Vd (Table 5). With a smaller Vd,  higher
values were achieved that were closer to the measured  values
(Table 5). Incorporating recirculation and delayed development
of elimination in the model made little differences compared to
the predicted 2,4-D levels with adult Vd (data not shown).
   These  two  limited  tests  indicate  that  the  model   can
reasonably approximate  pup exposures when  data  on  milk
transfer and kinetics in the nonlactating female  are available.

Predicted Dose Metrics for Theoretical Chemicals

   Exposure levels  in the dam and pups were  predicted  and
several internal or  external dose  metrics were  calculated as
indicated in "Methods"  section. Since it was not practical to
report all the predictions from the  simulations, selected  results

                           TABLE  5
       Benchmarking the Model to  Published 2,4-D Data

                  Dam (Cd)  (mg/1)   Pup (Cp) (mg/1)   Milk (Cm) (mg/1)
Sturtz et al"
Adult female Vd6
SmVd6'c
26.09 ± 4.24
4.6-9.1
8.0-18
6.34 ± 1.68
1.5-2.1
6.6-9.5
28.92 ± 3.04
5.1-10
8.9-19
                                         "Mean ± SE as reported in (Sturtz et al., 2006).
                                         ^Predicted Cmin-Cmax values were reported for each dose metric.
                                         The volume of distribution for 2,4-D was varied to the smaller value
                                       reported for this chemical (Timchalk, 2004).
                                      Previous
                                TOC

-------
24
                                                    YOON AND BARTON
           A)
             12
              6 •
             12-
              6 -
              3 -
   Short half life compound
              Dam
	Gavage  	Unadj Feed   	Adj Feed
                                                      IP f r\
                          7         14         21
                                   PND

                                   Pup
                      	Gavage  	Unadj Feed   	Adj Feed
                                                         28
                                     14
                                   PND
                                               21
                                                          28
B)

  75 -i

  60

^,45-

£30-

  15 •
                                                              "01
                                           75 n

                                           60

                                           45

                                           30 -

                                           15
 Long half life compound
           Dam
	Gavage  	Unadj Feed   	Adj Feed
                                                        7         14        21
                                                                PND

                                                                Pup
                                                     	Gavage  	Unadj Feed   	Adj Feed
                                                                                                            28
                                                                  14
                                                                PND
                                                                                                  21
                                                                                                            28
                       AUCd
                                                AUCp
                                                                          AUCd
                                                                              AUCp
  FIG. 5.  Predicted concentrations and AUC values for base case compounds. The base case represents a rapidly absorbed chemical distributed to total body
water. It partitions to milk with a concentration equal to maternal plasma. Simulations are for the 3-week lactational period followed by 1-week postweaning. The
24-h AUC for dams and pups were calculated for each indicated PND.
are highlighted below.  Complete listings of predicted dose
metrics (Cmax, AUC, and daily exposure  dose) as well  as
figures  showing all the  simulated concentration profiles over
time are in the Supplementary section.
Base Case Comparison of Alternative Exposure Methods,
  Short Half-life

  The concentrations in the dam and her pups predicted during
the 4 weeks simulation and the resulting 24-h AUC values on
selected PNDs for the short half-life base case compound are
shown  in Figure  5A.  hi  the dam, gavage dosing resulted in
                                         higher peak concentrations  compared to  the two  dietary
                                         exposures.  For dietary exposures, the  peak levels reflected
                                         the changing pattern of food intake during lactation and after
                                         weaning. When the dam was dosed  via  unadjusted feeding
                                         (i.e.,  constant ppm in diet), the peak levels were observed to
                                         increase during the first week of lactation as expected  from
                                         a rapid increase in dam's food consumption in this period, while
                                         the peaks in the second and third week of lactation increased only
                                         slightly. This behavior differed from the expectation based on
                                         food intake by the dam which increased during these latter two
                                         weeks, although not as extensively as in the first week (Fig. 2).
                                         The similarity of the peak concentrations is attributable to the
                                       Previous
                                      TOC

-------
                                  BIOLOGICAL MODELING OF RAT EARLY LIFE DOSIMETRY
                                                                                                                   25
altered  feeding  pattern of the dam  during lactation which
results in less  fluctuation  in feeding rates during  the  day.
Although the peak concentrations were approximately constant
during the second and third weeks, internal exposure  (AUCd)
on PND16 was higher than on PND4 indicating that the dam's
increased food  intake  indeed resulted  in  greater exposure
(Fig. 5A). If the chemical  concentration in  diet was adjusted
during lactation, peak values were lower than those from the
unadjusted  dosing regimen. As intended for the adjusted diet
protocol, the internal exposure was maintained at approximately
constant levels  during lactation  as implied  by the similar
predicted AUCd for  PNDs4 and  16 (Fig. 5A). Since  the
adjustment was  simulated only for the lactational period and
thus the chemical concentration was the  same for adjusted and
unadjusted  exposures, the concentration profiles  and AUCd of
the dam for postweaning period were identical  for the two die-
tary dosing methods (Fig. 5A).
  The concentration  in the  pups  (Cp)  during  lactation was
substantially lower than in the mother (Cd) for  all three dosing
approaches  (Fig. 5A).  Gavage dosing was predicted to have the
greatest difference,  with the  pup's peak concentrations  about
50 fold lower than the  mother's (see Supplementary tables). The
time profile for the chemical concentration  in the pups during
lactation did  not parallel  the mother's. When the  dam was
gavaged, the  pup concentration was predicted  to continuously
decrease as they grew, while the mother's concentration stayed
constant. The rate of chemical input through milk appeared not
to be fast enough to prevent chemical dilution by growth of the
pup, i.e., increasing pup volume of distribution, hi simulations of
feeding exposure, the profiles of peak concentrations in mother
and pup  were very different from PND17 and  onwards, which
was attributable  to the initiation of consumption of dosed-feed
by the pups (Fig. 5A). During the 4th week of simulation, the
pups and dam were either dosed directly by gavage or ate the
unadjusted  diet.  The pups'  internal  exposure  during this
postweaning  period was the same  as the  mother's exposure
when gavaged. Pups  fed chemical via  diet had higher peak
concentrations and a larger AUC than their mother's (Fig. 5A),
which was  attributable to the higher food consumption per kg
body  weight of the pups  compared to adults.  Compared to
lactational exposure, the postweaning exposure  of pups was
much higher for both gavage and feeding. The differences in pup
peak concentrations between  lactation and postweaning expo-
sures were greater for  gavage dosing, but in terms of AUC, the
extent  of the lactation versus  postweaning differences  were
similar in gavage and  feeding.


Base Case Comparison of Alternative Exposure Methods,
  Long Half-life

  Figure 5B  shows the predicted  concentration profiles in the
dam and pups for the  long half-life compound. The daily peak
concentrations  were  the   highest  with unadjusted  feeding
exposure, and the AUCd from this dosing  method was also
      greater than gavage or adjusted feeding approaches in the dam.
      Similar to  the short half-life  case, the predicted AUCd levels
      indicated that the exposures to the dam via gavage and adjusted
      feeding were to  be very similar.  The  slight increase  in
      concentration in the gavaged dam after weaning was due to
      cessation of chemical  excretion  in  milk.  The  concentration
      profiles in  the pups roughly parallel the mother's for the three
      dosing  methods  though  lower  and  showing   smaller  daily
      fluctuations (although this  partially may  be due to  modeling
      continuous  suckling, rather than episodic behavior),  hi the case
      of long half-life chemical exposure, the concentrations in the
      pups, Cp, approached the mother's levels, especially  for the first
      week,  but  were  still  lower.  Adjusted feeding and gavage
      produced very similar lactational  exposure in the  pups in terms
      of both peak concentrations and AUCP during the later part of
      lactation until the pups initiated their own feeding (Fig. 5B). The
      impact on Cp of the pups'  independent feeding on  treated diet
      was not as great as that observed for the short half-life compound
      exposure. Postweaning exposure of the pups was the same as the
      dam for gavage, while higher than mother's for feeding exposure.
      When compared to the lactational exposure, the postweaning
      exposures were observed  to  be  higher for both gavage and
      feeding, but the difference between the two  periods  was not as
      great as was the case for the short half-life compound in terms of
      peak concentration and the  predicted AUCp  (Figs. 5A and 5B).
      When comparing the lactation and postweaning  exposures, the
      greatest difference was caused by adjusted feeding followed by
      gavage and then unadjusted feeding. It should be  noted that the
      exposure to the dam determines  the pup's exposure  during
      lactation, while the pups' own  exposure determines  their
      exposure level after weaning. It was also noticeable  that dietary
      exposure produced higher peak  concentrations  in the  post-
      weaning pups compared to gavage, in the long half-life chemical
      exposure simulations, while it resulted in lower  peak concen-
      trations than gavage for the short half-life chemicals (Fig. 5B).


      Changes in Volume of Distribution

         Simulations  were performed  to determine  the  impact  of
      changing chemical  properties on the extent of pup exposure.
      By changing Vd and Pm, the exposures of the dam  and pups
      varied to different extents from the base case predictions, but
      some of the patterns remained similar to those for the base
      cases (see Figures in Supplementary section). Only distinctive
      features resulting from these variations are described here. The
      AUC values were utilized to compare the pups' exposure to the
      dam's and among different periods of postnatal life. The impact
      of varying  chemical properties on the dose metrics other than
      AUC including the concentration profiles over time and Cmax
      can  be found in the Supplementary section. The predicted
      AUCp and  AUCd values were compared with the volume of
      distribution (Vd) varying from the base case scenarios. Results
      from gavage dosing are shown for a short half-life compound
      as an example and  unadjusted  feeding  for  long half-life
                                  Previous
TOC

-------
26
                                                  YOON AND BARTON
                   A)
Short half life, Gavage


1
.c
*
01
E



Vd=0.2 Vd=0.7
100 -
80 -
60 •

40 -
20-






I




I
1





(Base)




Oil
4 16 23 4 16 28
PND
Vd=2.5





ELJUl
4 16 28

 B)
Long half life, Unadj Feed
Vd=0.2 Vd=0.7 Vd=2.5
5000 •
4000-
1 3000 •
*o>
E 2000 •
1000-
0-



r [
J
(Base)



flfil nfL-
4 16 28 4 16 28 4 16 28
PND
  FIG. 6.  Impact of varying Vd on predicted AUC values of the dam (open bars) and pups (closed bars) for two selected exposure scenarios. Other
pharmacokinetic parameters were the same as for the base case (e.g., Pm = 1).
compound (Fig. 6). The smaller the Vd, the higher the relative
level of AUCp compared to AUCa. The impact of varying Vd
on AUCp was  observed both for lactation and postweaning
period,  in terms  of absolute values. However, it should  be
noted that the relative levels  of AUCp and  AUCa  during
postweaning were the same regardless  of Vd values, although
the absolute  values were  different  (Fig. 6). The impact  of
changing Vd on the relative AUCp to  AUCa during lactation
was  more pronounced for the  case  of  the  long half-life
compound. With the smallest Vd used in the model, AUCp
values were greater than AUCa during lactation by more than
twofold (Fig. 6B). Similar  trends were observed for  the other
two dosing  methods  (data in  Supplementary section). Com-
paring dosing methods, the relative level of AUCp  to AUCa
was  greater in gavage dosing  simulations, though to  only
a small extent, compared to the two dietary methods for all Vd
cases. Unadjusted and  adjusted feeding methods resulted  in
similar relative exposure levels for the pups compared to the
mother for all Vd used.

Changes in Milk Partitioning

  Three Pm values in addition to  the base case with  milk
concentrations equal to maternal concentrations (Pm =1)  were
                             assessed—milk  concentrations  10-fold  lower (Pm  =  0.1),
                             threefold higher (Pm = 3), and 10-fold higher (Pm =  10) than
                             mother's plasma concentrations. Predicted AUC values using
                             gavage  dosing for the  short half-life compounds and  adjusted
                             feeding for the long half compounds are illustrated in Figure 7,
                             while predictions for the other two dosing regimen are listed in
                             the Supplementary section. The impact of varying Pm on the
                             pup exposure was observed mostly during lactation as shown
                             by increased (or decreased) AUCp values observed on PNDs4
                             and 16  with higher (or lower) Pm, while PND28 values were
                             unaffected as this is 7  days after exposure to milk ceased. For
                             short  half-life  compounds,  increasing Pm  resulted  in  less
                             difference in internal  exposures for dams (AUCa) and pups
                             (AUCp) (Fig. 7A). The AUCa values decreased with increasing
                             Pm, which also contributed to reducing the difference  between
                             mother  and pups though the extent of this decrease was smaller
                             than  the  corresponding  increase in  AUCp. Long  half-life
                             compounds exhibited pup exposures ranging from much lower
                             than maternal to much greater than  maternal with increasing
                             Pm, which  substantially  contributed  to the decline in the
                             maternal exposure with increasing milk transfer (Fig. 7B). The
                             other  two  dosing methods  produced  similar  trends (see
                             Supplementary section).  When comparing  dosing  strategies,
                   A)
Short half life, Gavage
B)
 Long half life, Adj Feed
35 Pm=0.1 Pm=1 Pm=3 Pm=10 All Pm=0.1 Pm=1 Pm=3 Pm=10 All

28 -

^ 21-
^
•K
Ul ...
£ u'
7 •















(Base)


































i







1400-

1050-
_j
£ 700-
O)
£
350-
(Base)



1









1 1
nil





J:
;
i






4 16 4 16 4 16 4 16 28 4 16 4 16 4 16 4 16 28
PND PND
  FIG. 7.  Impact of varying Pm on predicted AUC values of the dam (open bars) and pups (closed bars) for two selected exposure scenarios. Other
pharmacokinetic parameters were the same as the base case (e.g., Vd = Vp = 0.7).
                                      Previous
                           TOC

-------
                                  BIOLOGICAL MODELING OF RAT EARLY LIFE DOSIMETRY
                                                                                                                    27
A)          Short half life compound
   6 t	-- Prenatal
        — Delayed
   5-

   4 -
                                                             B)
fl)
g 280 •
0)
§ 210-
£t
6? 140-
0.
0 -









4 16 28
PND
                                                                70-

                                                                60-

                                                                50-

                                                                40 -

                                                                30-

                                                                20-

                                                                10-

                                                                 0
  Long half life compound
• Prenatal
- Delayed
                                                                  350
                                                               0)
                                                               I/)
                                                               g 280

                                                               1 210-1
                                                               -Q
                                                               g? 140 -
                                                                   70-
                                                                    o
                                                                                  14
                                                                                 PND
                                                                                          21
                                                                                                 28













                                                                                   16
                                                                                  PND
                                                                                            28
  FIG. 8.  Predicted concentration of the pups with delayed development of elimination. These plots present the results using unadjusted feeding. AUCp is shown
as % of AUCp from the corresponding half-life base case simulation on each selected PND. The base case assumes prenatal development of elimination to adult
capacity.
the relative level of AUCp to AUCa was greater in gavage
dosing simulations,  although to a small extent, compared to the
two dietary methods, for all varied Pm cases. The relative level
of AUCp to AUCd was almost the same for unadjusted feeding
and adjusted feeding methods for each  simulation case (see
Supplementary section).
Developmental Delay in Elimination

  In order to compare the base case with a different pattern for
postnatal  development of elimination capacity, one possible
scenario of a developmental delay was simulated (DevKe in
Supplementary  section).  The  delay  was  described  with
a Michaelis-Menten profile (i.e., a rectangular hyperbola) with
half maximal capacity  on PND7.  Figure  8  shows  the
concentration profiles  over time  in  pups with  delayed de-
velopment of elimination compared with those born with adult
capacity (base case) for unadjusted  feeding  exposures.  The
delay in elimination capacity in the  pups resulted in higher
concentrations  and AUCp  compared to the base  case.  The
greatest impact was  predicted to occur during the first week
with a two- to threefold increase in AUCp for the short half-life
chemical (Fig. 8A) and a smaller increase for the long half-life
chemical (Fig. 8B). The impact was reduced as the elimination
capacity approached adult levels  for both half-life cases. The
impact on peak concentrations appeared to be more persistent.
                                         Other dosing approaches showed similar trends, in terms of the
                                         timing of the maximum effect on AUC, the profiles of the
                                         impact over time, and greater impacts in the short half-life case,
                                         though the extent of maximum effect varies according to the
                                         different dosing methods and half-life of a chemical.

                                         Role of Excreta Recirculation

                                           Concentrations in  the pups with  or without recirculation
                                         were  compared  using  predictions  from  adjusted  feeding
                                         simulations  in Figure 9. Excreta  recirculation between the
                                         dam and pups had a small impact,  with a maximum effect of
                                         a 10-30%  increase in AUCp  for the long half-life case only
                                         (Fig.  9B).  Cp was  higher than in the base case during the
                                         second week for the long half-life chemical simulation. The
                                         impact of recirculation on concentrations and thus AUCp for
                                         the short half-life chemical was negligible (Fig. 9A). Results
                                         for other exposure methods were similar (see Supplementary
                                         section).

                                         Predicted Pups' Daily Dose

                                           Daily doses  to the pup (mg/kg/day) were calculated using
                                         the model and compared to the daily  dose administered to the
                                         dam for three selected PNDs (Fig. 10). Daily doses to the pup
                                         (mg/kg/day)  from  six  different  cases   were  graphed  and
                                         compared to the predicted dam's  daily doses, with  gavage
                                  Previous
                                   TOC
 Next

-------
28
                                                  YOON AND BARTON
                               Short half life compound
  B)
01
in
a
jjj 100-
15
ft
a so -
O
3
0 -


















4 16 28
PND
                                                               60-
                                                               45-
                                                             o»
                                                             £ 30-
                                                               15-
 Long half life compound
- Base
- Recirculation
                                                                                 14
                                                                                PND
                                                                                         21
                                                                                                 28
01
in
ra
o
« 100-
as
J3
l-o-
O
< 0-










4 16 28
PND
  FIG. 9.  Predicted concentration in pups with recirculation of excreta simulated. Simulation results using adjusted feeding are presented. AUCp is shown as %
of AUCp from the corresponding half-life base case simulation on each selected PND.
providing the targeted 15 mg/kg/day. Overall,  the amount of
chemical delivered per kilogram body weight of pups per day
was smaller than that of the dam during lactation for short half-
life compounds for all simulation cases (Fig. 10A). Differences
between the dam's and the pup's doses during lactation were
smaller with  a smaller Vd or a higher Pm for short half-life
chemicals  (Fig. 10A). For long half-life compounds,  pups
received comparable doses as the dam. The daily doses to the
pups were predicted to be higher than the dam's  on PND4
when Vd was smaller or Pm was greater than 1 during the first
week of lactation, while  in the later period of lactation these
cases were predicted  to  have  similar levels  as the dam's
(Fig. 10B). After  weaning, gavage  dosing was  predicted to
provide the same dose to the dam and pups, while feeding was
predicted  to  result in a higher  chemical dose  to  the  pups
(Figs. 10A and 10B).
                       DISCUSSION

  The purpose of this study was to answer the question, to how
much chemical are pups  exposed during lactation and early
postweaning?  Although pharmacokinetic information on lac-
tational transfer and dosimetry in developing rat pups has been
noted as  a critical data need in extrapolating early life toxicity
studies  to  humans,  only limited  information is  available
   (Barton et al., 2006). Pharmacokinetic studies involving early
   life stages are technically challenging and approaches are less
   standardized than for adults, even  for pharmaceuticals. The
   current model demonstrated that it could provide insights into
   what dosimetry  would  be during  lactation and early  post-
   weaning  periods in dams and  pups.  These insights can be
   applied  to  pharmacokinetic and  toxicity  study  design,  in-
   terpretation of  toxicity study  results, and risk assessment
   applications.
     The present model for rats was  based on  known changes in
   biology  and exposure  characteristics  during early  postnatal
   periods together with chemical  kinetics adapted from adults.
   Model evaluation against limited published  lactational transfer
   data for OTA and 2,4-D indicated that reasonable predictions
   could be made using this information whether the chemical was
   given via diet (2,4-D)  or  by gavage  (OTA). While two-  or
   threefold discrepancies  were observed by comparison with
   some  data,  this  was  considered  acceptable  for  obtaining
   plausible initial  estimates for a range of chemicals.
     Model predictions were  obtained  to  derive  insights  for
   a range of considerations important  in one-generation toxicity
   settings,  such  as characterizing  exposures during  lactation,
   comparison of pup exposures to the mother's,  comparison of
   exposures via  milk  to  direct  exposure  after weaning, and
   evaluation of alternative dosing  approaches. The present study
   clearly  showed that exposure of the pups frequently does not
                                      Previous
TOC
     Next

-------
                                  BIOLOGICAL MODELING OF RAT EARLY LIFE DOSIMETRY
                                                                                                                     29
                    A)
   25 i

   20
>v
a
5  15
O)
O) 10 -

    5 -

    0
     Short half life compound
Gavage
                                          B)


                                                                50

                                                                40
                                                              m
                                                             •C 30
                                                                20 -
Long half life compound
                    Gavage
                                                                                               1
   40-

   30-

   20 -

   10-

    0
                             PND4
                          Unadj Feed
                                         PND16

                                          JT
                                                   PND28
                           I
                                                                      PND4
                                                                                   PND16
                                                                75
                                                                60
                                                              ns
                                                             •a 45-
                                                              r°
                                                                                             PND28
                                                                      Unadj Feed
                             PND4
                                         PND16
                                                   PIMD28
                                                                      PND4
                                                                                   PIMD16      PND2B
25-
20-
•5 15-
"§)
"a 10 •
E
5 •
Adj Feed






J


50-
40-
t 30-
0!
0> 20-
10-




o

1

?
/
/
/
/
/
/
/
/
/
/
/
Adj Feed
'
>
*
$
'
>
>
•
I
PN04 PND16 PND28 PND4 PND16 PN028
1 1 Dam BSH Pup-LgVd EE3 Pup-HighPrn
Pup-Base K^a Pup-LowPm Pup-RCC
nrm Pup-SmVd W7^ Pup-MidPm •• Pup-AN cases
  FIG. 10.  Predicted daily exposure dose (mg/kg/day) of the test chemical to the pups from milk or direct dosing. For each dosing method, seven different
exposure scenarios were simulated, and resulting pup daily dose were compared to the mother's dose for three selected days. For PND28, the daily doses to the
pups are the same for all cases, so only one column was plotted for each exposure scenario.
parallel maternal  exposure.  However,  it  was  possible to
delineate  several general features  of the  neonatal exposure
pattern. The extent of lactational exposure of the pups to short
half-life compounds was generally lower than that of the dam,
while it was often comparable or even  higher than maternal
levels for  longer half-life compounds. Factors such  as a lower
concentration in milk than in maternal blood and a volume of
distribution  larger than total body water tended to result in
lower  pup  exposures compared  to  dam and  consequently
greater differences between maternal and neonatal exposures
for both half-life compounds. Less difference was predicted
between the maternal and neonatal exposures when the milk
concentrations were greater than maternal blood levels, and the
volume of distribution was smaller than total body water level.
With these characteristics, the pup exposure level could exceed
that of the dam for long half-life compounds. But, it also needs
to be noted that even when the  test compound's  half-life is
                                          short,  the pup  exposure level could be  comparable  to the
                                          mother, if there are factors that tend to elevate exposure, e.g.,
                                          smaller  volume of  distribution, delay  in development  of
                                          elimination,  and recirculation. Pup  exposures comparable to
                                          the mother's for a short half-life compound would be more
                                          likely  with a dietary exposure  regimen rather than gavage
                                          because  gavage tended  to result in more pronounced differ-
                                          ences between maternal and neonatal exposure. Recirculation
                                          of excreta between the dam and the pups can contribute to pup
                                          exposure for longer half-life compounds when elimination is
                                          largely renal clearance, rather than by metabolism, by raising
                                          maternal  exposure.   Recirculation  would  be  a  particularly
                                          important factor to consider in cross-fostering studies where
                                          the pups  would expose the "nonexposed" foster mother.
                                           Predictions from the  present model can be used  as initial
                                          estimates for better designing and interpreting toxicity studies.
                                          Current findings suggest that there are some cases  for which
                                  Previous
                                   TOC

-------
30
                                                  YOON AND BARTON
one should think about alternative exposure methods other than
milk to  achieve  exposure to the pups. Particular  attention
would  be needed when the pup exposure was predicted to be
extremely low, e.g., less than 1% of maternal exposure, which
was observed when the milk concentration was much less than
the mother's  blood concentration  or when the  volume of
distribution was large for a short half-life compound. In these
cases, one  would need to be careful determining the windows
of susceptibility or interpreting the potency  of the chemical. If
pup  exposure was  estimated from maternal levels  for  such
a chemical, then  the period of lactation could be interpreted
incorrectly as not being a susceptible period  when instead there
was  simply a lack of adequate exposure in the pups and the
lactational  period had effectively not been evaluated.  If this
were the case, an  alternative study design such as direct dosing
of the pups would need to be considered (Moser et al., 2005).
Conversely, if the chemical exhibited toxicity  with this  very
low level of milk transfer, then it can be suspected as  a very
potent  compound.  Model  predictions  also  could  help in
amending  the study design by  providing  information as to
whether  toxicities seen at certain doses may be related to
excessively high  pup exposure. During lactation, a notable
example of this case is high milk transfer of a chemical with
a long  half-life.
  The  model predictions could be informative  about whether
abrupt  changes in exposure levels after weaning, due to the
initiation of direct dosing  of  pups,  result  in  toxicity. For
example, predictions from the current modeling  showed that
postweaning exposure  was  much  higher  than lactational
exposure when gavaging short half-life chemicals. This abrupt
increase in concentration in  the pup could produce premature
death or other toxicities in the neonates right  after weaning,
which could be misinterpreted as a critical window for the test
chemical. Long half-life chemicals  tended  to show a  greater
similarity between  lactational  and  postweaning exposures,
though not in every case (e.g., low Pm). In the postweaning
period, the pup exposure is  predicted to be even higher than
adult exposure under dietary exposure  conditions due to the
higher feeding rate of pups on a body weight basis compared to
adults,   while  the   same  when  gavage   dosing is   used.
Collectively, initial estimates of pup exposure from the current
model  could  provide valuable  information to  understand
a chemical's effect based upon  limited information  on adult
pharmacokinetics and  milk  transfer  to facilitate design of
follow-up pharmacokinetic or toxicity studies.
  Although different dosing methods can lead to very different
pharmacokinetic  outcomes (Saghir  et al.,  2006), there have
been few direct studies  of the potential differences when the
dam is given  a chemical via gavage  versus  dietary  adminis-
tration.  Arnold et al. (2000)  reported  that similar doses of
hexaclorobenzene or Aroclor  1254 administered  by  two
different dosing methods, gavage or feeding, did not result in
similar exposures in suckling neonates. The  marked increase in
maternal food consumption during lactation was suggested to
   account for the difference, which was  also supported by the
   current model predictions. The administration of chemicals at
   different dose rates, as exemplified by  gavage versus dietary
   administration, could result in differences in the developmental
   outcomes, particularly those dependent upon peak concentration.
     The  amount and pattern of  lactational  exposure  from
   adjusted dietary dosing for longer half-life compounds  were
   predicted to be more similar to gavage than unadjusted diet as
   shown  by  similar internal exposure levels and peak concen-
   trations before the onset of pup's independent feeding on solid
   diet.  Unadjusted  feeding  generally  produced higher pup
   exposure than  these two approaches.  These predictions will
   help not only  to  choose an appropriate  dosing approach to
   achieve a pup exposure level similar to its mother's but also to
   provide insight into what to  expect  when  different dosing
   methods are applied. Dietary exposure is often considered to be
   the most relevant dosing approach for life-stage testing, unless
   there is a  technical problem, so we sought  to determine for
   what chemicals this dosing method would be most relevant.
   When the  chemical is rather slowly eliminated and the milk
   transfer is moderate (i.e., the Pm being between 1 and 3) and/or
   the volume of distribution is slightly smaller than the total body
   water, then one can expect that the dam and pups would be
   exposed at similar levels via feeding. Again, it should be taken
   into consideration that the unadjusted diet exposure in a number
   of cases for long half-life  compounds will lead to higher
   exposures  than expected  from the predetermined target dose
   due to the  increased maternal food intake during lactation.
     Current  modeling highlighted some  critical data needs in
   predicting postnatal dosimetry; time-dependent changes in milk
   partitioning are one example.  The milk  concentration was
   assumed to parallel maternal  plasma  levels  by a  constant
   proportionality, the milk to plasma ratio, throughout lactation.
   Known changes in milk during lactation may call into question
   this assumption for at least some classes of chemical (Luckey
   et al., 1954). This would matter especially for chemicals whose
   partition to milk is highly affected by milk composition.  Such
   chemicals  include  highly  fat-soluble  compounds  due  to
   significantly higher fat content in milk the first few days after
   birth  and  its rapid decline thereafter, rngestion of highly
   lipophilic compounds  such as hexachlorobenzene and poly-
   chlorinated biphenyls  by suckling pups was observed to  be
   elevated shortly after birth correlated with high milk fat content
   (Arnold et al., 2000). Secretion of chemicals  that bind to milk
   proteins may also be affected by such time-dependent changes
   in milk composition. Thus, more accurate estimation of milk to
   blood   ratios  for  test  compounds  would  be obtained by
   measuring  them experimentally throughout  lactation. It also
   would be important to determine milk composition throughout
   lactation because  currently available  information  is  very
   limited. With additional data, methods could be developed to
   predict passive distribution  of chemicals based upon  milk
   composition  and  the physical chemical properties  of the
   chemical.
                                     Previous
TOC

-------
                                   BIOLOGICAL MODELING OF RAT EARLY LIFE DOSIMETRY
                                                                                                                        31
  While the current model has value as  a tool to make initial
predictions,  it is  also  plausible that adult pharmacokinetics
during lactation would  differ from those in nonlactating adults
or that there were major age-dependent changes in the pups that
could impact a specific chemical.  There are changes  in  fluid
handling during lactation in the dam, so distribution character-
istics or  renal  clearance of  a chemical  could be affected
although rats do not show decreased circulating albumin levels
until the final days of pregnancy, in contrast to humans (Stock
et al., 1980). Developmental pharmacokinetic changes in pups
can be extensive during the  first weeks of life as exemplified by
changes in  enzymes and transporters (Li et al., 2002; Yoon
et al., 2006). The developmental pattern of elimination capacity
was  one  possible  example  of  developmental   delay  in
elimination  capacity based on critical changes during neonatal
kidney  maturation (Kavlock and  Gray,  1982).  If one knew
important pharmacokinetic  determinants in  the adult,  e.g.,
metabolism by a specific enzyme, binding to a specific protein,
or  movement via  a transporter,  this  could potentially be
incorporated  into  the   current model  (See Supplementary
section for examples).  As the  complexity and detail increase,
it would be necessary to shift to using a physiologically based
pharmacokinetic model in place of the one-compartment model
used here. Collecting data to evaluate the  model predictions for
any specific compound would be essential.
  It will be  necessary to  modify the model to  expand its
general uses as well. Examples would include accounting for
metabolism or transport and related saturation kinetics as well
as lactational transfer of metabolites, modeling very long half-
life  compounds,  and  modeling  premating  and  pregnancy
periods. The activity of saturable active transport systems can
be important for the pharmacokinetics of pharmaceutical  and
environmental chemicals, including for  distribution to  milk.
For example, high lipophilicity is not the only factor that can
lead to high milk concentrations. The milk concentration of the
lipophilic carboxylic acid herbicide 2,4-D was  shown to be
similar to maternal  blood (Sturtz et al.,  2006),  while less
lipophilic (or more water  soluble) compounds  like antiviral
drugs  have  shown several fold  higher milk concentrations
compared to  mother's  blood  due to active  transport in  rats
(Alcorn  and  McNamara, 2002).  Early  life  toxicity testing
usually involves the dam's  exposure to the test chemical prior
to impregnation and continued exposure during gestation  and
lactation.  As a result, the neonate (and the fetus) is  exposed to
the  test chemical  and  its  metabolites prenatally and during
lactation. Hence, simulation of the gestational carry over would
be critical in predicting  lactational exposure accurately for very
long half-life compounds.
  We have demonstrated  a  modeling  approach  to predict
maternal and pup external and internal exposures by combining
biological information in  the literature  on  body  weight
changes, milk production,  and food  consumption  with adult
nonpregnant rat pharmacokinetics  and  information on  milk
distribution. The resulting predictions can be used to (1) design
      pharmacokinetic studies to evaluate the predictions, (2) evaluate
      early life toxicity study design choices (e.g., exposure method),
      and (3) develop hypotheses to interpret toxicity study findings.
      Extending this  approach by  incorporating  more pharmacoki-
      netic determinants in the model for distribution to milk and
      early postnatal  pharmacokinetics  would  improve  predictions
      for  specific  chemicals  as  well as  facilitate  development of
      further generalizations concerning the extent of pup exposures
      for chemicals with different pharmacokinetic characteristics.
      The current modeling analyses predicted substantial differences
      between maternal and  pup  external  and  internal  exposures
      indicating that risk assessment approaches based upon maternal
      exposure doses  are of limited utility when considering  early
      childhood exposures.
                       SUPPLEMENTARY DATA

         Supplementary section  are available online at http://toxsci.
      oxfordjournals.org/.
                               FUNDING

         National Research Council Research Associateship Award at
      the US Environmental Protection Agency to M.Y. (EPA-NRC
      # CR82879001).
                        ACKNOWLEDGMENTS

         The authors acknowledge Dr. Suzanne Fenton for her review and comments.



                             REFERENCES

      Alcorn, J., and McNamara, P. J. (2002). Acyclovir, ganciclovir, and zidovudine
        transfer into rat milk. Antimicrob. Agents Chemother. 46, 1831-1836.
      Arnold, D. L., Bryce, F. R., Clegg, D. J., Cherry, W., Tanner, J. R., and
        Hayward, S. (2000). Dosing via gavage or diet for reproduction studies:
        A pilot  study using two fat-soluble compounds-hexachlorobenzene and
        aroclor 1254. Food Chem. Toxicol. 38, 697-706.
      Barton, H. A. (2005). Computational pharmacokinetics during developmental
        windows of susceptibility. /. Toxicol. Environ. Health A 68, 889-900.
      Barton, H. A., Pastoor, T. P., Baetcke, K., Chambers, J. E., Diliberto, J.,
        Doerrer, N. G., Driver, J. H., Hastings, C. E., lyengar, S., Krieger, R., et al.
        (2006).  The  acquisition and  application  of  absorption,  distribution,
        metabolism, and excretion (ADME) data in  agricultural chemical  safety
        assessments. Crit. Rev.  Toxicol. 36, 9-35.
      Bornschein, R. L., Fox, D. A., and Michaelson, I. A. (1977). Estimation of
        daily exposure in neonatal rats receiving lead via dam's milk. Toxicol. Appl.
        Pharmacol. 40, 577-587.
      Byczkowski,  J.  Z., Kinkead,  E. R., Leahy, H. F., Randall,  G. M., and
        Fisher, J. W.  (1994). Computer simulation of the lactational transfer of
        tetrachloroethylene  in rats using  a physiologically based model. Toxicol.
        Appl. Pharmacol. 125, 228-236.
      Corley, R. A., Mast, T. J., Carney, E. W., Rogers, J. M., and Daston, G. P.
        (2003). Evaluation of  physiologically based models of pregnancy and
                                   Previous
TOC

-------
32
                                                             YOON AND BARTON
  lactation for their application in children's health risk assessments. Crit. Rev.
  Toxicol. 33, 137-211.
Doerflinger,  A., and Swithers, S. E. (2004). Effects of diet and handling on
  initiation of independent ingestion in rats. Dev. Psychobiol. 45, 72-82.
Godbole, V.  Y., Grundleger, M. L., Pasquine, T. A., and Thenen, S. W. (1981).
  Composition of rat milk from day 5 to 20 of lactation and milk intake of lean
  and preobese Zucker pups. /. Nutr. Ill, 480-487.
Gorzinski, S. J., Kociba, R. J., Campbell, R. A., Smith, F. A., Nolan, R. J., and
  Eisenbrandt, D. L. (1987). Acute, pharmacokinetic, and subchronic lexicological
  studies of 2,4-dichlorophenoxyacetic acid. Fundam. Appl. Toxicol. 9,423-435.
Griffin,  R. J., Godfrey,  V.  B., Kim, Y. C.,  and Burka, L.  T. (1997). Sex-
  dependent differences in the disposition of 2,4-dichlorophenoxyacetic acid in
  Sprague-Dawley rats, B6C3F1 mice,  and  Syrian hamsters. Drug Metab.
  Dispos. 25, 1065-1071.
Hallen,  I.  P.,   Breitholtz-Emanuelsson,  A.,  Hull,  K., Olsen,  M.,  and
  Oskarsson, A. (1998). Placental and lactational transfer of ochratoxin A in
  rats. Nat. Toxins 6, 43-49.
Hanley,  T. R.,  Jr, Breslin,  W. J.,  Quasi, J. F., and Carney, E.  W. (2002).
  Evaluation of spinosad in a Iwo-generalion dielary reproduction sludy using
  Sprague-Dawley rals. Toxicol. Sci. 67, 144-152.
Hanley,  T. R., Jr, and Walanabe, P. G. (1985). Measuremenl of  solid feed
  consumption patterns in neonalal rals by 141Ce-radiolabeled microspheres.
  Toxicol. Appl. Pharmacol. 77, 496-500.
Henning,  S.  J.  (1981).  Poslnalal  development  Coordination  of feeding,
  digestion,  and metabolism. Am. J. Physiol. 241, G199-G214.
Hinderliler, P. M.,  Mylchreesl, E., Gannon,  S. A.,  Bulenhoff, J. L., and
  Kennedy,  G. L.,  Jr (2005). Perfluorooclanoale: Placenlal and laclalional
  Iransport pharmacokinelics in rals. Toxicology 211, 139-148.
Howdeshell, K.  L., Furr, J., Lambrighl, C. R., Rider, C. V., Wilson,  V. S., and
  Gray,  L.  E.,  Jr  (2007). Cumulative effecls of dibulyl phlhalale  and
  dielhylhexyl phlhalale on male ral reproductive Iracl development Altered
  felal steroid hormones and genes. Toxicol. Sci. 99,  190-202.
Kavlock, R. J., and Gray, J. A. (1982). Evaluation of renal function in neonalal
  rals. Biol.  Neonate 41, 279-288.
Knighl,  C. H., Docherty, A. H., and Peaker, M.  (1984). Milk yield in rals in
  relation to activity  and size  of  Ihe mammary  secretory  cell population.
  /. Dairy Res. 51, 29-35.
Li, N., Hartley,  D. P., Cherringlon,  N. J., and Klaassen, C. D. (2002). Tissue
  expression, ontogeny,  and  inducibilily of ral organic  anion Iransporting
  polypeplide 4. /. Pharmacol. Exp. Ther. 301, 551-560.
Li, S., Marquardl, R. R., Frohlich, A. A., Villi, T. G., and Crow, G. (1997).
  Pharmacokinelics of ochratoxin A and  ils metabolites in rals. Toxicol. Appl.
  Pharmacol. 145, 82-90.
   Luckey, T. D., Mende, T. J., and  Pleasanls,  J. (1954).  The physical and
     chemical characterization of ral's milk. /. Nutr. 54, 345-359.
   McNamara, P. J., Meece,  J. A.,  and Paxlon, E.  (1996). Active Iransport of
     cimelidine  and  ranilidine  into  Ihe  milk  of  Sprague Dawley  rals.
     /. Pharmacol. Exp. Ther. 277,  1615-1621.
   Moser, V. C., Walls, I., and Zoelis,  T. (2005). Direcl dosing of preweaning
     rodenls in loxicily testing and research: Deliberations of an ILSI RSI Expert
     Working Group. Int. J. Toxicol. 24, 87-94.
   Rayner, J. L., Enoch, R. R., Wolf, D. C., and Fenlon, S. E. (2007). Alrazine-
     induced reproductive Iracl alterations after Iransplacenlal and/or laclalional
     exposure in male Long-Evans rals.  Toxicol. Appl. Pharmacol. 218, 238-248.
   Redman, R. S., and Sweney, L.  R.  (1976). Changes in diel and patterns of
     feeding activity of developing rals. /. Nutr. 106, 615-626.
   Saghir,  S. A., Mendrala, A. L.,  Bartels,  M. J.,  Day, S.  J., Hansen, S. C.,
     Sushynski,  J.  M., and  Bus, J. S.  (2006).  Slralegies to assess  systemic
     exposure of chemicals in subchronic/chronic diel and drinking water sludies.
     Toxicol. Appl. Pharmacol. 211, 245-260.
   Shirley, B. (1984). The food intake of rals during pregnancy and lactation. Lab.
     Anim. Sci. 34, 169-172.
   Slock, B., Dean, M., and  Levy,  G.  (1980). Serum  protein binding  of  drugs
     during and  after pregnancy in rals. /. Pharmacol. Exp. Ther. 212, 264-268.
   Stole, V., Knopp, J., and Slolcova, E. (1966). Iodine, solid diel, water and milk
     intake by  laclaling rals and Iheir  offsprings. Physiol.  Bohemoslov. 15,
     219-225.
   Slrubbe, J.  H., and Gorissen, J. (1980). Meal patterning in  Ihe laclaling ral.
     Physiol. Behav. 25, 775-777.
   Slurtz, N., Bongiovanni, B., Rassello,  M., Ferri, A., de  Duffard, A. M., and
     Duffard, R. (2006). Detection of 2,4-dichlorophenoxyacelic acid in ral milk
     of  dams exposed during  lactation and  milk  analysis  of Iheir major
     componenls. Food Chem. Toxicol. 44, 8-16.
   Timchalk,  C.  (2004).   Comparative  inler-species  pharmacokinelics  of
     phenoxyacelic acid herbicides  and related organic  acids: Evidence lhal
     Ihe dog  is nol a relevanl  species for  evaluation of human heallh risk.
     Toxicology 200, 1-19.
   Yoon, M., Madden, M.  C.,  and  Barton, H. A.  (2006). Developmental
     expression of aldehyde dehydrogenase in ral: A comparison of liver and lung
     development Toxicol. Sci. 89, 386-398.
   Yuan, J. (1993). Modeling blood/plasma concenlralions in dosed feed and
     dosed drinking water toxicology  sludies. Toxicol. Appl. Pharmacol. 119,
     131-141.
   Zepnik, H., Volkel, W.,  and Dekanl, W.  (2003). Toxicokinelics of Ihe
     mycoloxin  ochratoxin A in  F  344 rals after oral adminislralion. Toxicol.
     Appl. Pharmacol. 192, 36^4.
                                              Previous
TOC

-------
                                                         Journal of Exposure Science and Environmental Epidemiology (2009), 1-10       ijfife
                                                         © 2009 Nature Publishing Group All rights reserved 1559-0631/09/S32.00        ^UP
                                                         www.nature.com/jes



Research needs  for  community-based  risk  assessment: findings  from  a

multi-disciplinary  workshop


YOLANDA ANITA SANCHEZa, KACEE DEENERb, ELAINE COHEN HUBALC, CARRIE KNOWLTONa,
DAVID REIFC AND DEBORAH SEGALd

3Association of Schools of Public Health,  Washington, DC, USA
bU.S. Environmental Protection Agency, Office of Research and Development,  National Center for Environmental Assessment, Washington, DC, USA
CU.S. Environmental Protection Agency, Office of Research and Development, National Center for Computational Toxicology, Washington, DC, USA
dU.S. Environmental Protection Agency, Office of Research and Development,  National Center for Environmental Research, Washington, DC, USA


Communities face exposures to multiple environmental toxicants and other non-chemical stressors. In addition, communities have unique activities and
norms that influence exposure and vulnerability. Yet,  few studies quantitatively consider  the role of cumulative  exposure and additive impacts.
Community-based risk assessment (CBRA) is a new approach for risk assessment that aims to address the cumulative stressors faced by a particular
community, while incorporating a community-based participatory research framework. This paper summarizes an Environmental Protection Agency
(EPA) sponsored workshop, "Research Needs for Community-Based Risk Assessment." This workshop brought together environmental and public
health scientists and practitioners for fostering an innovative discussion about  tools, methods, models, and approaches for CBRA. This workshop was
organized around three topics: (1) Data and Measurement Methods; (2) The Biological Impact of Non-Chemical Stressors and Interaction with
Environmental Exposures; and (3) Statistical and Mathematical Modeling. This report summarizes the workshop discussions, presents identified research
needs, and explores future research opportunities in this emerging field.
Journal of Exposure Science and Environmental Epidemiology advance online publication, 25 February 2009; doi:10.1038/jes.2009.8

Keywords: population-based studies, epidemiology, exposure modeling, analytical methods, empiricalfstatistical modeling, PBPK modeling.
Introduction

Communities  face  myriad  exposures  to  environmental
toxicants   and  other  non-chemical   stressors.  Although
environmental  epidemiology  studies  do consider multiple
risk  factors,  few studies quantitatively considered the full
range of complex interactions between the multiple environ-
mental  agents  (chemical,  biological,  and  social stressors)
within a targeted population or within a geographic  area in
influencing health outcomes. The handful of quantitative
cumulative risk assessments, conducted to date, consider only
the additive impacts of chemical agents that share a common
mode of action (Castorina et al., 2003; Payne-Sturges et al.,
2004a; Payne-Sturges et al., 2004b; Caldas et al., 2006; U.S.
Environmental Protection Agency, 2006a) or a  common
exposure media (Fox et al., 2004; Teuschler et al., 2004). In
addition,  some studies  have investigated the interaction of
environmental stressors  that lead to negative health outcomes
1. Address all correspondence to: Deborah Segal,  US EPA, Office of
Research and Development, National Center for Environmental Research,
Room 3108, 1025 F. Street, NW, Washington, DC 20004, USA. Tel.:
+ 202 343 9797. Fax: + 202 233 0677. E-mail: Segal.deborah@epa.gov
Received 8 September 2008; accepted 29 December 2008
    (Cary et al., 1997; Morrison et al., 1998; Erren et al., 1999).
    These  studies,  however,  provide  little  information  on
    susceptibility factors,  interactive  effects of  biological  re-
    sponses, or social stressors that may modify toxic response.
      The 1996 Food Quality and Protection Act expanded risk
    assessment for evaluating chemical mixtures and  environ-
    mental contaminants that target similar body mechanisms
    (U.S.  Congress,  1996). Cumulative  risk assessment was
    discussed  extensively  in  the International   Life  Sciences
    Institute (ILSI, 1999) publication, titled A Framework for
    Cumulative Risk Assessment, which focused on the cumula-
    tive toxicity, exposure, and risk  characterization of multiple
    environmental  contaminants. However, the methods  for
    cumulative risk assessment are still evolving (U.S. Environ-
    mental  Protection   Agency,  2003;  U.S.   Environmental
    Protection   Agency,  2007a).   The  2003   Environmental
    Protection   Agency   (EPA)   publication,  Framework for
    Cumulative Risk Assessment, widely expanded the definition
    of cumulative  risk   assessment to  include  non-chemical
    stressors and the concept of population vulnerability. The
    EPA  defined cumulative  risk assessment  as "an  analysis,
    characterization and  possible quantification of the combined
    risks to health and or the environment from multiple agents
    or stressors (U.S.  Environmental Protection Agency, 2003)."
                                      Previous
TOC
Next

-------
     Sanchez et al.
                                                                             Research needs for community-based risk assessment
  The key aspects of the EPA Framework for Cumulative
Risk Assessment are:  (1) understanding the combined effects
of more  than one  agent/stressor;  (2)  considering  non-
chemical stressors; (3) focusing on identifying and character-
izing  vulnerable human and  ecological  populations; and
(4) using a place-based or population-based analysis for risk
assessments, which elicits community expertise. This  EPA
framework generated a paradigm shift in risk assessment by
greatly expanding the concept of an environmental "stressor"
to include chemical,  biological, physical, and  psychosocial
agents (U.S. Environmental Protection Agency, 2003). The
framework also includes the concept of population vulner-
ability — certain  disadvantaged,  underserved, and over-
burdened communities face  conditions that can exacerbate
environmental burdens. More specifically, mechanisms  of
vulnerability identified in the framework include differences
in individual or population  susceptibility,  exposure, prepa-
redness, and ability to recover (U.S. Environmental Protec-
tion   Agency,  2003;   National  Environmental   Justice
Advisory Committee, 2004).
  Communities  generally  share a  common  geographic
location and/or common experience (traditions, diet, beha-
vioral  norms).  Therefore,  cumulative risk assessment for
communities is inherently place based (for a discussion  on
place-based  public health, see  (Patychuk, 2007; Yeboah,
2005). Community norms influence diet and activities that
determine  how individuals come  into contact with environ-
mental  contaminants   (U.S.   Environmental  Protection
Agency,  2003,  2006b;  National Environmental  Justice
Advisory  Committee,  2004).  In  addition, non-chemical
stressors may affect the  health outcome of  exposure  to
environmental  contaminants (White et  al.,  2007).  For
example, stress has been shown  to exacerbate  lead toxicity
(Bellinger et al., 1988; Tong et al., 2000; Cory-Slechta et al.,
2004). The EPA has reflected this understanding of place-
based cumulative risk assessment in the 2006 Human Health
Multi-Year  Plan,  by  expanding  the  long-term goal  of
"Research on  Cumulative  Risk"  to include  research  on
community-based cumulative risk assessment. This includes
the development and  application of tools and approaches for
assessing community risk, and the application of community-
based tools,  as well as approaches for assessing exposure to
environmental  contaminants and non-chemical  stressors
(U.S.  Environmental  Protection  Agency,  2006b;  U.S.
Environmental Protection Agency, 2007a).
  Therefore,  community-based risk assessment (CBRA) is
defined here as a model of risk assessment that addresses
multiple chemical and non-chemical stressors faced by a
particular  community, while incorporating a  community-
based participatory research framework and a transparent
process to instill confidence and trust among the community
members (U.S.  Environmental  Protection Agency,  2003,
2007b; National Environmental Justice Advisory Committee,
2004). CBRA may include characteristics of a community that
    cannot be  identified and  assessed through traditional  risk
    assessment paradigms, such as social and cultural dynamics
    of the community or resources, strengths, and relationships
    within the community (Israel et al., 1998). Community  and
    stakeholder involvement is critical in harnessing community
    knowledge and  to better understand complex cumulative
    exposures (U.S.  Environmental Protection Agency, 2003;
    National Environmental Justice Advisory Committee, 2004;
    Menzie et al., 2007). Traditionally, this  community knowl-
    edge has been difficult to obtain using conventional research
    and risk assessment methods (Israel et al., 1998; Israel et al.,
    2001; O'Fallon  and Dearry, 2001; Corburn, 2005). Com-
    munity-based  participatory  research frameworks  can  aid
    efforts to involve the community, and can integrate insightful
    community  information  that can  advance environmental
    health research (O'Fallon and Dearry, 2002; Corburn, 2005;
    Israel et al., 2005) and risk assessment.
      CBRA reflects  the recommendations put forth in  the
    National   Environmental  Justice  Advisory  Council's
    (NEJAC)  report,  titled  "Ensuring  Risk  Reduction in
    Communities with Multiple Stressors: Environmental Justice
    and Cumulative Risks/Impacts," and echoes the interests of
    the EPA's Office of Environmental Justice and Office of
    Children's  Health Protection  (National Environmental
    Justice Advisory  Committee,  2004;  U.S. Environmental
    Protection  Agency, 2006b).  CBRA evolved from the EPA
    and the  NEJAC publications regarding frameworks  for
    cumulative risk  assessment, which indicated the need for
    dealing with risk on a community-by-community basis (U.S.
    Environmental Protection  Agency, 2003).
      Including  CBRA  in   the  regulatory decision-making
    process poses  some significant challenges. These challenges
    include  the need  to assess  toxicity of mixtures,  measure
    vulnerability  of populations, and to evaluate interactions
    among multiple stressors,  chemical or non-chemical (U.S.
    Environmental Protection Agency,  2003). Additional chal-
    lenges are  associated  with the  need  to  partner  with
    stakeholders  for  obtaining  community  knowledge.  The
    traditional  EPA risk assessments evaluate the  hazardous
    properties  of  substances,  assess  the  extent  of  human
    exposure, and characterize the risk of adverse health effects
    (National Research Council, 1983).  Many  of  these  risk
    assessments aim to protect  the most sensitive individuals and/
    or  groups  in  the  general  population. In contrast, CBRA
    would characterize additional community-level stressors  and
    measures of vulnerability to help inform risk evaluation  and
    decision-making at the local level.  This may include how
    community-level stressors  and vulnerability factors  interact
    with environmental contaminant exposures to impact the
    overall risk to the individuals within the defined community.
      To  address the scientific  challenges inherent in CBRA, the
    Office  of  Research and  Development (ORD)  at EPA
    sponsored a workshop, titled "Research Needs for Commu-
    nity-Based Risk  Assessment."  This workshop focused on

Journal of Exposure Science and Environmental Epidemiology (2009),  1-10
                                    Previous
 TOC
Next

-------
Research needs for community-based risk assessment
                                                                                                      Sanchez et al.
three topics:  (1) Data and Measurement Methods; (2) The
Biological Impact of Non-Chemical Stressors and Interac-
tion with Environmental Exposures; and (3) Statistical and
Mathematical Modeling.  (U.S. Environmental Protection
Agency, 2007b). This paper  provides an  overview of the
workshop, presents research needs identified based on results
of the workshop, and highlights themes to further advance
the science behind CBRA.
Workshop overview

From 18 October 2007 to 19 October 2007, in the Research
Triangle  Park,  NC,  the  EPA's  ORD  sponsored  the
"Research Needs for Community-Based Risk Assessment"
workshop. This  multi-disciplinary workshop was coordi-
nated by a small organizing committee that developed four
basic questions regarding CBRA, which became the frame-
work of the workshop:

•  What research has been conducted?
•  What is the current  state of the  science?
•  What are the research needs?
•  How can community-based information be quantified in a
   way that is useful for EPA risk assessments?

   Approximately 85 people attended the workshop. Partici-
pants  included  the EPA  employees, contractors, or fellows,
spanning  eight offices  and five regions. Other participants
were affiliated with the National Institute of Environmental
Health Sciences  (NIEHS), academia, other  research insti-
tutes,  local government,  or  community advocacy  groups.
Before the workshop,  participants were encouraged to read
the  2007   Environmental  Health  Perspectives Mini-Mono-
graph on  cumulative risk assessment (Callahan and Sexton,
2007;  DeFur et al., 2007; Menzie et al., 2007; Ryan et al.,
2007;  Sexton and Hattis, 2007). The workshop was divided
into  three topics,  as  described  earlier. Day  one  of the
workshop  included presentations on the state of the science
for  each   topic.  Small  breakout  sessions  on  each topic
occurred on day  two, at  which 10-20 participants discussed
the research needs for  their respective topics. Session chairs
compiled and summarized the research needs identified by the
breakout group participants,  and these were presented back
to the full group for further  discussion. A summary report
that  included details  of the workshop  presentations and
discussion  was prepared (U.S. Environmental Protection
Agency, 2007b).

Session I: Data and Measurement Methods
Measuring chemical and  non-chemical stressors,  susceptibil-
ity factors, and health  outcomes at the community-level will
play  an important role in CBRA. Some examples  include,
measuring personal  and  community  exposures to multiple
chemical stressors,  monitoring the time-activity behavior,

Journal of Exposure Science and Environmental Epidemiology (2009), 1-10
   measuring markers of susceptibility, and tracking early health
   outcomes. This session explored currently available tools and
   methods.
      Session I included three  presentations: "Development of
   Nanoscaled Sensor Systems for Detecting and Monitoring
   Environmental  Chemical Agents"  by  Desmond  Stubbs of
   Oak Ridge  Center for Advanced Studies (ORCAS); "Data
   Collection Platforms for Integrated Longitudinal Surveys of
   Human Exposure-Related Behavior" by Paul Kizakevich of
   RTI International;  and  "Assessment Methods for Commu-
   nity-Based Risk Assessment"  by  Elaine Faustman of the
   University of Washington.
      The first speaker, Dr. Stubbs, focused on the application of
   emerging  technologies  for measuring exposures to chemical
   stressors. He summarized results of an earlier workshop co-
   sponsored by the EPA and  the ORCAS as the background
   and context  for  his   presentation (U.S.  Environmental
   Protection Agency  and  Oak  Ridge Center  for  Advanced
   Studies, 2006c). Both the EPA and the  NIEHS identified the
   need for a rugged, lightweight, low-cost, wearable, real-time
   sensor  capable  of multi-analyte  detection  with minimal
   burden to the individual. The "gold standard" was defined
   as  the  ability  to  simultaneously  detect multiple chemical
   agents in  the field with the same sensing system and to link
   this data to a specific biological event. Such a device would be
   capable of remote data acquisition, location recording, and
   measurement of both  the concentration and frequency of
   environmental exposure. Dr. Stubbs identified  the  ongoing
   research on several devices  for use in  exposure assessment,
   including  passive  radio-frequency identification tags,  an
   electronic nose  (i.e., "dog-on-a-chip"),  microelectromagnetic
   sensors, and interferometric optical  sensors. He then discussed
   microfabricated cantilever array platforms and the potential
   for these to  provide  lightweight, wearable multi-analyte
   sensors. Dr. Stubbs also described the possibility of linking
   the pea-size sensing and telemetry  unit  to a receiver unit the
   size of a small personal  digital assistant,  designed to be carried
   in a pocket. The personal digital  assistant unit could  have
   analysis and display capability,  and support global position-
   ing and bio-monitoring device interfaces. Preliminary results
   suggest that these devices are capable of real-time detection
   (sub-second scale) of low vapor pressure chemical compounds
   in the subparts per billion range. The potential power of these
   new small-scale technologies for  measuring personal and
   community-level exposures  to  a  wide range  of chemical
   constituents and stressors was recognized by many CBRA
   workshop participants.
      The  second  speaker,  Dr. Kizakevich,  focused  on  ap-
   proaches  for  measuring  exposure-related behaviors  for
   assessing risk.  He presented details on  the development of a
   system that integrates multiple  real-time data  collection
   streams  and  survey  modes on  a  handheld  Pocket  PC
   platform (Whitmore and Kizakevich, 2004). The objectives
   of  this research are  to develop, validate,  and evaluate
                                     Previous
TOC
Next

-------
     Sanchez et al.
                                                                               Research needs for community-based risk assessment
innovative  methods  for  the  TALE (time/activity/location/
exertion-level) data, dietary consumption data, and data on
use  of  consumer products,  including pesticide products,
household  cleaning products, and  personal care products.
The system  integrates  diaries and questionnaires with a
collection  of wireless peripheral  devices for  monitoring
physical and  physiological data.
  The RTI researchers are also exploring different methods
for  collecting data  and  evaluating  these  methods  using
feedback from the study population. Three Pocket PC diary
modes were studied: interactive menus, voice questionnaires,
and passive periodic  photos.  Innovations, such as passive
microenvironment identification (i.e., beacons), passive exer-
tion assessment, wireless product use event markers, wireless
interfaces,  intelligent  prompting,  GPS tracking,  and auto-
mated daily review for collecting the  data both accurately and
with a low participant burden are also being investigated.  The
system design emphasizes easy reconfiguration for supporting
varied study  requirements, investigator needs, and partici-
pant preferences.  Dr. Kizakevich noted that data collected
during piloting of these approaches will be made available
after the next round of monitoring.  Originally, the goal  was
to determine the  best method for  collecting the exposure-
related behavior data. However, based on the  first round of
evaluation, it is clear  that the best technology for collecting
behavior data will be determined by the objectives of a given
study or risk assessment. Results of  the RTI research  will
provide  information that  will  allow  investigators   and
communities  to determine which method is best for their
needs.
  In the third presentation, Dr. Faustman considered study
approaches and  data requirements for characterizing  the
exposure and risk factors to assess  individual- and commu-
nity-level   risks.  She presented three  types  of  studies
conducted  by the University  of Washington investigators to
understand pesticide exposures in children  (Thompson et al.,
2003, 2008; Vigoren et al., 2007). The  three studies presented
were a  community-based participatory research project, a
longitudinal multiple sampling project aimed at understand-
ing between-  and within-family variability, and  a longitudinal
cohort study. Dr.  Faustman also identified the importance of
collaboration between researchers and community members,
by  presenting these three study  examples. Throughout her
presentation, Dr.  Faustman emphasized the need for study
designs  to  integrate  the wide range of  data required to
conduct  CBRAs.  Although  the  researchers  typically  had
access to general  statistics on pesticide usage,  an important
insight into the potential sources and pathways was obtained
from community participants  that   proved   integral   for
understanding exposures. Her final  message focused on the
need to develop  and incorporate biomarkers of exposure,
susceptibility, and effect  into studies  for identifying vulner-
able groups  and  to  understand risks.  Genomic and gene
expression  analysis technologies are  being applied in some of
    the studies by the University of Washington, and have the
    potential to improve prediction of exposure-response and at-
    risk individuals in communities.
       These presentations provided insight into the tremendous
    challenges  and wide range of data needs  associated with
    characterizing stressors for CBRA. All the speakers identified
    the need for efficient tools for monitoring personal exposures
    to better identify vulnerable groups, understand significant
    exposure  pathways,  and develop  targeted  interventions.
    Novel measurement methods for monitoring environmental
    stressors (small-scale  sensors),  collecting exposure-related
    behavior data (wireless,  real-time  survey  methods),  and
    developing biomarkers of exposure, susceptibility, and effect
    (genomic and gene expression analyses) were highlighted in
    the context of CBRA.

    Session II: The Biological Impact of Non-Chemical
    Stressors and Interaction with other Environmental
    Exposures
    There is a recognized need to incorporate non-chemical
    stressors into cumulative risk assessment (U.S. Environmen-
    tal Protection Agency, 2003; National Environmental Justice
    Advisory Committee, 2004).  Most public  health research on
    non-chemical stressors  have focused on  the health  effects
    from exposure to chronic stress (Negro-Vilar,  1993; Bjorn-
    torp, 2001; Kramer et al., 2001; Maccari et  al., 2003; Strine
    et al., 2004;  Wright, 2005; Tamashiro et al.,  2007;  Suglia
    et al., 2008). However, CBRA provides  an opportunity to
    investigate  the non-chemical stressors that might interact with
    environmental contaminants. Therefore, this session focused
    on understanding the  health impacts  of  non-chemical
    stressors,  specifically chronic  stress,  and  their  ability to
    interact with exposure to environmental toxicants affecting
    the risk of adverse health  outcomes.
       Session  II included three  presentations:  "Social Stress,
    Stress Hormones,  and Neurotoxins", by James Herman of
    the  University   of  Cincinnati;  "Intersections  of  Social
    Ecology, Neurobehavioral Development,  and Environmental
    Contamination"  by  Bernard Weiss  of the  University of
    Rochester;  and  "Social  Environment as  a  Modifier of
    Chemical Exposures" by Robert  Wright of  the Harvard
    University  School of Public Health.
       Dr. Herman described the biological systems that mediate
    stress responses. Herman and Seroogy (2006) had broadly
    defined stress as a "real or perceived threat to homeostasis."
    The  secretion  of glucocorticoid  hormones, particularly
    cortisol, function  to  return the body  to  homeostasis after
    stress. However, a prolonged secretion of cortisol and other
    glucocorticoids due  to chronic stress inhibits neurogenesis.
    This can contribute to deleterious effects on  the body and
    brain, including immune system dysfunction, depression, and
    cognitive decline (Herman and Seroogy, 2006). Dr. Herman
    also  highlighted that this  process can  exacerbate other
    effective disease states, such  as schizophrenia and  bipolar

Journal of Exposure Science and Environmental Epidemiology (2009), 1-10
                                     Previous
 TOC
Next

-------
Research needs for community-based risk assessment
                                                                                                       Sanchez et al.
disease. He also emphasized the potential for the interaction
of specific environmental neurotoxicants and chronic stress,
because they both represent "hits" on a target system in the
multi-hit hypothesis of toxicity (White et al., 2007). More-
over, both environmental neurotoxicants and chronic stress
can  modulate glucocorticoid  secretion,  which  can work
together to potentiate the effects on nerve cells and neurons.
This phenomenon has been shown by exposure to lead and
chronic stress (Bellinger et al., 1988; Tong et al., 2000; Cory-
Slechta et al., 2004; Bellinger, 2008).
  Dr. Weiss, the second speaker, focused on the interaction
between exposure to neurotoxicants and social disadvantage,
referring to his review of children's vulnerability to environ-
mental contaminants (Weiss  and Bellinger,  2006). He used
lead exposure as a case study and discussed the sometimes-
difficult-to-quantify effects  on  intellectual quotient  and
behavior  associated with  lead exposure.  Dr.  Weiss  also
emphasized  the  disparate  exposure of people of lower
socioeconomic status to  both  lead  and  chronic  stress,
explaining that lead is only  one example of differential
exposure of neurotoxins to this population.  Other examples
of differential exposure to neurotoxins include environmental
tobacco  smoke  (Barbeau  et al., 2004a;   Barbeau et al.,
2004b), pesticides (Sexton et al., 2006), and mercury (Payne-
Sturges and Gee, 2006).
  The third speaker, Dr. Wright, integrated the information
presented by the other speakers with his detailed discussion of
the  effects of environmental contaminants  on  neuronal
function. He also used lead exposure as a case study. Lead
exposure  stimulates neurotransmitter release causing inap-
propriate  firing of  neurons  and  the blockage of calcium
channels  required  for  proper  neuron function.  Citing
numerous animal studies, Dr. Wright illustrated that positive
social environments can mitigate the effects of environmental
toxicants, such as lead (Guilarte et al., 2003; Weaver et al.,
2004). Similar studies in humans have shown that social
determinants can alter susceptibility to environmental con-
taminants (Tong et al., 2000; Clougherty et al., 2007). A new
birth  cohort,  the  Early  Life  Exposure  in   Mexico to
Environmental Toxicants Project (ELEMENT), has been
established for investigating these interactions.  This project
will  examine stress, lead  exposure, iron  deficiency,  and
neurodevelopment with a holistic perspective. The long-term
goals of ELEMENT are to: (1) identify factors that increase
and/or decrease metal toxicity; (2) understand  the biology
of metal  neurotoxicity;  (3) prevent toxicity; and (4) treat
toxicity after it has occurred, by finding  the  appropriate
intervention(s).
  Together,  these three presentations provided a compre-
hensive picture of current  knowledge about how the brain
responds to chronic stress and how this response can interact
with  exposure to environmental contaminants. This over-
view  helped  set the stage for the breakout sessions,  which
charged the session participants for identifying gaps in our

Journal of Exposure Science and Environmental Epidemiology (2009), 1-10
   understanding of the connections between the chronic stress
   and environmental contaminants toxicity.

   Session III: Statistical and Mathematical Modeling
   There are statistical and modeling challenges involved in
   viewing organisms and the environment as they really are an
   integrated whole. Traditional biostatistical approaches, such
   as linear regression, data stratification or transformation, and
   others are  useful,  yet have important limitations when
   handling  high-dimensional  data  of disparate  types.  The
   Session III  discussions included integrating data that vary
   across space and time, pooling datasets drawn from multiple
   sources, and creating accessible and user-friendly methods for
   public participation.
      This session  included  three  presentations:  "Community
   Based Risk Assessment—A Statistician's  Perspective" by
   Louise Ryan of the Harvard School of Public Health; "A
   Multi-Site Time Series Study of Hospital Admissions and Fine
   Particles: A Case-Study for National Public Health Surveil-
   lance" by Francesca Dominici of the Bloomberg School of
   Public Health at Johns  Hopkins  University; and  "Risk
   Assessment!Risk Communication: Understanding the  Commu-
   nity" by Thomas Schlenker of the Public Health Depart-
   ment of Madison-Dane County, WI.
      The  first  speaker,  Dr.  Ryan,  discussed  examples of
   community-focused  research studies that were  similar in
   terms of  having sparse data, a  clever combination of data
   from multiple sources,  and the  inclusion of spatiotemporal
   modeling in the study designs.  The most successful studies
   integrated both  personal  and community-level  data to
   overcome issues of sparse data  and unknown confounding
   factors (Ryan, 2008). As uncertainty tends to  be large when
   dealing with data collected  in real-world communities,  it is
   important to measure characteristics of the  community in
   addition  to individuals. Appropriate statistical techniques,
   such as spatiotemporal and  hierarchical models, are of great
   practical  use  in  such studies  that require  synthesis of
   information from multiple sources. However, researchers
   must be cautioned against overinterpreting model results and
   placing too  much emphasis on  f-values disconnected from
   other relevant information. For  complex  problems,  the
   results must undergo rigorous  sensitivity analyses  in order
   to fine-tune the models. Dr. Ryan  called for continued work
   for  developing  tools  capable  of combining information
   measured on multiple scales and degrees of uncertainty, so
   that the community-based models  are robust with respect to
   time, space, and other perturbations.
      Dr. Dominici, the second speaker, discussed the utility of a
   national system for tracking population health. She stated
   that population health research could be advanced rapidly by
   integrating the existing databases  (each containing  separate
   information on environmental,  social, and economic factors
   that impact health) and by designing new statistical models to
   describe the associated risk factors. Dr. Dominici highlighted
                                     Previous
TOC
Next

-------
     Sanchez et al.
                                                                              Research needs for community-based risk assessment
how multi-site  studies comparing day-to-day variations in
hospital  admission  rates  with  day-to-day  variations  in
pollution  levels  within  the same community are used  to
estimate city-specific pollution effects relative to confounding
effects, such as trend, season, and weather (Dominici et al.,
2006). Results have  indicated that effects  are  consistent
across location, and that there is a lag between air pollution
exposure and respiratory effect.  These preliminary  results
indicate that  flux in the levels  of  air pollution affect  health.
Such  studies provide  an  impetus for  linking national
databases and developing appropriate  analysis methods to
investigate risk  at the  local  level. Owing  to  the small
attributable risk for air pollution and  the large number of
potential confounders,  single-site studies  generally  display
increased  statistical error. Therefore, a national system for
analyzing data  from  multiple locations in  a  systematic
fashion is necessary to reliably assess population health.
  Dr. Schlenker, the final session speaker, emphasized that
accurate  and valid risk assessments cannot be carried out
unless there  is an understanding of  the community and
communication between the community and researchers. He
illustrated  this  point with examples of community-based
studies involving lead and manganese, in which commu-
nicating a story about the "life" of these metals in the body
instead of merely providing data  and scientific jargon about
internal disposition was crucial to success (Schlenker, 1989).
Dr. Schlenker proposed that providing examples of how a
model has been (or can be) used at the community level is the
best way to take complicated models and move them into a
context, in which they can be trusted and understood by the
community. The research can benefit fully from community
guidance and case-specific advice, by including communities
in the analysis process.
  Collectively, the speakers for Session III identified  several
key  requirements for  successful  modeling  for  CBRA—
including  data  collected across  spatiotemporal scales, in-
formation on multiple communities for elucidating commu-
nity-specific  risk   factors,   comprehensive  community
involvement,  and appropriate statistical  analysis methods.
The speakers identified the need for continued development
of methods for analyzing disparate data types, integrating
existing (and nascent)  databases, and working  to  mean-
ingfully include communities in all stages of research.
Workshop results

Emerging Themes
The major outcome  of the workshop is the resulting list of
research  needs (see  Table  1)  and a  list of suggestions to
enhance  CBRA (see Table 2) elicited from the summary
document.  Many broad ideas  were  mentioned  in  more
than one workshop  session topic,  suggesting emerging  and
crosscutting themes.  The three overarching themes, which
    are inclusive of the individual research needs, were identified:
    (1) scientific  tools and methods  to  better measure  and
    evaluate exposures and health  outcomes at a  community
    level; (2) environmental health infrastructure; and (3) com-
    munity involvement processes.
       The  need  for  scientific  tools (methods and  models) to
    better measure and evaluate exposures and health outcomes
    at a community  level was identified throughout the work-
    shop. On the basis of workshop presentations of the state of
    the art  for monitoring and modeling,  it was clear that some
    very sophisticated tools are available  and that much of the
    research effort could  be focused on adapting and applying
    these tools to the  specific objectives of a CBRA. In Session I,
    the potential for emerging monitoring technologies to provide
    low-burden, real-time data on the full range of community
    environmental stressors  was identified. Furthermore, Session
    III participants  suggested  that statistical  techniques  are
    needed  to better evaluate health  outcomes at the community
    level, including techniques for synthesizing information from
    multiple datasets, reduce limitations of small population size,
    and characterize  group-level effects. Workshop participants
    also identified the need for adjustments to the traditional risk
    framework. Session II  participants recommended amend-
    ments in the  risk paradigm to incorporate vulnerability  and
    non-chemical stressors.   Session I  participants identified a
    need  for  methodology  and modeling  changes to include
    qualitative data and incorporate social (e.g., poverty, access
    to  medical  care,  chronic  stress)  variables  as modeling
    parameters. A major outcome of the workshop was the
    recognition that a new conceptual model for risk assessment
    may be needed.
       A second theme that emerged throughout the workshop
    was the need for an environmental health infrastructure to
    address the current gaps in data  and data accessibility to
    foster multidisciplinary research required for CBRA. Session
    III participants suggested a better infrastructure is needed to
    create an enhanced access to existing databases and develop
    transparent modeling methods for diverse disciplines, entities
    of government, research groups, and community organiza-
    tions.  All  the three  sessions advocated  for  an enhanced
    infrastructure, which  could ensure multiple levels of local,
    state, tribal, and federal entities working together on CBRA.
    In addition, all the sessions resulted in the acknowledgment
    of the need to facilitate cross-disciplinary teams within public
    health practice and research, social  science, and environ-
    mental  health science.
       A final  reoccurring theme was  the  need for  community
    involvement.  This will require the establishment and fostering
    of effective working  relationships between  the  community
    and researchers in addition to the  community  and govern-
    ment agencies. This may  involve  training  on the options
    available for community involvement (such as  the use of a
    community-based participatory  research framework) within
    government  agencies and  among research institutes.  In

Journal of Exposure Science and Environmental Epidemiology (2009), 1-10
                                     Previous
 TOC
Next

-------
Research needs for community-based risk assessment
                                                                                                                      Sanchez et al.
 Table 1. Research needs for community-based risk assessment (CBRA) by workshop session and emerging theme

                                                                                                                        Emerging themes
                                                                                                                        #1     #2    #3
 Session I: Data Needs and Measurement Methods for CBRA                                                                          X
 Develop  metrics, indicators, and biomarkers for exposure and health tracking surveillance
 Develop  simple and low-cost monitoring methods for pollutants and pathogens at the individual and community level over space and   X
 through time (including real time)
 Develop  simple and low-cost monitoring methods for non-chemical stressors at the individual and community level over space and     X
 through time (including real time)
 Develop  enhanced sensor technologies for providing real-time data on individual and community level measures of exposure to        X
 environmental stressors
 Create accessible and well-documented databases with links to the full range of exposure information, to include an infrastructure for          X
 facilitating addition of data by investigators and the sharing of data and tools used to characterize environmental stressors
 Identify and adapt indices used currently in social sciences for measuring community-level psychosocial health                       X
 Translate more qualitative social indices into a form that is useful for quantitative risk assessments                                  X

 Session II: The Biological Impact of Non-Chemical Stressors and Interaction with Other Environmental Exposures
 Review social variables of importance for  health in the context of the EPA risk assessment                                        X
 Develop  approaches for incorporating vulnerability into risk assessment models                                                  X
 Develop  techniques to incorporate important social variables as modeling parameters                                             X
 Develop  techniques to use community characteristics as proxies  of psychosocial exposure                                          X
 Understand the interaction (chemical dose-response relationships) of chemical and non-chemical stressors, specifically psychosocial     X
 stress
 Obtain data on baseline variability of psychosocial stress hormones among the population in order to understand inter- and intra-             X
 individual variability
 Develop  tools  to monitor psychosocial stress levels in real time (develop biomarkers) at  individual and community levels              X
 Incorporate psychosocial stress into physiologically based pharmacokinetic (PBPK) and physiologically based pharmacodynamic       X
 (PBPD) models
 Examine differential activity patterns  between social groups                                                                    X

 Session III: Statistical and Mathematical Modeling for CBRA
 Compare various monitoring and modeling techniques to assess value and ease of use                                             X
 Develop  techniques to integrate existing datasets on population health for future  predictions/modeling                              X
 Develop  and apply advanced statistical techniques to: characterize group-level effects, synthesize information from multiple datasets,    X
 extrapolate data across communities,  reduce limitations of small population studies, account for possible underestimation of exposure,
 etc.
 Increase  the  ability of Hierarchical Bayesian Model to add data from multiple sources and scales                                   X
 Develop  spatiotemporal models that can adjust for information at multiple scales and levels of accuracy (temporal, spatial, or data from   X
 multiple sources)
 Develop  better geospatial techniques to characterize communities                                                               X
 Explore emerging geospatial tools (e.g., Google Earth)                                                                        X
 Develop  hierarchical datasets gathered at multiple levels that can be mapped collected, organized, and accessed by community members          X
 Improve  methods for interpreting biomonitoring data                                                                         X
 Develop  transparent modeling methods that  can be used collaboratively with the community                                       X     X
 Better communicate methods and results of complex models                                                                   X

 Emerging themes: #1, Scientific tools and methods to better evaluate health outcomes at a community level (including a new framework for risk assessment);
 #2, Environmental health infrastructure; #3, Community involvement processes.
 addition,  paradigm  shifts  within  agencies  and  research
 institutes  may   be   necessary   to  initiate  CBRA  (U.S.
 Environmental Protection Agency, 1999; National Environ-
 mental Justice Advisory Committee,  2004).  Although not
 discussed  explicitly,   the  community-based  participatory
 research or  community involvement in  decision-making is
 not without challenges.  For example,  truly involving the
 community  as active participants  in  research  or decision-
 making  is expensive and requires  a great deal of resources.
 Additionally,  there  may  be a  lack  of trust,  as  well  as
    differences in  goals, values,  and perspectives between  the
    community members, scientists (Israel et al., 2005),  and risk
    assessors.

    Review of Research Needs
    Some research needs and other suggestions identified at the
    workshop (Tables 1 and 2) include using tools, methods, or
    approaches that  exist, but are not currently  being  applied
    to risk  assessment. Session I  indicated the  need to refine
    sensor technologies for providing real-time data on commu-
Journal of Exposure Science and Environmental Epidemiology (2009), 1-10
                                           Previous
TOC
Next

-------
     Sanchez et al.
                                                                                 Research needs for community-based risk assessment
Table 2.  Crosscutting suggestions to enhance community-based risk assessment (CBRA) by emerging theme

                                                                                                          Emerging themes
                                                                                                           #1    #2   #3

Cross-Cutting Suggestions
Develop a new framework to integrate all chemical, non-chemical, and vulnerability issues into risk assessment                    X
Establish attributes of successful and unsuccessful case studies (deliberative processes where communities partner with the EPA).       X
Integrate community knowledge for risk assessment                                                                   X
Develop tools/methods to elicit community knowledge for risk assessment                                                 X
Establish models, tools, and frameworks  from other disciplines (specifically the ecological sciences) that would be useful for human     X
health risk assessment
Create access to databases that give information at the local level                                                        X
Integrate multidisciplinary teams to undertake CBRA research                                                          X
Integrate multi-agency  (federal, state, local) partnerships to address CBRA                                                 X
Utilize community training modules on basic environmental health and risk assessment                                       X
Focus on research that is directly usable by community or its local health or environmental department (community-driven research)   X
Establish training modules in academia/agencies on how to conduct community-based participatory research                      X
                                                      X
                                                      X
                                                      X
                                        X

                                        X
                                        X
                                        X

                                        X
                                        X
                                        X
                                        X
                                        X
                                        X
Emerging themes: #1, Scientific tools and methods to better evaluate health outcomes at a community level (including a new framework for risk assessment);
#2, Environmental health infrastructure; #3, Community involvement processes.
nity  environmental  stressors.  Further examples  of the
existing tools and methods are techniques in use by social
scientists  or ecologists.  Session II suggested  the  need to
identify and apply indices for measuring  community-level
psychosocial health  (e.g.,  community  cohesion)  and use
community characteristics as proxies  of psychosocial expo-
sure. All the session groups identified the need for methods to
work effectively with community members, such as how to
elicit community knowledge and address research problems
that are applicable to the community.  Such approaches may
already be  in use among social science and public health
disciplines, but risk assessors may be able to further refine
these  methods for their work. Altogether, it is important to
understand models,  tools,  and  frameworks  from  other
disciplines that would  be useful for  human  health  risk
assessment.
  Other identified research needs included the  development
of new tools and methods that could be useful for CBRA.
This may require multidisciplinary  collaborations to create
novel techniques and modeling approaches. For example,
techniques used for translating more qualitative social indices
into quantitative risk assessments were  addressed in Session I.
Session III  identified   the  need  to  develop  transparent
modeling  methods that  can be used  collaboratively within
the community. Additional needs addressed in  Session  II
include approaches for incorporating  vulnerability  into risk
assessment models and  techniques  for  including important
social variables (e.g., poverty, access to medical care, chronic
stress, etc.)  as modeling  parameters.

Next steps

There have been multiple  actions  taken,  such  as the
workshop   to  advance  CBRA   research.  First,  EPA's
    National  Center  for  Environmental  Research  (NCER)
    compiled the workshop proceedings, which are now available
    online. This document includes a copy of most presentations,
    a final agenda,  and a summary report capturing all the
    presentations  and  discussions   of  the  workshop  (U.S.
    Environmental Protection  Agency,  2007b).  Second,  the
    NCER created a Listserv to disseminate information and
    resources relevant to CBRA (to enlist, see: http://www.epa.
    gov/ncer/CBRA web  site  *forthcoming*).  Most of the
    information is digested in a monthly bulletin. Third, NCER
    is establishing a CBRA Science Page on its Web site, which
    will provide  information regarding  CBRA  to the general
    public and to the research community.  In addition to these
    NCER activities, other  parts of the  EPA  are  supporting
    CBRA.  For  example,  the EPA's Risk Assessment Forum
    has been  a  proponent  of CBRA.  The  EPA's CARE
    (Community  Action for a Renewed  Environment) program
    is  a  community-based  cooperative agreement  program,
    which helps to build broad-based partnerships for reducing
    environmental risks at the  local level.  Also, the  EPA's
    Region 6 is partnering with the Ponca Tribe of Northern
    Oklahoma, EPA's Office of Research and Development, the
    University  of  North  Texas,  and  the  Oklahoma  State
    University  to conduct  a cumulative risk  assessment of the
    Tribe,  examining   holistically  the  effects   of   numerous
    environmental stressors on tribal  lands.
       The next step to support CBRA within the EPA  is for
    NCER to  incorporate CBRA into  its  extramural research
    program.   NCER's  mission is   to  support  high-quality
    research by the nation's leading scientists, which will improve
    the  scientific  basis  for  decisions on  national environmental
    issues to help the  EPA  achieve  its goals.  In 2009,  EPA/
    NCER plans to issue  a  Request for Applications (RFA)
    soliciting research to further the field of CBRA.  This RFA

Journal of Exposure  Science and Environmental Epidemiology (2009), 1-10
                                      Previous
 TOC
Next

-------
Research needs for community-based risk assessment
                                                                                                                                Sanchez et al.
will  aim to  address  some  of  the  major research  needs
identified in this workshop.
 Acknowledgements

 The authors are grateful to the many workshop presenters for
 their thoughtful  contributions,  and the  workshop  partici-
 pants for their valuable insights and discussions. The authors
 would like to acknowledge Nigel Fields for his breakthrough
 concepts  on community-based risk  assessment,  and for his
 energy and inspiration.
 Disclaimer

 This  publication was developed under Cooperative Agree-
 ment No. #X3-83085001 awarded by the US Environmental
 Protection  Agency (EPA)  to the Association of Schools  of
 Public Health (ASPH). It has not been formally  reviewed by
 the EPA. The views expressed in this paper are solely those  of
 the  authors  and  do  not  necessarily  reflect those of the
 Agency.  The  EPA  does   not  endorse   any  products   or
 commercial services mentioned in this publication.
 References

 Barbeau E.M.,  Krieger N., and Soobader MJ. Working class matters: socio-
    economic disadvantage, race/ethnicity, gender, and smoking in NHIS 2000.
    Am J Public Health 2004a: 94(2): 269-278.
 Barbeau E.M.,  McLellan D., Levenstein C, DeLaurier G.F., Kelder G.,  and
    Sorensen G. Reducing occupation-based disparities related to tobacco: roles for
    occupational health and organized labor. AmJInd Med 2004b: 46(2): 170-179.
 Bellinger D., Leviton A., Waternaux C., Needleman H., and Rabinowitz M. Low-
    level lead exposure, social class, and infant development. Neurotoxicol Teratol
    1988: 10(6): 497-503.
 Bellinger D.C.  Lead neurotoxicity and socioeconomic status: conceptual  and
    analytical issues. Neurotoxicology 2008: 29(5): 828-832.
 Bjorntorp P. Do stress reactions cause abdominal obesity and comorbidities? Obes
    Rev 2001: 2(2): 73-86.
 Caldas E.D., Boon P.E., and Tressou J. Probabilistic assessment of the cumulative
    acute exposure  to organophosphorus  and carbamate  insecticides in the
    Brazilian diet. Toxicology 2006: 222(1-2):  132-142.
 Cary R., Clarke S.,  and Delic J.  Effects of combined exposure to  noise  and
    toxic substances-critical review of the literature. Ann Occup Hyg  1997: 41(4):
    455-465.
 Callahan M.A.,  and Sexton K. If cumulative risk assessment is the answer, what is
    the question? Environ Health Perspect 2007:  115(5): 799-806.
 Castorina R.,  Bradman  A., McKone T.E., Barr D.B.,  Harnly M.E.,  and
    Eskenazi  B.  Cumulative  organophosphate pesticide  exposure  and  risk
    assessment among pregnant women living in an agricultural community:  a
    case  study from the CHAMACOS cohort. Environ Health Perspect 2003:
    111(13): 1640-1648.
 Clougherty J.E., Levy J.I., Kubzansky L.D., Ryan P.B., Suglia S.F., Canner
    M.J., and Wright  R.J. Synergistic effects  of traffic-related air pollution and
    exposure to violence on urban asthma etiology. Environ Health Perspect 2007:
    115(8): 1140-1146.
 Corburn J.  Street Science: Community  Knowledge and Environmental Health
    Justice. MIT Press, Cambridge, Massachusetts, 2005.
 Cory-Slechta D.A., Virgolini M.B., Thiruchelvam M., Weston D.D., and Bauter
    M.R. Maternal stress modulates the effects of developmental lead exposure.
    Environ Health Perspect 2004:  112(6): 717-730.
    DeFur P.L., Evans G.W., Cohen Hubal E.A., Kyle A.D., Morello-Frosch R.A.,
        and Williams D.R. Vulnerability  as a  function  of individual and  group
        resources in cumulative risk assessment. Environ Health Perspect 2007: 115(5):
        817-824.
    Dominici R, Peng R.D., Bell M.L., Pham L., McDennott A., Zeger S.L., and
        Samet  J.M.  Fine particulate  air pollution  and hospital admission  for
        cardiovascular and respiratory diseases. JAMA 2006: 295(10): 1127-1134.
    Erren T.C., Jacobsen M., and Piekarski C. Synergy between asbestos and smoking
        on lung cancer risks. Epidemiology 1999:  10(4): 405-411.
    Fox M.A., Tran N.L., Groopman J.D., and  Burke T.A. Toxicological resources
        for cumulative risk: an example with hazardous air pollutants. Regul Toxicol
        Phamacol 2004: 40(3): 305-311.
    Guilarte T.R., Toscano C.D.,  McGlothan J.L., and Weaver S.A. Environmental
        enrichment reverses cognitive and molecular deficits induced by developmental
        lead exposure. Ann Neural 2003: 53(1): 50-56.
    Herman J.P., and Seroogy K. Hypothalamic-pituitary-adrenal axis,  glucocorti-
        coids,  and neurologic disease. Neural Clin 2006: 24(3): 461-481, vi.
    International Life  Sciences  Institute. A Framework  for  Cumulative  Risk
        Assessment. In: Mileson B., Faustman E., Olin  S., Ryan  P.B.,  Ferenc S.,
        and Burke T. (eds.). Washington, DC, 1999.
    Israel B.A.,  Schulz A.J., Parker E.A.,  and Becker A.B. Review of community-
        based  research: assessing partnership  approaches to improve public health.
        Amu Rev Public Health 1998:  19: 173-202.
    Israel  B.A., Schulz A.J.,   Parker  E.A.,  and Becker  A.B.  Community-
        based  participatory research:  policy  recommendations  for promoting  a
        partnership approach in  health research.  Educ  Health  (Abingdon)  2001:
        14(2):  182-197.
    Israel B.A., Parker E.A., Rowe Z., Salvatore  A., Minkler M., Lopez J., Butz A.,
        Mosley A., Coates  L.,  Lambert G.,  Potito P.A., Brenner  B., Rivera M.,
        Romero H., Thompson B., Coronado G., and Halstead S. Community-based
        participatory  research:  lessons learned  from the Centers  for  Children's
        Environmental Health and  Disease Prevention  Research. Environ Health
        Perspect 2005: 113(10):  1463-1471.
    Kramer M.S., Goulet L., Lydon J., Seguin L., McNamara H.,  Dassa C., Platt
        R.W., Chen M.F., Gauthier H., Genest J., Kahn S., Libman M., Rozen R.,
        Masse A., Miner L., Asselin G., Benjamin A., Klein J., and Koren G. Socio-
        economic disparities in preterm birth:  causal pathways and mechanisms.
        Paediatr Perinat Epidemiol 200\: 15(Suppl 2): 104-123.
    Maccari S., Darnaudery M., Morley-Fletcher  S.,  Zuena A.R., Cinque  C.,
        and Van Reeth O. Prenatal stress and long-term consequences: implications
        of glucocorticoid  hormones.  Neurosci  Biobehav  Rev  2003:   27(1-2):
        119-127.
    Menzie  C.A.,  MacDonell M.M.,  and Mumtaz  M. A phased approach for
        assessing combined effects from multiple stressors. Environ Health Perspect
        2007:  115(5): 807-816.
    Morrison H.I., Villeneuve P.J., Lubin J.H., and Schaubel D.E. Radon-progeny
        exposure and lung cancer risk in a cohort of Newfoundland fluorspar miners.
        Radiat Res 1998: 150(1): 58-65.
    National Environmental Justice Advisory Committee. Ensuring Risk Reduction in
        Communities  with Multiple Stressors:  Environmental Justice  and Cumulative
        Risks I Impacts, Vol.  EPA  300/R-04/903.  U.S.  Environmental  Protection
        Agency, Washington, DC, 2004.
    National Research Council. Risk Assessment in the Federal Government: Managing
        the Process. National Academy Press, Washington, DC, 1983.
    Negro-Vilar A.  Stress and  other  environmental  factors  affecting  fertility
        in men and women: overview. Environ Health Perspect 1993: 101(Suppl 2):
        59-64.
    O'Fallon  L.R.,  and  Dearry  A.  Commitment of  the National  Institute
        of Environmental  Health   Sciences  to  community-based  participatory
        research for  rural health. Environ Health Perspect 2001: 109(Suppl 3):
        469-473.
    O'Fallon L.R., and Dearry A. Community-based participatory research as a tool
        to  advance environmental health sciences. Environ  Health  Perspect 2002:
        110(Suppl 2): 155-159.
    Patychuk D.L. Bridging place-based research  and action for health. Can J Public
        Health 2007:  98(Suppl 1):  S70-S73.
    Payne-Sturges D., and Gee G.C. National  environmental health measures for
        minority and low-income populations: tracking social disparities in environ-
        mental health. Environ Res 2006: 102(2):  154-171.
    Payne-Sturges D.C., Burke T.A., Breysse P., Diener-West M., and Buckley T.J.
        Personal exposure meets  risk  assessment:  a  comparison of measured and
Journal of Exposure Science and Environmental Epidemiology (2009), 1-10
                                              Previous
TOC
Next

-------
       Sanchez et al.
                                                                                                     Research needs for community-based risk assessment
    modeled exposures and risks in an urban community. Environ Health Perspect
    2004a: 112(5): 589-598.
Payne-Sturges D.C., Schwab M., and Buckley T.J. Closing the research loop: a
    risk-based approach  for communicating results  of  air pollution exposure
    studies. Environ Health Perspect 2004b: 112(1): 28-34.
Ryan L. Combining data from multiple sources, with applications to environ-
    mental risk assessment. Stat Med 2008: 27(5): 698-710.
Ryan P.B., Burke T.A., Cohen Hubal E.A., Cura J.J., and McKone T.E. Using
    biomarkers to inform cumulative risk assessment. Environ Health Perspect
    2007: 115(5):  833-840.
Schlenker T. The effects of lead in Milwaukee's water. Wis Med J 1989: 88(10):
    13-15.
Sexton  K., Adgate J.L.,  Fredrickson  A.L.,  Ryan A.D.,  Needham L.L.,  and
    Ashley D.L. Using biologic markers in blood to assess exposure to multiple
    environmental  chemicals for  inner-city children 3-6 years of age. Environ
    Health Perspect 2006: 114(3): 453-459.
Sexton  K., and Hattis D. Assessing cumulative health  risks from exposure to
    environmental  mixtures  -  three fundamental questions.  Environ  Health
    Perspect 2007: 115(5): 825-832.
Strine T.W.,  Ford  E.S.,  Balluz L., Chapman D.P.,  and  Mokdad A.M. Risk
    behaviors and health-related quality of life among adults with asthma: the role
    of mental health status. Chest 2004: 126(6): 1849-1854.
Suglia  S.F., Ryan  L., Laden  F.,  Dockery  D.W., and Wright  RJ. Violence
    exposure,  a chronic  psychosocial  stressor,  and  childhood lung function.
    Psychosom Med 2008: 70(2): 160-169.
Tamashiro K.L.,  Nguyen M.M.,  Ostrander M.M.,  Gardner  S.R.,  Ma L.Y.,
    Woods S.C., and Sakai R.R. Social stress and recovery: implications for body
    weight and body composition. Am J Physiol Regul Integr Comp Physiol 2007:
    293(5): R1864-R1874.
Teuschler L.K., Rice G.E., Wilkes C.R., Lipscomb  J.C., and Power F.W. A
    feasibility  study of cumulative risk assessment methods for drinking water
    disinfection by-product mixtures. / Toxicol Environ Health A 2004: 67(8-10):
    755-777.
Thompson B., Coronado  G.D., Grossman J.E., Puschel K., Solomon C.C., Islas
    L, Curl C.L., Shirai J.H., Kissel J.C., and Fenske R.A. Pesticide take-home
    pathway among children of agricultural workers: study design, methods, and
    baseline findings. / Occup Environ Med 2003: 45(1): 42-53.
Thompson B., Coronado G.D., Vigoren E.M.,  Griffith W.C.,  Fenske R.A.,
    Kissel  J.C., Shirai J.H.,  and  Faustman E.M.  Para ninos  saludables:  a
    community intervention trial to reduce organophosphate pesticide exposure in
    children of farmworkers. Environ Health Perspect 2008: 116(5): 687-694.
Tong S., McMichael A.J., and Baghurst P.A. Interactions between environmental
    lead exposure and sociodemographic factors on cognitive development. Arch
    Environ Health 2000:  55(5): 330-335.
U.S. Congress. Food Quality Protection Act of 1996. Public Law 104-170, United
    States of America; 1996.
    U.S. Environmental Protection Agency. Concepts, Methods, and Data Sources for
        Cumulative Health Risk Assessment  of Multiple  Chemicals,  Exposures and
        Effects:  A Resource Document. EPA Office of Research and Development,
        Washington, DC, 2007a. EPA/600/R-06/013F.
    U.S. Environmental Protection Agency. EPA's Framework for Community-Based
        Environmental Protection. EPA Office of Policy and Office of Reinvention,
        Washington, DC, 1999. EPA 237/K-99/001.
    U.S. Environmental Protection Agency. EPA's Framework for Cumulative Risk
        Assessment.  EPA  Office of Research and Development,  Washington, DC,
        2003. EPA 630/P-02/001F.
    U.S.  Environmental Protection Agency  Organophosphorus  Cumulative Risk
        Assessment—2006 Update.  EPA Office of Pesticide Programs, Washington,
        DC, 2006a. EPA HQ-OPP-2006-0618.
    U.S. Environmental Protection Agency. Human Health Research Program Multi-
        Year Plan  (FY 2006-2013).  EPA  Office of Research and  Development,
        Washington, DC, 2006b.
    U.S.  Environmental   Protection   Agency  Proceedings   of  the  U.S.  EPA
        Workshop  on  Research  Needs for Community-Based Risk  Assessment.
        EPA Office of Research and  Development, Research Triangle Park, NC,
        2007b.  Available  at:   http://es.epa.gov/ncer/cbra/presentations/ll_18_07/
        1 l_18_07_workshop.html.
    U.S. Environmental Protection  Agency and  Oak  Ridge  Center  for Advanced
        Studies.  Nanotechnology Applications in Environmental Health: Big Plans for
        Little Particles. EPA Office of Research and Development, Research Triangle
        Park, NC, 2006c. Available  at: http://www.epa.gov/ncct/communications.
        html.
    Vigoren E.M., Griffith W.C., Krogstad F.T.O., Coronado G.D., Thompson B.,
        and  Faustman  E. Formal  Uncertainty Analysis in  the Interpretation of
        Organophosphate  Pesticide Metabolite Concentrations. In: Society for Risk
        Analysis Annual Meeting; Dec 9-12, 2007; San Antonio, TX, 2007.
    Weaver I.C., Cervoni  N., Champagne F.A., D'Alessio A.C., Sharma S., Seckl
        J.R., Dymov S.,  Szyf  M.,  and Meaney M.J.  Epigenetic programming by
        maternal behavior.  Nat  Neurosci 2004: 7(8): 847-854.
    Weiss  B., and  Bellinger D.C.  Social  ecology  of children's vulnerability to
        environmental pollutants. Environ Health Perspect 2006: 114(10): 1479-1485.
    White  L.D., Cory-Slechta  D.A.,  Gilbert M.E., Tiffany-Castiglioni E.,  Zawia
        N.H., Virgolini  M., Rossi-George A.,  Lasley S.M., Qian Y.C., and Basha
        M.R. New and evolving concepts in the neurotoxicology of lead. Toxicol Appl
        Pharmacol 2007: 225(1): 1-27.
    Whitmore R., and Kizakevich P. Progress Report: Data Collection Platforms for
        Integrated Longitudinal Surveys of Human Exposure-Related Behavior. EPA
        Grant Number R831541; 2004.
    Wright  RJ.  Stress and atopic disorders. J Allergy Clin Immunol 2005:  116(6):
        1301-1306.
    Yeboah D.A. A framework for place based health planning.  Aust Health Rev 2005:
        29(1): 30-36.
10
                                                                          Journal of Exposure Science and Environmental Epidemiology (2009), 1-10
                                                Previous
TOC
Next

-------
                                                         Journal of Exposure Science and Environmental Epidemiology (2008), 1-6       ijfife
                                                         © 2008 Nature Publishing Group All rights reserved 1559-0631/08/S30.00       ^UP
                                                         www.nature.com/jes


Review

Exposure science  and  the  U.S.  EPA  National  Center for Computational

Toxicology


ELAINE A. COHEN HUBALa, ANN M.  RICHARDa, IMRAN SHAHa,  JANE GALLAGHER13, ROBERT KAVLOCKa,
JERRY BLANCATOa AND STEPHEN W. EDWARDSb

^National Center for Computational Toxicology,  US Environmental Protection Agency, Research Triangle Park, North Carolina, USA
bNational Health and Environmental Effects Laboratory, US Environmental Protection Agency, Research Triangle Park, North Carolina, USA


The emerging field of computational toxicology applies mathematical and computer models and molecular biological and chemical approaches to explore
both qualitative and quantitative relationships between sources of environmental pollutant exposure and adverse health outcomes. The integration of
modern computing with molecular biology and chemistry will allow scientists to better prioritize data, inform decision makers on chemical risk
assessments and understand a chemical's progression from the environment to the target tissue within an organism and ultimately to the key steps that
trigger an adverse health effect. In this paper, several of the major research activities being sponsored by Environmental Protection Agency's National
Center for Computational Toxicology are highlighted. Potential links between research in computational toxicology and human exposure science are
identified. As with the traditional approaches for toxicity testing and hazard assessment, exposure science is required to inform design and interpretation
of high-throughput assays. In addition, common themes inherent throughout National Center for Computational Toxicology research activities are
highlighted for emphasis as exposure science advances into the 21st century.
Journal of Exposure Science and Environmental Epidemiology advance online publication, 5 November 2008; doi:10.1038/jes.2008.70

Keywords: exposure modeling, toxicology, bioinformatics, toxicogenomics.
Introduction

Computational  toxicology  is  a  new  and  high-priority
research area in US Environmental Protection Agency (US
EPA, 2003; Kavlock et al, 2007). Defined as the application
of mathematical and  computer  models to predict adverse
effects and to better understand the mechanism(s) through
which  a given  chemical  induces  harm,  computational
toxicology provides approaches to explore both qualitative
and  quantitative relationships  between sources of environ-
mental pollutant exposure and adverse health outcomes. This
integration of modern computing with molecular biology and
chemistry will allow scientists to better prioritize data, inform
decision makers on chemical risk assessments and understand
a chemical's  progression from the environment to the target
tissue within  an organism and ultimately to the key steps that
trigger an adverse health effect.
  In February  2005,  US  EPA established the  National
Center  for Computational Toxicology  (NCCT)  to conduct
Address all correspondence to:  Dr. Elaine A. Cohen Hubal, National
Center for Computational Toxicology,  US  Environmental Protection
Agency, Mail Drop B205-01, Research Triangle Park, NC 27711, USA.
Tel.: + 1  919 541 4077, Fax: +919 685 3334.
E-mail: hubal.elaine@epa.gov
Received 18 September 2008; accepted 23 September 2008
    and sponsor research in this area. The overall goal of ORD's
    research program  on Computational Toxicology  is to use
    emerging technologies to improve  quantitative risk assess-
    ment  and  reduce  uncertainties  in  the  source-to-adverse
    outcome  continuum by  providing  ultimately systems level
    understanding of biological processes and their perturbation.
    The importance and relevance of this mission  and NCCT-
    initiated research has received strong support with the recent
    release of the National Academy of Sciences report calling
    for  a  transformative shift  in toxicity  testing  and  risk
    assessment   (NRC,  2007).  Toxicity  Testing  in  the 21st
    Century:  A  Vision and a Strategy, calls for a  collaborative
    effort across the toxicology community to rely less on animal
    studies and  more  on in  vitro tests using human  cells and
    cellular components to identify chemicals with  toxic effects.
    A framework for implementing this long-range  vision  is
    provided by the recently formalized  collaboration between
    two NIH institutes (NIEHS and NHGRI) and the EPA to
    use  high-speed,  automated screening methods  to  efficiently
    test  compounds for potential toxicity (Collins et al.,  2008).
      These  high  visibility   efforts  in  toxicity  testing  and
    computational toxicology raise important research  questions
    and  opportunities for exposure scientists.  The  National
    Academies  report authors  (NRC, 2007) emphasize that,
    population-based data and human exposure information are
                                      Previous
TOC
Next

-------
     Cohen Hubal et al.
                                                                                                    Exposure and complex
required at each step of their vision for toxicity testing; and
that these data will continue to  play a critical role in both
guiding development and use of the toxicity information.
  The  NCCT  Computational  Toxicology  program has
identified  the need  to  include  exposure information  for
chemical prioritization, modeling system response to chemi-
cal exposures across multiple levels of biological organization
and  linking information on  potential  toxicity  of environ-
mental  contaminants to real-world health outcomes  (e.g.,
complex disease).  As a  starting  point,  several  common
themes  have emerged among the NCCT research projects.
These themes are of particular interest to exposure scientist as
we consider how to best incorporate the tools of computa-
tional toxicology into exposure research  as well as how  to
best contribute to research in computational toxicology. The
research conducted in the NCCT is designed  to address the
need for:  (1)  characterization of the target  system across
levels of biological organization; (2) improved linkages across
the source-to-outcome continuum; and (3) a shift from linear
source-to-dose paradigm to  a systems-based approach.  In
addition, the complexity of the systems under study and the
multidimensional nature of data produced using emerging
technologies requires  extensive collaboration  and advanced
environmental informatic capabilities.  In this paper, with
these common  themes  in mind:  potential  links between
research conducted in the  US EPA's National  Center for
Computational Toxicology and human exposure science are
discussed; the need for exposure  science to address chemical
screening, prioritizing and toxicity testing in the 21st century
is  identified;  and  priority  research  areas  for  exposure
scientists are proposed.
NCCT research activities

Toxcast: Prioritizing the Toxicity Testing of Environmental
Chemicals
Globally there is a  need to characterize potential risk  to
human  health  and the environment that  arises from the
manufacture and use of tens of thousands  of chemicals.  In
2007, US EPA's NCCT launched ToxCast™ to develop a
cost-effective  in vitro approach for prioritizing the toxicity
testing of large numbers of chemicals in a short period of time
(Dix et al., 2007).  Using  data from state-of-the-art high-
throughput screening  (HTS) bioassays developed  in the
pharmaceutical industry, ToxCast™  is  building  computa-
tional models to forecast the potential human toxicity  of
chemicals.  The  premise  underlying  ToxCast™   is  that
toxicological  response  is driven by interactions  between
chemicals and biomolecular targets. For  most environmental
chemicals the protein targets and biological effects underlying
potential  adverse effects  have yet to be identified   or
characterized.  The strategy of ToxCast™ is  to focus on a
diverse range of assays and data types to identify  potential
   targets. The ToxCast™ program will apply a multiple target
   matrix approach to address this goal. The matrix contains an
   expanded  number  of  potential  targets  whose  chemical
   interactions may  be  characterized  by  in  silica  models,
   biochemical assays, cell-based in vitro assays (based on both
   human and animal tissues) and nonmammalian models. The
   resulting data span levels of biological organization: mole-
   cular,  cellular,  tissue and  whole organism.  The  overall
   pattern across many assays and data types will be used to
   develop a fingerprint or bioactivity profile that can be used as
   a predictor of toxicity. These hazard predictions will provide
   EPA regulatory programs with  science-based information
   helpful in prioritizing chemicals for more detailed toxicolo-
   gical evaluations and lead to more efficient use  of animal
   testing. The  resulting data will also provide  insights into
   modes of action of chemical toxicity in an unprecedented and
   unbiased  manner. This,  in  turn,  has  implications for
   identifying potentially susceptible populations,  both from a
   life-stage viewpoint, but also from a genetic (polymorphic)
   standpoint as toxicity pathways intersect with disease path-
   ways.
      The  ToxCast™  program is  being implemented  using  a
   tiered multiphase approach (see www.epa.gov/ncct/toxcast).
   In phase I, over 300 well-characterized chemicals have been
   profiled in over 400 HTS end  points. These end points
   include biochemical assays  of protein  function,  cell-based
   transcriptional reporter  assays, multi-cell interaction assays,
   transcriptomics  on primary cell cultures and developmental
   assays  in zebrafish  embryos.  Almost  all of  the phase  1
   compounds have been tested in traditional toxicology  tests,
   including developmental toxicity, multigeneration studies and
   subchronic and chronic rodent bioassays. Phase 1  ToxCast™
   signatures will  be defined and evaluated by the ability to
   predict outcomes  from  existing mammalian toxicity testing
   and identify toxicity pathways that are relevant  to  human
   health effects.
      ToxCast phase II, scheduled to launch in FY09 will bring
   the total number of chemicals screened to nearly 1000. These
   additional compounds  will  represent  broader  chemical
   structure  and use classes and  some pharmaceutical agents
   with known adverse side effects, to evaluate the predictive
   bioactivity signatures developed in phase I. As a result of the
   memorandum  of understanding   (MOU)  with  National
   Toxicology Program/NIEHS and NIH Chemical Economics
   Center/NHGRI, additional  chemical screening capability  is
   being made accessible and it is now projected that more than
   5000 chemicals will be entering the high-throughput screen-
   ing  program of the NCGC  within the  next year.  It  is
   anticipated that successful conclusion of ToxCast™ phases I
   and II will provide EPA regulatory programs with a tool for
   rapidly and efficiently screening compounds and prioritizing
   further toxicity testing.
      As computational analyses of ToxCast™ phase I data begin,
   the need to consider exposure potential for selecting phase II

Journal of Exposure Science and Environmental Epidemiology (2008), 1-6
                                     Previous
TOC
Next

-------
Exposure and complex
                                                                                                      Cohen Hubal et al.

Stressor

Environmental
Source


| 	 Ambient
* Exposure
Environmental
Source
Personal
\ 	 ^ Exposure
                                        Perturbation
                                                                         Perturbation
                          Internal Exposure
                            (Tissue Dose)
                             Dose to Cell
                           Dose of Stressor
                              Molecules
                                                          Population
                                                           Individual
                                                              ft
 Tissue
                                                             Cell
Biological
Molecules
                                 Disease
                           Incidence/Prevalence
                                 Disease State
                           (Changes to Health Status)
Dynamic Tissue Changes
     (Tissue Injury)
                             Dynamic Cell Changes
                           (Alteration in Cell Division,
                                  Cell Death)

                              Dynamic Changes
                           in Intracellular Processes
Figure 1. Cascade of exposure-response processes for integrating exposure science and toxicogenomic mode-of-action information.
chemicals as well as for providing real-world relevance for
interpretation of  toxicity  screening  has been identified by
NCCT. As ToxCast™ and related research activities provide
information on  key events required to incorporate mode-of-
action information along the continuum, similar key exposure
metrics at comparable resolution will need to be identified
(Figure 1). This  would build upon the great strides made in the
last 20 years on PBPK modeling and expand on that success
as  emerging  technologies  blur  the boundaries  between
exposure and effects sciences. The ultimate goal would be an
integrated  program in which biomarkers  of exposure and
bioindicators of effects are jointly determined and can be used
to  enhance  biologically  based dose-response  models by
providing measured parameters linking relevant exposures to
the probability of an adverse outcome (NRC, 2007).

Distributed Structure-Searchable Toxicity Database
Network: Informatics for Environmental Health Risk
Assessment
Specific activities  at the NCCT  include research to define
chemical properties that can be used as indicators of potential
toxicity for use in prioritization of toxicity testing as well as to
construct computational models of chemical interactions with
biological  systems for human  health risk  assessment.  This
research requires  creation of flexible databases covering  a
broad range  of chemical space so that the wide range  of
multidimensional data spanning levels of biological organiza-
tions  across  the  source-to-outcome  continuum   can be
accessed, combined and interpreted using novel approaches.
   The  Distributed  Structure-Searchable  Toxicity  (DSSTox)
Database Network is creating a chemical data foundation for
improved structure activity and  predictive toxicology  capabil-

Journal of Exposure Science and Environmental Epidemiology (2008), 1-6
     ities, and broad linkages to chemical data resources across and
     outside of EPA (Richard et al., 2006). The  DSSTox website
     (US EPA, 2008) publishes downloadable, chemical structure
     files associated with toxicity data in a variety of formats, along
     with documentation and links to source information,  quality
     review  procedures   and  guidance  for  users.  Standardized
     chemical structure annotation of a diverse array of toxicol-
     ogy-related data and  resources,  coupled with  the  online
     DSSTox Structure  Browser, are providing  structure-search-
     ability  and direct  access  to these  data (including  EPA's
     Integrated  Risk Information System,  Fat-head minnow acute
     toxicity database and High-Production Volume Chemical lists,
     the  National  Toxicology  Program's  —  NTP  Bioassay
     database, as well  as estrogen-receptor binding  data,  rodent
     carcinogenicity  data  and most recently, gene expression data).
     The DSSTox  project is  also  providing  primary  structure-
     annotation and cheminformatics support to both the NTP HTS
     and EPA  ToxCast™ programs in conjunction with NCCT's
     Aggregated Computational Toxicity Resource (ACToR) pro-
     ject, slated for public release in late 2008 (Richard et al., 2008).
     The latter is  providing a  relational  database  platform  for
     surveying vast  Internet data resources pertaining to environ-
     mental toxicology  (hundreds  of thousands of  chemicals),
     including high- and medium-production  chemical  lists and
     exposure-related data (Judson et al., in press). ACToR will also
     serve  as the primary storage and  analysis resource for  the
     ToxCast HTS data, linking these data to standardized historical
     lexicological test results and broader chemical resources.
        Similarly, it  is imperative that exposure data be accessible
     and linked to the  rapidly  growing   base of toxicity data.
     Development of consolidated data and knowledge bases for
     exposure is a high priority. Existing tools and platforms that
                                       Previous

-------
     Cohen Hubal et al.
                                                                                                       Exposure and complex
are currently being implemented with environmental toxicity
information should be considered to provide the most useful
links  to existing  toxicity  and environmental health data.
Relevance and value  of exposure information for toxicology
and risk assessment will increase dramatically if links to these
data are immediately apparent to an investigator searching the
universe of toxicity and health data for a given compound.
Chemical structure-annotation of exposure-related data, such
as could be provided  by DSSTox,  and incorporation of such
data into the  new ACToR resource,  will greatly enhance
linkages between  these exposure  data  and toxicity-related
human  health end points. In a preliminary demonstration, a
DSSTox file of 60 chemical  structures was created to index
chemical-related content within the  EPA  Children's Total
Exposure  to   Persistant  Pesticides  and  Other  Persistant
Pollutants  (CTEPP)  database (US  EPA,  2006).  Ideally,
conversion of text tables within CTEPP pdf documents would
be  tagged  and indexed  in  web-accessible  files such that
chemical structure-searches  on the  Internet  could  locate
relevant exposure  content. These sorts  of linkages  have the
potential to bring the toxicology and exposure science research
communities into closer alignment and foster more productive
interaction.

v-Liver™: Characterizing Toxicity Pathways and
Extrapolating Dose-Response
ToxCast™ has generated an unprecedented amount of rodent
data for discovering in vitro biomarkers of adverse outcomes in
vivo, which will be vital for prioritizing  chemicals for further
testing.  Rodent liver  toxicity is currently the most frequent
cause for  the  regulation of orally consumed environmental
chemicals. The Virtual Liver  Project  v-Liver™  will  utilize
ToxCast™  and other public and agency  data to  aid  in
extrapolating  in  vitro assays  to  clinical outcomes   across
chemicals, doses, genders, life stages and populations  (Kav-
lock et  al., 2007).  Virtual Tissues  offer a novel translational
paradigm  for  predicting  target  organ toxicity by  fusing
molecular and cellular systems modeling for physiologically
relevant simulation (Knudsen and  Kavlock, 2008). The goal
of v-Liver™ is to  quantitatively simulate liver injury  due to
chronic chemical  exposure  by  modeling the linkage  of
perturbed molecular  pathways with  adaptive or  adverse
processes leading to changes of cell state, and the integration
of this response through a dynamic cellular network giving rise
to macroscopic tissue alterations. Histopathology is  currently
the clinical gold standard for  estimating adverse  liver out-
comes.  In the long-term,  the Virtual Liver's ability  to
quantitatively predict tissue lesions from molecular and cellular
networks dynamics will help in accurately  assessing  human
risks from exposure to environmental stressors.
   The first phase  of v-Liver™ is a proof of concept focused
on a subset of ToxCast™ chemicals and apical toxicity end
points.  These  initially include  a  subset  of pathways from
nuclear  receptor activation to proliferative sublobular lesions in
    rodents through a combination of cellular mitogenic, muta-
    genic and regenerative proliferation processes. Currently, data
    are being gathered on relevant molecular, cellular and tissue-
    level quantitative data on chemicals with known toxicological
    profiles to cross-validate the in silica modeling approach. In
    addition,  qualitative and quantitative information on normal
    and pathologic processes across levels of biological organiza-
    tion is being curated in a knowledge base for virtual tissue
    construction. Finally,  the  multiscale  molecular, cellular and
    tissue responses are simulated via an agent-based modeling
    (ABM) approach. Here, the liver tissue is being conceptualized
    as an ecosystem of heterogeneous cells. The ABM approach
    attempts to faithfully model the microanatomic heterogeneity
    of the complex  hepatic acinus as  a network of parenchymal
    and nonparenchymal "agents" in a  nutrient and xenobiotic
    gradient. Agents model the dynamic behavior of liver cells to
    their microenvironment by processing endogenous and xeno-
    biotic  inputs through molecular circuits. ABM  approaches
    encapsulate molecular, cellular and tissue complexity effectively
    enabling in silica simulation of functional liver unit(s) across
    species, chemicals, doses  and times.  The modularity  of  the
    approach also simplifies integration with physiologic models at
    the organism scale.
     The  liver's response to environmental  chemicals spans
    multiple levels of organization — from molecular interactions
    to  alterations   in tissue  structure.   Novel  computational
    approaches are  required  to  ensure that information  on
    biological  effects  is developed at environmentally  relevant
    exposures. Integrating exposure, organism-level ADME and
    Virtual Tissues  is vital for  assessing the risk of  adverse
    outcomes in human populations. Significant work in exposure
    modeling  has  focused  on  application  of  mass-balance
    approaches to model chemical fate and transport from source
    to  individual to  internal  dose.  This  research  should  be
    continued  with   specific  emphasis  on  developing  inputs
    required  to  predict  response  of  biological   systems  to
    environmental  perturbations  such as those  required  for v-
    Liver™.  In addition,  there  continues to  be  significant
    challenges associated with modeling and predicting individual
    and population interaction with the environment. Similar to
    tissue-level biological  systems, imbedded complexity (e.g.,
    feedback,  multiple  scales,  multiple  stressors, etc.)   of  the
    higher-level  systems  requires  consideration  of a  novel
    approach. Conceptualizing a  population as an ecosystem of
    heterogeneous individuals and the individual as an ecosystem
    of heterogeneous  behavior will facilitate holistic modeling of
    human-environment interaction.  Use cases  for data  rich
    compounds  should be developed  to  test utility  of  this
    approach for identifying key exposure events that can support
    tissue-level predictions. In the   long-term, the  source-to-
    outcome modeling paradigm may become more integrative
    by  capitalizing  on emerging multiscale systems modeling
    approaches such  as those being  applied  in  the v-Liver™
    project.

Journal of Exposure Science and Environmental Epidemiology (2008),  1-6
                                      Previous
TOC
Next

-------
Exposure and complex
                                                                                                  Cohen Hubal et al.
Mechanistic Indicators of Childhood Asthma Study:
Understanding Environmental Factors of Complex Disease
Ultimately, the systems of primary interest to human-health
risk assessors are those at the individual and population level.
Emerging tools in molecular biology provide the potential to
develop cellular and molecular indicators of exposure that
can be  used to  assess  the  vulnerability  of humans  to
environmental  stressors.  The  Mechanistic  Indicators  of
Childhood  Asthma (MICA)  Study has been designed  to
incorporate  state-of-the-art  technologies to  examine  the
physiological  and  environmental factors that  interact  to
increase the risk of asthmatic responses (http://www.epa.gov/
dears/studies.htm).   Collected  markers of  susceptibility,
exposure   and  effects  are  being  used  to  analyze and
characterize  combined risk factors  that relate  to asthma
severity in a cohort of children. The MICA study provides an
opportunity  to  advance  a  system-based  approach  for
evaluating   complex   relationships   among   environmental
factors, physiological biomarkers  and health outcomes. This
study  also  provides a platform  for applying and testing
computational  approaches to  evaluate multifactorial-multi-
dimensional  data  that are  becoming standard  output  of
environmental health  and ecogenetic studies (NRC,  2008)
and to use these data for hypothesis  development.
   The MICA study is primarily  a clinically based observa-
tional  children's study. Multiple  measures of health status,
asthma severity, environmental exposure and gene expression
have been  collected in  a case/control cohort of 200 children
(aged  9-12 years).   Environmental  samples  have  been
collected with a focus  on  three broad classes of particulate-
associated chemicals: volatile organic compounds, metals and
polycyclic aromatic hydrocarbons. In the NCCT component
of the MICA  study, advanced statistical and machine
learning  methods  are  being applied in combination with
mechanistic  information  to  evaluate  multiple  types  of
biomarker data collected  in MICA (similar approaches for
combining data presented by Reif et al., in press). Methods
and tools are being applied to evaluate and visualize gene
expression data in novel ways. These approaches are being
used to characterize the relationship between rat and human
response,  to  characterize the relationship  between  gene
expression and biological response and to evaluate the utility
of gene expression data collected in a human cohort study for
understanding relationships between  exposure, susceptibility
and early  effects (Reif et al., in  preparation; Heidenfelder
et al., submitted).
   The MICA  study  provides  a case  example  for how
exposure information and computational toxicology has the
potential to provide a mechanistic interpretation of biomo-
nitoring data, whether these data are "classical" concentra-
tion measurements  or  "toxicogenomic" markers in relation
to exposure patterns, routes and pathways. As methods for
assessing health risks resulting from exposures to individual
environmental  pollutants improves,  environmental health

Journal of Exposure Science and Environmental Epidemiology (2008), 1-6
    scientists are turning attention toward characterizing relation-
    ships between multiple environmental factors and complex
    disease.  The  MICA  study  is  an example of  this shift.
    Computational  tools and  approaches for efficiently char-
    acterizing exposure potential of environmental compounds
    are required for screening and prioritizing as well as for
    environmental health studies. One possibility is to formulate
    an exposure classification index based on a limited set of
    metrics designed to efficiently cover exposure space. Applica-
    tion of environmental informatic approaches may help to
    identify the critical metrics for representing personal exposure
    over time, place, life stage and lifestyle or behavior. Such an
    approach could also inform exposure data collection at the
    personal, residential, community and ambient level.  The
    MICA study also  serves as an example of the type of study
    outlined in Figure  1  where biomarkers  of exposure and
    bioindicators of effects  are jointly determined and linked
    computationally to support modeling  for risk assessment.
   Exposure science for computational toxicology

   Clearly the new field of computational toxicology provides
   significant opportunities for exposure scientists.  The chal-
   lenge is to move forward and consider new approaches for
   measuring, modeling and assessing exposure to address 21st
   century research needs for environmental health risk assess-
   ment.  Consideration of analogies in hazard assessment may
   help to inform our path forward.
      The NRC Vision (2007) of a shift to characterizing toxicity
   pathways requires a commensurate  shift to characterizing
   exposure across all levels of biological organization (Figure 1).
   Interpretation of toxicogenomic hazard data  requires con-
   textual relevance. Pathways identified using HTS approaches
   such as those being  developed in the ToxCast  program are
   being  anchored to  apical  end  points using  conventional
   toxicity data. Similarly,  understanding relevant perturbations
   leading to these toxicogenomic end points require anchoring
   stressors to real-world human exposure (e.g., biomonitoring
   data and other conventional exposure metrics). As illustrated
   in the examples below, new approaches to  risk assessment
   require exposure  science  to  extend  beyond traditional
   boundaries  and predict exposures down  to the molecular
   level. This requires consideration of the interactions between
   exposure and effect and highlights the need for interdisciplin-
   ary teams to define these interactions.
      Suter (1999) notes that  conventionally risk assessment
   considered the process by which a release of a  contaminant
   results in exposure of a target or receptor. Induction of effects
   is assumed to  occur after the exposure process facilitating
   separate analysis of  exposure and hazard. However,  as risk
   assessment  has moved to  address  risks  resulting from
   exposures to multiple stressors, this assumption is no longer
   appropriate. Effects  at one organizational level affect others
                                     Previous
TOC
Next

-------
      Cohen Hubal et al.
                                                                                                                     Exposure and complex
resulting  in  complex health  outcomes.   So,  rather  than
considering  flow  of  contaminants  along  the  source-to-
outcome continuum, there is a need to  characterize cascades
of  alternating processes  and states in  an  overall network
(Suter, 1999).
  As  toxicity testing relies more on evaluating  the mode  of
action  for  compounds,  systems  approaches  describing the
molecular basis  of disease (Loscalzo et  al., 2007) are being
considered for risk assessment purposes (Edwards and Preston,
2008). With this approach, molecular networks for disease can
be generated (Schadt and Lum, 2006) and used to derive key
event networks for use in mode  of action determination.  The
resulting  networks describe  the  overall connectivity  of the
system along with the perturbations of that system resulting in
certain disease states. Mode of action can now be defined as the
perturbations of this  "normal" state by  a  specific stressor  or
mixture.  Such a holistic systems approach  demands exposure
metrics and models to characterize key  stressors at a level  of
resolution commensurate with that of the response or effects.
An  example  of this  type  of  approach  has previously  been
demonstrated for ecological risk  assessment (Ankley et al.,
submitted;  Ekman et al.,  2007, 2008).  Development  and
application of toxicogenomic molecular indicators of exposure
(e.g.,  Sen  et al., 2007)  and nanotechnology-based  sensors
(Weis,  2005)  provides the  potential to  mechanistically  link
traditional exposure metrics and  end points measured in HTS
assays. Together, a focus on mode of action and characteriza-
tion of stressors  at all levels of biological organization  enables
the vision for toxicity  testing in the 21st century set forth by the
NRC  (NRC, 2007)  by providing a framework in  which  to
interpret  toxicity pathway perturbations.
  The NCCT research program and the NRC vision present
tremendous   challenges   and  opportunities  for  exposure
science. In  May 2008, US  EPA established  a Community
of Practice in Exposure Science for Toxicity Testing, Screen-
ing  and  Prioritization (ExpoCop)  to provide  a forum for
promoting the advancement  of exposure science to begin  to
address some of the  challenges alluded to in this paper  (US
EPA,  2008). We look forward to a broad  participation from
the  exposure  science  community   as  we  continue   this
important dialog.

Disclaimer

The  US  Environmental  Protection  Agency,  through  its
Office of Research and Development funded and managed
the research described here. It has been subjected to Agency's
administrative review and approved for publication.
References

Ankley,  et al.  Endocrine disrupting chemicals  in  fish: developing exposure
   indicators and predictive models of effects based on mechanism of action.
   submitted.
    Collins F.S., Gray G.M., and Bucher J.R. Transforming environmental health
       protection. Science 2008: 319: 906-907.
    Dix D.J., Houck K.A., Martin M.T., Richard A.M., Setzer R.W., and Kavlock
       R.J. The ToxCast program for prioritizing toxicity testing of environmental
       chemicals. Toxicol Sci 2007: 95(1): 5-12.
    Edwards S.W., and Preston R.J. Systems biology and mode of action based risk
       assessment. Toxicol Sci 2008 doi:10.1093/toxsci/kfn!90.
    Ekman D.R., Teng Q., et al. NMR analysis of male fathead minnow urinary
       metabolites: a potential approach for studying impacts of chemical exposures.
       Aquat Toxicol 2007: 85(2): 104-112.
    Ekman D.R., Teng Q., et al. Investigating compensation and recovery of fathead
       minnow (Pimephales promelas)  exposed to 17alpha-ethynylestradiol with
       metabolite profiling. Environ Sci Technol 2008: 42(11):  4188-4194.
    Heidenfelder B.L., Reif D.M., Cohen Hubal E.A., Hudgens E.E., Bramble L.A.,
       Wagner J.G., Harkema J.R., Morishita M., Keeler G.J., Edwards S.W., and
       Gallagher J.E. Comparative microarray analysis and pulmonary morpho-
       metric  changes in brown Norway  rats  exposed to ovalbumin and/or
       concentrated air particulates. Submitted.
    Judson R., Richard A., Dix D., Houck K., Elloumi F., Martin M., Cathey T.,
       TransueT.R., Spencer R., and Wolf M. ACToR — aggregated computational
       toxicology resource. Toxicol Appl Pharmacol, Available online 18 July 2008.
    Kavlock R.J., Ankley G., Blancato J., Breen M., Conolly R., Dix D., Houck K.,
       Hubal E., Judson R.,  Rabinowitz J.,  Richard A., Setzer R.W., Shah I.,
       Villeneuve D., and Weber E. Computational toxicology a state of the science
       mini review. Toxicol Sci 2007: 103(1): 14-27.
    Knudsen T.B., and Kavlock R.J. Comparative Biomformatics and Computational
       Toxicology, Abbot B., and Hansen D. (Eds.). 3rd edn. Taylor & Francis 2009.
    Loscalzo J., Kohane I., and Barabasi A.L. Human disease classification in the
       postgenomic era:  a complex systems approach to human pathobiology.  Mol
       Syst Biol 2007: 3: 124.
    National Research Council of the National Academies (NRC).  Toxicity Testing in
       the 21st Century: A Vision and A Strategy. The National Academies Press,
       Washington, DC, 2007.
    National Research Council  of the National Academies (NRC). The National
       Children's Study Research Plan: A Review. The National Academies Press,
       Washington, DC, 2008.
    Richard A., Yang C, and Judson R. Toxicity data informatics: supporting a new
       paradigm  for toxicity prediction. Toxicol Mech Meth 2008: 18: 103-118.
    Richard A.M., Gold L.S.,  and Nicklaus M.C. Chemical structure indexing of
       toxicity data on the Internet:  Moving toward a flat world. Curr Opin Drug
       Discov Devel 2006: 9(3):  314-325.
    Reif D.M., et al. Integrating demographic, clinical,  and environmental exposure
       information to identify  genomic biomarkers associated with subtypes of
       childhood asthma. 2008 Joint Annual Conference ISEE/ISEA. Pasadena, CA.
    Reif D.M., Motsinger A.A., McKinney B.A.,  Edwards  K.M.,  Chanock  S.J.,
       Rock M.T., Crowe Jr J.E., and Moore J.H. Integrated analysis of genetic and
       proteomic data identifies biomarkers associated with systemic adverse events
       following  smallpox vaccination. Genes Immun, published online 16 October
       2008. doi:10.1038/gene.2008.80.
    Schadt E.E., and Lum P.Y. Thematic review series: systems biology approaches to
       metabolic and cardiovascular disorders. Reverse engineering gene networks to
       identify key drivers of complex disease phenotypes. / Lipid Res 2006:  47:
       2601-2613.
    Sen B., Mahadevan B., and DeMarini D.M. Transcriptional responses to complex
       mixtures — a review. Mutat Res 2007: 636(2007): 144-177.
    Suter G.W. Developing conceptual models for complex ecological risk assessments.
       Him Ecol Risk Assess 1999: 5(2): 375-396.
    U.S. EPA. A Framework for a Computational Toxicology Research Program. Office
       of Research and  Development, Washington,  DC, 2003 EPA 600/R-03/065
       http://www.ep a. gov/comp to x/publications/comptoxframework06_02_04.pdf.
    U.S. EPA. A Pilot Study of Children's Total Exposure to Persistent Pesticides and
       Other Persistent  Organic Pollutants (CTEPP). Volume  I:  Final  Report.
       Contract  Number 68-D-99-011, U.S. Environmental Protection Agency,
       Office of  Research and  Development, Research Triangle Park,  NC, 2006
       Available  online at http://www.epa.gov/heasd/ctepp/ctepp_report.pdf.
    U.S. EPA. EPA Community of Practice: Exposure  Science for Toxicity Testing,
       Screening,  and Prioritization 2008: http://www.epa.gov/ncct/practice_community/
       exposure_science.htmlAccessed September 16, 2008.
    Weis  B.K., Balshaw  D., Barr J.R., Brown D., Ellisman M.,  Lioy P., et al.
       Personalized exposure assessment: promising approaches for human environ-
       mental health research. Environ Health Perspect  2005: 113(7): 840-848.
                                                                    Journal of Exposure Science and Environmental Epidemiology (2008), 1-6
                                           Previous
TOC
Next

-------
                     Reproductive Toxicology, In Press





    Fetal malformations and early embryonic gene expression response in





           cynomolgus monkeys maternally exposed to thalidomide










Makoto Emaa*, Ryota Iseb, Hirohito Katoc, Satoru Onedad, Akihiko Hirosea, Mutsuko




Hirata-Koizumi3, Amar V. Singh6, Thomas B. Knudsen, and Toshio Iharac









        a Division of Risk Assessment, Biological Safety Research Center, National Institute




          of Health Sciences, Tokyo, Japan




        b Shin Nippon Biomedical Laboratories (SNBL), Ltd., Tokyo, Japan




        c Shin Nippon Biomedical Laboratories (SNBL), Ltd., Kagoshima, Japan




        d SNBL USA,  Ltd., Everett, WA, USA.




        e Contractor to NCCT, Lockheed-Martin, Research Triangle Park NC, USA 27711




        f National Center for Computational Toxicology (NCCT), U.S. Environmental




          Protection Agency, Research Triangle Park NC, USA 27711









      *   Corresponding author:




          Makoto Ema, DVM, PhD.




          Division of Risk Assessment, Biological Safety Research Center, National Institute




          of Health Sciences, 1-18-1, Kamiyoga, Setagaya-ku, Tokyo 158-8501, Japan




          Tel: +81-3-3700-9878




          Fax: +81-3-3700-1408




          E-mail: ema@nihs.go.jp
                    Previous  I     TOC

-------
ABSTRACT




The present study was performed to determine experimental conditions for thalidomide




induction of fetal malformations and to understand the molecular mechanisms underlying




thalidomide teratogenicity in cynomolgus monkeys. Cynomolgus monkeys were orally




administered thalidomide at 15 or 20 mg/kg/day on days 26-28 of gestation, and fetuses were




examined on day 100-102 of gestation. Limb defects such as micromelia/amelia, paw/foot




hyperflexion, polydactyly, syndactyly, and brachydactyly were observed in seven of eight




fetuses. Cynomolgus monkeys were orally administered thalidomide at 20 mg/kg on day 26 of




gestation, and whole embryos were removed from the dams 6 h after administration. Three




embryos each were obtained from the thalidomide-treated and control groups. Total RNA was




isolated from individual embryos, amplified to biotinylated cRNA and hybridized to a custom




Non-Human Primate (NHP) GeneChip® Array.  Altered genes were clustered into genes that




were up-regulated (1,281 genes) and down-regulated (1,081 genes) in thalidomide-exposed




embryos. Functional annotation by Gene Ontology (GO) categories revealed up-regulation of




actin cytoskeletal remodeling and insulin signaling, and down-regulation of pathways for




vasculature development and the inflammatory response. These findings show that thalidomide




exposure perturbs a general program of morphoregulatory processes in the monkey embryo.




Bioinformatics analysis of the embryonic transcriptome following maternal thalidomide




exposure has now identified many key pathways implicated in thalidomide embryopathy, and




has also revealed some novel processes that can  help unravel the mechanism of this important




developmental phenotype.




Key words: Thalidomide; Teratogenicity; Fetal malformation; Gene expression profile;




Embryo; Cynomolgus monkey
                      Previous  I     TOC

-------
1. INTRODUCTION




      Thalidomide (a-phthalimidoglutarimide) was synthesized in West Germany in 1953 by




the Chemie Griinenthal pharmaceutical firm, and was marketed from October 1957 into the




early 1960s. It was used for treating nausea and vomiting late during pregnancy and was also




said to be effective against influenza. The first case of the phocomelia defect, although not




recognized at the time as drug-related, was presented by a German scientist in 1959;




subsequently, malformed children were reported in 31  countries [1]. A pattern of defects of




limbs as well  as the ocular, respiratory, gastrointestinal, urogenital, cardiovascular and nervous




systems caused by maternal thalidomide exposure during early pregnancy was observed. Limb




defects such as phocomelia, amelia, micromelia, oligodactyly, and syndactyly were the most




common malformations [2]. After removal from the global market in 1962, thalidomide was




reintroduced in 1998 by the biotechnology firm Celgene as an immunomodulator for treatment




of erythema nodosum leprosum, a serious inflammatory condition of Hansen's disease, and in




orphan status  for treating Crohn's disease and several other diseases [1].









      Animal species are not equally susceptible or sensitive to the teratogenicity of chemical




agents, and some species respond more readily than others [3]. For thalidomide, a variety of




developmental toxic effects were reported in 18 animal species, but the responses have been




highly variable across species. Limb defects that mimic human thalidomide embryopathy have




only been observed and replicated in a few strains of rabbits and in primates  [1,3,4]. Eight of 9




subhuman primates treated with thalidomide showed characteristic limb reduction




malformations ranging from amelia to varying degrees of phocomelia at a dosage and timing




comparable to those observed in human thalidomide embryopathy [3,5]. Since the  first report of
                      Previous  I    TOC

-------
thalidomide appeared 50 years ago, considerable information regarding the therapeutic




applications of this drug has accumulated, but the mechanisms by which thalidomide produce




congenital malformations are still not well understood [2,3,5].









      The nonhuman primate Macaca fascicularis (cynomolgus monkey) is widely used in




prenatal developmental studies because of year-round rather than seasonal breeding behavior




[6]. Kalter [5] noted that nonhuman primates, especially macaques and baboons, are favorable




for mechanistic studies; however, only two full reports of the teratogenicity of thalidomide in




cynomolgus monkeys are available [7,8]. In those studies, cynomolgus monkeys were given




thalidomide by gavage at doses of 5 to 30 mg/kg/day during gestation days 20 to 30, and fetuses




were examined morphologically. The findings of these studies determined the critical  period




and doses of thalidomide required for production of fetal malformations in this macaque species.




Although amounts taken were not always accurately recorded in humans, available documents




show that typical malformations resulted from the ingestion of as little as 25 mg three times a




day or 100 mg/day for 3 days during the sensitive period, equivalent to an astonishingly small




dosage of about 1 mg/kg/day [5]. In teratology studies using cynomolgus monkeys, the timing




of dosing was comparable to the human one and the doses were estimated to be 5 to 30 times




higher those which produced typical malformations in humans [5,7,8].









      Knowledge of the patterns of altered gene expression in embryonic target organs on a




global scale is an important consideration for understanding the mechanisms of teratogenesis




[9-13]. The application of cDNA microarray technology, a genome-wide analysis technique, to




cynomolgus monkeys facilitates the rapid monitoring of a large number of gene alterations in
                      Previous  I    TOC

-------
this species [14]. In order to obtain information about the molecular mechanisms underlying the




detrimental effects of thalidomide teratogenicity, the present study has determined the




experimental conditions required to produce thalidomide-induced fetal defects that mimicked




human abnormalities in cynomolgus monkeys and then profiled altered patterns of gene




expression in these embryos during the critical period. The dosing used in the present study was




15 or 20 mg/kg/day thalidomide given by gavage to pregnant dams at days 26-28 of gestation




for teratological evaluation, and 20 mg/kg given on day 26 for gene expression profiling 6 h




post-treatment.









2. MATERIALS and METHODS




2.1. Teratological evaluation




        The teratology study was performed at SNBL USA, Ltd. (Everett, WA, USA) in




compliance with the Animal Welfare Act and recommendations set forth in The Guide for the




Care and Use of Laboratory Animals [15]. Only females showing 25-32 day menstrual cycles




were used in these experiments. Each female monkey was paired with a male of proven fertility




for three days between days 11-15 of the menstrual cycle. When copulation was confirmed, the




median day of the mating period was regarded as day 0 of gestation. Pregnancy was confirmed




on day 20 or day 25 by ultrasound (SSD-4000, Aloka Co., Mitaka, Japan) under sedation




induced by intramuscular injection of 5% ketamine hydrochloride (Sigma Chemical Co., St




Louis, MO, USA). The monkeys were given (±)-thalidomide (Lot no. SEH7050, Wako Pure




Chemical Industries, Ltd., Osaka, Japan) at 15 or 20 mg/kg/day by oral administration using




gelatin capsules (Japanese Pharmacopiae grade) on days 26 to 28 of gestation. The dosage was




adjusted to the body weight on day 25 of gestation. Cesarean section was performed on day
                      Previous  I     TOC

-------
100-102 of gestation under deep anesthesia induced by intramuscular injection of 5% ketamine




hydrochloride (0.1-0.2 ml/kg) and inhalation of isoflurane (0.5-2.0%, Baxter, Liberty Corner,




NJ, USA). Salivation was inhibited by atropine (0.01 mg/kg, Phenix Pharmaceutical, St. Joseph,




MO, USA). Fetal viability was recorded, and the fetuses were euthanized by intraperitoneal




injection of pentobarbital and phenytoin solution (Euthasol®, Virbac Corp., Fort Worth, TX,




USA). Fetuses were sexed and examined for external anomalies after confirmation of the




arrested heart-beat. After the completion of external examinations, fetuses were examined for




internal  abnormalities.









2. 2. Microarray experiments




        The animal experiments were performed at Shin Nippon Biomedical Laboratories




(SNBL), Ltd. (Kagoshima, Japan) in compliance with the Guideline for Animal




Experimentation (1987), and in accordance with the Law Concerning the Protection and Control




of Animals (1973) and the Standards Relating to the Care and Management of Experimental




Animals (1980). This study was approved by the Institutional Animal Care and Use Committee




of SNBL and performed in accordance with the ethics criteria contained in the bylaws of the




SNBL committee.









        Each female monkey was paired with a male of proven fertility for one day between




day 11 and day 15 of the menstrual cycle. Pregnant females, aged 5-8 years and weighing




2.84-3.76 kg on day 22 of gestation, were allocated randomly to two groups, each with three




monkeys, and housed individually. The monkeys were orally dosed with (±)-thalidomide (Lot




no.  SDH7273/SDJ3347, Wako Pure Chemical Industries, Ltd., Osaka, Japan) at 0 or 20 mg/kg
                      Previous  I     TOC

-------
by oral administration of a gelatin capsule on day 26 of gestation, which was during the critical




period for thalidomide-induced teratogenesis [7,8]. Dosage was adjusted to the body weight on




day 22 of gestation. Control monkeys received the capsule only.









2.3. RNA sample collection




        Hysterectomy was performed under terminal anesthesia at 6 h after the administration




of thalidomide on day 26 of gestation. Whole embryos were rapidly removed from the uterus




using a stereomicroscope and immersed in sterilized physiological saline. Three embryos each




in the thalidomide-treated and control groups were obtained for RNA analysis and stored at




-70 °C until further processing. General factors of maternal age, weight and date of processing




these samples are shown in Table 1. Embryos were processed simultaneously, and aside from




the blocking factors in Table 1, all 6 samples were handled concurrently through RNA isolation




and hybridization.
          Table 1: Procurement of cynomolgus embryos at SNBL for microarray study
Group
control



thalidomide


Embryo
001
002
003

101
102
103
Maternal age
in years
6
1
8

5
6
8
Maternal bw
in kg (day 22)
3.76
2.84
3.68

2.97
3.01
3.14
Date of embryo
collection (day 26)
Nov. 2, 2006
Dec. 2, 2006
Dec. 2, 2006

Oct. 30, 2006
Nov. 6, 2006
Nov. 24, 2006
*. eel filename
(NIHS)
137255bpcynall.cel
137256bpcynall.cel
137257bpcynall.cel

137258bpcynall.cel
137259bpcynall.cel
137260bpcynall.cel
                      Previous
TOC

-------
2.4. RNA preparation and labeling




     Total RNA was isolated from each day-26 embryo, amplified to cRNA, and biotin-labeled




for analysis on the Affymetrix NHP GeneChip® Array at Gene Logic Inc. (Gaithersburg, MD,




USA) using the TRIzol method and RNeasy columns according to protocols from Affymetrix




(Santa Clara, CA, USA). The 28S/18S rRNA ratio of isolated RNA was assessed using a




Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA) and found to be of sufficiently high




quality. Biotinylated cRNA was finally cleaned up and fragmented by limited hydrolysis to a




distribution of cRNA fragment sizes below 200 bases.









2.5. Affymetrix NHP GeneChip® Array and hybridization




        Biotinylated cRNA samples from control and exposed embryos (n=3 each) were




hybridized using Biogen Idec's (NASDAQ: BIIB) proprietary Affymetrix NHP GeneChip®




Array platform. This microarray chip contains a comprehensive representation of the




Cynomolgus genome derived from Biogen Idec's proprietary sequencing efforts, from which




Gene Logic (www.genelogic.com/) subsequently obtained the exclusive rights to provide as a




service (personal communication, Jun Mano, Gene Logic). The steps for hybridization followed




a protocol described in the Gene Logic GeneChip® Analysis Manual (Gaithersburg, MD, USA).




Probe-sets for this analysis consisted of cynomolgus expressed sequence tags (ESTs), published




rhesus monkey ESTs, predictive coding sequences from the rhesus genome, and human genes




not represented by monkey sequences. Because of the incomplete state of annotation for the




cynomolgus genome at the time this study was undertaken, we used human, mouse and rat gene




annotations to characterize monkey genes on the NHP GeneChip® Array. This reasonably




assumes that most cynomolgus sequences are well-annotated by human ortholog information.
                     Previous  I    TOC

-------
After hybridization the GeneChip® arrays were scanned and raw signal values were subjected




to subsequent normalization and processing.









2.6. Microarray data processing and analysis




     Probe-level data normalization from the 6 *.cel files used the robust multichip average




(RMA) method with perfect-match (PM) but not mismatch (MM) data from the microarrays.




RMA returns a single file containing the 51,886 probes in 6 columns of normalized data,




representing the Iog2-intensity of each probe. To query differential transcript abundance




between sample groups, the Iog2 ratio of treated (Q) to reference (R) was computed for all 6




samples, with R being the average of the 3 controls. The 6 columns were centered to MEDIAN




= 0.00 and scaled to STDEV = 0.50 [10,12]. These data were loaded to GeneSpring GX7.3




software (Agilent Technologies, Redwood City CA, USA) for one-way analysis of variance




(ANOVA) by treatment group. Due to the small sample size (n=3) and limited annotation of the




cynomolgus genome for this preliminary analysis we relaxed the selection criterion by not




applying a false-discovery rate filter. Genes or probes passing the statistical (ANOVA) filter at a




P-value of 0.05 were subjected to K-means clustering, with  cluster Set 1 and Set 2 that were




up-regulated and down-regulated, respectively, in the thalidomide-exposed versus control




embryos. Entrez gene identifiers were used for bioinformatics evaluation




(http://www.ncbi.nlm.nih.gov/).









3. RESULTS




3.1. Teratological evaluation
                      Previous  I    TOC

-------
      To confirm thalidomide embryopathy in the cynomolgus colony under the conditions




used for this study, pregnant dams were given thalidomide at 15 and 20 mg/kg on days 26-28 of




gestation. Four fetuses were obtained at each  dose for teratological evaluation (Table 2).




Although we did not observe a clear dose-response in this limited number of fetuses, we did




observe a number of cases with limb defects consistent with human thalidomide embryopathy.




Figure 1 shows external appearance of fetuses of dams exposed to thalidomide on GD 26-28.




Bilateral amelia in the fore-/hindlimbs was noted in one female fetus at 20 mg/kg, and bilateral




micromelia in the hindlimbs was observed in four fetuses at 15 mg/kg. Deformities of the paw




and/or foot including hyperflexion, ectrodactyly, polydactyly, syndactyly, brachydactyly, and/or




malpositioned digits, were observed in all fetuses at 15 mg/kg and in two fetuses at 20 mg/kg.




Tail anomalies were found in one fetus at 15 mg/kg and three fetuses at 20 mg/kg. Small penis




was noted in one fetus each in both thalidomide-treated groups. No internal  abnormalities were




noted in any of the thalidomide-treated fetuses examined here. This confirmed the relevant




sensitivity of cynomolgus embryos to thalidomide, based on a maternally administered dose of




15-20 mg/kg during days 26-28 of gestation.









3.2. Genes altered by thalidomide




       The embryonic transcriptome was evaluated at 6 h after 20 mg/kg maternal thalidomide




exposure on day 26. For this analysis, we used a proprietary Non-Human Primate (NHP)




microarray having representation of the cynomolgus genome (see Methods for details). The




NHP array includes 18,293 cynomolgus genes and 8,411 Rhesus genes as well as genes from




several other species. The 6-array dataset conforming to MIAME standards resides in the Gene




Expression Omnibus repository (www.ncbi.nlm.nih.gov/geo/) under platform accession number
                                          10
                      Previous  I    TOC

-------
GPL8393 (series GSM389350-389355). A thalidomide-sensitive subset of genes in the




embryonic transcriptome was reflected in the high-percentage of present calls for genes whose




expression levels showed >1.5-fold difference between thalidomide-treated and control




embryos.









     Statistical (ANOVA) analysis identified 2,362 genes that differed significantly between




control and thalidomide groups (P < 0.05). The heat map for these genes showed a clear pattern




(Figure 1). K-means clustering partitioned them into primary sets of up-regulated (1,281) genes




and down-regulated (1,081) genes for thalidomide relative to control embryos. Corresponding




files for the up-regulated (Set 1) and down-regulated (Set 2) genes are provided as a




supplement.
                      Previous
                                          11
TOC

-------
      Table 2. Morphological findings in fetuses of cynomolgus monkeys given thalidomide on days 26-28
Target
Findings

Forelimb
Amelia
Dose
Fetus no.
Gender


1 5 mg/kg
1234
Female Male Female Female

_
20 mg/kg
5678
Male Male Male Female

- - - B
Paw
  Hyperflexion               B        -
  Ectrodactyly               L        -
  Accessory digit(s) *         L        —
  Polydactyly *               -        R
              Brachydactyly -        -

Hindlimb
  Micromelia                B
  Amelia                    -        -

Foot
  Hyperflexion               -        B
  Ectrodactyly               -        B
  Polydactyly                -        -
  Syndactyly                 R        -
  Brachydactyly              -        -
  Malpositioned digit(s)       -        -
                  B
B        B        B
         B
         R

         B

         L
B
R
                                                         B
           B
B
Craniofacial

Trunk

Tail
  Bent or curled tail
  Short tail

External Genital Organs
  Small penis
- : No anomaly was observed.
+: Anomaly was observed.
B: Bilateral anomaly was observed.
R: Unilateral (right side) anomaly was observed.
L: Unilateral (left side) anomaly was observed.
* Polydactyly means (almost) complete extra digits existed, and accessory digit incomplete "digit like tissue"
attached to a normal digit.
                                                  12
                          Previous
         TOC

-------
Figure 1. Malformed fetuses of cynomolgus maternal monkeys exposed to thalidomide on CDs 26-28.




A) The fetus of maternal monkey given thalidomide at 15 mg/kg/day exhibiting brachydactyly in the




paw, micromelia in the hindlimb, hyperflexion, ectrodactyly and brachydactyly in the foot and curled




tail. B) The fetus of maternal monkey given thalidomide at 20 mg/kg/day exhibiting amelia in the fore-




and hindlimb and bent tail.
                       Previous
                                            13
TOC

-------
Figure 2. Molecular abundance profiles of the thalidomide-sensitive genes in the cynomolgus




embryonic transcriptome on day 26 of gestation. RNA was isolated from day 26 embryos 6 h after




maternal  exposure to 20  mg/kg thalidomide or vehicle  control.  Values represent Log2 ratios of




treated/reference, where the reference is an average of all three controls for each gene. ANOVA returned




2362 genes that were significantly different between the groups (n=3, P < 0.05). The heat map visualizes




the genes in rows and the embryos in columns, and the histogram shows the distribution of genes in each




cluster. Columns left to right:  1-3 from control embryos (#001, #002,  #003) and 4-6 from thalidomide




embryos (#101, #102, #103). Genes were partitioned by K-means clustering into two primary expression




clusters with 1,281 up-regulated genes (red) and 1,081 down-regulated genes (green).
                                             14
                        Previous
TOC

-------
3.3. Annotation systems




     Ranking functional categories of genes in an expression cluster is an important step to




unravel the cellular functions and pathways represented in the differentially expressed gene list.




To derive the highest-ranking biological themes across the up-/down-regulated gene lists,




Entrez gene IDs were annotated by Gene Ontology (GO) category using the Database for




Annotation, Visualization, and Integrated Discovery (http://appsl.niaid.nih.gov/david/). Table 3




lists the significantly over-represented themes when the 1,281 up-regulated genes (Table 3 A)




and 1,081 down-regulated genes (Table 3B) were mapped by GO category. We used lev el-4




annotation for Biological Processes, Cellular component and Molecular Function as well as




curated pathways from the KEGG (Kyoto Encyclopedia of Genes and Genomes) open source




pathway resource to obtain categories passing by Fisher exact test (P < 0.05). For clarity and




greater specificity we limited the categories in Table 3 to those having at least 10 hits for




sensitivity and no more than 50 hits to improve specificity. Supplemental Table 3 provided




electronically includes the gene identifiers for each category.









     Integrated biological processes evident across the up-regulated categories addressed the




regulation of cellular growth, including cell cycle progression, DNA repair and nucleic acid




transport. Other up-regulated biological processes addressed the regulation of metabolism, the




cytoskeletal cycle, heart development and vesicle transport. Many of these processes were




logically reflected in the ontologies for cellular  components addressing the nucleo-ribosomal




system, the microtubule network, and molecular functions for GTPase activity and actin binding.




Up-regulated signaling pathways (KEGG) included several oncogenic growth pathways as well




as the TGF-beta, GnRH and insulin signaling pathways.
                                           15
                      Previous   I     TOC

-------
    Table 3A. GO-annotated biological categories for genes up-regulated in the embryo following maternal thalidomide
                                                       exposure
Category

GOTERM_BP_4
GO:0015931
GO:0050658
GO:0050657
GO:0051236
GO:0051028
GO:0045941
GO:0007507
GO:0051276
GO:0006281
GO:0022618
GO:0031325
GO:0009893
GO:0051169
GO:0016481
GO:0006461
GO:0045786
GO:0009892
GO:0031324
GO:0000074
GO:0051726
GO:0007010
GO:0016192

GOTERM_CC_4
GO:0005830
GO:0005681
GO:0000785
G 0:0031965
Term

Biological Process (level 4)
nucleobase, nucleoside, nucleotide and nucleic acid transport
RNA transport
nucleic acid transport
establishment of RNA localization
mRNA transport
positive regulation of transcription
heart development
chromosome organization and biogenesis
DNA repair
protein-RNA complex assembly
positive regulation of cellular metabolic process
positive regulation of metabolic process
nuclear transport
negative regulation of transcription
protein complex assembly
negative regulation of progression through cell cycle
negative regulation of metabolic process
negative regulation of cellular metabolic process
regulation of progression through cell cycle
regulation of cell cycle
cytoskeleton organization and biogenesis
vesicle-mediated transport

Cellular component (level 4)
cytosolic ribosome (sensu Eukaryota)
spliceosome
chromatin
nuclear membrane
'ount
15
13
13
13
11
40
15
45
28
12
42
44
14
28
27
19
38
32
42
42
41
39
PValue
0.001
0.002
0.002
0.002
0.007
0.000
0.006
0.000
0.001
0.035
0.000
0.000
0.035
0.003
0.005
0.022
0.002
0.009
0.005
0.005
0.008
0.013
List
Total
694
694
694
694
694
694
694
694
694
694
694
694
694
694
694
694
694
694
694
694
694
694
Pop
Hits
100
87
87
87
79
326
128
394
267
116
416
445
145
300
295
209
436
387
526
529
526
509
Pop
Total
13532
13532
13532
13532
13532
13532
13532
13532
13532
13532
13532
13532
13532
13532
13532
13532
13532
13532
13532
13532
13532
13532
Log 2 Fold
Change
+2.92
+2.91
+2.91
+2.91
+2.71
+2.39
+2.28
+2.23
+2.04
+2.02
+ 1.97
+ 1.93
+ 1.88
+ 1.82
+ 1.78
+ 1.77
+ 1.70
+ 1.61
+ 1.56
+ 1.55
+ 1.52
+ 1.49
10
16
22
15
0.017
0.004
0.001
0.012
743
743
743
743
76
134
194
136
14201
14201
14201
14201
+2.51
+2.28
+2.17
+2.11
                                                                  16
                                             Previous

-------
GO:0012506
GO:0005874
GO:0005635
GO:0005768
GO:0005694
GO:0030529
vesicle membrane
microtubule
nuclear envelope
endosome
chromosome
ribonucleoprotein complex
GO TERM  MF 4  Molecular Function (level 4)
GO:0051427       hormone receptor binding
GO:0051020       GTPase binding
GO:0003712       transcription cofactor activity
GO:0003779       actin binding
GO:0008234       cysteine-type peptidase activity

KEGG_PATHWAY
hsa05220          Chronic myeloid leukemia
hsa05222          Small cell lung cancer
hsa05215          Prostate cancer
hsa04350          TGF-beta signaling pathway
hsa04912          GnRH signaling pathway
hsa04910          Insulin signaling pathway

   Table 3B. GO-annotated biological categories for genes down-regulated in the embryo following maternal thalidomide
                                                      exposure
 Category

 GOTERM_BP_4
 GO:0008284
 GO:0007517
 GO:0009889
 GO:0006417
 GO:0032940
 GO:0001944
 Term

 Biological Process (level 4)
 positive regulation of cell proliferation
 muscle development
 regulation of biosynthetic process
 regulation of translation
 secretion by cell
 vasculature development
13
23
18
18
32
41
10
11
41
27
15
10
11
11
11
11
14
i the el*
Count

24
16
18
14
23
15
0.030
0.005
0.015
0.028
0.011
0.047
0.001
0.003
0.000
0.002
0.027
0.016
0.016
0.016
0.020
0.026
0.025
nbryofoll
PValue

0.000
0.006
0.005
0.027
0.004
0.026
743
743
743
743
743
743
578
578
578
578
578
225
225
225
225
225
225
'owing
List
Total
556
556
556
556
556
556
125
233
182
196
385
584
57
78
311
302
172
74
87
87
90
94
134
'.rnal ti
Pop
Hits
240
177
207
174
287
191
14201
14201
14201
14201
14201
14201
12599
12599
12599
12599
12599
4214
4214
4214
4214
4214
4214
halidomid
Pop
Total
13532
13532
13532
13532
13532
13532
+ 1.99
+ 1.89
+ 1.89
+ 1.76
+ 1.59
+ 1.34
+3.82
+3.07
+2.87
+ 1.95
+ 1.90
+2.53
+2.37
+2.37
+2.29
+2.19
+ 1.96
e
Log2Fold
Change
-2.43
-2.20
-2.12
-1.96
-1.95
-1.91
                                                                17
                                            Previous

-------
GO:0045045
GO:0051246
GO:0006873
GO:0006954
GO:0016192
GO:0042127
GO:0019752
GO:0046907

GOTERM_CC_4
GO:0005625
GO:0005768
GO:0005789
GO:0044432
GO:0005624
GO:0005783

GOTERM_MF_4
GO:0030594
GO:0051020
GO:0016747
GO:0004175
secretory pathway
regulation of protein metabolic process
cellular ion homeostasis
inflammatory response
vesicle-mediated transport
regulation of cell proliferation
carboxylic acid metabolic process
intracellular transport

Cellular component (level 4)
soluble fraction
endosome
endoplasmic reticulum membrane
endoplasmic reticulum part
membrane fraction
endoplasmic reticulum

Molecular Function (level 4)
neurotransmitter receptor activity
GTPase binding
transferase activity, transferring other than amino-acyl groups
endopeptidase activity
KEGG_PATHWAY
hsa04640           Hematopoietic cell lineage
hsa04612           Antigen processing and presentation
18
23
16
22
35
34
36
40
0.020
0.008
0.031
0.012
0.004
0.005
0.012
0.043
556
556
556
556
556
556
556
556
239
307
214
301
509
499
572
714
13532
13532
13532
13532
13532
13532
13532
13532
                                                           12
                                                           10
0.005
0.024
223
223
85
80
4214
4214
                                     -1.83
                                     -1.82
                                     -1.82
                                     -1.78
                                     -1.67
                                     -1.66
                                     -1.53
                                     -1.36
21
15
28
30
44
46
0.004
0.039
0.031
0.047
0.026
0.049
602
602
602
602
602
602
244
196
435
494
749
827
14201 -:
14201
14201
14201
14201
14201
>.03
.81
.52
.43
.39
.31
14
11
15
31
0.000
0.002
0.028
0.012
531
531
531
531
99
78
188
463
12599
12599
12599
12599
-3.36
-3.35
-1.89
-1.59
-2.67
-2.36
Results for the embryo 6 h after a teratogenic dose of thalidomide (20 mg/kg) on day-26 of gestation for 1,281 significantly up-regulated genes
(Table 3A) and 1,081  significantly down-regulated genes (Table 3B) based on the population of arrayed genes. The annotated system used the
NIH/NIAID Database for Annotation, Visualization, and Integrated Discovery (DAVID) at level 4. Count refers to the number of altered genes in
the ontology (min = 10 and max = 50). P Value refers to results from Fisher exact test (P < 0.05); List Total refers to the number of annotated
genes on the array; Pop Hits and Pop Total refers to the  number of annotated genes in the database for the category and overall; Log2 Fold
Change is computed as the mean Log2 (treated / control) for genes in the category . Note: see electronic supplement Table 2 for gene identifiers
in each listing.


                                                                  18
                                             Previous

-------
     Integrated biological processes evident across the down-regulated categories addressed




ion homeostasis and cellular secretion. These processes were logically reflected in the




ontologies for cellular components addressing the endoplasmic reticulum, GTPase activity




and transferases. Other down-regulated biological processes addressed cell growth, muscle




and vasculature development, and the inflammatory response - consistent with KEGG




pathways for hematopoietic cells and antigen processing.









4. DISCUSSION









     The results from this study show that a teratogenic dose of thalidomide (20 mg/kg)




significantly alters global gene expression profiles in the cynomolgus monkey embryo within




6 h of exposure on day 26 of gestation. Bioinformatics analysis of the embryonic




transcriptome following maternal thalidomide exposure revealed up-regulation in several




signaling pathways with roles in morphogenesis and oncogenesis (e.g., TGF-beta, insulin




signaling), and down-regulation of the endoplasmic reticulum and inflammatory response. As




might be anticipated, this implies a broad reaction of the embryo to the mechanism of




thalidomide and a generalized reprogramming of pathways known to be important in




development and teratogenesis.









    The dosing scenario used in the present study was 15 or 20 mg/kg/day thalidomide given




by gavage to pregnant dams at days 26-28 of gestation for teratological evaluation, and 20




mg/kg given on day 26 for gene expression profiling 6 h post-treatment. The teratological




exposure induced limb malformations consistent with earlier studies with thalidomide in
                                         19
                      Previous  I     TOC

-------
pregnant macaques. For example, it was previously reported that two fetuses with amelia




were obtained from two of four cynomolgus monkeys given thalidomide by gavage at 10




mg/kg/day on days 32 to 42 after commencement of menses (approximately equivalent to




days 20 to 30 of gestation) and that the fetal malformations were similar to malformations




reported in children whose mothers had taken thalidomide during pregnancy [7]. Forelimb




malformations in the cynomolgus fetus were noted following a single oral administration of




thalidomide on days 25, 26 or 27 of gestation at 10 and 30 mg/kg and daily administration on




days 25 to 27 of gestation at 5 mg/kg, and both fore- and hindlimb malformations were




observed following a single oral administration on day 25 or 28 of gestation at 30 mg/kg [8].




The present study, taken together with the previous studies [7,8], indicate that orally




administered thalidomide induces fetal malformations in cynomolgus monkeys similar to




human pregnancies and furthermore localizes the vulnerable period to days 25 to 28 of




gestation and the effective doses  to 5 to 30 mg/kg/day.









        Given the limitations of working with this species the preliminary application of a




custom NHP microarray, the analysis at one dose and time point, and the incomplete state of




annotation of the macaque genome, the current study design focused on RNA collected from




individual embryos rather than the specific target organ system (forelimb, hindlimb). Ideally a




follow-up study on focused gene expression analysis should be performed for specific




embryonic limbs in which malformations have been induced with thalidomide; however, the




present study is among the first to provide genomic information on the initial changes in gene




expression occurring in macaque embryos during the critical events following a teratogenic




dose of thalidomide. A total of 43 and 26 functional categories of redundant genes were up-
                                         20
                      Previous  I    TOC

-------
and down-regulated, respectively, based on the GO annotation system for human Locus Link




identifiers.









        Statistically, the top-ranked 20 up-regulated genes included 4 hits to cell shape and




polarity genes: KIAA0992 (twice), FNML2, FMNL3. Palladin, encoded by the KIAA0992




gene, plays a role in cytoskeletal organization, embryonic development, cell motility, and




neurogenesis [16]. Formin-related proteins play a role in Rho GTPase-dependent regulation




of the actin cytoskeletal cycle and have been implicated in morphogenesis, cell movement




and cell polarity [17]. Several genes in the focal adhesion/actin cytoskeleton pathway were




up-regulated. Guanine nucleotide exchange factors (GEFs) DOCK1, which forms a complex




with RhoG, and VAV2 and ARHGEF7 that act on Rho family GTPases, play a fundamental




role in small G-protein signaling pathways that regulate numerous cellular processes




including actin-cytoskeletal organization [18-22]. To further understand the mechanisms of




thalidomide induced-teratogenicity the regional and developmental stage of expression for




these genes and corresponding proteins should be determined; however, these preliminary




findings suggest that thalidomide perturbs a general program involving the up-regulation of




Rho  family GTPases and their GEFs.









        One candidate pathway for the control of cytoskeletal remodeling evident in studies




of early induction of the Fetal Alcohol Syndrome (FAS) in mouse embryos is the receptor




tyrosine kinase (RTK) signaling pathway, mediating insulin-like growth factors [12]. Genes




in the RTK insulin signaling pathway were significantly up-regulated by thalidomide




treatment as in FAS. AKT1 and GSK3(5, which were up-regulated by thalidomide, are key
                                         21
                      Previous  I    TOC

-------
genes in this pathway. AKT1, a serine-threonine protein kinase, is regulated by PDGF and




insulin through PI-3 kinase signaling [23-25]. GSK3(5, a substrate of AKT, is a




proline-directed serine-threonine kinase that was initially identified as a phosphorylating and




inactivating glycogen synthase [26]. IGF-I and IGF-II are expressed in the anterior and




posterior mesodermal cells of the developing limbs [27-29]. IGF-I can influence chick limb




outgrowth [29-31] and regulate muscle mass  during early limb myogenesis [32]. Although




these facts may implicate IGF signals as a potential mediator of thalidomide embryopathy, the




present study did not find significant expression or thalidomide-induced alteration in the




global pattern of several key transcripts in this signaling pathway, including IGFBPs 13, 5,  6




and 7, IGF1, IGF1R, and IRS 14 (data not shown).  It is certainly plausible that thalidomide




exposure may locally alter upstream events in IGF-1 signaling without necessarily altering the




molecular abundance profiles of the pathway in the developing limb of monkey embryos. On




the other hand, our preliminary microarray analysis does find evidence for up-regulation of




GSK3(5 and AKT1 transcripts that are downstream in the insulin signaling pathway. Effects




on TGF-beta and WNT signaling may be  critical here. Thalidomide-induced oxidative stress




in chick embryos can enhance signaling through BMPs (bone morphogenetic proteins),




leading to up-regulation of the WNT antagonist Dickkopfl (Dkkl) and subsequent cell  death




[33]. We note here a significant up-regulation of genes in the TGF-beta pathway and




similarities with genes in the cytoskeletal  cycle and WNT pathways for the murine FAS [12].









        Some of the responsive genes found in this study are known to play roles in vascular




development pathways. For example, vascular endothelial growth factor (VEGF) was




down-regulated and platelet-derived growth factor receptor (5 (PDGFR(5) was up-regulated
                                         22
                      Previous  I     TOC

-------
during early stages in thalidomide embryopathy. VEGF is a key stimulator of vascular cell




migration and proliferation and acts directly on endothelial cells, whereas PDGF attracts




connective tissue cells that can also stimulate angiogenesis. The reciprocal effect on these




transcript profiles, potentially leading to an overall decrease in VEGF/ PDGFRP activities,




might be predicted to interfere with vascular cell recruitment and proliferation in the




developing embryo or limb. It is well known that thalidomide reduces the activity or




production of VEGF and TNF-a, leading to inhibition of angiogenesis [34]. The present




microarray data are consistent with this effect. Furthermore, VEGF stimulates PDGFRP and




induces tyrosine phosphorylation [35]. The reciprocal effect that maternal thalidomide




exposure had on these transcripts may suggest a key event in the programming or induction of




vascular cells or their progenitors has been disrupted within 6 h after exposure. This notion is




supported by the study of D'Amato et al. [36] that suggested limb defects caused by




thalidomide were secondary to inhibition of blood vessel growth in the developing limb bud.




Down-regulation of the vascular development program is consistent with this notion and with




the supposition that correct limb bud formation requires a complex interaction of both




vasculogenesis and angiogenesis during development [37]. Perhaps these genes might be




considered as potential biomarkers of thalidomide-induced teratogenesis in cynomolgus




monkeys. A recent study with the teratogenic thalidomide metabolite, CPS49, has shown




direct evidence for suppression of endothelial angiogenetic sprouting and failure to  establish a




normal vascular network as a key event in thalidomide embryopathy [38]. CPS49 mimics the




antiangiogenic properties, but not anti-inflammatory properties, of thalidomide.
                                         23
                      Previous  I     TOC

-------
        Finally, the inflammatory response pathway was found to be significantly




down-regulated in the early thalidomide embryome. Although down-regulation of the




inflammatory response might be anticipated to protect the embryo, studies in laboratory




animals have implicated a role for reactive oxygen species (ROS) in thalidomide




embryopathy [39]. In that study, thalidomide was found to preferentially increase ROS in




embryonic limb cells from a sensitive species (rabbit) but not the insensitive species (rat).




Down-regulation of the inflammatory pathways in thalidomide-exposed monkey embryos




reinforces this notion.









      In conclusion, these findings show that thalidomide exposure perturbs a general




program of morphoregulatory processes in the cynomolgus monkey embryo. Bioinformatics




analysis has now identified many key pathways implicated in thalidomide embryopathy in




cynomologus monkeys, and has also revealed some novel processes that can help unravel the




mechanism of this important developmental phenotype. Several pathways, including actin




cytoskeleton remodeling and downstream insulin signaling-related genes, in addition to




vascular development pathways may provide  candidate biomarkers for key events underlying




the teratogenicity of thalidomide in primates.  To clarify the molecular mechanisms further




studies must examine protein expression, phosphorylation, and other modifications in the




precursor target organ system.
                                         24
                      Previous   I     TOC

-------
ACKNOWLEDGEMENTS









       This work was partially supported by Health and Labour Sciences Research Grants




(Research on Regulatory Science of Pharmaceuticals and Medical Devices:




H16-Kenkou-066; Research on Risk of Chemical Substances: H17-Kagaku-001;) from the




Ministry of Health, Labour and Welfare of Japan. The bioinformatics analysis was performed




at the National Center for Computational Toxicology, US EPA. Authors are grateful to Dr.




Robert MacPhail of EPA's National Health and Environmental Effects Research Laboratory




for helpful comments on the manuscript.









       Disclaimer: The U.S. EPA, through its Office of Research and Development




collaborated in the research described here. It has been subjected to agency review and




approved for publication. The authors declare they have no competing financial interests.
                                        25
                      Previous  I     TOC

-------
REFERENCES




[1] Schardein JL, Macina OT. Thalidomide: In: Human Developmental Toxicants-Aspects of




    Toxicology and Chemistry, Boca Raton: CRC press, Taylor & Francis Group; 2007:




    127-41.




[2] Hansen JM, Carney EW, Harris C. Differential alteration by thalidomide of the




    glutathione content of rat vs. rabbit conceptuses in vitro. Reprod Toxicol




    1999;13:547-54.




[3] Schardein JL. Thalidomide: the prototype teratogen: In: Chemically Induced Birth Defects,




    3rd edition, revised and expanded, New York: Marcel Dekker Inc; 2000: 89-120.




[4] Teo SK, Denny KH, Stirling DI, Thomas SD, Morseth S, Hoberman AM. Effects of




    thalidomide on developmental, peri- and postnatal function in female New Zealand white




    rabbits and offspring. Toxicol Sci 2004;81:379-89




[5] Kalter H. Thalidomide: In: Teratology in the Twentieth Century-Congenital




    Malformations in Humans and how their Environmental Causes were Established,




    Amsterdam: Elsevier Science; 2003:167-75.




[6] Yoshida T. Introduction: In: The TPRC Handbook on the Care and Management of the




    Laboratory Cymonolgus Monkey, Yoshida T, Fujimoto K eds, Tokyo: Springer Japan;




    2006:1-3.




[7] Delahunt CS, Lassen LJ. Thalidomide syndrome in monkeys. Science 1964;146:1300-5.




[8] Hendrickx AG. The sensitive period and malformation syndrome produced by




    thalidomide in the crab-eating monkey (Macacafascicularis). J Med Prim




    1973;2:267-76.
                                        26
                     Previous  I    TOC

-------
[9] Finnell RH, Gelineau-van Waes J, Eudy JD, Rosenquist TH. Molecular basis of




    environmentally induced birth defects. Ann Rev Pharmacol Toxicol 2002;42:181-208.




[10] Singh AV, Knudsen KB, Knudsen TB. Computational systems analysis of




    developmental toxicity: design, development and implementation of a birth defects




    systems manager (BDSM). Reprod Toxicol 2005; 19,421-39.




[11] Daston, G.P. Genomics and developmental risk assessment. Birth Defects Res (Part A)




    2007;79:l-7.




[12] Green ML, Singh AV, Zhang Y, Nemeth KA, Sulik KK, Knudsen TB. Reprogramming




    of genetic networks during initiation of the fetal alcohol syndrome. Dev Dyn




    2007;236:613-31.




[13] Knudsen TB and Kavlock RJ. Comparative bioinformatics and computational toxicology.




    In: Developmental Toxicology volume 3, Target Organ Toxicology Series. (B Abbott and




    D Hansen, editors) New York: Taylor and Francis, 2008;pp 311-360.




[14] Gene Logic. NHP GeneChip® Array Service [accessed September 14, 2007]




    http://www.genelogic.com/docs/pdfs/NHP W.pdf




[15]  Institute of Laboratory Animal Research, Commission of Life Sciences,  National




    Research Council. Guide for the Care and Use of Laboratory Animals, Washington DC:




    The National Academies Press; 1996.









[16] Otey CA, Rachlin A, Moza M, Arneman D, Carpen O. The




    palladin/myotilin/myopalladin family of actin-associated scaffolds. Int Rev Cytol




    2005;246:31-58.
                                       27
                     Previous   I     TOC

-------
[17] Yayoshi-Yamamoto S, Taniuchi I, Watanabe T. FRL, a novel formin-related protein,




    binds to Rac and regulates cell motility and survival of macrophages. Mol Cell Biol




    2000;20:6872-81.




[18] Marignani PA, Carpenter CL. Vav2 is required for cell spreading. J Cell Biol




    2001;154:177-86.




[19] Brugnera E, Haney L, Grimsley C, Lu M et al. Unconventional Rac-GEF activity is




    mediated through the Dockl80-ELMO complex. Nature Cell Biol 2002;4:574-582.




[20] Katoh H, Negishi M. RhoG activates Racl by direct interaction with the




    DocklSO-binding protein Elmo. Nature 2003;424:461-4.




[21] Rosenberger G, Jantke I, Gal A, Kutsche K. Interaction of alphaPIX (ARHGEF6) with




    beta-parvin (PARVB) suggests  an involvement of alphaPIX in integrin-mediated




    signaling. Hum Mol Genet 2003;12:155-67.




[22] Shin EY, Woo KN, Lee CS, Koo SH, Kim YG, Kim WJ, Bae CD, Chang SI, Kim EG.




    Basic Fibroblast Growth Factor Stimulates Activation of Racl through a p85 PIX




    Phosphorylation-dependent Pathway. J Biol Chem 2004;279:1994-2004.




[23] Burgering BM, Coffer PJ. Protein kinase B (c-Akt) in phosphatidylinositol-3-OH kinase




    signal transduction. Nature 1995;376:599-602.




[24] Franke TF, Yang SI, Chan TO, Datta K, Kazlauskas A, Morrison DK, Kaplan DR,




    Tsichlis PN. The protein kinase encoded by the Akt proto-oncogene is a target of the




    PDGF-activatedphosphatidylinositol 3-kinase.  Cell 1995;81:727-36.




[25] Kohn AD, Kovacina KS, Roth  RA. Insulin stimulates the kinase activity of RAC-PK, a




    pleckstrin homology domain containing ser/thr kinase. EMBO J 1995;14:4288-95.
                                        28
                     Previous  I    TOC

-------
[26] Cross DA, Alessi DR, Cohen P, Andjelkovich M, Hemmings BA. Inhibition of glycogen




    synthase kinase-3 by insulin mediated by protein kinase B. Nature 1995;378:785-9.




[27] Streck RD, Wood TL, Hsu MS, Pintar JE. Insulin-like growth factor I and II and




    insulin-like growth factor binding protein-2 RNAs are expressed in adjacent tissues




    within rat embryonic and fetal limbs. Dev Biol 1992;151:586-96.




[28] van Kleffens M, Groffen C, Rosato RR, van den Eijnde SM, van Neck JW,




    Lindenbergh-Kortleve DJ, Zwarthoff EC, Drop SL. mRNA expression patterns of the




    IGF system during mouse limb bud development, determined by whole mount in situ




    hybridization. Mol Cell Endocrinol 1998;138:151-61.




[29] Stephens TD, Bunde CJ, Fillmore BJ. Mechanism of action in thalidomide teratogenesis.




    Biochem Pharmacol 2000;59:1489-99.




[30] Dealy CN, Kosher RA. Studies on insulin-like growth factor-I and insulin in chick limb




    morphogenesis. Dev Dyn 1995;202:67-79.




[31] Dealy CN, Kosher RA. IGF-I, insulin and FGFs induce outgrowth of the limb buds of




    amelic mutant chick embryos.Development 1996;122:1323-30.




[32] Mitchell PJ, Johnson SE, Harmon K. Insulin-like growth factor I stimulates myoblast




   expansion and myofiber development in the limb. Dev Dyn 2002;;223:12-23.




[33] Knobloch J, Shaughnessy JD Jr, and Riither U. Thalidomide induces limb deformities by




    perturbing the Bmp/Dkkl/Wnt signaling pathway. FASEB J 2007;21:1410-21.




[34] Eisen T, Boshoff C, Mak I, Sapunar F, Vaughan MM, Pyle L, Johnston SR, Ahern R,




    Smith IE, Gore ME. Continuous low dose Thalidomide: a phase II study in advanced




    melanoma, renal cell, ovarian and breast cancer. Br J Cancer 2000;82:812-7.
                                        29
                     Previous  I    TOC

-------
[35] Ball SG, Shuttleworth CA, and Kielty CM. Vascular endothelial growth factor can signal




    through platelet-derived growth factor receptors. J Cell Biol 2007;177:489-00.




[36]  D'Amato RJ, Loughnan MS,  Flynn E,  Folkman J. Thalidomide is  an inhibitor of




    angiogenesis. ProcNatl Acad Sci USA 1994;91:4082-5.




[37] Seifert R, Zhao B, Christ B. Cytokinetic studies on the aortic endothelium and limb bud




    vascularization in avian embryos. Anat Embryol (Berl) 1992; 186:601-10.




[38] Therapontos C, Erskine L, Gardner ER, Figg WD, Vargesson N. Thalidomide induces




    limb defects by preventing angiogenic outgrowth during early limb formation. PNAS




    Early Edition 2009; www.pnas.org/cgi/doi/10.1073/pnas.0901505106.




[39] Hansen JM, Harris KK, Philbert MA, Harris C. Thalidomide modulates nuclear redox




    status and preferentially depletes glutathione in rabbit limb versus rat limb. J Pharmacol




    Exp Therap 2002;300:768-76.
                                        30
                     Previous  I     TOC

-------
            Systems Biology in Reproductive Medicine
Inhibition of Rat and Human Steroidogenesis by Triazole
                       Antifungals
Journal:
i
Manuscript ID:
i
Manuscript Type:
1
Date Submitted by the
Author:
Complete List of Authors:
i
Keywords:
Systems Biology in Reproductive Medicine
draft
Research Article

Dix, David; U.S. EPA
myclobutanil, CYP17, testosterone, triadimefon, propiconazole
i
                        Manuscript Central
 URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
            Previous
TOC

-------
Page 2 of 30                         Systems Biology in Reproductive Medicine


1
2
3
4
5
6

„                     Inhibition of Rat and Human Steroidogenesis by Triazole Antifungals

9
10
11
12        Amber K. Goetz1'2, John C. Rockett1, Hongzu Ren1, Inthirany Thillainadarajah1,
13

\\        David J.Dix1
16
17
18
19        Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle
p-i
22        Park, NC 27711, USA; Department of Environmental and Molecular Toxicology, North Carolina
23
24        State University, Raleigh, NC 27695, USA
25
26
27
2g        Correspondence and reprint requests:
29
30        Dr. David Dix
31
32        National Center for Computational Toxicology (D343-03)
33
_.        Office of Research & Development

35        U.S. Environmental Protection Agency
36
37        Research Triangle Park, NC 27711
38
39        E-mail: dix.david@epa.gov.
40
J"        Tel: 919 541 2701
42
43
44
45        Running Title:  Triazole effects on Steroidogenesis
46
47
48
49
          Disclaimer: The United States Environmental Protection Agency through its Office of Research and
ou
51
52        Development funded and managed the research described here. It has undergone Agency review and
53
54        been approved for publication.
oo
56
57
58
59
60

                       URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                   Previous  I    TOC

-------
                                     Systems Biology in Reproductive Medicine                          Page 3 of 30


1
2
3         Abstract
4
5         Environmental  chemicals  that  alter  steroid production could  interfere with  male reproductive
6

          development and function. Three agricultural antifungal triazoles (myclobutanil, propiconazole and

9
10        triadimefon) that are known  to  modulate  expression  of cytochrome  P450  (CYP)  genes  and
11
12        enzymatic activities were tested for effects on steroidogenesis in rat in vivo and in vitro,  and human
 I o
14
,.,-        in vitro model systems. Hormone  production was measured in testis organ cultures from untreated
16
17        adult  and neonatal  rats, following in  vitro exposure to 1, 10, or 100 uM of myclobutanil or
18
19        triadimefon. Myclobutanil and  triadimefon  reduced media levels of testosterone by 40-68% in the

21
22        adult and neonatal testis culture, and altered steroid production in a manner that indicated CYP 17-
23
24        hydroxylase/17,20 lyase (CYP17A1)  inhibition at the highest concentration tested. Rat to  human
25
Oft
2°        comparison was explored using the H295R (human adrenal adenocarcinoma) cell line. Following 48

28
29        hour exposure to myclobutanil, propiconazole or triadimefon at 1, 3, 10, 30, or 100 jiM, there was an
30
31        overall decrease in estradiol,  progesterone and testosterone by all  three triazoles. These data indicate
32
33
_.        that myclobutanil, propiconazole and triadimefon are  weak inhibitors of testosterone  production in

35
36        vitro. However, in vivo exposure of rats to triazoles resulted in increased serum and intra-testicular
37
38        testosterone levels. This discordance could  be due to higher concentrations of triazoles tested in
39
40
>,.        vitro, and differences within an in vitro model system lacking neuroendocrine control.
42
43
44
45        Key Words: myclobutanil, triadimefon, propiconazole, testosterone, CYP 17
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
                                                                                                    2
                        URL: http:/mc.manuscriptcentral.com/sbrm Email:  AARS@comcast.net
                                     Previous  I    TOC

-------
Page 4 of 30                          Systems Biology in Reproductive Medicine


1
2
3         Introduction
4
5                Conazoles are triazole fungicides used for crop protection and pharmaceutical treatment of
6

          fungal infections. They inhibit cytochrome P450 (CYP) 51 by competitively binding to the heme

9
1 o        component of the CYP enzyme (Ghannoum and Rice, 1999).  In fungal cells this binding depletes
11
12        ergosterol levels, which leads to a build up of precursor sterols in the cellular membrane, disrupting
13
14
^ c        turgor pressure and triggering cytotoxicity. Conazoles disrupt other CYPs, including steroidogenic
16
17        CYPs (CYP17A1, CYP19A1) and consequently there is concern that inadvertent exposure to
18
19        agricultural conazoles may inhibit steroidogenesis and adversely affect reproduction in humans and

21
22        other mammalian species (Zarn et al., 2003).
23
24               It has been shown that some triazole conazoles inhibit aromatase (CYP 19) conversion of
25
Oft
2°        testosterone to estrogen (Andersen et al., 2002; Trosken et al., 2004; Vinggaard et al., 2000).  In

28
29        addition, there are reports that have investigated the effects of agricultural triazoles on testosterone
30
31        synthesis. The triazoles hexaconazole and flusilazole inhibited testosterone synthesis in Leydig cell
32
33
_.        culture and increased the incidence of Leydig cell tumors in rats (Inchem IPCS, 1990, 1995). In

35
36        contrast, the triazoles myclobutanil and triadimefon, did not cause Leydig cell tumors in rats, but did
37
38        increase serum testosterone levels in adult male rats after 14 days. This occurred without affecting
39
40
41        serum luteinizing hormone (LH), estradiol, or stimulating Leydig cell hyperplasia (Goetz et al., 2007;
42
43        Tully et al., 2006).
44
45               The current study was designed to  investigate the effect of myclobutanil, propiconazole and
46
47
43        triadimefon on testis testosterone synthesis in vivo or in vitro, and to determine whether and how
49
50        exposure to these triazoles alters testis testosterone levels in conjunction with increased serum
51
cp
^        testosterone levels.  In the first experiment, the effects of triadimefon following exposure in vivo on
Oo
54
55        serum and testis testosterone levels were measured in rats to test the hypothesis that increased testis
56
57        testosterone production contributes to the reported elevated serum testosterone levels.  We selected
58
59
60
                                                                                                     O
                        URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                     Previous  I     TOC

-------
                                    Systems Biology in Reproductive Medicine                          Page 5 of 30


1
2
3         triadimefon because it had caused the most robust increases in serum testosterone in previous rat
4
5         experiments (Goetz et al., 2007; Tully et al., 2006). Serum LH was measured to determine if it
6

          contributed to altered testosterone production, and serum estradiol to test if inhibition of aromatase

9
1 o        occurred in vivo.  In the second experiment, rat in vivo to in vitro comparisons were explored by
11
12        measuring intra-testicular testosterone production in organ cultures of neonatal and adult rat testes
13
14
^ c        exposed to varying concentrations of either myclobutanil or triadimefon to assess in vitro testis
16
17        testosterone production. In the third experiment, rat to human comparisons were explored by
18
19        measuring hormone production (testosterone, estradiol, progesterone) in the H295R cell line

21
22        following exposure to myclobutanil, propiconazole or triadimefon.
23
24
25
2®        Materials and Methods

28
29        Animal Husbandry.  All animal procedures were approved by the U.S. Environmental Protection
30
31        Agency's National Health and Environmental Effects Research Laboratory Institutional Animal Care
32
33
_.        and Use Committee. All animals were purchased from Charles River Laboratories (Raleigh, NC) and

35
36        housed in an Association for Assessment and Accreditation for Laboratory Animal Care-
37
38        International accredited facility. Animals were individually housed in polypropylene boxes
39
40
41        containing Alpha-Dri® bedding (Shepherd Specialty Papers, Watertown, TN), and subjected to a 12
42
43        hour: 12 hour lightdark cycle under controlled temperature (22 ± 2°C) and humidity (40-60%).
44
45        Animals were provided unlimited access to LabDiet 5002 Rodent Diet (PMI LabDiet, Richmond, IN)
46
47
43        and water. Feed was prepared by Bayer CropScience (Kansas City, MO) as part of a Materials
49
50        Cooperative Research and Development Agreement between the USEPA and the US Triazole Task
51
cp
^        Force.  Control animals were fed 5002 Certified Rodent Diet. The triadimefon treated group received
Oo
54
55        feed containing 1800 ppm triadimefon.
56
57
58
59
60
                                                                                                   4
                       URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                    Previous  I    TOC

-------
Page 6 of 30                         Systems Biology in Reproductive Medicine


1
2
3         In vivo dosing and sample collection.  The treated animals started dietary exposure on postnatal day
4
5         (PND) 60. Male Wistar Han rats (n = 15) were fed rat chow 5002 containing 1800 ppm triadimefon.
6

          Treatment lasted 30 days, body weight and feed were weighed on a weekly basis to determine dose.

9
10        Animals were tail bled 14 days into dosing (PND74) for testosterone measurements.  On day 30
11
12        (PND 90) animals were decapitated between 08:30 and 10:30 and then necropsied. Trunk blood was
13
14
^ c        collected for serum measurements of testosterone, estradiol, luteinizing hormone, and prolactin.
16
17        Liver, epididymis, ventral prostate, seminal vesicle, and pituitary were collected and weighed. Testes
18
19        were weighed and snap frozen in liquid nitrogen and stored at -80°C until analysis. The frozen right

21
22        testis was homogenized in cold Dulbecco's PBS (Gibco, Grand Island, NY) with an Ultra-Turrax T25
23
24        homogenizer (Janke-Kunkel IKA, Boutersem, Belgium), and centrifuged at 4°C for 10 min at 4,000
25
Oft
2°        refusing a Beckman J2-21M centrifuge.  Supernatant was then centrifuged with a 5417R centrifuge

28
29        (Eppendorf, Westbury, NY) at 20,000 rcf for  10 min at 4°C. Supernatant was collected and stored at
30
31        -80°C until analysis of intratesticular testosterone measurements.
32

34        Rat in vitro testis culture.  Testes from adult (PND 90-100, n = 5-8) and neonatal  (PND 1, n = 5

35
36        litters) Sprague Dawley rats were used in the in  vitro study. Testis parenchyma was sliced into -100
37
38        mg pieces for each adult testis and neonatal testes were left intact. Testis tissue was incubated in 1.5
39
40
4,,        ml of M199 media (Gibco, Grand Island, NY) supplemented with 0.2% bovine  serum albumin
42
43        (Sigma,  St. Louis, MO) and 10% charcoal/dextran treated  fetal bovine serum (Hyclone, Logan, UT).
44
45        Human chorionic gonadotropin (hCG), an LH receptor agonist, was used to stimulate testosterone
46
47
43        production.  Human chorionic gonadotropin (Sigma, St. Louis, MO) was added at 100 mU/mL to all
49
50        treatment groups except the negative controls. Technical grade (>95% purity) myclobutanil (LKT
51
rp
^        Laboratories Inc., St. Paul, MN) or triadimefon (Bayer CropScience, Kansas City, KS) was added to
Oo
54
55        a final concentration of 1, 10,  or 100 jiM.  Each test chemical was premixed with ethanol to aid
56
57        dilution; the final ethanol volume was 0.05%  of the total culture volume. Positive control medium
58
59
60

                       URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                    Previous  I    TOC

-------
                                   Systems Biology in Reproductive Medicine                         Page 7 of 30


1
2
3         (plus hCG, minus test chemical) and negative control medium (minus hCG and test chemical) both
4
5         contained 0.05% ethanol.  Tissues were incubated at 34°C in 2.0 ml siliconized tubes rotated at -10
6

          rpm.  At three time points, 0.5, 1.5, and 2.5 hr, the media was removed and replenished with fresh

9
1 o        media containing the appropriate triazole and dose concentration. All treatments were replicated in
11
12        triplicate for each adult rat.
13
14
. c        H295R cells. H295R human adrenocortical carcinoma cell lines were obtained from the American
16
17        Type Culture Collection (ATCC CRL-2128; ATCC, Manassas, VA) and grown in 75 cm2 flasks with
18
19        12.5 ml of supplemented medium at 37°C with a 5% CC>2 atmosphere.  H295R cultures were

21
22        performed by Dr. Xiaowei Zhang in the Department of Zoology at Michigan State University, East
23
24        Lansing, MI.  Supplemented medium was a 1:1 mixture of Dulbecco's modified Eagle's medium
25
26        with Ham's F-12 Nutrient mixture (Sigma,  St. Louis, MO) with 15 mM HEPES buffer. The medium

28
29        was supplemented with 1.2 g/L Na2CC>3, ITS + Premix (1 ml Premix/ 100 ml medium), and 12.5 ml/
30
31        500 ml NuSerum (BD Biosciences, San Jose, CA). Final component concentrations in the medium
32
33
_.        were  as follows: 15 mM HEPES, 6.25 ug/ml insulin, 6.25  ug/ml transferring, 6.25 ng/ml selenium,

35
36        1.25 mg/ml bovine serum albumin, 5.35 ug/ml linoleic acid, and 2.5% NuSerum. The medium was
37
38        changed two to three times per week and cells were detached from flasks for subculturing by use of
39
40
41        trypsin/ EDTA (Sterile IX Trypsin-EDTA; Life Technologies Inc., Grand Island, NY). Cells were
42
43        exposed to triazoles in 6-well Tissue Culture Plates (Nalgene Nunc Inc., Rochester, NY). To
44
45        minimize  potential effects of hormones  in the serum during exposures, Nu-Serum was replaced with
46
47
43        2.5% charcoal dextran treated FBS (HyClone Laboratories, Inc. Logan, UT) immediately before cells
49
50        were  exposed to triazoles at 1, 3, 10, 30, or 100 uM concentration in dimethyl sulfoxide (DMSO,
51
rp
^        Sigma-Aldrich, St. Louis, MO) for 48 h. At the end of this culture period media was collected and
Oo
54
55        shipped frozen to EPA for hormone measurements.
56
57
58
59
60
                                                                                                 6
                       URL: http:/mc.manuscriptcentral.com/sbrm  Email: AARS@comcast.net
                                   Previous  I     TOC

-------
Page 8 of 30                         Systems Biology in Reproductive Medicine


1
2
3         H295R cell viability. Protocol is adapted from Hilscherova et al., 2004.  Cells were visually
4
5         inspected under a microscope to evaluate viability and cell numbers. Cell viability was determined
6

          using the live/dead cell viability kit (Molecular Probes, Eugene, OR).

9
10        Hormone Measurements. Rat intratesticular levels of testosterone, androstenedione, 17alpha-
11
12        hydroprogesterone, and progesterone were measured from the culture media using coat-a-count 125I
13
14
^ c        radioimmunoassay kits (Diagnostic Products Corporation, Los Angeles, CA).  Serum LH was
16
17        measured using the rat disassociation enhanced lanthanide flourometric immunoassay (DELPHIA)
18
19        (Haavisto, 1993).  Serum prolactin (PRL) was measured by radioimmunoassay according to

21
22        manufacturer's instructions, using materials supplied by the National Hormone and Pituitary Agency
23
24        for LH and PRL: iodination preparation 1-6, reference preparation RP-3, and antisera  S-9. lodination
25
^        material was radiolabelled with 125I (DuPont/New England Nuclear) by a modification of the

28
29        chloramine-T method (Greenwood, 1963). Con6 control standards (Diagnostic Products Corporation,
30
31        Los Angeles, CA) were used to verify assay quality.  Lactate dehydrogenase (LDH) levels in the
32
33
_.        media, an indicator of cytotoxicity, were measured using an LDH detection kit (Roche Diagnostics,

35
36        Indianapolis, IN). Estradiol, progesterone, and testosterone levels in the H295R culture media were
37
38        assayed in duplicate using appropriate Coat-A-Count radioimmunoassay  (RIA) kits (Diagnostic
39
40
41        Products Co., Los Angeles, CA) according to manufacturer's instructions.
42
43        Statistical analysis. The average of the three replicates for each adult rat testis was used for
44
45        statistical analysis. The litter was the unit of analysis for the neonatal culture.  Rat in vitro hormone
46
47
43        data and LDH data, both loglO transformed, were analyzed using the SAS GLM procedure (SAS
49
50        Institute Inc., Gary, NC). Since each adult or litter was exposed to all eight treatments within the
51
rp
^        testis culture assay, animal or litter was used as a blocking factor within analyses of variance.
Oo
54
55        Pairwise t-tests were used to test for any differences between treatment groups and the control.  Rat
56
57        in vivo hormone data (loglO transformed), body and tissue weight data were analyzed using a two-tail
58
59
60
                                                                                                    7
                        URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                     Previous  I    TOC

-------
                                     Systems Biology in Reproductive Medicine                         Page 9 of 30


1
2
3         t-test. Statistical significance between control and treatment was set at p<0.05. H295R hormone data
4
5         was analyzed using ANOVA, measures with p<0.05 were considered significant.  Students t-test was
6

          used for further comparisons between control and treatment groups.

9
10
11
12        Results
13
14
^ c        In vivo triadimefon effects.  The mean dose received was 126 mg triadimefon/kg body weight/day.
16
17        Feed intake of treated males was on average  10% less than the control males over the course of
18
19        dosing.  During the first week of treatment, body weights were significantly decreased 8-10% at 1800

21
22        ppm triadimefon compared to the controls, however, the rate of body weight gain was similar
23
24        between treatment groups throughout the remainder of the study (Figure 1). Four individuals (one
25
Oft
2°        control and three treated) were removed from the study due to factors not related to treatment.

28
29               Liver weights were increased (27% absolute, 37% adjusted for body weight) after 30 days
30
31        exposure (Table 1).  Absolute pituitary and paired epididymal weights were decreased (9.8% and
32
33
_.        5.2%, respectively) at necropsy.  There were no treatment effects on the androgen-dependent tissues;

35
36        ventral prostate or seminal vesicle.  Serum testosterone levels were unaffected following 2 weeks
37
38        exposure, however they were increased after 4 weeks exposure along with intra-testicular testosterone
39
40
41        levels (Table 1). Serum levels of estradiol, LH, and prolactin were elevated, but not statistically
42
43        significant.
44
45        Rat in vitro testis cultures. Testosterone production remained fairly constant (<11% change) in the
46
47
43        positive control adult testis tissue after each successive time point (0.5, 1.5, and 2.5 hr; hCG
49
50        stimulated) (Figure 2A).  Testosterone levels increased ~ 88% between the 0.5 and 1.5 hr time point
51
rp
^        in the positive control neonatal testis, and remained constant during the 1.5 and 2.5 hr time point
Oo
54
55        (Figure 2B). Accounting for incubation time, testosterone production decreased slowly after each
56
57        successive time point in the negative control  adult and neonatal testis cultures that were not
58
59
60
                                                                                                    8
                        URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                     Previous  I    TOC

-------
Page 10 of 30                         Systems Biology in Reproductive Medicine


1
2
3         administered hCG and triazole treatment. The control data suggests hCG continued to stimulate
4
5         testosterone production in both adult and neonatal testis cultures; and testosterone production
6

0         decreased over time without hCG in the testis cultures.
o
9
1 o               Following administration of each triazole in the adult testis cultures, testosterone levels were
11
12        reduced by a statistically significant amount only at the highest concentration tested (Figure 2A).
13
14
^ c        Testosterone levels were reduced by myclobutanil or triadimefon by less than 50% following 100
16
17        |iM treatment compared to the respective control at 0.5 h with hCG. At 1.5 and 2.5 h, testosterone
18
19        levels decreased 63-68% following 100 jiM triadimefon treatment and ~ 50% following 100 jiM

21
22        myclobutanil treatment (Figure 2A). In the neonatal testis culture, testosterone levels were reduced
23
24        following myclobutanil (100 jiM ) and triadimefon (10 and 100 jiM) treatment, which suggests that
25
Oft
2°        triadimefon might be the stronger inhibitor of the two (Figure 2B).  Neonatal testosterone production

28
29        was more pronounced  at the 0.5 h time point by myclobutanil and triadimefon compared to the 1.5
30
31        and 2.5 h time points.  This may be due to testosterone levels increasing over time, as was seen in the
32
33
_.        control groups containing hCG. Neonatal testosterone production was reduced -57% by triadimefon

35
36        (10 and 100 jiM) at 1.5 and 2.5 h. The 100 jiM myclobutanil treatment reduced neonatal testosterone
37
38        production by 65% and 40% at 1.5 and 2.5 h, respectively.  The variability among the time points
39
40
41        was higher in the neonatal testis culture and likely due to the use of the litter vs. individual adults for
42
43        statistical analysis.
44
45               The hormone pattern after chemical treatment suggests CYP17A1 was inhibited. Media
46
47
43        levels of androstenedione were reduced by both chemicals in the adult (Figure 3 A) and  neonatal
49
50        (Figure 3B) testis cultures, but not at all concentrations and time points that reduced testosterone
51
rp
^        levels. In the adult testis cultures, the levels of 17alpha-hydroxyprogesterone (Figure 4A) and
Oo
54
55        progesterone (Figure 5A) were increased by myclobutanil and triadimefon at lower concentration
56
57        levels than those that affected the  androgens. In the neonatal testis cultures, the levels of 17alpha-
58
59
60
                                                                                                    9
                        URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                     Previous  I    TOC

-------
                                     Systems Biology in Reproductive Medicine                        Page 11 of 30


1
2
3         hydroxyprogesterone (Figure 4B) and progesterone (Figure 5B) were also increased.  The progestins
4
5         were not detected in the media of the neonatal testis cultures at all time points. LDH levels were
6

          variable in the adult testis culture, which made cytotoxicity evaluation difficult. Analysis of the log

9
1 o        transformed LDH data found no significant differences among the treatment groups, with the
11
12        exception of 100 uM triadimefon at the 2.5 hour time point.  At this concentration and time point,
 I o
14
. c        LDH levels were significantly lower than the control (p < 0.011, data not shown). Cytotoxicity in the
16
17        neonatal testis cultures could not be evaluated since LDH was not detected in these cultures.
18
19        H295R cell viability. No adverse effects on cell growth or viability were observed following triazole

21
22        treatments (data not shown).
23
24        H295R hormone assays. Relative change in estradiol, progesterone and testosterone levels by
25
Oft
2°        myclobutanil, propiconazole, or triadimefon in H295R cells are shown in Figure 6A/B. Myclobutanil

28
29        reduced the levels of estradiol in a concentration-dependent manner. Propiconazole and triadimefon
30
31        produced an increase in estradiol at the lower doses of 1 and 3 uM but a decrease as the dose
32
33
_.        increased.  All three triazoles decreased levels of progesterone at all doses assessed, however

35
36        progesterone levels returned to control levels at the highest dose of triadimefon. Testosterone levels
37
38        were decreased by all three triazoles in a consistent manner. In  addition to the overall decrease in
39
40
41        hormone levels, estradiol levels were consistently greater than progesterone and testosterone levels
42
43        following propiconazole and triadimefon treatment indicating a possible disruption in the conversion
44
45        of estradiol, i.e. CYP17A1 activity. Established serum standards, a low, medium, and high control,
46
47
43        were used to quantitate variation among hormone assays. The intra-assay coefficient of variation
49
50        (CoV) range was 1-13 %, and the inter-assay CoV was 14-19 % among the low, medium, and high
51
rp
^        controls for testosterone assays. The androstenedione intra-assay CoV range was 0-9% and inter-
Oo
54
55        assay CoV was 7-17%.  The 17alpha-hydroxyprogesterone intra-assay CoV range was 0-12% and
56
57
58
59
60
                                                                                                   10
                        URL: http:/mc.manuscriptcentral.com/sbrm  Email: AARS@comcast.net
                                     Previous  I    TOC

-------
Page 12 of 30                         Systems Biology in Reproductive Medicine


1
2
3         inter-assay CoV was 2-11%.  The progesterone intra-assay CoV range was 2-12% and inter-assay
4
5         CoV was 19-21%.
6
7
8
9
10        Discussion
11
12               Toxicology studies using animals and in vitro cellular or tissue preparations have been used to
13
14
^ c        study the toxic effects and mechanism of action of chemicals, to determine the effective and safe dose
16
17        of drugs in humans, and the risk of toxicity from chemical exposures. In vitro testing allows for a
18
19        specific evaluation of a chemical's ability to alter the synthesis of steroids, however the absorption,

21
22        distribution, metabolism, and elimination (ADME) of the chemical is not accounted for in such
23
24        testing (Gray et al., 1997). It has been demonstrated that several conazoles disrupt CYP enzymes,
25
Oft
2°        including steroidogenic CYPs, and consequently there is concern that inadvertent exposure to

28
29        agricultural conazoles may inhibit steroidogenesis and adversely affect reproduction in humans and
30
31        other mammalian species  (Zarn et al., 2003). In the present study we examined each chemical's
32
33
_.        ability to affect testosterone production, which may be helpful in understanding the mode of action

35
36        for the  reproductive effects observed in toxicology studies of conazoles.  This series of experiments
37
38        was designed to address the hypothesis that the triazoles myclobutanil and triadimefon elicit their
39
40
41        reproductive effects through inhibition of testosterone production in the testis.
42
43               Results from this set of experiments demonstrate that all three triazoles were weak inhibitors
44
45        of testosterone production in vitro and suggest that, at least for myclobutanil and triadimefon,
46
47
4g        inhibition of CYP17A1 occurs in vitro. Inhibition of CYP17A1 has been reported with several
49
50        imidazole compounds (Ayub and Levell, 1987; Engelhardt et al., 1991), and triazole compounds such
51
rp
^        as hexaconazole (Lloyd, 1991) and flusilazole (Inchem IPCS, 1995).  Inhibition of the LH signal
Oo
54
55        transduction pathway, as stimulated by hCG, may have contributed to lowered testosterone
56
57        production. Although this hypothesis was not tested in our study design, the increased progestin
58
59
60

                        URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                     Previous  I    TOC

-------
                                     Systems Biology in Reproductive Medicine                         Page 13 of 30


1
2
3         levels in the cultures treated with myclobutanil or triadimefon suggest that hCG stimulation of the LH
4
5         pathway was not altered.
6

                 The data suggest that changes in hormone levels were not due to direct cytotoxicity. Indirect

9
1 o        evidence from the progestin hormone data suggests an absence of cytotoxicity; levels of these
11
12        hormones continued to rise with increasing concentrations of the triazoles.  This effect was also
 I o
14
^ c        observed in the adult testis culture, where no significant cytotoxicity was observed
16
17               Inhibition of testosterone synthesis was the hypothesis used to explain the increased incidence
18
19        of Ley dig cell tumors observed by the triazoles hexaconazole and flusilazole, and a similar

21
22        hypothesis was proposed that ketoconazole would induce similar tumors by the same mechanism if
23
24        tested under the USEPA guideline criteria (i.e., maximum tolerated dose and length of exposure)
25
Oft
27        (Cook et al., 1999).  Triadimefon and myclobutanil appear to inhibit CYP17A1 in vitro, decrease

28
29        testosterone levels in vitro, but increase serum testosterone levels in vivo and do not induce Leydig
30
31        cell tumors.  The difference across the triazole and imidazoles may be a result of the strength with
32
33
_.        which different conazoles bind and inhibit metabolizing and/or steroidogenic CYPs.  This marked

35
36        reduction in testosterone production by individual conazoles is likely due to a threshold dose
37
38        response as part of the mode of action for the production of Leydig cell tumors in rats.
39
40
41               In the dietary exposure experiment, triadimefon increased serum testosterone levels similar to
42
43        previous reports on triadimefon, myclobutanil and propiconazole (Goetz et al., 2007;  Inchem IPCS,
44
45        1985; Tully  et al., 2006).  Serum testosterone levels were not increased during the first two weeks of
46
47
43        dosing suggesting this effect developed over time. The increased intra-testicular level of testosterone
49
50        by triadimefon suggests increased testis testosterone production contributes to the increase in serum
51
rp
^        testosterone. In addition, treatment related hepatic adaptive response (presumably causing a decrease
Oo
54
55        in liver testosterone metabolism and  clearance) also likely contributed to the increased circulating
56
57        levels of testosterone.
58
59
60
                                                                                                    12
                        URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                     Previous  I     TOC

-------
Page 14 of 30                         Systems Biology in Reproductive Medicine


1
2
3                The in vitro data demonstrated the opposite effect on testosterone production, producing a
4
5         discrepancy between the in vitro and in vivo results. Several triazole compounds have been examined
6
          in both in vivo and in vitro studies, such as fluconazole (Hanger et al., 1988) and R76713 and its

9
1 o        enantiomers (Wouters et al., 1990). In both cases, the inhibitory effects observed in vitro on
11
12        androgen synthesis and concomitant increase in the precursor progestins (Wouters et al., 1990),
13
14
^ c        which is indicative for some effect on the CYP17A1 enzyme, were not repeated in the in vivo
16
17        experiments.  As noted in these other studies, it is likely that the in vitro triadimefon concentrations
18
19        which significantly inhibit testosterone production  (100 jiM) are not achieved through dietary

21
22        exposure in the adult rat, and another mechanism of action is stimulating the increased serum
23
24        testosterone levels at lower concentrations of triadimefon.
25
Oft
2°               A similar situation occurs with the triazoles tebuconazole and epoxiconazole (Taxvig et al.,

28
29        2007, 2008).  Reproductive toxicity studies with tebuconazole and epoxiconazole demonstrated
30
31        female virilization (increased anogenital distance) by both triazoles; fetal male feminization
32
33
_.        following exposure to tebuconazole with concomitant decrease in fetal testosterone levels;  and

35
36        increased testosterone levels in the dams following epoxiconazole which was observed following
37
38        myclobutanil exposure to pregnant dams as well (Goetz et al.,  2007).  Although the overall results
39
40
41        from these reproductive toxicity studies show that many of the azole fungicides behave similarly
42
43        following gestational exposure, the profile of action in vivo varies. The route of exposure (oral);
44
45        method of exposure, whether dietary or by gavage with a  vehicle such as corn oil; and duration of
46
47
43        exposure will all certainly have a significant impact on the outcome of the studies.  It is important to
49
50        define the effects observed at the high dose levels in these reproductive toxicity studies and
51
cp
^        extrapolate them to low dose effects.
Oo
54
55               The mechanism of action responsible for the increased serum and testis testosterone levels is
56
57        not clear. The elevated testosterone levels would be expected  to reduce LH levels through a negative
58
59
60
                                                                                                     1 ^
                        URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                     Previous  I     TOC

-------
                                     Systems Biology in Reproductive Medicine                         Page 15 of 30


1
2
3         feedback loop. However, triadimefon exposure did not significantly alter serum LH suggesting that
4
5         the hypothalamic-pituitary-gonadal (HPG) axis may be altered.  Triadimefon could be acting as an
6

          androgen receptor (AR) antagonist to stimulate increased testosterone production, but it is unlikely

9
1 o        since high levels of triadimefon/triadimenol are needed to inhibit AR function (Okubo et al., 2004).
11
12        In addition, the androgen-dependent ventral prostate and seminal vesicle weights were unaffected in
13
14
^ c        this study, and other studies do not show much evidence of AR antagonism in androgen-dependent
16
17        tissues (Goetz et al., 2007; Tully et al., 2006). Triadimefon has been reported to be an aromatase
18
19        inhibitor in vitro, but this inhibition was not evident in serum estradiol within our study  and others
20                       '                                                              y
21
22        (Goetz et al., 2007; Tully et al., 2006) which suggests that an altered HPG regulation from reduced
23
24        estradiol levels did not occur.
25
Oft                                             	
2°               One explanation for a disrupted HPG axis that has not been explored is by means of altered

28
29        neurotransmitters within the hypothalamus. Triadimefon exposure has been shown to affect behavior
30
31        presumably by altering neurotransmitters within the brain (Crofton et al.,  1988; Reeves et al., 2004a,
32
33
,.        2004b; Walker and Mailman, 1996).  If triadimefon is affecting neurotransmitters within the

35
36        hypothalamus, this mechanism could disrupt the HPG axis.  This hypothesis was investigated by
37
38        measuring serum PRL levels as a proxy of altered dopamine levels within the hypothalamus (Waeber
39
40
41        et al., 1983), but there was no effect.  The mechanism by which triadimefon increases testis
42
43        testosterone production requires further investigation and should include examination of
44
45        hypothalamic-pituitary axis regulation.
46
47
43               In summary, triadimefon,  myclobutanil and propiconazole show weak inhibition of
49
50        testosterone production in vitro in rat testis cultures, and the hormone data suggests inhibition of
51
rp
^        CYP17A1. However, in vivo triadimefon increased rat testis testosterone production and serum
Oo
54
55        testosterone levels. The mechanism of action for increased testis testosterone levels in vivo is
56
57        unresolved, but may possibly involve a disruption of the HPG axis.  Further studies into the effects of
58
59
60
                                                                                                    14
                        URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                     Previous  I     TOC

-------
Page 16 of 30
                          Systems Biology in Reproductive Medicine
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
triadimefon and other triazoles on the pituitary, hypothalamus, testis, and steroidogenesis will be

needed to fully elucidate the dose-response of mechanisms relevant to human health risk assessments.



Acknowledgements:

AKG was supported by U.S. EPA and N.C. State University Cooperative Training Agreement No.

CT826512010.
                       URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                                                                               15
                                   Previous
                                         TOC

-------
                                    Systems Biology in Reproductive Medicine                         Page 17 of 30


1
2
3         References
4
5         Andersen, H.R., Vinggaard, A.M., Rasmussen, T.H., Gjermandsen, I.M., Bonefeld-J0rgensen, E.G.
6
          (2002). Effects of Currently Used Pesticides in Assays for Estrogenicity, Androgenicity, and

9
10        Aromatase Activity In Vitro. Toxicology and Applied Pharmacology. 179, 1-12.
11
12
13
14
,.,-        Ayub M.  and Levell M. J. (1987). Inhibition of Testicular 17alpha-hydroxylase and 17,20 lyase but
16
17        not 3beta-hydroxysteroid dehydrogenase or 17beta-hydroxysteroid oxidoreductase by Ketoconazole
18
19        and Other Imidazole Drugs. Journal of Steroid Biochemistry. 28, 521-531.

21
22
23
24        Cook J. C., Klinefelter G. R., Hardisty J. H., Sharpe R. M., Foster P. M. D. (1999). Rodent Leydig
25
Oft
2°        Cell Tumorigenesis: A Review of the Physiology, Pathology, Mechanisms, and Relevance to

28
29        Humans.  Critical Reviews in Toxicology. 29, 169-261.
30
31
32
34        Crofton K. M., Boncek V. M., Reiter L. W. (1988). Hyperactivity Induced by Triadimefon, A

35
36        Triazole Fungicide. Fundamental and Applied Toxicology. 10, 459-465.
37
38
39
40
41        Engelhardt D., Weber M. M., Miksch T., Abedinpour F., Jaspers C. (1991). The Influence of
42
43        Ketoconazole on Human Adrenal Steroidogenesis: Incubation Studies with Tissue Slices. Clinical
44
45        Endocrinology. 35, 163-168.
46
47
48
49
50        Ghannoum, M.A. and Rice, L.B. (1999). Antifungal Agents: Modes of Action, Mechanisms of
51
rp
^        Resistance, and Correlations of the Mechanisms with Bacterial Resistance. Clinical Microbiology
Oo
54
55        Reviews.  12, 501-517.
56
57
58
59
60
                                                                                                  16
                        URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                    Previous  I    TOC

-------
Page 18 of 30                        Systems Biology in Reproductive Medicine


1
2
3         Goetz, A.K., Ren, H., Schmid, I.E., Blystone, C.R., Thillainadarajah, I, Best, D.S., Nichols, H.,
4
5         Strader, L.F., Narotsky, M.G., Wolf, D.C., Rockett, J.C., Dix, D. J. (2007). Disruption of Testosterone
6

          Homeostasis as a Mode of Action for the Reproductive Toxicity of Triazole Fungicides in the Male

9
10        Rat. lexicological Sciences. 95,227-239.
11
12
13
14
^ c        Gray L. E., Kelce W. R., Wiese T., Tyl R., Gaido K., Cook J., Klinefelter G., Desaulniers D., Wilson
16
17        E., Zacharewski T., Waller C., Foster P., Laskey J., Peel J., Giesy J., Laws S., McLachlan J., Breslin
18
19        W., Cooper R., Giulio R. Di, Johnson R., Purdy R., Mihaich E., Safe S., Sonnenschein C., Welshons

21
22        W., Miller R., McMaster S., Colborn T. (1997) Endocrine Screening Methods Workshop: Detection
23
24        of Estrogenic and Androgenic Hormonal and Antihormonal Activity for Chemicals that Act Via
25
Oft
          Receptor or Steroidogenic Enzyme Mechanisms. Reproductive Toxicology. 11, 719-750.

28
29
30
31        Greenwood F.C., Hunter W. M., Glover J.  S. (1963). The Preparation of 1-131-Labelled Human
32
33
_.        Growth Hormone of High Specific Radioactivity. The BiochemicalJournal. 89, 114-123.

35
36
37
38        Hanger, D.P., Jevons, S., and Shaw, J.T.B. (1988). Fluconazole and Testosterone: In Vivo and In
39
40
4,,        Vitro Studies. Antimicrobial Agents and Chemotherapy. 32, 646-648.
42
43
44
45        Haavisto AM, Pettersson K, Bergendahl M, Perheentupa A, Roser JF, Huhtaniemi I.  (1993) A
46
47
43        Supersensitive Immunofluorometric Assay for Rat Luteinizing Hormone. Endocrinology. 132, 1687-
49
50        1691.
51
52
53
54
55
56
57
58
59
60
                                                                                                 17
                       URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                    Previous  I    TOC

-------
                                    Systems Biology in Reproductive Medicine                        Page 19 of 30


1
2
3         Hilscherova, K., Jones, P.D., Gracia, T., Newsted, J.L., Zhang, X., Sanderson, J.T., Yu, R.M.K., Wu,
4
5         R.S.S., Giesy, J.P. (2004). Assessment of the effects of chemicals on the expression often
6

          steroidogenic genes in the H295R cell line using real-time PCR, Toxicol. Sci. 81, 78-89.

9
10
11
12        Inchem IPCS, FAO/WHO (1985). Joint Meeting on Pesticide Residues: 733. Triadimefon, Pesticide
13

^ c        residues in food evaluations Part II Toxicology.
16
17
18
19        Inchem IPCS, FAO/WHO (1990). Joint Meeting on Pesticide Residues: 810. Hexaconazole,

21
22        Pesticide Residues in Food Evaluations in Toxicology.
23
24
25
26        Inchem IPC S, F AO/WHO (1995). Joint Meeting on Pesticide Residues: 896. Flusilazole, Pesticide

28
29        Residues in Food Evaluations Part II Toxicological & Environmental.
30
31
32
33
_.        Lloyd S. C. (1991). Effects of hexaconazole (ICIA523) on the steroidogenic function of isolated rat

35
36        and human Leydig cells. ICI Central Toxicology Laboratory.
37
38
39
40
41        Okubo T., Yokoyama Y., Kano K., Soya Y., Kano I. (2004). Estimation of the Estrogenic and
42
43        Antiestrogenic Activities of Selected Pesticides by MCF-7 Cell Proliferation Assay. Archives of
44
45        Environmental Contamination and Toxicology. 46, 445-453.
46
47
48
49
50        Reeves R., Thiruchelvam M., Cory-Slechta D. A. (2004a) Development of Behavioral Sensitization
51
^        to the Cocaine-Like Fungicide Triadimefon is Prevented by AMP A, NMD A, DA Dl but not DA D2
Oo
54
55        Receptor Antagonists. Toxicological Sciences. 79,  123-136.
56
57
58
59
60
                                                                                                 18
                       URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                    Previous  I    TOC

-------
Page 20 of 30                        Systems Biology in Reproductive Medicine


1
2
3         Reeves R., Thiruchelvam M., Cory-Slechta D. A. (2004b). Expression of Behavioral Sensitization to
4
5         the Cocaine-Like Fungicide Triadimefon is Blocked by Pretreatment with AMP A, NMDA and DA
6
          Dl Receptor Antagonists. Brain Research. 1008, 155-167.

9
10
11
12        Taxvig, C., Hass, U., Axelstad, M., Dalgaard, M., Boberg, J., Andersen, H.R., Vinggaard, A.M.
13
14
. c        (2007). Endocrine-Disrupting Activities In Vivo of the Fungicides Tebuconazole and Epoxiconazole.
16
17        Toxicological Sciences. 100, 464-473.
18
19
20

22        Taxvig, C., Vinggaard, A.M., Hass, U., Axelstad, M., Metzdorff, S., Nellemann, C. (2008).
23
24        Endocrine-Disrupting Properties In Vivo of Widely Used Azole Fungicides. InternationalJournal of
25
26        Andrology.3l, 170-177.

28
29
30
31        Trosken, E.R., Schloz, K., Lutz, R.W., Volkel, W., Zarn, J.A., Lutz, W.K. (2004). Comparative
32
33
_.        Assessments of the Inhibition of Recombinant Human CYP19 (aromatase) by Azoles Used in

35
36        Agriculture and as Drugs for Humans. Endocrine Research. 30, 387-394.
37
38
39
40
41        Tully, D.B., Bao, W., Goetz, A.K., Blystone, C.R., Ren, H., Schmid, I.E., Strader, L.F., Wood, C.R.,
42
43        Best, D.R., Narotsky, M.G., Wolf, D.C., Rockett, J.C., Dix, D.J. (2006). Gene Expression Profiling
44
45        in Liver and Testis of Rats to Characterize the Toxicity of Triazole Fungicides. Toxicology and
46
47
43        Applied Pharmacology. 215,260-273.
49
50
51
^        Vinggaard, A.M., Hnida, C., Breinholt, V., Larson, J.C.  (2000). Screening of Selected Pesticides for
Oo
54
55        Inhibition of CYP19 Aromatase Activity In Vitro. Toxicology In Vitro. 14, 277-234.
56
57
58
59
60
                                                                                                  19
                       URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                    Previous  I    TOC

-------
                                    Systems Biology in Reproductive Medicine                         Page 21 of 30


1
2
3         Waeber C., Reymond O., Reymond M., Lemarchand-Beraud T. (1983). Effects of Hyper- and
4
5         Hypoprolactinemia on Gonadtrophin Secretion, Rat Testicular Luteinizing Hormone/Human
6

          Chorionic Gonadotropin Receptors and Testosterone Production by Isolated Leydig Cells. Biology of

Q
1 o        Reproduction. 28,167-177.
11
12
13
14
,.,-        Walker Q. D. and Mailman R. B. (1996). Triadimefon and Triadimenol: Effects on Monoamine
16
17        Uptake and Release. Toxicology and Applied Pharmacology. 139, 227-233.
18
19
20
21
22        Wouters, W., De Coster, R., van Dun, J., Krekels, M.D.W.G., Dillen, A., Raeymaekers, A., Freyne,
23
24        E., Van Gelder, J., Sanz, G., Venet, M., Janssen, M. (1990).  Comparative Effects of the Aromatase
25
26        Inhibitor R76713  and of its Enantiomers R83839 and R83842 on Steroid Biosynthesis In Vitro and In

28
29        Vivo. Journal of Steroid Biochemistry and Molecular Biology.37,  1049-1054.
30
31
32
33
_.        Zarn, J.A., Bruschweiler, B.J., Schlatter, J.R. (2003). Azole Fungicides affect Mammalian

35
36        Steroidogenesis by Inhibiting Sterol 14alpha-demthylase and Aromatase. Environmental Health
37
38        Perspectives. Ill, 255-261.
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
                                                                                                20
                       URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                    Previous  I     TOC

-------
Page 22 of 30
                         Systems Biology in Reproductive Medicine
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Table 1: Weight and hormone measurements (± SEM) from control and treated animals following 30

days dietary exposure to triadimefon .
Parameter
Body Weight (g)
Liver Weight (g)
Live weight adjusted for body weight
Testis weight (g)
Epididymides (g)
Ventral Prostate (g)
Seminal Vesicle (g)
Pituitary (mg)
Serum T (14 day) (ng/mL)
Serum T (30 day) (ng/mL)
Intra-testicular T (ng/mL)
LH (ng/mL)
Estradiol (pg/mL)
Prolactin (ng/mL)
Control"
0 ppm
366.08 ±5. 63
12.977 ±0.282
3.54
3.474 ±0.055
1.162 ±0.016
0.379 ±0.020
1.148 ±0.044
10.2 ±0.02
2.63 ±0.60
2.15 ±0.39
33.70 ±5. 71
0.433 ±0.069
15.04 ±1.30
4.81 ±0.82
Triadimefon
ISOOppm
338.25 ±5. 84**
16.499 ±0.447***
4.88
3.485 ±0.066
1.102 ±0.020*
0.371 ±0.017
1.133 ±0.052
9.2 ±0.02*
2.72 ±0.46
6.20 ±1.26**
70.76 ±12.05 **
0.680 ±0.128
18.08±1.10
8. 00 ±1.99
Percent change
92.4
127.1
137.8
-
94.8
-
-
90.2
-
288.3
210.0
-
-
-
 control animals (n = 14) and treated animals (n=12); * p < 0.05, **  p < 0.01, *** p < 0.0001.
                       URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                                                                              21
                                   Previous
                                         TOC

-------
                                     Systems Biology in Reproductive Medicine                        Page 23 of 30


1

3         FIGURE LEGENDS
4
5         Figure 1:  Body weight of control and triadimefon treated rats over the course of dosing.  Solid line =
6

          control, dashed line = triadimefon ISOOppm treatment. ** = p<0.01,*** = p< 0.001.

9
1 o        Figure 2:  In Vitro testosterone production by the adult (A) and neonatal (B) testis after myclobutanil
11
12        (Myc) and triadimefon (Tri) exposure. Asterisks indicate a significant difference (* p < 0.05, **  p <
13
14
^ c        0.01, and *** p < 0.001) between the treatment group and control group ((+), hCG with no chemical)
16
1 7        at each time point. (-) refers to tissue not stimulated by hCG and without chemical treatment.
18
1 9        Figure 3:  In Vitro androstenedione production by the adult (A) and neonatal (B) testis after

21
22        myclobutanil (Myc) and triadimefon (Tri) exposure. Asterisks indicate a significant difference (*  p
23
24        < 0.05, ** p < 0.01, and *** p < 0.001) between the treatment group and control group ((+), hCG
25
Oft
2°        with no chemical) at each time point. (-) refers to tissue not stimulated by hCG and without chemical

28
29        treatment.
30
31        Figure 4:  In Vitro 17alpha-hydroxyprogesterone production by the adult (A) and neonatal (B) testis
32
33
_.        after myclobutanil (Myc) and triadimefon (Tri) exposure.  Asterisks indicate a significant difference

35
36        (* p < 0.05, ** p < 0.01, and *** p < 0.001) between the treatment group and control group ((+),
37
38        hCG with no chemical) at each time point. (-) refers to tissue not stimulated by hCG and without
39
40
41        chemical treatment.
42
43
44
45        myclobutanil (Myc) and triadimefon (Tri) exposure. Asterisks indicate a significant difference (*  p
46
47
43
49
50
51
52
          treatment.
          Figure 5:  In Vitro progesterone production by the adult (A) and neonatal (B) testis after

          myclobutanil (Myc) and triadimefon (Tri) exposure.  Asterisks indicate a significant difference

          < 0.05, ** p < 0.01, and *** p < 0.001) between the treatment group and control group ((+), hCG

          with no chemical) at each time point. (-) refers to tissue not stimulated by hCG and without chemical
54
55        Figure 6:  Hormone measures from H295R cell media.  (A) Each panel presents progesterone,
56
57        estradiol and testosterone changes, for each of the three triazoles (myclobutanil, propiconazole,
58
59
60
                                                                                                   22
                        URL: http:/mc. manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                     Previous  I    TOC

-------
Page 24 of 30
                          Systems Biology in Reproductive Medicine
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
triadimefon).  (B) Each panel presents results with all three triazoles, for each of the three hormones.

All data presented as percent change relative to controls. * p<0.05, ** p<0.01, *** p<0.001.
                       URL: http:/mc.manuscriptcentral.com/sbrm  Email: AARS@comcast.net
                                                                                                23
                                    Previous
                                          TOC

-------
                                 Systems Biology in Reproductive Medicine
                                                                                 Page 25 of 30
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 1:
                    400-
                    200
• Control
•Triadimefon 1800ppm
                       60    65     70    75     80     85    90
                                    Post-natal Day
                      URL: http:/mc.manuscriptcentral.com/sbrm  Email: AARS@comcast.net
                                                                                          24
                                 Previous
                                       TOC

-------
Page 26 of 30
        Systems Biology in Reproductive Medicine
          Figure 2A:
               0.7 -i
           §"  0.6 -
           LLI
           CO
           ±,  0.5

            O)
            E
c
o>

s
0)
ts
o
ts
0)
               0.4 -
               0.3 -
               0.2 -
               0.1 -
               0.0
JL***
to
                                    Myc 1    Myc 10  Myc 100    Tri 1     Tri 10    Tri 100


                                             Treatments
          Figure 2B:
                  30 n
                                      Myc 1   Myc 10  Myc 100   Tri 1

                                              Treatments
                                        Tri 10  Tri 100
                       URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                                                                                 25
                                    Previous
                         TOC

-------
                         Systems Biology in Reproductive Medicine
                                            Page 27 of 30
Figure 3A:
     0.16 -i
                           Myc 1    Myc 10  Myc 100   Tri 1     Tri 10    Tri 100
                                   Treatments
Figure 3B:
                                         i
                         Myc 1   Myc 10  Myc 100   Tri 1
                                 Treatments
             Tri 10   Tri 100
             URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                                                                     26
                          Previous
TOC

-------
Page 28 of 30
                         Systems Biology in Reproductive Medicine
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 4A:
      0.25 n
      0.00
                           Myc1    MydO   MyclOO    Tri 1     Tri 10   Tri 100
                                   Treatments
Figure 4B:
  -. 3.5 -,
  LU
  CO
     3.0 -
  to 2.5
  S
  1*2.0
  0)

  l"<
  I/)
  0)
  O) 1.0
  Q
  Q.
  I 0.5 H
     0.0
                          Myc 1    Myc 10   Myc 100
                                   Treatments
Tri1
Tri 10   Tri 100
                       URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                                                                              27
                                   Previous
                                         TOC

-------
                                   Systems Biology in Reproductive Medicine
                                                                                     Page 29 of 30
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 5A:
                                               I
    0.00
                         Myc1   MydO   MydOO    Tri 1     Tri 10    Tri 100
                                  Treatments
Figure 5B:
     3.0 -i
                         Myc1    MydO   MydOO    Tri 1     Tri 10    Tri 100
                                  Treatments
                       URL: http:/mc.manuscriptcentral.com/sbrm Email: AARS@comcast.net
                                                                                              28
                                   Previous
                                         TOC

-------
Page 30 of 30
                             Systems Biology in Reproductive Medicine
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Figure 6:

   A 120%

      100%


       80%
   3
    
-------
                                        Regulatory Toxicology and Pharmacology xxx (2009) xxx-xxx
                                             Contents lists available at ScienceDirect
                              Regulatory Toxicology and Pharmacology
                                   journal  homepage: www.elsevier.com/locate/yrtph
Evaluation of high-throughput genotoxicity assays  used  in  profiling  the US  EPA
ToxCast™ chemicals

Andrew W. Knighta'*, Stephen Littleb, Keith Houckb, David Dixb, Richard Judsonb, Ann Richard5,
Nancy McCarrollc,  Gregory Akermanc, Chihae Yangd, Louise Birrell3,  Richard M. Walmsley3
" Centronix Ltd., CTF Building, 46 Crafton Street, Manchester, M13 9NT, UK
b National Center for Computational Toxicology (D343-03), Office of Research and Development, US Environmental Protection Agency, Research Triangle Park, NC 27711, USA
c Health Effects Division, Office of Pesticide Programs, US Environmental Protection Agency, 1200 Pennsylvania Ave., NW (MC 7509P), Washington, DC 20460, USA
d Office of Food Additive Safety (HFS-275), Center for Food Safety and Applied Nutrition, US Food and Drug Administration, College Park, MD 20740, USA
ARTICLE   INFO

Article history:
Received 27 April 2009
Available online xxxx

Keywords:
Genotoxicity
In vitro
High-throughput screening
ToxCast
Pesticides
Hazard assessment
GreenScreen HC
CellCiphr
CellSensor
p53
CADD45 alpha
                                        ABSTRACT
 Three high-throughput screening (HTS) genotoxicity assays—GreenScreen HC GADD45a-GFP (Centronix
 Ltd.), CellCiphr p53 (Cellumen Inc.) and CellSensor p53RE-bla (Invitrogen Corp.)—were used to analyze
 the collection of 320 predominantly pesticide active compounds being tested in Phase I of US. Environ-
 mental Protection Agency's ToxCast™ research project. Between 9% and 12% of compounds were positive
 for genotoxicity in the assays. However, results of the varied tests only partially overlapped, suggesting a
 strategy of combining data from a battery of assays. The HTS results were compared to mutagenicity
 (Ames) and animal tumorigenicity data. Overall, the HTS assays demonstrated low sensitivity for rodent
 tumorigens, likely due to: screening at a low concentration, coverage of selected genotoxic mechanisms,
 lack of metabolic activation and difficulty detecting non-genotoxic carcinogens. Conversely, HTS results
 demonstrated high specificity, >88%. Overall concordance of the HTS assays with tumorigenicity data was
 low, around 50% for all tumorigens, but increased to 74-78% (vs. 60% for Ames) for those compounds pro-
 ducing tumors in rodents at multiple sites and, thus, more likely genotoxic carcinogens. The aim of the
 present study was to evaluate the utility of HTS assays to identify potential  genotoxicity hazard in the
 larger context of the ToxCast project, to aid prioritization of environmentally relevant chemicals for fur-
 ther testing and assessment of carcinogenicity risk to humans.
                                                      © 2009 Elsevier Inc. All rights reserved.
1. Introduction

   The ToxCast™ project of the National Center for Computational
Toxicology (NCCT) at the US. Environmental Protection Agency
(EPA) aims to develop a cost-effective approach for rapid prioritiza-
tion of the assessment of toxicity of large numbers of environmen-
tally  relevant chemicals (EPA ToxCast, 2009). Using data from
state-of-the-art high-throughput screening (HTS) and high-content
imaging bioassays, ToxCast is building computational  models to
forecast the  potential  toxicity of  these compounds to humans.
These hazard predictions  will provide EPA regulatory programs
with science-based information helpful in  prioritizing compounds
for more detailed toxicological evaluations and, ultimately, lead to
more efficient use of animal testing. Investigating modern HTS as-
says for genotoxicity is a key element of the ToxCast project and an
important component of toxicity risk assessment. The hypothesis
is that useful and predictive data can be obtained  by integrating
the results  from large batteries of individual  HTS tests. Such
 * Corresponding author. Fax: +44 161 606 7337.
   E-mail address: andrew.knight@gentronix.co.uk (A.W. Knight).

0273-2300/S - see front matter © 2009 Elsevier Inc. All rights reserved.
doi: 10.1016/j.yrtph.2009.07.004
                          integrated data will supplement traditional assays for evaluating
                          chemical genotoxicity in support of carcinogenicity assessments,
                          and provide useful insights into mechanisms of toxicity, which in
                          turn will impact on risk assessment (Dix et al., 2007;  Houck and
                          Kavlock, 2008).
                             To  test this hypothesis, an initial set  of 320 compounds was
                          chosen based on a significant amount of accessible historical toxi-
                          cological data and was designated as the ToxCast Phase I Chemical
                          Dataset. Most of the ToxCast Phase I compounds are pesticide ac-
                          tive compounds  with hazard assessment toxicological data from
                          the US EPA Office of Pesticide Programs (EPA OPP, 2009a).
                             The regulatory requirement for mutagenicity testing to support
                          a pesticide registration is found in the EPA's Code of Federal Regu-
                          lations (CFR) Title 40 Part 158 (EPA CFR, 2009). In 1991, the Agency
                          revised these guidelines to identify specific mutagenicity testing to
                          be performed;  these appear  in the  OPP Pesticide Assessment
                          Guidelines, Subdivision F, Hazard Evaluation: Human and Domes-
                          tic Animals. The  Subdivision F guideline was revisited  in 1996 by
                          the Office of Prevention, Pesticides and Toxic Substances (OPPTS)
                          and is presented as the OPPTS Harmonized Test Guidelines, Series
                          870 Health Effects Volume I-III (EPA  OPPTS, 2009.) The existing
 Please cite this article in press as: Knight, A.|
 Regul. Toxicol. Pharmacol. (2009), doi:10.1C
Previous
                                                    i profiling the US EPA ToxCast™ chemicals.

-------
                                  A.W. Knight et al./Regulatory Toxicology and Pharmacology xxx (2009) xxx-xxx
EPA test battery is a three-tiered system that includes tests for
gene mutation in  bacteria (Ames test);  mammalian  cells  (e.g.,
mouse lymphoma,  Chinese hamster ovary (CHO) cells, CHO strain
AS52 and/or Chinese hamster V79 lung fibroblasts), and; chromo-
somal  aberrations  in vitro (mammalian cells) or in vivo (mouse
micronucleus assay). If positive results are obtained in one or more
of the first tier assays, subsequent tests are performed  to confirm
these results; pesticides may also be assessed for potential germi-
nal cell effects (Dearfield et al.,  1991; Waters et al., 1993). This ap-
proach is similar to the  strategy developed by the United Kingdom
and is in general agreement with strategies devised by other Euro-
pean Economic Community  nations (Cimino, 2006). Within the
statutory  limitations of the Federal Insecticide, Fungicide, and
Rodenticide Act (FIFRA), such mutagenicity testing is required for
all  general use pesticides,  including  terrestrial,  aquatic, green-
house,  forestry, domestic outdoor and indoor for both food crop
and non-food uses  (EPA FIFRA,  2009).
   Environmental regulators and researchers have looked to the
pharmaceutical industry for modern techniques to assess genotox-
icity (Muller et al.,  1999; US Federal Register, 2008). However, the
established genetic toxicity tests currently recognized  by regula-
tory authorities, such as the Ames test, chromosome  aberration
test, in vitro micronucleus test or in vitro mouse lymphoma thymi-
dine kinase mutation assay,  are not suited to higher throughput
compound screening as they require too  much compound, have
too low a throughput, and require too much time to conduct (Cus-
ter and Sweder, 2008). In vitro assays can be overly sensitive when
used for broad chemical screening due to a lack of accounting for
lower in vivo bioavailability and lack of deactivation mechanisms
present in in vitro systems. In some cases, prokaryotes  give nega-
tive results for compounds  known to specifically interact with
eukaryotic targets such as chromatin DNA and the different pro-
teins involved in DNA metabolism or chromosome segregation
(Walmsley, 2005), leading to low assay sensitivity and falsely neg-
ative predictions of in vivo hazard. Nevertheless, the Ames test is
often used as a predictor of rodent carcinogenesis (Ames et al.,
1973a,b; Matthews et al., 2006). Conversely, it has recently been
recognized that in  vitro mammalian genotoxicity assays, such as
the micronucleus test, demonstrate low specificity, i.e., the ability
to correctly identify non-rodent-carcinogens, leading to falsely po-
sitive predictions of in  vivo hazard (Kirkland et al., 2005, 2007).
This may in part be explained by the highly toxic and physiologi-
cally unattainable doses applied in these assays in order to detect
bonafide carcinogens (Kirkland, 1992). A prerequisite for the Tox-
Cast project is that  such assays  should be in an HTS format so that
the assessment of large numbers of compounds can be conducted
efficiently and cost effectively,  while also attempting to maintain
accuracy in their prediction of in vivo hazard.
   The pharmaceutical  industry has employed scaled-down ver-
sions of regulatory assays and  other screening approaches. High-
throughput, microplate-based bacterial mutagenicity assays such
as Ames II (Fliickiger-Isler  et  al., 2004),  SOS umuC Chromotest
(Reifferscheid et al., 1991; Reifferscheid and Heil,  1996) and Vito-
tox (Muto et al., 2003),  have been shown to be effective screening
alternatives to the standard Ames test and have demonstrated high
concordance with this test using selected compounds. The devel-
opment of higher  throughput, miniaturized versions  of in vitro
mammalian tests presents a greater challenge due to the fragility
of the cell lines, automating labor-intensive multi-step processes
and the requirement for manual, and often more subjective, scor-
ing. This said, a protocol for a higher throughput micronucleus test
using mouse lymphoma cells and  automated  scoring  by flow
cytometry has been developed (Bryce  et al., 2008). However, the
number of validation compounds and the diversity of their chem-
istry are currently limited, and scoring by flow cytometry will ulti-
mately place a limit on  the testing throughput attainable. There is
                       also a  microplate  format  thymidine  kinase  mutation  assay,
                       although the assay still requires many weeks for dose setting and
                       the selection of revertants for counting (Chen and Moore, 2004).
                          The purpose of this paper is to evaluate recently developed HTS
                       methodologies from Centronix Ltd., Cellumen Inc., and Invitrogen
                       Corp., the latter performed  by the National Institutes of Health
                       Chemical Genomics Center (NCGC), in the context of the particular
                       chemical space and larger aims of the ToxCast project. These assays
                       each reflect different aspects of the cellular response to a genotoxic
                       challenge, which can lead to DNA damage, mis-repair and muta-
                       tions and, ultimately, to tumorigenesis and carcinogenesis.
                          The Centronix 'GreenScreen HC assay uses a human lymphoblas-
                       toid TK6 cell line (Watanabe et al., 1995), which has been genetically
                       modified by incorporating a green fluorescent protein (GFP) reporter
                       based on the regulation of the human GADD45a (growth arrest and
                       DNA damage) gene (Hastwell et al., 2006). GADD45a  mediates the
                       cell's responses to genotoxic stress and the GFP fluorescence repor-
                       ter includes p53 regulatory elements that ensure specific and dose-
                       dependent response from the gene reporter. Validation studies have
                       shown that the assay responds positively to all classes of genotoxic
                       damage, but unlike other genetic toxicity assays, appears particu-
                       larly suited to achieving both high sensitivity and specificity for dis-
                       criminating  genotoxic rodent carcinogens from non-carcinogens
                       and carcinogens acting by epigenetic mechanisms (Hastwell et al.,
                       2006). The Cellumen 'CellCiphr' cytotoxicity profiling panel consists
                       of fluorescent probes for 10 cell features of cytotoxicity of which one
                       is a DNA damage response in human HepG2 cells, measured by p53
                       activation via a fluorescent anti-p53 antibody (Vernetti et al., 2009).
                       The Invitrogen  'CellSensor' assay uses a beta-lactamase  reporter
                       gene under the control of p53 response elements stably integrated
                       into HCT-116 cells (Brattain et al., 1981).  This test system employs
                       proprietary 'GeneBLAzer' technology based on fluorescence reso-
                       nance energy transfer (FRET) (Zlokarnik et al., 1998). p53 is known
                       to act in a 'gate keeper' role during the cell cycle, ensuring genetic
                       and cellular integrity (Lane, 1992). Through sequence specific and
                       also non-specific binding, p53 can act in a variety of ways in the cell
                       in response to genotoxic stress. When DNA damage is sensed it can
                       activate DNA repair proteins, it can hold  the cell cycle at the Gl/S
                       regulation checkpoint until repair is effective, or it can initiate apop-
                       tosis if the DNA damage cannot be repaired. Thus p53 is central to
                       many of the cell's DNA damage response pathways and anti-cancer
                       mechanisms (Liu and Kulesz-Martin, 2001). In this way the p53 as-
                       says serve as a broader screen and will also respond to the effects of
                       cytotoxicity-induced cellular regeneration and neoplasia that can
                       lead to tumor formation.
                          Fundamentally carcinogenesis, and to a lesser extent mutagene-
                       sis, are complex multistage and multi-pathway processes. There are
                       more than 100 distinct types of cancer, and corresponding subtypes
                       of organ specific tumors. In addition, cells  need to pass through sev-
                       eral separate physiological changes to develop the novel capabili-
                       ties acquired during tumor development, and breach the cell's
                       inherent  defense  mechanisms (Hanahan and  Weinberg, 2000).
                       Genotoxicity, i.e., direct or indirect DNA damage which may result
                       in DNA mutation or changes in chromosome structure or number, is
                       only one dimension in the process and indeed some carcinogens act
                       through non-genotoxic modes of action (MOAs). For example, in a
                       wide survey of marketed Pharmaceuticals, Brambilla and Martelli
                       (2009) reported that out of 315 drugs with both genotoxicity and
                       carcinogenicity  data, 24% were carcinogenic in at least one sex of
                       mice or rats but test negative in genotoxicity assays, while the same
                       percentage were positive in both  carcinogenicity and genotoxicity
                       assays. It is not sufficient to say that a compound which produces
                       positive results  in in vitro mutation assays will be a genotoxic car-
                       cinogen in vivo (EPA, 2007). There are many forms of DNA damage
                       that genotoxic compounds can exert including;  single or double
                       strand breaks, alteration of bases, formation of covalent adducts,
 Please cite this article in press as: Knight, A.W.I
 Regul. Toxicol. Pharmacol. (2009), doiilO.lOiel
Previous
                                                  1-ofiling the US EPA ToxCast™ chemicals.

-------
                                  A.W. Knight et al./Regulatory Toxicology and Pharmacology xxx (2009) xxx-xxx
oxidative damage, and a myriad  of alterations that disrupt the
process of DNA replication or repair. In turn, the cell has a multitude
of sensors, transducers and effectors that orchestrate the response
of the cell to the genotoxic assault (Harper and Elledge,  2007;
Norbury and Hickson, 2001). Thus no single HTS assay, whether
based on specific mutation events or particular cellular responses
to DNA damage, will do a comprehensive job of accurately predict-
ing animal carcinogenicity. Rather the hope is that each HTS assay
will be a surrogate for one  of the dimensions in the process and
that, by combining the results  of a battery of diverse assays in a
statistical manner, they will provide a satisfactory screen for poten-
tial carcinogenic compounds (Waters et al., 1988).
   The results from screening the 320 ToxCast compounds  in the
three HTS assays have been compared to one standard genotoxicity
assay in the form of historical mutagenicity (Ames) data, available
for a subset of the test compounds, and animal tumorigenicity stud-
ies  conducted in support of regulatory requirements. The  main
objective was to assess the performance of the HTS data alongside
results obtained from a conventional genotoxicity assay for which
the HTS assays  provide supportive complementary information,
considering both for informing prediction of rodent carcinogenicity
endpoints. In vivo tumorigenicity data for this  set of compounds
have been extracted from the recently published chronic toxicity re-
sults from the US EPA ToxRefDB database, which includes potency
results for multiple rodent species and various sites of tumor devel-
opment (Martin et al., 2009; EPA ToxRefDB, 2009). Finally, the assays
have been assessed for their practical utility for deployment in HTS
campaigns applied to screening hundreds or thousands of diverse
chemicals.
2. Materials and methods

2.1. Selection of the test compound set

   The compounds chosen for Phase I of the ToxCast program were
selected to include a wide diversity of chemical classes and modes
of action. Of the many groups of environmental compounds that re-
quire prioritization by regulatory authorities, pesticide compounds
are amongst the most characterized and prevalent in toxicity dat-
abases, thus providing a comprehensive set of toxicity endpoints
against which to judge the performance of the HTS assays. From
the approximately 1000 conventional pesticide actives registered
by EPA, the selection was narrowed down based on: (i) the degree
of overlap with other databases such as those of the National Tox-
icology Program (NTP, 2009) and EPA DSSTox (EPA DSSTox, 2009)
and (ii) suitability for HTS assays, i.e., giving lower priority to inor-
ganics, organometallics, highly lipophilic (high AlogP as a measure
of octanoI/water partitioning) and smaller volatile compounds with
molecular weights <150 (Dix et al., 2007).
   The final compound set is comprised of 309 unique structures;
291 pesticides actives, 8 metabolites and 10 industrial chemicals
with relevance to other toxicological programs. Of the 291 pesticide
actives, 273 are  registered  and 18 are unregistered. Three com-
pounds (bensulide, diclofop methyl and prosulfuron) were included
in triplicate, randomly distributed in the compound set to allow for
the assessment of assay reproducibility. Five compounds (3-iodo-2-
propynylbutylcarbamate, chlorsulfuron, dibutyl phthalate, EPTC
and fenoxaprop-ethyl) were included in duplicate but sourced from
different suppliers in order to check for potential differences in test
results with source and small differences in purity. Compound pro-
curement and preparation was handled by Compound Focus Inc., a
subsidiary of BioFocus DPI (South San Francisco, CA, USA). Regard-
ing chemical sources,  including replicates:  193 of chemicals were
purchased from  Sigma-Aldrich (St.  Louis,  MO,  USA) principally
Pestanal  standards from Riedel-de  Haen;  101  from  Crescent
                         Chemicals Co. (Islandia, NY, USA); 5 from ChemService Inc. (West
                         Chester,  PA, USA); 4 from Wako Chemicals USA Inc. (Richmond,
                         VA, USA); 2 from Alfa Aesar (Ward Hill, MA, USA); 1 from Acros
                         (Geel, Belgium); 1 from TCI America (Portland, OR, USA); 2 from
                         Battelle (in turn obtained from Sigma-Aldrich and Matrix Scientific
                         (Columbia, SC, USA)); and 11 chemicals were procured from the
                         EPA National Pesticide Standard Repository (which came from var-
                         ious chemical companies). All compounds were coded and blind-
                         tested by the assay operators  and only identified at the end of the
                         study for the purposes of comparative data analysis. The  com-
                         pounds were supplied to the assay operators as frozen aliquots dis-
                         solved in dimethylsulfoxide (DMSO) at a concentration of 20 mM.

                         2.2. GreenScreen HC assay (Centronix Ltd.)

                            The GreenScreen HC assay uses two genetically modified TK6
                         cell lines: the GADD45a-GFP reporter strain (GenM-TOl) and a con-
                         trol strain (GenM-COl) containing an out-of-frame EGFP gene, such
                         that a functional and fluorescent GFP protein is not produced. The
                         control strain was used to allow effective correction  for any test
                         compound's  autofluorescence,  or  non-specific induced  cellular
                         fluorescence, that may  otherwise give a false indication of GFP
                         induction in the reporter strain. A suspension of 2 x 106 cells per
                         ml in a proprietary assay medium are  added to a dilution series
                         of the test  compound, and separately to a standard genotoxicant
                         (methylmethane  sulfonate), in 96-well microplates. At  24 and
                         48 h time-points during incubation of the microplate at 37 °C, 5%
                         C02, the measurement  of the induction in cellular fluorescence
                         was  indicative of genotoxicity, while  the measurement of the
                         reduction in optical absorbance, proportional to the inhibition of
                         cell proliferation, was used to quantify general cytotoxicity, both
                         with reference to statistically defined thresholds (Hastwell et al.,
                         2006). Measurements were made in an Ultra 384 (Tecan, Theale,
                         UK)  microplate reader.  Recently the microplate layout has been
                         modified for HTS  applications,  testing  12 compounds per plate.
                         Each compound was tested over 3 serial dilutions (200,  100 and
                         50 (J.M), with the top test concentration limited by the 1% v/v tol-
                         erance of DMSO in the GreenScreen HC assay. Using this strategy,
                         only very potent genotoxins active at lower concentrations would
                         cause significant cytotoxicity, triggering  re-testing from a lower
                         concentration. A full description of the  protocol employed in this
                         exercise  has been published elsewhere (Knight  et al., 2009).
                            Cytotoxicity was assessed  by the percentage reduction in cell
                         proliferation  (relative cell density) compared to that achieved in
                         the vehicle-treated controls.  It is important to note that this is
                         not a measure of cell viability or death. If the cell density  relative
                         to a vehicle-treated control fell  below 80% at 1  test concentration
                         the compound was deemed cytotoxic and if extended over 2 or 3
                         concentrations, strongly cytotoxic.  Otherwise the compound was
                         considered negative for cytotoxicity. Fluorescence  induction in
                         the test strain was corrected for both cell density and autofluores-
                         cence with reference to the control strain. If induction of GFP fluo-
                         rescence relative to a vehicle-treated control exceeded 50% at 1
                         test concentration the compound was deemed genotoxic and if ex-
                         tended over 2 or 3 concentrations, strongly genotoxic. Otherwise
                         the compound was considered negative for genotoxicity.

                         2.3. CellCiphr Cytotox Profiling Panel—p53 endpoint. (Cellumen Inc.)

                            Assessment of  the DNA damage response  in human-derived
                         HepG2 cells was determined by measurement of p53 activation
                         via a fluorescent anti-p53 antibody, one of 10 phenotypic end-
                         points in the Cellumen CellCiphr Cytotox Profiling Panel (Vernetti
                         et al., 2009). The 20 mM stock of test compound was diluted with
                         100% DMSO in 9 twofold dilution steps to create 10 concentrations
                         from 200 to 0.39 u,M. HepG2 cells in log-phase growth were seeded
 Please cite this article in press as: Knight, A.|
 Regul. Toxicol. Pharmacol. (2009), doi:10.1C
Previous
                                                   i profiling the US EPA ToxCast™ chemicals.

-------
                                  A.W. Knight et al./Regulatory Toxicology and Pharmacology xxx (2009) xxx-xxx
in collagen-coated 384 well microplates at 3 cell concentrations
corresponding to  different  exposure times, [4.3 x 103 (acute),
2.4 x 103 (early), 1.2 x 103 (chronic) cells/well], allowed to settle
for 30 min at room temperature and incubated at 37 °C, 5% C02
for 16 h prior to treatment.  Following exposure, cells were fixed
in 4% buffered formalin, the  nucleus stained with Hoechst 33342,
and for phospho-p53 using Alexa Fluor 488 antibody. Plates were
read on an Arrayscan HCS Reader (Thermo-Fisher) using channel
1 to quantitate valid  objects (cells) defined  by Hoescht staining
and p53 activity in channel 2 by quantifying the amount of Alexa
Fluor 488 antibody fluorescence in the area defined by the nuclear
dye. Data were normalized  to  vehicle-treated control  cells. Re-
sponses are measured at 3 different time-points (30 min (acute),
24 h (early) and 72 h (chronic)) for each of 10 concentrations of
the test compounds (serial dilutions from 200 u,M) tested in dupli-
cate along with positive and negative controls, to collect the data to
determine  the half-maximal activity (AC50). AC50  values  were
determined by fitting the data to the Hill equation using the Con-
doseo module of Genedata Screener (Genedata AG, Basal, Switzer-
land). A positive  result was concluded if  the p53 AC50  was
calculated to be below 200 u,M, provided the AC50 was lower than
the IC50 for cell loss/cytotoxicity at for that time point.

2.4. Invitrogen CellSensor p53RE-bla HCT-116 assay (NCGC)

   HCT-116 cells with a stably integrated beta-lactamase reporter
gene under control of p53 response elements were suspended in
OPTI medium including 0.5% dialyzed FBS, and plated onto 1536
well assay plates at a density of 4000 cells/well (5 u.1 of assay med-
ium with cells suspended at 8 x 105 cell/ml) using a Flying Reagent
Dispenser (Aurora  Discovery, Carlsbad, CA). After the plates were
incubated at 37 °C, 5% C02 for 6 h,  23 nL of compounds  dissolved
in DMSO, positive controls  or DMSO alone were transferred to
the assay  plate by a pin tool (Kalypsys, San Diego, CA) resulting
in a 217-fold dilution. The final compound concentration  in the
5 u.1 assay volume  ranged from 1.2  nM to 92  u,M in 15 concentra-
tions. Nultilin-3 was used as a  positive control. The plates were
then incubated for a further 16 h at 37 °C, 5% C02. Subsequently,
1 ul of  GeneBLAzer™ B/G FRET substrate (Invitrogen,  CA)  was
added, the plates  were incubated  at room temperature for 2 h,
and fluorescence intensity at 460 and 530 nm emission was mea-
sured with excitation at 405 nm using an  EnVision microplate
reader (Perkin Elmer, Shelton, CT). Data were expressed as the ratio
of emissions at 460 nm/530  nm. For primary data analysis,  read-
ings for each titration point  were first normalized relative  to the
Nultilin-3 control (12 u,M, 100%) and wells containing the vehicle
only (basal, 0%), and then corrected by applying a pattern correc-
tion algorithm using control plates containing the DMSO diluent
alone. Concentration-response titration points for each compound
were fitted to the Hill equation and concentrations of half-maximal
activity (AC50) and maximal  response (efficacy) values were calcu-
lated. (Invitrogen, 2009). A positive result was concluded  if the p53
AC50 was calculated to be below 92 uM.
3. Reference data

   Two comparative data sets were compiled: collated data from
the Salmonella typhimurium reverse mutagenesis assay (Ames test)
extracted from multiple public sources; and chronic toxicity re-
sults from the EPA ToxRefDB.

3.1. Ames test data

   In order to provide a link and a fair comparison between the
new in vitro HTS genotoxicity data generated in this project and
                       the chronic toxicity data from rodent bioassays, results from a reg-
                       ulatory in vitro assay, the bacterial Salmonella mutagenesis (Ames)
                       test, were compiled from the Leadscope toxicity database (2008)
                       that includes data from a variety of sources: the US National Insti-
                       tute of Environmental Health Sciences' National Toxicology Pro-
                       gram  (NTP,  2009);  the  US   National  Library  of Medicine's
                       Chemical Carcinogenesis Research Information System  (CCRIS,
                       2009); the  US National Institute for  Occupational Safety and
                       Health's  Registry of Toxic Effects of Chemical Substances (RTECS,
                       2009); the  Carcinogenic Potency Database (CPDB, 2009); and the
                       Tokyo Metropolitan Institute of Public Health Mutagenicity of Food
                       Additives Database (TMIPH, 2009). Where conflicting data were
                       observed between different  studies, a straightforward algorithm
                       was applied to arrive at a positive, intermediate or negative score.
                       The procedure was based on the number of reported positive and
                       negative studies, as well as rules based on content, i.e., the mini-
                       mum assay information a data source should provide before it is
                       graded and the completeness and quality of the data record (Yang
                       et al., 2008). Using this approach Ames data were publicly available
                       for  108 of the 309 unique test compound structures. The Ames data
                       and the details of this analysis are provided in the supplementary
                       information Appendix A and  on the ToxCast  home page (EPA
                       ToxCast,  2009).

                       3.2. Chronic toxicity data

                          Rodent tumorigenesis data from chronic, two-year bioassays
                       were collated  from the ToxRefDB  database, as of  October 2008
                       (Martin et al., 2009; EPA ToxRefDB, 2009). These data allowed dis-
                       crimination of tumorigenicity based on occurrence in rats or mice,
                       males or females, and single or multiple sites within each animal
                       category. For the purposes of statistical comparison, positive and
                       negative scores were given in three categories: (i) rodent tumori-
                       gen (rat  and/or mouse); (ii) multiple  species tumorigens, and;
                       (iii) multiple site tumorigen. In these three categories data were
                       available for 273, 212 and  273 test compounds, respectively.  In
                       addition, positive results were ranked by potency.  Hence further
                       comparisons were made to a subset of the most potent tumorigens
                       by selecting those compounds with a potency <15 mg/kg/day.
                       4. Results

                          Full results for all 320 test chemicals are provided in the supple-
                       mentary information Appendix B and on the ToxCast home page
                       (EPA ToxCast, 2009). This comprises a comparative genetic toxicity
                       database comparing results for the 309 unique compounds, provid-
                       ing chemical name, CAS number, DSSTox chemical IDs, color in
                       solution, qualitative (positive/negative) and quantitative (lowest
                       effective concentration or AC50) genotoxicity  results, a positive
                       or negative call for Ames mutagenicity data and ToxRefDB tumor-
                       igenicity data with a delineation between rodent, multiple species,
                       multiple site and most potent rodent tumorigens.

                       4.1. Physicochemical properties

                          For assays based on optical endpoints, there is a potential for
                       interference to occur  in the measurement of optical absorbance
                       and fluorescence from compounds that are autofluorescent, col-
                       ored or that precipitate from aqueous  solution upon  incubation
                       with cells and assay medium. A preliminary assessment of the de-
                       gree of interference for the ToxCast compound set was determined
                       using the GreenScreen HC  assay where  measurements  of conven-
                       tional optical absorbance and fluorescence are made. In this assay
                       each compound was initially tested  from 200 uM, and 58  com-
                       pounds (18.1%) were  found  to form  significant precipitates after
 Please cite this article in press as: Knight, A.W.I
 Regul. Toxicol. Pharmacol. (2009), doi:10.1016l
Previous
                                                  1-ofiling the US EPA ToxCast™ chemicals.

-------
                                   A.W. Knight et al./Regulatory Toxicology and Pharmacology xxx (2009) xxx-xxx
incubation with test cells and assay medium. Since higher optical
absorbance readings can artificially increase the estimation of cell
density, and  reduce sensitivity for assessment of cytotoxicity, in
this project, these compounds were re-tested from a 2- or 4-fold
lower starting concentration to confirm the accuracy of the initial
test result. After optimizing the  test concentration, only 9 com-
pounds (2.9%)  produced  optical  absorbance values greater  than
20% that of the cell density achieved by the vehicle-treated con-
trols. Fourteen compounds (4.5%) were visibly colored in solution;
however, the color was not sufficiently high as to limit the analysis.
Ninety-three compounds  (30.1%) were greater than 20% more
fluorescent than the background autofluorescence of the vehicle-
treated control strain and assay medium. However, the use of the
non-fluorescent control  strain to correct the measured fluores-
cence induction coupled with a simple corrective method, based
on fluorescent polarization measurements for the most autofluo-
rescent compounds  to  enhance  the  discrimination of  GFP,
adequately compensated for the test compound's autofluorescence
in all cases (Knight et al., 2002). Thus for testing compounds from a
concentration of 200 u,M, these results suggest that the physico-
chemical properties of the test compounds would present minimal
challenge for an assay with optical endpoints.

4.2. Reproducibility

   Table 1 shows the genotoxicity and cytotoxicity results for the
GreenScreen  HC and CellCiphr p53 assays for the 8 compounds
with replicate samples  randomly distributed in the 320 ToxCast
compound set.  The  data show that both  assays were highly
consistent within each set of replicates, giving identical genotoxi-
city results and closely matching cytotoxicity results. Upon inspec-
tion of the raw data for  GreenScreen HC, dibutyl phthalate and
                                prosulfuron showed very similar dose-dependent trends in cyto-
                                toxicity responses between replicates. Only one test of prosulfuron
                                produced a reduction in relative cell density just over the signifi-
                                cance threshold at 48 h and only at the highest test concentration
                                (200 u,M). There appears to be no difference in the reproducibility
                                results based on chemical supplier. The CellSensor p53 assay gave
                                negative results for all replicates and thus data for this assay is not
                                included in Table  1.

                                4.3. Effective concentration ranges

                                  The supply of pre-solubilized library compounds in DMSO solu-
                                tion automatically restricts the highest concentrations that can be
                                tested because of the inherent cytotoxicity associated with DMSO.
                                Compounds were  tested up to 200  u,M in the GreenScreen HC and
                                CellCiphr p53 assays  and  to  92 uM in the CellSensor p53 assay.
                                Comparing the concentrations at which a significant positive geno-
                                toxicity result was recorded allows an assessment of the sensitivity
                                of each assay in terms of chemical  concentration, bearing in mind
                                the caveats of the  restricted concentration range  and  defined dilu-
                                tion strategy. The  average AC50 for a positive result in each assay
                                (LEC in the case of GreenScreen HC) were as follows:  GreenScreen
                                HC  (65 uM) > CellCiphr p53  (58 uM) > CellSensor p53 (31 uM).
                                These findings indicate all assays show similar chemical sensitivity
                                for a genotoxicity  response.

                                4.4. Screening hit rates for genotoxicity and cytotoxicity

                                  The  following  data  analysis  is  based on  the  309  unique
                                compounds in the set omitting replicates. Where data for the rep-
                                licate compounds were not  consistent for the  same assay, the
                                strongest positive response was taken as the 'definitive' result for
Table 1
Genotoxicity and cytotoxicity results for the 8 compounds which were represented in duplicate or triplicate in the ToxCast chemical set. Groups of 2 replicates are from different
chemical suppliers. Groups of 3 replicates are true replicates from the same chemical stock Key: +, positive; + +, strong positive (GreenScreen HC assay only); —, negative; LEC,
Lowest effective concentration; AC50, Active concentration for 50% effect; CFLID, project's chemical ID code.
Chemical name
                          CAS
                          number
Replicate CFIJD
#
Source    Purity GreenScreen HC  CellCiphr p53
GreenScreen HC  CellCiphr p53
                                                                 Genotoxicity LEC  Genotoxicity AC50
                                                                           (um)           Min
                                                                                         (um)
                                                              Cytotoxicity LEC  Cytotoxicity AC50
                                                                        (um)          Min
                                                                                      (um)
3-Iodo-2-
propynylbutylcarbamate

Bensulide


Chlorsulfuron


Dibutyl phthalate

Diclofop-methyl



EPTC

Fenoxaprop-ethyl


Prosulfuron



Agreement in all replicates
55406-53-
6

741-58-2


64902-72-
3

84-74-2

51338-27-
3


759-94-4

66441-23-
4

94125-34-
5



1

2
1
2
3
1

2
1
2
1

2
3
1
2
1

2
1

2
3

TV000274 Crescent

TV000023 Sigma
TV000097 Sigma
TV000334 Sigma
TV000318 Sigma
TV000060 Crescent

TV000264 Sigma
TV000026 Sigma
TV000290 Alfa_Aesar
TV000369 Sigma

TV000333 Sigma
TV000103 Sigma
TV000178 Sigma
TV000259 Crescent
TV000266 Crescent

TV000374 Sigma
TV000138 Sigma

TV000311 Sigma
TV000303 Sigma

97 + +

97.1 + +
99.5 -
99.5 -
99.5 -
99

99.9 -
99.63 -
98.6 -
99.2 -

99.2 -
99.2 -
98.7 -
97
98

98.4 -
98.4 -

98.4 -
98.4 -
8/8
12.5 + 8.5 ++ 12.5 +

12.5 + 81.0 + + 12.5 +
+ + 25 +
+ + 25 +
+ + 25 +
_ - -

-
+ + 50 +
+ 200 +
+ + 50 -

+ + 50 +
+ + 50 +
-
-
+ + 25 +

+ + 50 +
-

+ 200 -
-
8/8 7/8 7/8
8.6

3.2
30.5
34.4
40.4



199
186


191
198


120

169








Please cite this article in press as: Knight
Regul. Toxicol. Pharmacol. (2009), doi:10

!&

Previous I

^^n

B'^^VPQIf^^B PmlilinS the US EPA ToxCast™

chemicals.


-------
                                   AW. Knight et al./Regulatory Toxicology and Pharmacology xxx (2009) xxx-xxx
Table 2
Summary of the genotoxicity screening  hit  rates for  positive  results  in the
genotoxicity HTS assays.
Assay
GreenScreen HC
CellCiphr p53
CellSensor p53
Genotoxicity
Number
32
27
36

%
10.4
8.7
11.7
Cytotoxicity
Number
231
171

%
74.8
55.3
comparative analysis, representing evidence of a risk for a positive
result in that particular assay. Table 2 summarizes the screening
hit rates for the 3 HTS assays  for the compound collection. The
number of positive results for  genotoxicity was similar between
the 3 assays, averaging just over 10%. The hit rate of 10.4% for
GreenScreen HC was only marginally higher than the 7.3% demon-
strated previously in a larger collection of 1266  compounds in a li-
brary of pharmacologically active compounds, LOPAC (Knight et al.,
2009). This  may in part reflect  testing from 200 u,M in this study
rather than 100 u,M  in the larger collection. Nine percent of the
GreenScreen HC  positive results were only positive at 200 u,M,
while the most common lowest effective concentration (LEC) was
100 (J.M (34% of positive results), comfortably within the test con-
centration range.
   While the GreenScreen  HC,  CellCiphr p53 and CellSensor p53
assays  produced similar numbers of positive results (32, 27 and
36, respectively) the overlap between data sets was  relatively
small. The  number  of positive results which  were common  to
CellCiphr p53 and GreenScreen HC was 6. The number of positive
results which were  common to GreenScreen HC and CellSensor
p53 was 9. Eleven positive compounds were detected in both the
CellCiphr  p53 and CellSensor p53 sets. Seventy one compounds
(23.0%) were positive in at least one of the three HTS assays. The
overlap in positive results between assays is shown in Fig.  1.
   The number of positive results for cytotoxicity was higher than
for genotoxicity in the CellCiphr p53 and the GreenScreen  HC as-
says. The highest frequency of positive cytotoxicity results was
demonstrated in the GreenScreen HC assay (74.8%). This is signifi-
cantly higher than the 33% rate seen previously in the LOPAC col-
lection, for example  (Knight et  al., 2009). This is likely due to the
difference in the nature of the  chemical space, i.e., pesticides are
designed to reduce the viability of microorganisms, plants and ani-
mals, rather than providing therapeutic benefit. It should be noted
that this discrepancy is not as a result of higher dose used in this
study compared  to  an  evaluation of the LOPAC  compounds,  as
the most frequent LEC reported for the positive cytotoxicity results
(53.1% of compounds) was 50 u,M. The CellCiphr p53 assay re-
ported a similarly high  number of positive cytotoxicity  results
(55.3%).
      GreenScreen HC
                                            CellCiphr p53
                                            X
                             CellSensor p53

   Fig. 1. Overlap of positive genotoxicity results predicted by the HTS assays.
                        4.5. Prediction of mutagenicity and tumorigenicity data

                           The principle aim of conducting in vitro genotoxicity assays is to
                        predict the potential for genotoxicity or carcinogenicity in vivo. The
                        predictive ability and accuracy of the HTS assays were determined
                        by comparison to published Ames test data and the chronic toxic-
                        ity  results (tumorigenicity) from the US EPA ToxRefDB (Martin
                        et al., 2009). Commonly used terms for comparative analysis and
                        their definitions are taken from Cooper and co-workers and are
                        paraphrased here (Cooper et al., 1979). Also included is the term
                        'relative predictivity',  introduced by Kirkland, which is the ratio
                        of real to false results  (Kirkland et al., 2005).
                           A compound tested in any particular assay (e.g.,  Test A) can
                        have a positive outcome for which there may be either a positive
                        result (a), or a negative result (b), from a second,  comparative
                        in vivo test for carcinogenicity. Thus the total number of positives
                        for Test A is (a + b). Similarly, Test A can have a negative outcome
                        for which there might be either a positive (c), or a negative (d), re-
                        sult from the comparative test. Thus, the total number of negatives
                        for Test A is (c + d). It follows that the total number of positive re-
                        sults from the second comparative test is (a + c) and the total num-
                        ber of negative results from the second comparative test is (b + d).
                        Similarly, the total number of compounds for which there are data
                        for  both  tests, represented by N, is (a + b + c + d). The following
                        terms were calculated from these basic figures:

                        • Sensitivity, the percentage of correctly identified positives = [a/
                          (a + c)] x 100.
                        • Specificity, the percentage of correctly identified negatives = [d/
                          (b + d)] x 100.
                        • Concordance, the percentage of correctly identified results, both
                          positive and negative = [(a + d)/N] x 100.
                        • Balanced accuracy = mean of the  sensitivity and specificity.
                        • Prevalence, percentage of positive calls in the second, compara-
                          tive test results = [(a + c)/N] x 100.
                        • Positive predictivity = (a/(a + b)).
                        • Negative predictivity = (d/(c + d)).
                        • Relative predictivity of a positive result for carcinogenicity is the
                          fraction of carcinogens  giving a positive genotoxicity result,
                          divided by the  fraction of non-carcinogens giving a positive
                          genotoxicity result = [(a/(a + c))/(b/(b + d))].
                        • Relative predictivity of a negative result for non-carcinogenicity
                          is the fraction of non-carcinogens giving a negative genotoxicity
                          result, divided by the fraction of carcinogens giving a negative
                          genotoxicity result = [(d/(b + d))/(c/(a + c))].

                           The higher  the relative  predictivity value obtained, the  more
                        predictive the testing strategy will be and values above 2 are con-
                        sidered significant (Kirkland et al., 2005). The concordance gives a
                        measure  of overall accuracy since it combines the percentage of
                        correctly identified results both positive and negative. Although
                        the balanced accuracy is a simple mean of the sensitivity and  spec-
                        ificity results, it allows a better measure of performance if it is the
                        case that an assay has mainly positive or mainly negative results,
                        since sensitivity may be very high with correspondingly and dis-
                        proportionately low specificity, or vice versa.
                           Table 3A-E details  the comparison of the HTS assay results to
                        the corresponding Ames mutagenicity  test results  (available for
                        108 compounds) and rodent tumorigenicity results (available from
                        ToxRefDB for 273 of the compounds in the ToxCast set).
                           Table  3A compares the  results of the HTS assays  to the pub-
                        lished Ames test results. While the terms 'sensitivity' and 'specific-
                        ity' are generally considered only for comparison with carcinogenic
                        endpoints, in this table the authors  have used the same terms for
                        comparison to Ames results in order to provide the reader with a
                        measure that is more readily comparable to subsequent analyses
 Please cite this article in press as: Knight, A.W.I
 Regul. Toxicol. Pharmacol. (2009), doi:10.1016l
Previous
                                                    1-ofiling the US EPA ToxCast™ chemicals.

-------
                                           A.W. Knight et al./Regulatory Toxicology and Pharmacology xxx (2009) xxx-xxx
Table 3
Comparison of the performance of the HTS assays in the prediction of the Salmonella mutagenicity (Ames) test and rodent tumorigenicity.
GreenScreen GreenScreen
A. Prediction of Ames data
Ames + 6
Ames - 4
Number of comparisons 108
Sensitivity (% correct positives) 13.0
Specificity (% correct negatives) 93.5
Concordance 59.3
Balanced accuracy 53.3
Positive predictive value 0.60
Negative predictive value 0.59
Relative predictivity (positives) 2.02
Relative predictivity (negatives) 1.08
GreenScreen
B. Prediction of rodent tumorigenicity
Rodent + 22
Rodent - 7
Number of comparisons 273

Sensitivity (% correct positives) 14.9
Specificity (% correct negatives) 94.4
Concordance 51.3
Balanced accuracy 54.6
Positive predictive value 0.76
Negative predictive value 0.48
Relative predictivity (positives) 2.65
Relative predictivity (negatives) 1.11
GreenScreen
+
CellSensor p53
40 5
58 6
108
95% conf. lim. p-Value
(0, 27)
(81, 100)
(0.5, 0.7)
(0.5, 0.7)
(1.6, 2.6)
(0.8, 1.5)
GreenScreen
126
118
95% conf.
lim.
(7, 23)
(85, 100)


(0.7, 0.8)
(0.4, 0.6)
(2.5, 2.9)
(0.9, 1.4)
GreenScreen
-
<0.001 10.9
<0.001 90.3
56.5
50.6
0.004 0.45
0.673 0.58
0.002 1.12
0.795 1.01
CellSensor CellSensor
p53 + p53 -
p-Value

<0.001
<0.001


<0.001
0.409
<0.001
0.459


21
12
273

14.2
90.4
49.1
52.3
0.64
0.47
1.48
1.05
CellSensor
p53 +
127
113
95% conf.
lim.
(6, 22)
(82, 99)


(0.6, 0.7)
(0.4, 0.5)
(1.3, 1.7)
(0.8, 1.4)
CellSensor
p53-
CellSensor p53
41
56
95% conf. lim.
(0, 25)
(78, 100)
(0.3, 0.6)
(0.5, 0.7)
(0.7, 1.6)
(0.7, 1.4)
CellCiphr
p53 +
p-Value

<0.001
<0.001


0.002
0.702
0.001
0.762


17
9
273

11.5
92.8
48.7
52.1
0.65
0.47
1.60
1.05
CellCiphr
p53 +
p-Value
<0.001
<0.001
0.603
0.959
0.622
0.988
CellCiphr
p53-
131
116
95% conf.
lim.
(3, 19)
(85, 100)


(0.6, 0.7)
(0.4, 0.5)
(1.4, 1.8)
(0.8, 1.3)
CellCiphr
p53-
CellCiphr p53 CellCiphr p53
4
10
108
8.7
83.9
51.9
46.3
0.29
0.55
0.54
0.92
p-Value

<0.001
<0.001


<0.001
0.742
<0.001
0.757


42
52
95% conf. lim.
(0, 23)
(72, 96)
(0.2, 0.4)
(0.5, 0.6)
(0.1, 1.1)
(0.6, 1.3)
Ames Ames
23
13
90

44.2
65.8
53.3
55.0
0.64
0.46
1.29
1.18
Ames
+
29
25
95% conf.
lim.
(31,58)
(50, 82)


(0.5, 0.7)
(0.3, 0.6)
(1.0, 1.7)
(0.7, 1.8)
Ames
-
p-Value
<0.001
<0.001
0.013
0.583
0.034
0.578
p-Value

0.032
0.001


0.164
0.497
0.136
0.532


C. Prediction of most potent tumorigens <15 mg/kg/day
Rodent + 7
Rodent - 22
Number of comparisons 273


Sensitivity (% correct positives) 15.2
Specificity (% correct negatives) 90.3
Concordance 77.7
Balanced accuracy 52.8
Positive predictive value 0.24
Negative predictive value 0.84
Relative predictivity (positives) 1.57
Relative predictivity (negatives) 1.07
GreenScreen
+
39
205

95% conf.
lim.
(5,27)
(85, 95)


(0.1, 0.3)
(0.82, 0.86)
(1.0, 2.4)
(0.9, 1.2)
GreenScreen -




p-Value

0.687
0.003


0.145
0.387
0.140
0.427


7
26
273


15.2
88.5
76.2
51.9
0.21
0.84
1.33
1.04
CellSensor
p53 +
39
201

95% conf.
lim.
(5, 27)
(83, 94)


(0.1, 0.3)
(0.82, 0.86)
(0.7, 2.2)
(0.9, 1.2)
CellSensor
p53-



p-Value

0.705
0.025


0.390
0.562
0.392
0.577


4
22
273


8.7
90.3
76.6
49.5
0.15
0.83
0.90
0.99
CellCiphr
p53 +
42
205

95% conf.
lim.
(0, 20)
(85, 95)


(0.1, 0.3)
(0.81, 0.85)
(0.3, 1.8)
(0.9, 1.2)
CellCiphr
p53-



p-Value

0.123
0.003


0.755
0.880
0.727
0.853


14
22
90


56.0
66.2
63.3
61.1
0.39
0.80
1.65
1.50
Ames +

11
43

95% conf.
lim.
(40, 72)
(56, 77)


(0.2, 0.6)
(0.74, 0.9)
(1.0, 2.6)
(1.2, 1.9)
Ames -




p-Value

0.001
0.271


0.141
0.009
0.109
0.008


D. Prediction of multiple species rodent tumorigens
Rodent + 5
Rodent - 17
Number of comparisons 212


Sensitivity (% correct positives) 11.6
Specificity (% correct negatives) 89.9
Concordance 74.1
Balanced accuracy 50.8
Positive predictive value 0.23
Negative predictive value 0.80
Relative predictivity (positives) 1.16
Relative predictivity (negatives) 1.02
38
152

95% conf.
lim.
(0, 24)
(84, 96)


(0.1, 0.3)
(0.77, 0.83)
(0.6, 2.0)
(0.9, 1.2)



p-Value

0.130
0.001


0.635
0.800
0.700
0.861
5
18
212


11.6
89.3
73.6
50.5
0.22
0.80
1.09
1.01
38
151

95% conf.
lim.
(1, 24)
(83, 95)


(0.1, 0.3)
(0.77, 0.83)
(0.5, 1.9)
(0.9, 1.2)



p-Value

0.116
<0.001


0.776
0.873
0.837
0.928
3
13
212


7.0
92.3
75.0
49.6
0.19
0.80
0.91
0.99
40
156

95% conf.
lim.
(0, 19)
(86, 98)


(0.1, 0.3)
(0.77, 0.82)
(0.3, 1.7)
(0.8, 1.2)



p-Value

0.026
<0.001


0.782
0.940
0.726
0.906
9
16
62
95% conf.
lim.
50.0
63.6
59.7
56.8
0.36
0.76
1.38
1.27
9
28

p-Value

(33, 71)
(50, 77)


(0.2, 0.5)
(0.7, 0.8)
(0.6, 2.6)
(1.0, 1.8)





0.034
0.218


0.440
0.192
0.471
0.163
(continued on next page)

Please cite this article in press as: Knight,
Regul. Toxicol. Pharmacol. (2009), doi:10






ffi Previous I TOC 1 Nex


I^^^Hi profiling the



US EPA ToxCast™ chemicals.


-------
                                  A.W. Knight et al./Regulatory Toxicology and Pharmacology xxx (2009) xxx-xxx
Table 3 (continued)
GreenScreen GreenScreen
+ —
E. Prediction of multiple site rodent twnorigens
Rodent + 13 45
Rodent - 16 199
Number of comparisons 273
95% conf.

Sensitivity (% correct positives)
Specificity (% correct negatives)
Concordance
Balanced accuracy
Positive predictive value
Negative predictive value
Relative predictivity (positives)
Relative predictivity (negatives)

22.4
92.6
77.7
57.5
0.45
0.82
3.01
1.19
lim.
(12, 34)
(87, 98)


(0.4, 0.6)
(0.79, 0.84)
(2.5, 3.7)
(1.1, 1.4)
p-Value

0.750
<0.001


<0.001
0.033
<0.001
0.019
CellSensor CellSensor
p53 + p53 -
10
23
273

17.2
89.3
74.0
53.3
0.30
0.80
1.61
1.08
48
192
95% conf.
lim.
(6, 29)
(84, 95)


(0.2, 0.4)
(0.77, 0.83)
(1.1,2.3)
(0.9, 1.3)
p-Value

0.409
<0.001


0.059
0.327
0.050
0.332
CellCiphr CellCiphr
p53 + p53 -
8
18
273

13.8
91.6
75.1
52.7
0.31
0.80
1.65
1.06
50
197>
95% conf.
lim.
(4, 25)
(86, 97)


(0.2, 0.4)
(0.77, 0.82)
(1.2, 2.3)
(0.9, 1.2)
p-Value

0.147
<0.001


0.047
0.416
0.026
0.453
Ames
+
14
22
90

50.0
64.5
60.0
57.3
0.39
0.74
1.41
1.29
Ames
14
40
95% conf.
lim.
(33, 65)
(51, 76)


(0.2, 0.5)
(0.67, 0.81)
(0.8, 2.3)
(1.0, 1.7)

p-Value

0.019
0.425


0.305
0.127
0.302
0.086
in Table 3B-E. Due to the limit on the number of available Ames
data, this assessment has the least number of chemical compari-
sons (N = 108). The prevalence of positive results in the Ames assay
(42.6%) was much higher than that observed for the HTS assays
(see Table 2) and this bias must be borne in mind when consider-
ing these results. The predominant reason for the higher positive
prevalence in the Ames data is likely due to testing at considerably
higher concentrations,  as defined in ICH guidelines (US Federal
Register, 2008). Ames testing is often carried out with concentra-
tions up to 5 mg/plate or 10 mM if solubility allows; this is approx-
imately 50-fold greater than the highest concentration used in the
HTS assays. Secondly, none of the HTS assays incorporated meta-
bolic activation which  is routinely  included in Ames testing via
rat liver homogenates (S9 fraction). All three HTS assays showed
similar correlation with the Ames data. Prediction of positive Ames
test results (sensitivity) was  low for all assays, between 8.7% and
13.0%.  Prediction of  negative Ames test results (specificity)  was
high for all HTS assays  examined and highest for GreenScreen HC
(93.5%). We caution, however, that  given the high percentage of
negative HTS results the assay data are biased towards negative
predictions and hence we expect better performance in predicting
negative results in the regulatory assays. The overall concordance
was similar for all assays, averaging 55.9%. Using the relative  pre-
dictivity for positives as a comparison metric for Ames (Table 3A),
GreenScreen HC was significantly better than the CellSensor p53
and CellCiphr p53 assays (p-values 0.002, 0.622 and 0.034, respec-
tively). Confidence intervals and p-values were calculated using a
parametric bootstrapping technique that compares predictions
generated by any given  model (GreenScreen HC, CellCiphr  p53
and Cell Sensor p53) with predictions obtained from a null hypoth-
esis model. The null model draws random samples from a binomial
distribution in which the probability of an active result equals the
proportion of actives observed in the actual experimental results.
Statistics were obtained by comparing the results for the model
of interest with two thousand samplings of the null model. Ap-va-
lue of 0.002, for example, therefore indicates that the probability of
observing a value greater than the  reported statistic by random
chance is only 0.2%. Low p-values (<0.05) are obtained for models
that perform significantly better than the null model for the asso-
ciated  statistic. High  p-values denote results for which the model
prediction is no better than random  chance for that particular sta-
tistical measure.
   Table 3B is derived from comparing assay results to rodent
tumorigenicity from  rodent chronic bioassays in ToxRefDB, con-
sidering any significant tumor formation,  irrespective of species
or site. The number of chemical  comparisons was 273,  reflecting
a larger amount of available data and comprehensive overlap of
the ToxRefDB and ToxCast databases. Sensitivity of the HTS assays
                       was again low in this set, averaging 13.5% potentially due to the
                       aforementioned  lack of metabolic activation in the HTS assays.
                       Consequently  the Ames test results demonstrated a higher sensi-
                       tivity of 44.2%. However, sensitivity will also be low in this com-
                       parison as in vitro genotoxicity assays are not expected to readily
                       detect non-genotoxic carcinogens. Overall concordance was very
                       similar for all  in vitro assays including the regulatory Ames data,
                       averaging 49.7%. It  seems equally apparent that, whereas the
                       Ames test had a much higher sensitivity than  the 3 HTS  assays
                       (44.2% vs. 11.5-14.9%), the HTS assays had much better specific-
                       ity, again due  in part to the imbalanced nature of the HTS data-
                       sets, i.e., a high  prevalence of negative results. Using the relative
                       predictivity  for  positives as a  metric to  compare the rodent
                       tumorigenicity,  GreenScreen demonstrated  the  highest  value
                       (2.65) although all the  HTS assays showed significant predictivity
                       with low p-values ^0.001 (cf. p-value of 0.136 for Ames). These
                       latter data are mirrored in the  positive predictive values for the
                       HTS and Ames assays.
                          Table 3C compares  assay results with only the most potent
                       tumorigens, based on oral exposure values, where positive results
                       were given only for compounds with a potency ^15 mg/kg/day.
                       Compared to the prediction of all tumorigens (Table 3B) a signifi-
                       cant improvement in concordance figures for all assays was noted
                       when potency was  factored into activity, averaging 76.8%. Once
                       again the sensitivity of the HTS  assays was lower than that of the
                       Ames test, whereas specificity  was higher.  Relative predictivity
                       for a positive  result was, on the whole, lower than the previous
                       comparison  reflecting the lower number of potent tumorigens—
                       only 46 of the 148 rodent tumorigens had a potency ^15 mg/kg/
                       day.  Broadening the analysis  to potencies  ^50 mg/kg/day in-
                       creased the number of potent tumorigens to 76,  however, the per-
                       formance figures  from  the   different  assays  changed   only
                       marginally for all parameters examined (data not shown).
                          Table 3D compares assay results to multiple species rodent car-
                       cinogens, i.e., compounds that have been shown to produce tumors
                       in both rats and mice. In this comparison the accuracy of prediction
                       (concordance) was improved for each of the cellular HTS assays,
                       compared with  the analysis of all tumorigens, averaging 74.2%
                       and was significantly higher than Ames  (59.7%). The number of
                       available comparisons  was lower (212 compounds) due to the
                       requirement for  available data in both rats and  mice. In compari-
                       sons using the small set of most the most potential tumorigens
                       (Table 3C) and multiple species  rodent tumorigens  (Table 3D) the
                       positive predictive value and relative predictivity of all the assays
                       considered was low and associate with high p-values, demonstrat-
                       ing lower confidence in the results.
                          Table 3E compares assay results to multiple site rodent carcin-
                       ogens, i.e., compounds which have been shown to produce tumors
 Please cite this article in press as: Knight, A.W.I
 Regul. Toxicol. Pharmacol. (2009), doiilO.lOiel
Previous
                                                   1-ofiling the US EPA ToxCast™ chemicals.

-------
                                  A.W. Knight et al./Regulatory Toxicology and Pharmacology xxx (2009) xxx-xxx
in multiple sites in rats and/or mice, such as liver, kidney, thyroid,
testes, spleen or lung. In  this  comparison the data set is more
complete with 273 comparisons possible. In this case, for all the
HTS assays, concordance was further improved, from an average
of 50%  for all  rodent tumorigens to 74-78% for multiple  site
tumorigens. Relative predictivity for a positive result was signifi-
cantly high for  GreenScreen HC (3.01, p-value <0.001) compared
to the p53 assays and this  assay also demonstrated a correspond-
ingly  high  specificity of 92.6%. Again these data are reflected in
the positive and negative predicted  values. Most  noteworthy,
however, was the  increase in sensitivity of the GreenScreen HC
assay (nearly doubling to  22.4%) over Table 3D where compari-
sons were made on the basis of multiple species carcinogens. In
this case, a compound that produces tumors at multiple sites irre-
spective of species  or gender can be viewed as a particularly clear
carcinogenic response, most likely due to a genotoxic mechanism
of action, a factor that could lead to the improved sensitivity in
this assay.

4.6. Nature  of the compounds and mechanisms of action

   The ToxCast  compound  set was highly diverse in terms of both
MOAs and chemical structure (i.e., a large number of chemical clas-
ses). The majority of classes of pesticide had fewer than 3 chemical
representatives  in the set (<1%) and therefore although this enables
a more accurate assessment of assay performance in a wide chem-
ical space, the response of individual assays to particular chemical
classes and features  is more difficult  to model and will require
more detailed investigation. This is a long term objective of future
studies, incorporating data  from  structure-activity relationship
(SAR), other regulatory genetic toxicology and MOA studies as well
as other relevant endpoints.
5. Discussion

   The HTS assays studied represent two gene targets in their end-
points, p53 and GADD45a in a p53 competent cell line. The tumor
suppressor p53 protein appears to sense multiple types of DNA
damage and regulates many parts of the cell's DNA damage re-
sponse, activating DNA repair, cell cycle arrest and  induction of
apoptosis. p53 activity is maintained at low levels in  healthy cells
but is rapidly induced by ionizing radiation and several genotoxic
and chemotherapeutic compounds. Hence p53 is an obvious candi-
date for a marker of genotoxic effects. However, p53 is a multifunc-
tional  protein  with potential for  diverse modifications and
biochemical properties. Its functionality is only partially under-
stood in terms of how p53 coordinates the DNA damage response
in specific cell types, the nature of the interactions between p53
and  DNA repair proteins and the mechanism of p53-dependent
apoptosis (Liu and Kulesz-Martin, 2001; Meek, 2004). Induction
of p53 can also occur in response to a range of non-genotoxic stres-
ses leading to growth arrest or apoptosis.  Ohno et al. (2008) re-
cently  conducted  an  80  compound  validation  study of  a
genotoxicity test system based on p53R2 gene expression with a
luciferase reporter. Although reporting  that this system could be
used for the rapid screening of genotoxic potential, the ability of
the assay to detect genotoxic effects was unclear. A subsequent fo-
cused study of 27 compounds with diverse  genotoxic mechanisms
revealed that the test compound's potency in the assay was related
to MOA and that well-known genotoxic antimetabolites (e.g., pur-
ine synthesis and metabolism inhibitors such as 6-mercaptopur-
ine)  and HDAC (histone deacetylase) inhibitors (e.g., trichostatin
A) were not detected.
   The transcriptional regulation of GADD45a is also complex and
includes p53-dependent induction  and  regulation  by  BRCA1,
                         c-MYC and  NF-kB. GADD45a induction has been observed in re-
                         sponse to a wide range of genotoxic and growth arrest stresses
                         and upon exposure of cells to UV and a wide range of genotoxic
                         compounds with diverse MOAs. The protein is thought to play a
                         role in DNA repair, cell cycle checkpoints, apoptosis in response
                         to genotoxic stress and antitumorigenesis as well as in the mainte-
                         nance of genomic stability (Hastwell et al., 2006).
                            In the context of the complexity of biological pathways and pro-
                         cesses of mutagenesis and carcinogenesis and with the differing
                         nature of the cell lines and endpoints used in the three HTS assays,
                         it is not surprising that there is only limited overlap between the
                         positive results of these HTS assays (see Fig. 1) and Ames positive
                         results. This gives some weight to the argument for the use of a
                         battery of related screens probing different aspects of the cellular
                         response from which data can be pooled. Furthermore, the non-
                         overlapping features of the different HTS assays considered in the
                         present study also may reflect variable sensitivities of each assay to
                         different chemical classes  and, hence,  provide  support  for  the
                         broader ToxCast approach in which HTS assays are to be chosen,
                         combined and applied mindful of the chemical space being pre-
                         dicted. Clearly, however, these are longer term objectives beyond
                         the scope of the present study.
                            Two compounds, both fungicides, were positive in all three HTS
                         assays; benomyl and fluoxastrobin. In many studies  benomyl has
                         produced negative results for gene mutations, structural chromo-
                         some aberrations and DNA damage and does not react directly
                         with DNA.  Benomyl  inhibits fungal growth by binding to tubulin
                         and disrupting microtubule assembly. A similar genotoxic MOA
                         in mammalian cells  results  in  aneuploidy in vitro and  in vivo
                         (Bentley et al., 2000). Based on a MOA analysis, there is plausible
                         evidence that benomyl acts  through an aneuploidy mechanism
                         (McCarroll et al., 2002). Benomyl has also previously been reported
                         as a positive in the GreenScreen HC assay with an LEC concentra-
                         tion of 12 (J.M (Hastwell  et al., 2006)  compared to  an LEC of
                         50 u,M for this study. Fluoxastrobin is a strobilurin fungicide that
                         acts through inhibition of mitochondrial respiration, which can re-
                         sult in the  production of free electrons that react with oxygen to
                         form superoxide, known to cause oxidative stress and DNA damage
                         (Bartlett et al., 2002). Of 5 other strobilurin fungicides in the data
                         set, all 5 were positive in the CellSensor p53 assay and 4 of these
                         were positive in the GreenScreen HC assay, yet none were positive
                         in the CellCiphr p53 assay.  A strobilurin present in the ToxCast
                         compound collection, azoxystrobin, has been shown to produce a
                         weak clastogenic response in mammalian cells in vitro at cytotoxic
                         doses. However, in animals azoxystrobin was negative in assays for
                         chromosomal  damage and general DNA damage at high dose lev-
                         els.  Hence, strobilurins  are considered to have low toxicity and
                         consequently  low risk to non-target organisms (EPA, 1997). This
                         class of compounds  highlights the complications of extrapolating
                         in vitro test results to human health risk assessments.
                            It is unusual in  a comparative study such as this to have a
                         wealth of tumorigenicity data to reference for the majority of com-
                         pounds. In pharmaceutical R&D, a positive Ames result will often
                         remove a compound from the development process and  follow-
                         up in vivo studies will only be performed for  candidates that are
                         developed  for life threatening-conditions or have  other  highly
                         favorable therapeutic indications, although  exceptions  include
                         compounds from oncology and anti-viral campaigns. Overall the
                         HTS assays have demonstrated low sensitivity for compounds that
                         have been shown to  be active tumorigens, and similarly low posi-
                         tive predictivity for Ames results. Screening hit rates for positive
                         results were consistent between assays averaging  10%.  This is
                         likely explained by the limitation of concentrations achievable in
                         screening, the lack  of exogenous  metabolic activation, and  the
                         probability that many tumorigens may be acting via  non-muta-
                         genic MOAs. In this small set of three assays, screening only picked
 Please cite this article in press as: Knight, A.|
 Regul. Toxicol. Pharmacol. (2009), doi:10.1C
Previous
                                                   i profiling the US EPA ToxCast™ chemicals.

-------
10
                                  A.W. Knight et al./Regulatory Toxicology and Pharmacology xxx (2009) xxx-xxx
up the most potent genotoxic tumorigens, and thus low sensitivity
was a trade-off for the advantage of screening of large numbers of
compounds with low sample  concentration requirements.  Con-
versely, the HTS assays demonstrated high specificity, consistently
over 88%, and hence produce a low number of what may be inter-
preted as false positive results when comparing to rodent tumori-
genicity data. In the context of the predicted dose, high specificity
in screening for hazard assessment is desired, otherwise concerns
over product safety are falsely raised, triggering more detailed,
lengthy and expensive mechanistic investigations, often in vivo,
and delaying or halting development or deployment of the com-
pound. However, low assay sensitivity is also of concern and less
desirable for hazard identification and screening in the context of
environmental risk assessment, since public safety is the primary
objective. Hence, limitations  in the sensitivity of individual assays
to predict the in vivo  endpoint may need to be compensated by
additional tests or assays with higher sensitivity and different per-
formance characteristics. The measure of concordance is biased by
the low number of positive results and hence low sensitivity and
high specificity of the HTS assays and is therefore a less useful met-
ric in this study. Overall concordance with tumorigenicity data was
low, around 50% for all tumorigens, but increased to 74-78% for the
HTS assays (vs. 60% for Ames) for compounds producing tumors in
rodents at multiple sites.
   Through numerous validation studies it has become clear that
carcinogens partition into two main groups; genotoxic and non-
genotoxic MOAs. Many mechanisms are a part of the latter group,
including inhibition of apoptosis, down-regulation of gap junction
intercellular communication, enhancement of cellular prolifera-
tion, and peroxisome proliferation (Yamasaki et al., 1996; Combes,
2000; Vinken et al., 2008). The concordance with tumorigenicity
data was highest when only the most potent tumorigens and those
compounds that produced tumors at multiple sites and in multiple
species were considered, and were therefore more likely to have a
genotoxic  mode of action (EPA, 2007). This is in agreement with
Ashby and Tennant's  observations that the use of genotoxicity
screening  assays will  enable the detection of trans-species  and
multiple site  rodent carcinogens, while the detection of tissue,
sex or species-specific carcinogens will only  be  achieved by con-
ducting lifetime carcinogenicity  bioassays (Ashby and Tennant,
1988). Compounds that produce tumors in different rodent species
or at more than one site are more likely to be acting by a genotoxic
mechanism, i.e., directly reacting with DNA or interfering with its
production, maintenance or translocation, and thus are more likely
to be detected by in vitro genotoxicity tests (EPA, 2007). This  dis-
tinction and approach has been taken since no other attempt has
been made to classify the carcinogens in the ToxCast compound
set as either genotoxic or non-genotoxic carcinogens. Indeed  this
is not a trivial exercise and even where data exist there is often
an incomplete picture and differences in  'expert opinion' as to
the exact  mechanism of carcinogenicity. Furthermore for some
compounds, both genotoxic and  non-genotoxic mechanisms may
play  a role (Yamasaki et al., 1996).
   Many environmental compounds are tested at maximum toler-
ated doses in vivo in rodent chronic bioassays, and at up to 10 mM
in regulatory  genotoxicity tests, often at doses or concentrations
that  could  not be attained in actual human exposure (Kirkland
et al., 2007). Whereas many compounds may produce quantifiable
genotoxic  effects  in this  high-dose testing, the concentration at
which this is achieved is an important consideration in risk assess-
ment. In the HTS screening assays utilized  in the present study,
testing was carried out  at lower  concentrations than those at
which some regulatory tests are performed; hence it might be ex-
pected to detect only a subset of the more potent genotoxic spe-
cies,  i.e., show low comparative sensitivity. ToxRefDB includes
potency figures in the form of a lowest effect level in mg/kg body
                       weight/day over the two years of exposure at which the chronic tu-
                       mor endpoint was observed.
                          The authors stress that none of the HTS assays being considered
                       here have been developed with the aim of replacing or accurately
                       predicting the results of the Ames test. Miniaturized tests such as
                       Ames II (Fliickiger-Isler et al., 2004) and other bacterial assays do
                       this adequately. Rather, HTS tests for genotoxicity were developed
                       as early screening tools with the aim of increasing testing effi-
                       ciency and concordance with carcinogenicity endpoints  as com-
                       pared to the low-throughput, standard regulatory tests (Custer
                       and  Sweder,  2008). Furthermore, these HTS genotoxicity assays
                       probe cellular pathways that likely  play a role in mutagenic and
                       carcinogenic processes for many chemicals.
                          The lack of capacity for metabolic activation is a significant lim-
                       itation of the cell  lines currently used in the HTS assays examined
                       here. It has long been known that cytochrome P450 enzymes facil-
                       itate principally phase I metabolism of xenobiotics and are highly
                       conserved in virtually all eukaryotic and many prokaryotic cells.
                       Also, it is well-known that the metabolic activation of progenotoxic
                       or procarcinogenic  compounds often leads to the formation of
                       reactive electrophilic intermediates  (Nebert and Dalton, 2006).
                       P450 enzymes are predominantly localized in the liver. Hence liver
                       homogenates (S9 fraction) from rats dosed with P450  inducing
                       compounds are commonly added to the assay medium of in vitro
                       assays to  provide a level of metabolic competency  and enable
                       detection of progenotoxic compounds. This lack of metabolic com-
                       petency is being addressed in recent modifications of the Green-
                       Screen  HC  assay  that  incorporate  co-exposure of   the  test
                       compound and S9 fraction (Jagger and Tate et al., 2009). Likewise,
                       the CellCiphr and CellSensor assays  are also being modified to be
                       run  in  primary  cell systems  with more complete metabolic
                       capacity.
                          The  2-year rodent bioassay is generally considered  the gold
                       standard for assessing potential risk of carcinogenicity to  humans.
                       However,  such  assays are expensive, time-consuming and low-
                       throughput. In the context of assessing risk it should be noted that,
                       while rodent bioassays are currently the best approach we have to
                       assess the potential  for human carcinogenicity, they may  overesti-
                       mate the  risk of  mutagenesis  and carcinogenesis. The European
                       Medicines Agency quotes in their guidance for carcinogenicity test-
                       ing for Pharmaceuticals that "Since the early 1970s, many investi-
                       gations have  shown that it is possible to provoke a carcinogenic
                       response in rodents by a diversity of  experimental  procedures,
                       some of which are now considered  to have little or no relevance
                       for human risk assessment" (EMA, 2008). Rodent bioassays with
                       environmental  chemicals  are conducted up  to  high  doses
                       approaching what is maximally tolerated by the animal, and the
                       subsequent chronic  cytotoxicity can lead to increased cell division
                       to replace damaged cells, and increased DNA  replication with a
                       greater potential  to  produce mutations. This likely contributes to
                       the high positive rate in  rodent tumorogenicity studies, as was
                       the case in ToxRefDB with just under half the compounds (148)
                       identified  as  rodent tumorigens. The positive predictivity of the
                       HTS data was lowest when attempting to correlate with  multiple
                       species  tumorigens.  Although both species considered  here are ro-
                       dents, there are frequently significant differences in response be-
                       tween rats and mice even of the same sex. For example,  it has
                       been reported that carcinogenicity in mice predicts carcinogenicity
                       in rats with an accuracy of only 70% (Lave et al.,  1988) and there
                       are obvious cautionary implications in extrapolating the results
                       further  for human risk assessment (Knight et al., 2006). However,
                       hazard information from rodent bioassays alone should not be con-
                       strued as complete indicators of human risk; EPA considers many
                       factors in the process of interpreting such information relative to
                       human  exposure  and risk. The vast majority of these pesticidal
                       compounds have  been evaluated by EPA as having limited carcin-
 Please cite this article in press as: Knight, A.W.I
 Regul. Toxicol. Pharmacol. (2009), doiilO.lOiel
Previous
                                                  1-ofiling the US EPA ToxCast™ chemicals.

-------
                                     A.W. Knight et al./Regulatory Toxicology and Pharmacology xxx (2009) xxx-xxx
                                                                                                                                     11
ogenic potential in humans (EPA OPP, 2009b), and to be safe when
used according to EPA registration(s). Data from the Ames  assay,
for example, are only a part of the regulatory genetic toxicology
battery results that contributes to the weight of the evidence ap-
proach taken by EPA in consideration of carcinogenesis.
   Several physical characteristics are required for an effective and
useful HTS assay. These are high sample throughput, high repro-
ducibility, low compound requirement, and low cost of consum-
ables and equipment. The CellSensor p53 assay demonstrated the
shortest protocol  taking 24 h to complete; GreenScreen  HC takes
48 h, while CellCiphr p53  takes 24-72 h depending on the  expo-
sure duration. The CellSensor  p53 assay can achieve the highest
throughput of 30,000 to 100,000 compounds per week in a fully
automated laboratory. GreenScreen HC and CellCiphr demonstrate
a lower but still significantly high-throughput of up 500-700 com-
pounds per week, yet can be readily set up manually using auto-
dispensing handheld pipettes.  All three HTS assays used proprie-
tary assay reagents and cell lines available from Centronix Ltd.,
Millipore or Invitrogen. Reproducibility was  specifically tested by
the random distribution  of replicates  of test compounds within
the ToxCast compound set.  The GreenScreen  HC  and CellCiphr
p53 assays demonstrated a  very high degree  of reproducibility
both in qualitative and quantitative results and for both genotoxi-
city and cytotoxicity (see Section 4.2). Genotoxicity results for Cell-
Sensor  were  negative   in  all  cases,  so  a  fair analysis  of
reproducibility could not  be made for this assay.
   The next stage of this work will entail combining the HTS assay
results presented  here with a much larger set of HTS assay results
from the ToxCast project, along with data from the full EPA regula-
tory test battery genetic toxicology, MOA analysis and structure-
activity relationship (SAR) approaches to data analysis, to derive
predictive signatures for in vivo endpoints. We fully expect differ-
ent types of HTS assays, conjoined with chemical class characteris-
tics, will contribute to prediction of each in vivo toxicity endpoint.
These initial models built  from the ToxCast Phase 1 chemicals and
data will be validated on larger sets of compounds, and ultimately
the most predictive models could be employed for screening and
prioritization of large lists of environmental chemicals,  in  align-
ment with the needs of risk assessment. Thus, the HTS genotoxicity
indicator assays presented here are likely to find greatest  value
when deployed in combination with other types of pathway indi-
cators or less related endpoints, such as cytotoxic or SAR measures.
   In conclusion, whereas comparisons to Ames and tumorigenic-
ity  data have been useful in this particular project in order  to as-
sess the relevance and quality of the results  produced,  positive
data from these  high-throughput genotoxicity assays alone are
not  suitable surrogates for standard regulatory genotoxicity as-
says; neither can these HTS results be used in  isolation  as  direct
predictors of animal tumorigenicity. However, data  from  these
HTS assays can reasonably be  applied in four areas: (i)  screening
for compounds capable of causing genetic damage in vitro and thus
used for 'hazard identification'; i.e., the  rapid and  accurate identi-
fication of compounds with the potential to induce carcinogenic-
ity; (ii) as one part of a weight of evidence assessment to inform
on the likelihood  of a compound's adverse effect for humans; (iii)
to aid  the determination  the mode of action for carcinogenicity;
and (iv) to aid prioritization of a compound for further follow-up
testing in in vitro  assays and rodent bioassays (EPA, 2007).
Conflict of interest statement

   Andrew Knight,  Louise  Birrell and  Richard  Walmsley  are
employees of Centronix Ltd., who developed  and  manufacture
the GreenScreen HC assay. All other authors have no conflicts of
interest to declare.
                            Disclaimer

                               The United States Environmental Protection Agency through its
                            Office of Research and Development partially funded and collabo-
                            rated in the research described here. It has been  subjected to
                            Agency review and approved for submission and peer review. Ref-
                            erence to any specific commercial products, process, or service by
                            trade name, trademark,  manufacturer, or otherwise, does not nec-
                            essarily constitute or imply its endorsement, recommendation, or
                            favoring by the United States Government.


                            Acknowledgements

                               Chihae Yang wishes to thank Leadscope and US FDA Center for
                            Food Safety and Applied Nutrition for the use of their databases to
                            enable the statistical analysis.


                            Appendices A and B  Supplementary data

                               Supplementary data associated with this article can be found, in
                            the online version, at doi:10.1016/j.yrtph.2009.07.004.

                            References

                            Ames, B.N., Lee, F.D., Durston, W.E., 1973a. An improved bacterial test system for the
                               detection and classification of mutagens and carcinogens. Proc. Nat. Acad. Sci.
                               USA 70, 782-786.
                            Ames, B.N., Durston, W.E., Yamasaki, E., Lee, F.D., 1973b. Carcinogens are mutagens:
                               a simple test system combining liver homogenates for activation and bacteria
                               for detection. Proc. Nat. Acad. Sci. USA 70, 2281-2285.
                            Ashby, J., Tennant, R.W., 1988.  Chemical structure, Salmonella mutagenicity and
                               extent of carcinogenicity as indicators of genotoxic carcinogenesis among 222
                               chemicals tested  in rodents by the US NCI/NTP. Mutat. Res. 204,17-115.
                            Bartlett, D.W., Clough, J.R., Godwin, J.R., Hall, A.A., Hamer, M., Parr-Dobrzanski, B.,
                               2002. The strobilurin fungicides. Pest. Man. Sci. 58, 649-662.
                            Bentley, K.S., Kirkland, D., Murphy, M., Marshall, R., 2000. Evaluation of thresholds
                               for  benomyl- and  carbendazim-induced  aneuploidy in cultured human
                               lymphocytes using fluorescence in situ hybridization. Mutat. Res. 464, 41-51.
                            Brambilla, G., Martelli, A., 2009. Update on genotoxicity and carcinogenicity testing
                               of 472 marketed  Pharmaceuticals. Mutat. Res. 681, 209-229.
                            Brattain,  M.G.,  Fine, W.D.,  Khaled, F.M., Thompson, J.,  Brattain,  D.E.,  1981.
                               Heterogeneity of malignant cells from a human colonic carcinoma. Cancer
                               Res. 41, 1751-1756.
                            Bryce,  S.M., Avlasevich,  S.L, Bemis, J.C.,  Lukamowicz,  M., Elhajouji,  A.,  Van
                               Goethemc, F., De Boeck, M.,  Beerens, D., Aerts, H., Van Compel, J., Collins, J.E.,
                               Ellis,  P.C., White, A.T., Lynch, A.M., Dertinger, S.D., 2008.  Intel-laboratory
                               evaluation of  a flow cytometric, high content in vitro micronucleus assay.
                               Mutat. Res. 650, 181-195.
                            CCRIS, 2009. US National Library of Medicine's Toxicology Data Network (TOXNET)
                               Chemical Carcinogenesis Research Information System. Available from: . (accessed January 2009).
                            Chen, T., Moore, M.M.,  2004. Screening for chemical mutagens using the mouse
                               lymphoma assay. In: Van, Z., Caldwell, G.W. (Eds.), Methods in Pharmacology
                               and  Toxicology: Optimization in Drug Discovery: In Vitro Methods. Humana
                               Press, pp. 337-352.
                            Cimino, M.C., 2006. Comparative overview of current international strategies and
                               guidelines for genetic toxicology testing for regulatory purposes. Environ. Mol.
                               Mutagen. 47, 362-390.
                            Combes, R.D., 2000. The use of structure-activity relationships and markers of cell
                               toxicity to detect non-genotoxic carcinogens. Toxicol. in Vitro 14, 387-399.
                            Cooper,  J.A., Saracci, R.,  Cole,  P., 1979. Describing the validity of carcinogen
                               screening tests. Br. J. Cancer 39, 87-89.
                            CPDB, 2009. Berkeley Carcinogenic Potency Database. Available from: . (accessed January 2009).
                            Custer, L.L., Sweder, K.S., 2008. The role of genetic toxicology in drug discovery and
                               optimization. Curr. Drug. Met. 9, 978-985.
                            Dearfield, K.L, Auletta, A.E., Cimino, M.C., Moore, M.M., 1991. Considerations in the
                               US Environmental  Protection Agency's testing  approach  for mutagenicity.
                               Mutat. Res. 258, 259-283.
                            Dix, D.J., Houck, KA, Martin,  M.T., Richard, A.M., Setzer, R.W., Kavlock, R.J., 2007.
                               The  ToxCast  program for  prioritizing toxicity  testing of environmental
                               chemicals. Toxicol. Sci. 95, 5-12.
                            EMA, 2008. European Medicines Agency, ICH Topic SIB carcinogenicity: testing for
                               carcinogenicity  of  Pharmaceuticals,  Step  5,  Note   for  guidance  on
                               carcinogenicity:  testing for carcinogenicity of Pharmaceuticals  (CPMP/ICH/
                               299/95).  March  2008. Available from:   (accessed January 2009).
 Please cite this article in press as: Knight, A.|
 Regul. Toxicol. Pharmacol. (2009), doi:10.1C
Previous
                                                       i profiling the US EPA ToxCast™ chemicals.

-------
12
                                           A.W. Knight et al.f Regulatory Toxicology and Pharmacology xxx (2009) xxx-xxx
EPA, 1997. Pesticide tolerance petition filing for azoxystrobin. Fed. Reg. Doc. 97-
    5683. Tuesday, March 11,1997.
EPA,  2007.  Framework  for  determining  mutagenic  mode  of  action   for
    carcinogenicity.  Using  EPA's  2005  Cancer Guidelines and  Supplemental
    Guidance   for   Assessing   Susceptibility  from   Early-Life   Exposure  to
    Carcinogens. EPA 120/R-07/002-A.  External Peer  Review Draft September
    2007.  Available  from:   .
    (accessed January 2009).
EPA CFR, 2009. US Environmental Protection Agency's Code of Federal Regulations
    Title 40   Part  158.  Available from:  . (accessed January 2009).
EPA DSSTox,  2009.  US Environmental Protection Agency's Distributed Structure-
    Searchable Toxicity Data Network. Available from:  (accessed January 2009).
EPA FIFRA, 2009.  US Environmental Protection  Agency's  Federal  Insecticide,
    Fungicide,  and  Rodenticide  Act.  Available  from:  . (accessed January 2009).
EPA OPP, 2009a. US Environmental Protection Agency's Office of Pesticide Programs.
    Available   from:   .   (accessed   January
    2009).
EPA OPP,  2009b.   US  Environmental Protection  Agency's  Office  of  Pesticide
    Programs. Available from: . (accessed
    January 2009).
EPA OPPTS, 2009.  US  Environmental Protection Agency's Office of Prevention,
    Pesticides and Toxic Substances Harmonized Test Guidelines, Series 870 Health
    Effects   Volume   I-III.   Available  from:   .
    (accessed January 2009).
EPA ToxCast,  2009. US Environmental  Protection  Agency's  National Center  for
    Computational   Toxicology  ToxCast  Project.  Available   from:   . (accessed January 2009).
EPA ToxRefDB, 2009. US Environmental  Protection Agency's National Center  for
    Computational  Toxicology  ToxRefDB  (Toxicology   Reference  Database).
    Available from:  . (accessed January 2009).
Fliickiger-Isler, S.,   Baumeister, M., Braun,  K., Gervais,  V., Hasler-Nguyen,  N.,
    Reimann,  R., Van  Compel,  J.,  Wunderlich,  H.-G.,  Engelhard, G.,  2004.
    Assessment of the  performance of the Ames II™ assay: a collaborative study
    with 19 coded compounds. Mutat. Res. 558,181-197.
Hanahan, D., Weinberg, R.A., 2000. The hallmarks of cancer. Cell 100, 57-70.
Harper, J.W., Elledge, S.J., 2007. The DNA damage response:  ten years after. Mol. Cell
    28, 739-745.
Hastwell, P.W., Chai, L-L,  Roberts, K.J.,  Webster,  T.W., Harvey,  J.S., Rees, R.W.,
    Walmsley,  R.M., 2006.  High-specificity  and  high sensitivity  genotoxicity
    assessment in a human cell line: validation of the GreenScreen GADD45a-GFP
    genotoxicity assay. Mutat. Res. 607,160-175.
Houck, K.A., Kavlock, R.J., 2008.  Understanding mechanisms of toxicity: Insights
    from drug discovery research. Toxicol. App. Pharmacol. 277,163-178.
Invitrogen, 2009.  CellSensor™  p53RE-bla HCT-116  Cell-based  Assay  Protocol.
    Available  from: . (accessed January 2009).
Jagger, C, Tate, M.,  Cahill, P.A., Hughes, C, Knight, A.W.,  Billinton, N., Walmsley,
    R.M., 2009. Assessment of the genotoxicity of S9-generated metabolites using
    the GreenScreen HC GADD45a-GFP assay. Mutagenesis 24, 35-50.
Kirkland, D., Aardema, M., Henderson, L, Miiller, L, 2005. Evaluation of the ability of
    a battery of three in vitro genotoxicity tests to discriminate rodent carcinogens
    and non-carcinogens  I. Sensitivity specificity and relative predictivity. Mutat.
    Res. 584,  1-256.
Kirkland, D., Pfuhler, S., Tweats, D., Aardema, M., Corvi, R., Darroudi, F., Elhajouji, A.,
    Glatt, H., Hastwell, P., Hayashi, M.,  Kasper, P., Kirchner, S., Lynch, A., Marzin, D.,
    Maurici, D., Meunier, J.R., Miiller, L, Nohynek, G., Parry, J., Parry, E., Thybaud, V.,
    Tice, R., van Benthem, J., Vanparys, P., White,  P., 2007. How to reduce false
    positive results  when undertaking in vitro genotoxicity testing and thus avoid
    unnecessary follow-up animal tests: report of an ECVAM workshop. Mutat. Res.
    628, 31-55.
Kirkland, D.J., 1992.  Chromosomal aberration tests in vitro: problems with protocol
    design  and interpretation of results. Mutagenesis 7, 95-106.
Knight, A.,  Bailey, J., Balcombe, J.,  2006.  Animal carcinogenicity  studies: 1. Poor
    human predictivity. ATLA 34,19-27.
Knight, A.W., Birrell, L, Walmsley, R.M., 2009. Development and validation of a
    higher  throughput screening  approach  to  genotoxicity testing using the
    GADD45a-GFP GreenScreen HC assay. J. Biomol. Scr. 14, 16-30.
Knight,  A.W., Goddard,  N.J., Billinton, N., Cahill, P.A.,  Walmsley,  R.M.,  2002.
    Fluorescence  polarisation  discriminates  green   fluorescent protein   from
    interfering autofluorescence  in  a  microplate  assay  for  genotoxicity.  J.
    Biochem. Biophys. Meth. 51,165-177.
Lane, D.P., 1992. p53, guardian of the genome. Nature 358,15-16.
Lave, L.B., Ennever, F.K., Rosenkranz, H.S., Omenn, G.S., 1988. Information value of
    the rodent bioassay. Nature 336, 631-633.
Liu, Y., Kulesz-Martin, M., 2001. p53 protein  at the hub of cellular DNA damage
    response  pathways through sequence-specific and non-sequence specific DNA
    binding. Carcinogenesis 22, 851-860.
                             Martin, M.T., Judson, R.S., Reif, D.M., Kavlock, R.J., Dix, D.J., 2009. Profiling chemicals
                                 based on chronic toxicity results from the US EPA ToxRef database. Environ.
                                 Health Perspect. 117, 392-399.
                             Matthews, E.J., Kruhlak, N.L, Cimino, M.C., Benz, R.D., Contrera, J.F., 2006. An
                                 analysis of genetic toxicity, reproductive and developmental toxicity, and
                                 carcinogenicity data: I. Identification of carcinogens using surrogate endpoints.
                                 Regul. Toxicol. Pharm. 44, 83-96.
                             McCarroll, N.E., Protzel, A., loannou, Y., Frank Stack, H.F., Jackson, M.A., Waters, M.D.,
                                 Dearfield,  K.L.,  2002.  A survey  of EPA/OPP and open literature on selected
                                 pesticide  chemicals  III. Mutagenicity and carcinogenicity of benomyl and
                                 carbendazim. Mutat. Res. 512,1-35.
                             Meek, D.W., 2004. The p53 response to DNA damage. DNA Repair 3,1049-1056.
                             Miiller, L, Kikuchi, Y., Probst, G., Schechtman, L, Shimada, H., Sofuni, T., Tweats, D.,
                                 1999. ICH-Harmonised guidances on genotoxicity testing of Pharmaceuticals:
                                 evolution,  reasoning and impact. Mutat. Res. 436, 195-225.
                             Muto, S.,  Baba, H., Uno,  Y., 2003.  Evaluation of the Vitotox™ test as  a high-
                                 throughput genotoxicity assay. Environ. Mut. Res. Commun. 25, 69-75.
                             Nebert,  D.W., Dalton,  T.P.,  2006.  The  role  of  cytochrome  P450  enzymes  in
                                 endogenous signaling pathways and environmental carcinogenesis. Nat. Rev.
                                 Cancer 6, 947-960.
                             Norbury, C.J., Hickson, I.D., 2001. Cellular response to DNA damage. Annu. Rev.
                                 Pharmacol. Toxicol. 41, 367-401.
                             NTP,  2009. National Institutes  of Environmental  Health  Sciences  National
                                 Toxicology Program.  Available  from:  . (accessed
                                 January 2009).
                             Ohno, K., Ishihata, K., Tanaka-Azuma, Y., Yamada, T., 2008. A genotoxicity test
                                 system based on p53R2 gene expression in human cells: Assessment of its
                                 reactivity to various classes of genotoxic chemicals. Mutat. Res. 656, 27-35.
                             Reifferscheid, G., Heil, J., 1996. Validation of the SOS/umu test using test results of
                                 486 chemicals  and comparison with the Ames test and carcinogenicity data.
                                 Mutat. Res. 369, 129-145.
                             Reifferscheid, G., Heil, J., Oda, Y., Zahn, R.K., 1991. A microplate version of the SOS/
                                 umu-test  for  rapid  detection  of genotoxins  and genotoxic  potentials  of
                                 environmental  samples. Mutat. Res. 253, 215-222.
                             RTECS, 2009. US National Institute for Occupational Safety and Health's Registry of
                                 Toxic  Effects of Chemical Substances. Available from: . (accessed January 2009).
                             TMIPH, 2009. Tokyo Metropolitan Institute of Public Health Mutagenicity of Food
                                 Additives.  Available from: . (accessed
                                 January 2009).
                             US  Federal  Register,  2008.  International Conference on Harmonization: draft
                                 guidance   on   S2(R1)   genotoxicity   testing  and  data  interpretation  for
                                 Pharmaceuticals  intended  for  human use;   availability. Available  from:
                                 .
                                 (accessed June 2009).
                             Vernetti, L, Irwin,  W., Giuliano, K.A., Cough, A., Johnston, P., Taylor, D.L, 2009.
                                 Cellular systems biology applied to pre-clinical safety testing: a case study of
                                 CellCiphr cytotoxicity profiling. In: Xu, J.J., Ekins, S. (Eds.), Drug Efficacy, Safety
                                 and Biologies Discovery: Emerging Technologies and Tools. Wiley and Sons.
                             Vinken, M., Doktorova, T., Ellinger-Ziegelbauer, H., Ahr, H.-J., Lock, E., Carmichael, P.,
                                 Roggen, E., van Delft, J., Kleinjans, J., Castell, J., Bort, R., Donato,  T., Ryan, M.,
                                 Corvi, R., Keun, H., Ebbels,  T., Athersuch, T.,  Sansone, S.-A.,  Rocca-Serra,  P.,
                                 Stierum, R., Jennings, P., Pfaller,  W., Gmuender, H., Vanhaecke, T., Rogiers, V.,
                                 2008. The carcinoGENOMICS project: Critical selection of model compounds for
                                 the  development  of omics-based in  vitro carcinogenicity  screening  assays.
                                 Mutat. Res. 659, 202-210.
                             Walmsley, R.M., 2005. Genotoxicity  screening: the slow march to  the future. Exp.
                                 Opin. Drug Metabol. Toxicol. 1, 261-268.
                             Watanabe, T., Kataoka, T., Mizuta, S., Kobayashi, M., Uchida, T., Imai, K., Wada, H.,
                                 Kinoshita,  T., Murate, T., Mizutani, S.,  Saito, H., Hotta, T., 1995. Establishment
                                 and characterization of a novel cell line, TK-6, derived from T cell blast crisis of
                                 chronic myelogenous leukemia, with  the secretion of parathyroid hormone-
                                 related protein. Leukemia 9,1926-1934.
                             Waters, M.D., Stack, H.F., Jackson, M.A., Bridges, B.A., 1993. Hazard identification:
                                 efficiency of short-term tests in identifying germ  cell mutagens and putative
                                 nongenotoxic carcinogens. Environ. Health Perspect. 101, 61-72.
                             Waters, M.D.,  Stack, H.F.,  Rabinowitz, J.R., Garrett, N.E., 1988. Genetic activity
                                 profiles and pattern recognition in test battery selection. Mutat. Res. 205,119-
                                 138.
                             Yamasaki, H.,  Ashby, J., Bignami,  M., Jongen, W.,  Linnainmaa,  K., Newbold, R.F.,
                                 Nguyen-Ba, G.,  Parodi, S., Rivedal, E., Schiffmann, D., Simons, J.W.I.M., Vasseur,
                                 P., 1996. Nongenotoxic carcinogens: development  of detection methods based
                                 on mechanisms: a European project. Mutat. Res. 353, 47-63.
                             Yang, C, Hasselgren, C.H., Boyer, S., Arvidson, K., Aveston, S., Dierkes, P., Benigni, R.,
                                 Benz,  R.D., Contrera, J., Kruhlak, N.L,  Matthews,  E.J., Han, X.,  Jaworska, J.,
                                 Kemper, R.A., Rathman, J.F., Richard, A.M., 2008. Understanding genetic toxicity
                                 through data mining: the process of building knowledge by integrating multiple
                                 genetic toxicity databases. Toxicol. Mech. Meth. 18, 277-295.
                             Zlokarnik, G., Negulescu, P.A., Knapp, T.E., Mere, L, Burres, N., Feng, L, Whitney, M.,
                                 Roemer, K., Tsien, R.Y.,  1998. Quantitation of transcription and clonal selection
                                 of single living cells with beta-lactamase as reporter. Science 279, 84-88.
 Please cite this article in press as: Knight, A.W.I
 Regul. Toxicol. Pharmacol. (2009), doiilO.lOiel
Previous
                                                               1-ofiling the US EPA ToxCast™ chemicals.

-------
           Submitted to Chemical Research in Toxicology
Molecular Modeling for Screening Environmental Chemicals

  for Estrogenicity: Use of the Toxicant-Target Approach
Journal:
i
Manuscript ID:
i
Manuscript Type:
P_________________^^
Date Submitted by the
Author:
j~_____________________
Complete List of Authors:
Chemical Research in Toxicology
tx-2009-00135x.R2
Article

Rabinowitz, James; US EPA, NCCT
Little, Stephen; US EPA, NCCT
Laws, Susan; US EPA, NHEERL
Goldsmith, Michael-Rock; US EPA, NERL
                    <$>scholaroNi
                       Manuscript Central
                ACS        Plus Environment
            Previous
TOC

-------
Page 2 of 35                     Submitted to Chemical Research in Toxicology


1
2
3
4            Molecular Modeling for Screening Environmental
o
6
7
s        Chemicals for Estrogenicity: Use of the Toxicant-Target
10
11
12
13
14
15
16
17
18
19
20
21        National Center for Computational Toxicology, U.S. Environmental Protection Agency, Research
22
23                                       Triangle Park, NC
24                                            &
25
26
27
28
29      RECEIVED DATE (to be automatically inserted after your manuscript is accepted if required
30
31      according to the journal that you are submitting your paper to)
32
33
                                         Approach
              James R. Rabinowitz*, Stephen B. Little, Susan C. Laws1 and Michael-Rock Goldsmith2
        TITLE RUNNING HEAD  Molecular Modeling - Screening for Estrogenicity


        AUTHOR FOOTNOTE *rabinowitz.james(g),epa.gov, corresponding author
36
37
38
39
40      ( 1 ) S . Laws is a member of the Reproductive Toxicology Division, NHEERL, U. S .EPA
41
42
4 o      (2) During this study M-R. Goldsmith was a cross Office of Research and Development postdoctoral fellow working in the NCCT. He is

        presently in the National Exposure Research Laboratory, U.S. EPA, Research Triangle Park, NC
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

                                    ACS Paragon Plus Environment
                                Previous  I   TOC

-------
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
                                Submitted to Chemical Research in Toxicology

                                         Table of Contents Graphic
                                            Page 3 of 35
                                      ACS Paragon Plus Environment
                                                                                              2
                                  Previous
TOC

-------
Page 4 of 35                        Submitted to Chemical Research in Toxicology

                                                  ABSTRACT

2
3
4
5
6               There is a  paucity of relevant experimental information available for the evaluation of the
7
8        potential health and environmental effects of many man made chemicals.  Knowledge of the potential
9
10
..       pathways for activity provides a rational basis for the extrapolations inherent  in the preliminary

12
13       evaluation of risk  and the establishment of priorities for obtaining missing  data for environmental
14
15       chemicals.  The differential step in many mechanisms of toxicity may be generalized as the interaction
16

1 g       between a small molecule (a potential toxicant) and one or more macromolecular targets. An approach
19
20       based on computation of the interaction  between  a  potential molecular toxicant and  a  library of
21
22       macromolecular targets of toxicity has been proposed for preliminary chemical screening. In the current
23
24
25       study, the interaction  between a series of environmentally relevant  chemicals and models of the rat
26
27       estrogen receptors  (ER) was computed and the results compared to  an  experimental data set of their
28
29       relative binding affinities.  The experimental data set consists of 281 chemicals, selected from the U.S.
30
31
22       EPA's Toxic Substances Control Act (TSCA) inventory, that were initially screened using the rat uterine
33
34       cytosolic ER-competitive binding assay. Secondary  analysis, using Lineweaver-Burk plots and slope
35
3®       replots, was  applied to confirm that only fifteen of these test chemicals were true competitive inhibitors
O /
38
39       of ER binding with experimental inhibition constants (Ki) less than 100 |jM. Two  different rapid
40
41       computational "docking" methods have been applied. Each provides a score that is a surrogate for the
42

44       strength of the interaction between each ligand-receptor pair. Using the score that indicates the strongest

45
46       interaction for each pair, without consideration of the geometry of binding between the toxicant and the
47
48       target,  all of the active molecules were discovered in the first 16% of the  chemicals.  When a filter is
49

51       applied based on the geometry of a simplified pharmacophore for binding to  the ER,  the results are
52
53       improved and all of the active molecules were discovered in the first 8% of the chemicals.  In order to
54
55       obtain no false negatives in the model that includes the pharmacophore filter only 8  molecules are false
56

58       positives.  These results indicate that  molecular "docking" algorithms that were designed to find the
59
60       chemicals that  act most strongly at  a  receptor  (and  therefore are potential pharmaceuticals)  can

                                         ACS Paragon Plus Environment
                                     Previous  I    TOC

-------
                                   Submitted to Chemical Research in Toxicology                        Page 5 of 35

         efficiently separate  weakly active  chemicals from  a  library  of primarily inactive  chemicals.  The

2
3        advantage of using a pharmacophore filter suggests that the development of filters of this type for other
4
5        receptors will prove valuable.
6
7
8
9        KEYWORDS Computational  molecular docking,  Toxicant-Target interactions,  Estrogen  receptor,
10
11       Endocrine disruption
12                     F
13
. c       Abbreviations: (KBA) Relative binding affinity, (ER) estrogen receptor,  (KIERBL) The label in DSSTox

16
17       (15) for the primary experimental data set used in this study (14),  (1C50) The concentration  of a test
18
1 ^       chemical that inhibits the maximal specific binding of 0.33 nM radiolabeled (H) 17-/3-estradiol (E2) to

21
22       rat uterine cytosolic ER by 50%,   (Ki)  Inhibition  constant for the test chemical,  i.e., micromolar
23
24       concentration that will  bind  to half the  ER at equilibrium in  the absence  of radioligand or other
25
Oft
         competitors,  (TPR) True positive ratio, the ratio of the positives discovered divided by the total number

28
29       of positives,  (FPR) False positive ratio,  the ratio of the  number of false positives divided by the number
30
31       of negatives,  (HPTE) 2,2-Bis-(p-Hydroxyphenyl)-l,l,l-Trichloroethane).
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

                                          ACS Paragon Plus Environment
                                     Previous  I    TOC

-------
Page 6 of 35                        Submitted to Chemical Research in Toxicology


1
2
3
4          INTRODUCTION
o
6
7
o
                The potential risks due to  exposure  to  man-made  chemicals  in  the  environment must be
y
10
11       evaluated in order to protect human health and the environment.  Little or no experimental information
12
13       is available about the potential biological effects of as many as 75% of these chemicals (1). In addition
14

16       the data that are  available are often insufficient to adequately evaluate the potential hazards of each of

17
18       these chemicals.  Therefore, there is a compelling need to develop information that would enable the
19
20       screening of the potential health and environmental effects of large numbers of man-made chemicals (2).
21
22
23       While  animal  studies have historically been  preferred for risk  assessment, such data sets are often
24
25       difficult and expensive to obtain and yet still require significant extrapolation to be applied to this task.
26
27
28
29              The National Academy  of Science report, "Toxicity Testing in the  Twenty-First Century"
30
31       presents a vision of a new paradigm for toxicity testing (3).   This vision is based on the identification
o^
33
34       and characterization of toxicity pathways. The capacity of specific chemicals to participate in these
35
36       pathways would  then be interrogated  using rapid (m vitro or perhaps computational) test methods.
37
38       Integration of the data using informatics approaches that consider the causal nature  of these pathways
39
40
41       could then be applied for the evaluation of the risks posed by a specific chemical.  In support of this
42
43       vision,  efforts are underway to devise approaches that combine more rapidly obtained experimental  data
44
^       and data computed from chemical structure with informatics approaches to perform toxicity evaluations
46
47
43       (4).  In the initial stage, these approaches may be seen as extrapolations from more readily obtainable
49
50       data  through  the more traditionally applied  animal  data to the evaluation of potential health  and
51
C f}
^r       environmental  effects.     Until  more  experience is  obtained  and  success is demonstrated, these
Oo
54
55       approaches should  be considered as  methods to screen chemicals and develop testing priorities.  After
56
57       sufficient experience, the support of this approach through the more traditionally accepted animal  data
58
^       may become unnecessary and this type of evidence used directly to evaluate chemical toxicity.

                                         ACS Paragon Plus Environment
                                     Previous  I    TOC

-------
                                   Submitted to Chemical Research in Toxicology                        Page 7 of 35

                Knowledge of the potential mechanisms of toxicity provides a rational basis for extrapolation

2
3        from chemicals where there is a great deal of data on toxicity to chemicals for which little data exists.
4
5        The differential step in  many mechanisms of chemical toxicity may be generalized as the interaction
6
         between a small molecule (a potential toxicant) and one or more macromolecular targets.  The small
8
9
1 o       molecule may be the chemical itself or one of its descendents. Modeling the potential of a molecule for
11
12       specific interactions of this type is a source of insight into its potential toxicity. In a previous paper the
13
         application of an approach based on computation of the interaction between a potential molecular
 I O
16
17       toxicant and a library of macromolecular targets of toxicity was proposed for chemical screening (5). In
18
19       order  to use  a library of this  type  to assess the  potential for untested chemicals to be  toxic  and to
20
21
         determine the most likely pathways for toxicity,  a rapid method to evaluate interactions between the

23
24       molecule and the target is needed.  Molecular "docking" (6-8) has  been developed to screen large
25
26       libraries of chemicals for  molecules  that interact with  specific sites on proteins and therefore are
27
po
2Q       potential  pharmaceutical agents (9-11).   It also has been used to identify xenoestrogens (12) but has

30
31       infrequently applied to investigate the potential toxicity of weaker agents (5, 13).  It is a rapid method
32
33       that depends on a  surrogate  for  the  interaction energy  (a  scoring  function) that is determined
34
35
36       heuristically from data determined from many interactions between small molecules and proteins. In the
37
38       current study this computational approach is applied to a set of environmentally relevant chemicals
39
40       interacting with the estrogen receptor (ER) and the computational results compared to results from a
41
42
43       recent experimental  study that determined the relative binding affinities (KB A) of these molecules to the
44
45       rat (ER) in a complex preparation (14-16).
46
47
48
49              In the experimental study, the KB As were determined by the ability of the natural ligand, to  bind
50
51       to the estrogen  receptors in a  rat tissue preparation in the presence of increasing amounts of the test
o^
53
54       chemical.   This assay  is currently being evaluated  for  inclusion in a test battery to  be used for
55
56       determining the potential  of  environmental  chemicals to  disrupt endocrine function (17,  18).  It
57
^       simulates the initiating  step  of a process that  results in estrogen receptor mediated physiological
oy
60

                                          ACS Paragon Plus Environment
                                     Previous  I    TOC

-------
Page 8 of 35                        Submitted to Chemical Research in Toxicology

         responses. Chemicals that are active in this assay may have tissue dependent agonist and/or antagonist

2
3        properties. Measuring the response as a  function of the concentration of the test chemical allows true
4
5        binders to be separated from chemicals that interfere with the binding of the natural ligand through other
6
         mechanisms.   These results are analogous to  what is  modeled  in  molecular  "docking"  computer
8
9
1 o       experiments and as such form an excellent system for exploring the use of computational "docking" as a
11
12       tool in a toxicity screen. The data set has the additional advantage that most of chemicals that bind are
13
         found to bind weakly and "docking" methods can be tested for their capacity to find chemicals that bind
 I O
16
17       weakly to a receptor and are not drug like.
18
19
20
21         METHODS
22
23
24
25             Molecular  modeling software tools,  which have been developed for the discovery of novel
26
27       pharmaceutical agents, were applied to identify chemicals that bind weakly to the  rat estrogen receptor.
28
29
2Q       These tools attempt to order a set of chemicals by their capacity to bind to a specific site in a protein (7).
31
32       This is accomplished by computationally  "docking" each chemical into the ligand binding pocket of the
33
34       receptor.  In molecular "docking" the most favorable interaction between the chemical - binding pocket
OO
oc
37       pair (toxicant-target in this case) is identified and  quantified.
38
39
40
41             The protein targets used  in this  study  were derived from crystallographic structures of the
42
43       estrogen receptors each with a ligand  bound.  The  atoms of the ligands were  computationally removed
44
45       from crystal  structures to create computational targets for "docking".  In order to compare the results
46
47
43       from this computational study to the  experimental results from the cytosolic preparation (14, 16) four
49
50       targets were created.   Multiple  targets  were required  because the rat  uterine  cytosol used in  the
51
52
co       experimental study contained both ERa and ERp and  the experimental method do not distinguish

54
55       between  agonist-like and antagonists-like binding.  Crystal  structures  were not available from  the
56
^J       Protein Data Bank (PDB) (19) for the rat  ERa. Therefore the rat ERa targets were derived from human
58
59
60       ERa (20, 21) by homology models. There is 96.6% homology between the ligand binding domain of the

                                         ACS Paragon Plus Environment
                                     Previous  I    TOC

-------
                                  Submitted to Chemical Research in Toxicology                        Page 9 of 35


         rat ERa and the ligand binding domain of the human ERa.  The homology models were developed in

2
3        Molecular Operating Environment substituting the ligand binding domain of the rat receptor sequence
4
5        for the similar human receptor  sequence (22-24).   The model targets derived in this  way were
6

8        subsequently  optimized  by molecular mechanics  using the AMBER94  force-field (25).  Crystal
9
10       structures of the rat ERp  are available directly from the PDB for both agonist (26) and antagonist (27)
11
12
         co-crystallized conformations.  The AMBER94 force-field was also used to optimize the structures of
 I O
14
15       these targets.  In this manner four targets were created, ERa (agonist), ERa (antagonist), ERp (agonist)
16
         and ERp (antagonist).

19
20
21
22             A  structural data  file containing each of the 281 chemicals included in the experimental study
23
24       (14) was  obtained from  DSSTox (16).   The label for this  data set in  DSSTox is  KIERBL.   The
25
26       structures were imported into the Molecular Operating Environment software (22) and the molecular
OQ
29       geometries were optimized using the MMFFx force field (28). For the purposes of this study a chemical
30
31       was considered active (e.g, a true competitive binder for the ER) if it had an experimentally determined
32
00
^       Ki and an ICso less than  100 \iM in the experimental study (14).  Thus, of the 281 chemicals, 15 were
o^
35
36       classified as active ER binders.  266 were classified as inactive for this study.  Of those classified as
37
38       inactive for this study, the experimental study left some uncertainty about 11 of those chemicals.  Of
39
40
41       those 11 chemicals, 9 had limited competitive binding curves because of poor solubility and had ICso
42
43       that were  uncertain in the experimental study but were greater than 100 (jM, while the other 2 chemicals
44
^       had partial competitive binding curves that were deemed sufficient to obtain  an ICso values but those
46
47
48       values were greater than  100 |jM, more than 5 orders of magnitude greater than the natural ligand.  All
49
50       chemicals used in this  study are reported in the U.S. EPA's Distributed Structure-Searchable Toxicity

52
53       (DSSTox) Public Database Network where all experimental results  and a Structure Data File with
54
55       chemical identifiers are available for public access (15, 16).
56
57
58
59
60
                                                                                                     Q
                                         ACS Paragon Plus Environment
                                    Previous  I     TOC

-------
Page 10 of 35                       Submitted to Chemical Research in Toxicology

                Two different  molecular "docking" approaches,  using  two different software packages were

2
3        employed.  The first package, eHiTS (29,  30) relies on a decomposition of the optimized structure of
4
5        potential ligands  into fragments.  Then  individually docking each  fragment into  the  binding  site,
6
         obtaining multiple fragment poses and using a graph matching scheme to reconstruct the structure of the
8
9
10       potential ligand based on connectivity tables.   The  reconstructed molecular poses  are  subsequently
11
12       locally optimized and a score (a surrogate for the binding energy) for each pose is obtained from a
13
         heuristic scoring function (29).  The second algorithm, FRED (31, 32) applies a systematic exploration
I O
16
17       of the  rotational and  translational  space of each  putative ligand in the target space to discover the
18
19       configurations for the putative ligand-target interaction with the best  score.  It has been shown to be
20
21
         sufficiently accurate and rapid for virtual  screening for  leads for potential pharmaceuticals (33,  34).

23
24       Both of the "docking" methods used in this study have been shown to be successful at finding drug-like
25
26       (molecules that bind strongly to the receptor) estrogenic molecules (29, 33).  Their capacity to separate
27
po
2Q       weak binders from non-binders is a  primary goal of this computational study.
30
31

33         RESULTS
34
35
oc
3-7              The experimental results of Laws and co-workers (KIERBL) (14) provide an  excellent data set
38
39       for investigating the application  of the toxicant-target approach for the prioritization of chemicals for
40
41       potential toxicity.  The  advantages of comparing computational molecular "docking" results to these

43
44       experimental results are: first, there are a number of excellent crystal structures of both a and |3 estrogen
45
46       receptors available in the Protein Data Bank (20, 21, 26, 27 and others) that can be used to synthesize
47

4Q       macromolecular targets for computer "docking", second, a library of 281 chemicals  was tested in the
50
51       same laboratory with the same protocol for their capacity to compete  with radiolabeled 17-|3-estradiol
52
CO
^~       for their binding to the rat ER and third, the experiments yield a relatively direct measurement of what is
O^
55
56       modeled in computational  "docking", the energy of interaction  between the test chemicals and the
57
58       receptor compared to the energy  of interaction of the receptor with  17-|3-estradiol.
59
60
                                                                                                      Q
                                          ACS Paragon Plus Environment
                                     Previous  I     TOC

-------
                                   Submitted to Chemical Research in Toxicology                       Page 11 of 35

                In the current study each of the 281 chemicals was  computationally "docked" into 4 rat ER

2
3        targets.  The large structural differences between targets obtained from co-crystallization with an agonist
4
5        or an antagonist chemical were verified by observation (results not shown). The targets created from the
6
         agonist crystal structures are smaller and enclosed while the targets created from the antagonist crystal
8
9
1 o       structures are larger, more open and will accommodate larger molecules.
11
12
13
14              The  best  scores  for each  molecule  interacting  with  each  target  are  reported  for  each
15
16       computational approach and  used  for further  analysis  in  this study.   The chemical names and the

1 8
1Q       experimental determination  of binding from the criteria  described in the methods  section and the

20
21       measured values along  with the "docking" scores for the interaction of that chemical with each of the
22
23       four targets  using both methods are in Table 1 of the supporting material.  For this data set each
24
25
2g       chemical that binds (with the exception of 17-|3-estradiol) has an experimental Ki that is 3 to more than
27
28       5 orders of magnitude larger than 17-|3-estradiol.  The rat uterine cytosol used as the source of receptors
29
30
31       in the binding assays contained both the alpha and beta subtypes, with the predominant type being ERa
32
33       (35).  Models that attempted to realistically combined ERa and ERp scores were used but they did not
34
35
35       significantly improve the results when compared to those  from consideration of ERa alone (data not
37
38       shown). Thus, in the remainder of this analysis only the scores for ERa with both molecular docking
39
40
41       methods are used. Since the experimental results do not distinguish between molecules that bind like an
42
43       agonist and molecules that bind like an antagonist, scores for both targets have been combined to give a
44
^       composite score for each chemical.   The composite score for each chemical is chosen as the score that
46
47
43       indicates the most stable  interaction for that  chemical.  Comparison of the results for agonist and
49
50       antagonist targets might indicate whether the chemical was likely to be an agonist or an antagonist.  The
51
rp
^       composite score  for each chemical  was considered along with  the separate scores  for the agonist and
Oo
54
55       antagonist targets.
56
57
58
59
60

                                          ACS Paragon Plus Environment
                                     Previous  I     TOC

-------
Page 12 of 35                       Submitted to Chemical Research in Toxicology


                The results from computational "docking" are scores (a surrogate for the interaction energy) for

2
3        each potential ligand-protein pair. The scores are determined by the internal scoring functions of the

4
5        computational  method used.  The parameters in the  scoring functions have been determined  from
6

         multiple experimental measurements  of protein-small  molecule  structures (30,  32).  They are not
8
9
10       adjusted to optimize the results for comparison to these specific receptors or experimental results.  For
11
12       the  purpose of predicting potential  activity a  demarcation between likely active and  likely inactive
13

         chemicals  must be chosen. An example of this demarcation is shown in Figure 1.  The only adjustable
I O
16
17       parameter  in this approach is the choice of demarcation between the predicted likely active chemicals
18
19       and the predicted likely inactive chemicals (line Q-R in Figure 1). In Figure 2 the predicted true positive
20
21
         ratio (TPR) is  plotted as  a function of the predicted false positive ratio (FPR) as the position of the

23
24       demarcation line between predicted positive and negative (Q-R) is varied. Table 1 shows a synopsis of
25

26       these results for ER* targets for both "docking" methods.  Using the eHiTS method the results for the

28
29       agonist target are superior to the results for the antagonist target and the composite score.  Using the

30
31       FRED method the results for the agonist target are superior to the  results for the antagonist target and
32
o o
^~       composite score for identifying  14 of the  15  active chemicals but  finding one of  the active chemicals
O^
35
36       (4,4'-sulfonyldiphenol) is  more difficult, that chemical  is best found by the antagonist target. All the
37
38       active chemicals appear in the first  16% of the data set when the  eHiTS, fragment based approach is
39

 ®       used with  the agonist target and 39%  (27% when one positive chemical is excluded) when FRED is

42          A
43       used.
44
45
46
47              For the preceding  results the best score for each chemical was chosen without consideration of
48
49       the  geometry of binding between the toxicant and the  target.  Many experimental studies of estrogen
50

52       receptor binding have shown that specific interactions between the atoms in the ligand and atoms in the

53
54       receptor play a key role  in molecular recognition.  As a result, a pharmacophore for  binding to the
55
56       estrogen receptor has been developed  (36).  In theory, "docking" algorithms should insure the proper
o /
co
eg       orientation of the ligand  and  the target so that the requirements of this pharmacophore are satisfied

60

                                         ACS Paragon  Plus Environment
                                     Previous  I    TOC

-------
                                   Submitted to Chemical Research in Toxicology                       Page 13 of 35

         because the scoring functions should include the interactions that favor the pharmacophore recognition

2
3        process. However, the development of scoring functions is a continuing process (37, 38) and it has been
4
5        found that including pharmacophores, in the form of constraints on the allowed docking poses, improves
6
         results in some cases (39, 12 and 40).  In the docking program FRED, constraints of this type may be
8
9
10       applied to the geometry of the interacting molecules.  The set of chemicals was docked again under the
11
12       constrained condition that only geometries that produced two appropriate hydrogen bonds between the
13
         toxicant and the target (equivalent of the hydrogen bonds made by the hydroxyl group of the A-ring of
 I O
16
17       17-p-estradiol to an arginine and glutamate in the binding pocket of the receptor) were permitted. The
18
19       resulting scores using this approach in FRED are shown in Table 1 of the supporting material. The plots

21
22       of TPR as a function of FPR for these  results are shown in Figure 3. Table 2 shows these results for
23
24       various choices of demarcation. All positive chemicals are now found in the first 14% (8% if the same
25
2®       difficult chemical is omitted) of the molecules and while the results for finding all 15 chemicals is better

28
29       for the antagonist target than the agonist target, they are quite similar and the agonist target finds the first
30
31       14 more rapidly.  The eHiTS program does not yet contain the option of applying externally determined
32
o o
^       pharmacophore driven constraints directly but it  is possible to use the insight from the results generated
O^
35
36       by FRED to enhance the results obtained from eHiTS.  In the library of possible ligand-receptor poses
37
38       generated by eHiTS, all poses that do not satisfy  the constraints have been eliminated.  Those results are
39
 ®       also shown in Table 1 of the supporting material and the plots of TPR as a function of FPR for these

42
43       results are shown in Figure 4.   Table  2 shows the summary of these  results for various choices of
44
45       demarcation (Q-R).  All positive chemicals appear  in the first 8% of the molecules.  For the  eHiTs
46
         method with the constraints the agonist target consistently yields  the best results.  The addition of a

49
50       simplified pharmacophore constraint significantly decreases the number of false positives that would
51
52       result from a demarcation  scheme  that minimizes the  false negative rate for all  approaches and
53
         eliminates no positive chemicals in this data set. For all the "docking" results in this study, molecules
56
57       that could not be "docked" were inactive in the experimental study.
58
59
60
                                                                                                    17
                                         ACS Paragon Plus Environment
                                    Previous  I     TOC

-------
Page 14 of 35                       Submitted to Chemical Research in Toxicology


               With the exception of 17-|3-estradiol all of the chemicals in this data set that displace the natural

2
3        ligand, bind only weakly to  the estrogen receptor.  The computational methods used in this study were
4
5        designed to increase the discovery rate of molecules that bind efficaciously to  specific protein targets
6

8        and have been shown to enrich data bases for potential pharmaceutical agents by identifying chemicals
9
10       that bind strongly to receptor targets, including the estrogen receptor (33).  Applying  the same set of
11
12       computational tools, an approach has been described that is capable of separating chemicals that bind
I O
14
15       weakly to the estrogen receptor from those chemicals that do not bind to the receptor at all, at least for
16
17       the example of this data set.  In order to explore the activity domain of the approach used in this study, a
18
19       set of chemicals that are primarily potent estrogen receptor binders was docked into targets constructed

21
22       from the human estrogen receptor (20, 21).  The activities of these chemicals at the estrogen receptors
23
24       have been measured in a similar manner to the measurements in the KIERBL data set in a single study
25
2®       (41).  Chemical information and the experimental determination  of KB A (from  (41)) along with the

28
29       "docking" scores  for the ERa  receptor are shown in Table 2  of  the  supporting material.    The
30
31       demarcation lines (Q-R), established  previously  for the KIERBL  data set with each docking method,
32
33
34       were then used to separate the active  chemicals from the inactive chemicals in this additional  data set of
35
36       primarily strong binders.  With the eHiTS approach all chemicals that compete with E2  for the estrogen
37
38       receptor are identified by the agonist target with the exception of coumestrol, methoxychlor (a very
39
40
4
-------
                                   Submitted to Chemical Research in Toxicology                       Page 15 of 35

                In a previous  paper (5) the long term  goal  of developing  a library of molecular targets for

2
3        chemical toxicity was introduced.  In order to use  a library of this type to assess the potential for
4
5        untested chemicals to be toxic and to determine  the most likely pathways for toxicity, a rapid method to
6
         evaluate  interactions between the molecule and the (macromolecular) targets is needed.  Molecular
8
9
10       "docking" has been used to screen large libraries of chemicals for molecules that interact strongly with
11
12       specific sites on proteins and therefore are potential pharmaceutical agents.  In this study we evaluate the
13
         capacity of two "docking" methods to discover chemicals that interact weakly (3-5 orders of magnitude
I O
16
17       less than the natural ligand) with the estrogen receptor.
18
19
20
21              In the KIERBL data of weak binders used in this  study only 5% of the chemicals have any
22
23       measured capacity to compete with 17-|3-estradiol for the ER in the tissue  preparation. It is difficult to
24
25
26       estimate how many of the untested industrial chemicals are likely to interact with particular biological
27
28       receptors. However, it is likely that the amount  for each receptor is small and their interactions with the
29
o r\
^       receptors weak when compared to the natural ligand.   (Chemicals that interact strongly should be more
O I
32
33       easily discovered.)  For this  reason, the experimental data set used in this study is more relevant to
34
35       environmental circumstances than a data set composed of pharmaceuticals  or other chemicals that have
36
oy
         been designed to exhibit pharmaceutical like biological activity.
08
39
40

._              The choice of the ERs as a model target  for this exploration of the application of "docking" as a

43
44       tool for exploring potential toxicity has added advantages. First, the target is of environmental interest
45
46       and has specific relevance for endocrine disrupter screening.  Second there  are a relatively large number
47

4Q       of crystal structures of the ERs  co-crystallized with various agonist and antagonist ligands, in the
50
51       literature.   The availability of many similar crystal structures supplies leverage to  counteract a short
52
53       coming of the docking methods used in this study, that is, the protein targets are not considered flexible.
54

56       The computational targets do not respond to the potential  ligand and do not relax to  provide a better fit
57
58       for each ligand.  While there are methods that include protein flexibility (42, 43), they are currently too
59
60

                                          ACS Paragon Plus Environment
                                     Previous  I    TOC

-------
Page 16 of 35                       Submitted to Chemical Research in Toxicology

         computationally intensive to be considered for this type of application.  In examples like the estrogen

2
3        receptor where there  are  a number of crystal  structures available,  flexibility may be included by
4
5        constructing multiple targets for the same receptor and accepting the "best" score for each toxicant-
6
         target pair.  In this study we have utilized targets for both the agonist and antagonist configuration of the
8
9
1 o       receptor but only a single target for each of these major receptor configurations.
11
12
13
14              Two different "docking" packages with two different approaches  and different scoring functions
15
16       have been used.  The most useful  results were obtained when a simplified pharmacophore filter was

1 8
1Q       applied to constrain the allowed "docking" results.   By eliminating toxicant-target poses that do not

20
21       satisfy a predetermined pharmacophore, the filter significantly reduces  the number of chemicals that
22
23       would be classified as false positives for each approach used, yet its  application had no effect on the
24
25
2g       discovery of true positives in this  data set, that  are  up to 5 orders of magnitude weaker  than 17-0-
27
28       estradiol. The constraints of this pharmacophore based filter could be  satisfied by poses generated with
29
o r\
^       102 of the molecules considered in this study.
O I
32
33

                The pharmacophore filter was derived from decades of experience with the estrogen receptors
oO
36
37       (36) and is used  in this study in a simplified form. It is composed of only two of the three hydrogen
38
39       bonds found in the interaction between strong binders and the ER.  More  complex pharmacophores have
40

 ._       been proposed (44, 45). There is a balance between minimizing the false discovery rate and discovering

43
44       all potential positive chemicals that must be carefully  considered when increasing the complexity of the
45
46       filter.   The same pharmacophore filter used to discover chemicals that bind strongly to a receptor may
47

4Q       not be appropriate in a  chemical  screen designed to also find  chemicals  that  bind weakly.   The
50
51       pharmacophore filter used in  this   study contains  the geometry  for a single  hydrogen bond  donor
52
53       interacting with a glutamate and a single hydrogen bond acceptor interacting with an arginine. For other
54

56       potential receptor targets the available literature may not be as extensive and a similar approach for the
57
58       development  of a pharmacophore filter may not be feasible.   An  alternative approach for the
59
60

                                          ACS Paragon Plus Environment
                                     Previous  I     TOC

-------
                                   Submitted to Chemical Research in Toxicology                       Page 17 of 35

         development of pharmacophore filters is to use computational methods to identify a pharmacophore

2
3        from crystal structures of the target protein co-crystallized with an array of ligands (46, 47).
4
5
6
7               The pesticide methoxychlor could not fully satisfy even this simple pharmacophore.  It contains
8
9        the  necessary  hydrogen bond  acceptor  but does  not contain  a potential  hydrogen bond donor.
10

         Experimental evidence suggests it interacts weakly and in a complex manner with the estrogen receptors

13
14       (48-50).   HPTE,  (2,2-Bis-(p-Hydroxyphenyl)-l,l,l-Trichloroethane),  a metabolite of methoxychlor,
15
16       does satisfy the filter and is a more active estrogen receptor binder than its parent (48, 49) (See table 2 of

1 8
1Q       the  supplemental material for the docking results for HTPE).  An even simpler pharmacophore  may

20
21       provide the best approach for including all potential weakly estrogen molecules.
22
23
24              The degree  of improvement in the results of this study through application of the pharmacophore
25
Oft
         filter was surprising because the specific interactions characterized by the filter should already be in the

28
29       scoring functions used in "docking".  This result suggests that docking methods could be improved (at
30
31       least relative to their capacity to discover weak binders) by improving scoring  functions.  Molecular
32
00
^       docking with a pharmacophore filter might be incorporated into a tiered  or other structured approach for
o^
35
36       chemical  screening  (see for instance  51).   The  application of a pharmacophore  filter could be
37
38       incorporated into one of the initial tiers and all molecules that could not satisfy the filter in any docking
39
         pose would be eliminated before docking. Each  individual pose would still be required to satisfy the

42
43       filter for scoring during the "docking" phase which would be part of a later tier.
44
45
46
47              The "docking" results obtained from the ERa agonist  target are consistently better than the
48
         docking results for all other individual targets and the composite scores for KIERBL data set of weak
ou
51
52       binders. Part of the explanation is that the tissue preparations in the experimental study contain much
53
54       more ERa than ERp  (35).  All  of the active chemicals are discovered  most rapidly by using only the
oo
cc
57       agonist target and the addition of results for the antagonist target by creating composite scores increases
58
59       the  false  discovery rate.  This may result from something specific about the KIERBL data set (that
60

                                          ACS Paragon Plus Environment
                                     Previous  I     TOC

-------
Page 18 of 35                       Submitted to Chemical Research in Toxicology

         perhaps it does not contain antagonists).  The decrease in the discovery rate when composite scores

2
3        (constructed from agonist and antagonist results) are used may result from the crystal structures used to
4
5        create the targets.  As a result of a more favorable crystal structure interactions in the antagonist target,
6
         there might be a constant favorable offset.  This would introduce false positives through the antagonist
8
9
10       target when compared to the agonist target.   The current results suggest that this is the case and  an
11
12       adjustment of the scores when comparing results from  different  targets  is needed.   Clearly,  the
13
         discovery rates of this study (with the KJERBL data set) would not be changed by including docking at
I O
16
17       an antagonist target and combining it with the  results from the agonist target with an offset but for data
18
19       sets that contained  (bulky) antagonists the  results  would  be improved.   Most of the molecular
20
21
         identification features are the same for the agonist and antagonist targets but the antagonist target can

23
24       accommodate larger molecules.  The crystal structure of ERa co-crystallized with hydroxy-tamoxifen
25
26       was used to create the antagonist target in this study.  Hydroxy-tamoxifen is  too bulky to dock well into

28
29       the agonist target.  Similar bulky known antagonist molecules were not discovered with only an agonist
30
31       target when "docking" the series of pharmaceutical like molecules. It  is likely that  a general screen
32
o o
^~       would need both targets. As an element in a tiered or  another type of ordered approach for determining
O^
35
36       chemicals that bind to the estrogen receptor,  agonist and antagonist targets might be used separately
37
38       after prefiltering steps based on various molecular descriptors for each target.
39
40
41
42              The result  of the application  of a "docking"  tool to the toxicant-target paradigm is a list of
43

         chemicals ordered by their score.  The score  for a molecule is a surrogate for the interaction energy
4O
46
47       between the potential ligand and the target. These scores are derived from scoring functions that are
48
49       obtained from consideration  of the  general form of the  interaction between a protein and a  small
50

52       molecule and experimental data describing many protein  small molecule  interactions.  The scoring

53
54       functions do not depend on the specific chemicals or the specific protein being considered.  The score
55
56       for a specific protein-ligand interaction depends on the geometry of the interaction and the description of
o /
co
eg       the general scoring function. The development of the ordered list is independent of and does not require
60
                                                                                                       17
                                          ACS Paragon Plus Environment
                                     Previous  I    TOC

-------
                                   Submitted to Chemical Research in Toxicology                       Page 19 of 35

         any data on other chemicals interacting with this specific target.   This  is different than many other

2
3        computational  methods for evaluating potential toxicity where data for a training  set of chemicals
4
5        interacting with the specific receptor is required.  The use of "docking" tools requires only a crystal
6
         structure of the target or a similar protein if homology models are used.
8
9
10

                The variable  line Q-R (see  Figurel)  used to dichotomize the  set of molecules is the only

13
14       adjustable parameter in this approach. Optimizing the placement of Q-R allows the determination of the
15
16       best separation between active and inactive chemicals in the data set and facilitates identification of all
17
1 8
1Q       the positive chemicals in the smallest subset  of molecules.  This is a reasonable approach where the

20
21       necessary relevant data exists but in cases where that data is not available or in cases where there are
22
23       other rationales (perhaps resource driven) for choosing the number of chemicals to be tested the ordered
24
25
26       list provides a justification for the order of chemicals to be tested.
27
28
29              When chemicals in the initial data set are compared to  a large set of known estrogen receptor
30
31       binders (52) and three physicochemical properties that are relevant for ER binding (but probably not a
32
00
^       complete set of relevant properties relevant to ER binding) are computed (53.  54)  and employed for
o^
35
36       comparison, the KIERBL data set is found to be more diverse than the set of known estrogen binders
37
38       (see Figure  6).  To a major extent this is the case because most of the chemicals are inactive.  When only
39
         the  active  chemicals are considered 4  of the 15 active chemicals  are  found to  be  within the

42
43       physicochemical space of the external strong binding chemicals,  8 are adjacent to that space and 3 of the
44
45       active chemicals are outside that space. About 10% of the nonbinding chemicals are contained in the
46

         space of external strong binding chemicals.

49
50

52         CONCLUSIONS
53
54
55
56              In the primary experimental library considered in this study, only 14 of the 280 (excluding 170-
57
58       estradiol) chemicals have measured Ki values and an IC50 less than lOOuM.  Computational molecular
59
60
                                                                                                     1 8
                                         ACS Paragon Plus Environment
                                     Previous  I    TOC

-------
Page 20 of 35                       Submitted to Chemical Research in Toxicology

         "docking" methods are able to identify all of these chemicals using targets constructed from ER crystal

2
3        structures. For the best approach applied in this study, the 14 positive chemicals are identified with only
4
5        8 false positive  assignments.  All of the computational molecular docking approaches and comparisons
6
         used in this study give reasonable results.  All chemicals from the experimental library that could not be
8
9
1 o       "docked" in the biomolecular targets were inactive.
11
12
13
14              The molecular docking algorithms applied in this  study were developed to aid in pharmaceutical
15
16       discovery. As such they were designed to find the molecules with the greatest activity. However, the

1 8
^g       active chemicals in this study (with the exception of 17|3-estradiol itself) all have activities 3-5 orders of
20
21       magnitude smaller than 17-|3-estradiol and are discovered by these "docking" algorithms.  95% of the
22
23
24       chemicals in the library considered in this study are not active.  This is not typical of computational
25
26       studies of environmental chemicals where the chemical library often contains more positive chemicals
27
28       and attempts are made to balance the data base.  This data set is most likely more similar to the problem

30
3^       seen in environmental  circumstances where for  any biomolecular target the majority of the chemicals
32
33       will be negative. The relevant problem is to identify the few chemicals that are positive.
34
35
36
37              This approach is based on modeling the forces that determine the interaction between a small
38
39       molecule and a biomolecular target.  A score (a surrogate for that interaction energy) is obtained for each

41
42       molecule.  The actual determination of likely positive and negative chemicals depends on the choice of a
43
44       line of demarcation (called Q-R in Figure  1) imposed on the continuous spectrum of scores for the
45
         model interaction. This value may be adjusted  to provide a more or less conservative evaluation.  In

48
49       order to use this approach, there must be high quality structural descriptions of the biomolecular targets
50
51       or similar macromolecules. Previous data on the activity of other chemicals acting at this target are not
52
CO
^~       required.
O^
55
56

                In this approach a  single step in a complex path for toxicity is modeled.  It is an indicator of
oo
59
60       suitability of the potential toxicant to take  part in that  single facet of more general  mechanisms for
                                                                                                     1Q
                                          ACS Paragon Plus Environment
                                     Previous  I     TOC

-------
                                  Submitted to Chemical Research in Toxicology                       Page 21 of 35

        toxicity.  There are undoubtedly other pathways and perhaps other steps in a single pathway for toxicity

2
3       that may be influenced  by a potential chemical  toxicant.   As  a library  of targets  is developed,
4
5       interactions with many targets may be  combined to provide  a more complete picture of potential
6
        chemical toxicity.  The results generated by the toxicant-target  approach may be integrated with other
8
9
1 o      types of data, chemical information, pharmacophores and bioassay data.  These different types of data
11
12      may be best combined in a tiered approach that recognizes the strengths of each type of data and the
13
        requirements for obtaining it.
 I O
16
17
18
19
on
2"        ACKNOWLEDGMENT:  MRG was supported during a portion of this work by NHEERL-DESE

22
23      Training Agreement, EPA CT829471. This work was reviewed by EPA and approved for publication
24
25      but does not necessarily reflect official  Agency policy.  We thank Drs. Chang, Dix and Kavlock for
26
27
28      reading and commenting on  this manuscript and Dr. Richard for helpful advice.
29
30

32        SUPPORTING INFORMATION PARAGRAPH  Table  1  of the supporting material contains the
33
34      name, CAS number, smiles  code, measured ICso and the activity calls based on the criteria described in
35
3®      this study for each chemical. It also contains all the docking data (scores) used in this study, four targets
O /
38
39      ERa, ER|3, from both agonist and  antagonist crystal  structures  and four  methods, FRED without
40
41      constraints, Fred with 2 constraints, eHiTS without  constraints  and eHiTS results with the constraints
42

 ..      from FRED superimposed on the scores.

45
46

Jg        SUPPORTING INFORMATION PARAGRAPH  Table  2  of the supporting material contains the
49
50      name, CAS number, smiles  code, RBAs for ERa from reference (41). It also  contains the docking data
51
52
53      (scores) for both ERa the agonist and antagonist targets using both Fred and eHiTS with out constraints.
54
55
56
57
58      REFERENCES
59
60
                                                                                                 70
                                        ACS Paragon Plus Environment
                                    Previous  I     TOC

-------
Page 22 of 35                       Submitted to Chemical Research in Toxicology

         (1)  Judson, R., Richard, A.,  Dix D., Houck, K., Martin, M., Kavlock, R., Dellarco, V., Henry, T.,
2        Holderman, T., Sayre, P., Tan, S., Carpenter, T. and  Smith, E., (2009) The Toxicity Data Landscape for
3        Environmental Chemicals, Environmental Health Perspectives, 117(5), 685-695.
4
5        (2)  Richard, A.M., (2006) Future  of  Toxicology— Predictive Toxicology: An Expanded View of
6        "Chemical Toxicity". Chemical Research in Toxicology, 19(10), 1257-1262.
7
8        (3)  National Research Council (NRC).  (2007). Toxicity Testing in the 21st Century: A Vision and a
9        Strategy. National Academy Press, Washington, DC. Available at: http://www.nas.edu.
10
11       (4) Dix, D. I, Houck, K. A., Martin,  M.  T., Richard, A, M, Setzer, R. W., and Kavlock, R. 1(2007) The
12       ToxCast Program for Prioritizing Toxicity Testing of Environmental  Chemicals. Tax. Sci. 95, 9-12.
 I O
14       (5)  Rabinowitz, J. R., Goldsmith, M-R.,  Little, S.  B., and Pasquinelli,  M. A. (2008) Computational
16       Molecular Modeling for Evaluating  the Toxicity of Environmental Chemicals: Prioritizing Bioassay
17       Requirements. Environmental Health Perspectives, 116,573-577.
18
19       (6)  Kuntz,  ID.,   (1992)  Structure-based  strategies  for   drug   design  and  discovery.  Science
20       257(5073): 1078-1082.
21
22       (7)  Halperin, I, Ma, B., Wolfson, H.,  Nussinov, R., (2002). Principles of docking: an overview of
23       search algorithms and a guide to scoring functions. Proteins 47(4):409-443.
24
25       (8)  Sousa, S.F., Fernandes, P.A.,  Ramos, M.J., (2006) Protein-ligand docking: current status and future
26       challenges. Proteins Struct Funct Bioinform 65:15-26.

28       (9)  Abagyan, R.,  Totrov, M., (2001) High-throughput docking for lead generation.  Curr Opin Chem
y       J8/o/5(4):375-382.
O\J
31
3_       (10)  Bissantz C., Folkers G.,, and Rognan D., (2000) Protein-Based Virtual Screening of Chemical
33       Databases. 1. Evaluation of Different Docking/Scoring Combinations. J. Med. Chem. 43:4759-4767.
34
35       (11) Cozzini  P. and Dottorini T., (2004) Is it possible  docking and scoring  new ligands with few
36       experimental data? Preliminary results on  estrogen receptor as a case study. European J. Med. Chem.
37       31:601-609.
38
39       (12) Amadasi, A., Mozzarelli, A., Meda, C.,  Maggi, A., and Cozzini, P., (2009) Identification of
40       Xenoestrogens in Food Additives  by an Integrated In Silico and In Vitro Approach. Chem.Res. Toxicol.,
JJ       22(l):52-63

         (13) Ekins, S., (2004) Predicting undesirable drug interactions with promiscuous proteins in silico. Drug
45       Discov Today 9(6):276-285.
46
47       (14) Laws,  S. C.,  Yavanhxay, S., Cooper, R. L., and Eldridge, J.  C. (2006) Nature of the  Binding
48       Interaction for 50 Structurally Diverse Chemicals with Rat Estrogen Receptors. Tox. Sci.  94, 46-56.
49
50       (15)  Richard, A. M., and Williams,  C. R. (2002) Distributed Structure-Searchable Toxicity (DSSTox)
51       Public Database Network: A Proposal. Mutation Research, 499,27-52.
52
53       (16) S. Laws, J. Kariya, M.  Wolf,  and A.M.  Richard (2009) DSSTox EPA  Estrogen Receptor Ki
54       Binding Study (Laws et  al.)  Database  -  (KIERBL):  SDF file and documentation, Launch  version:
55       KIERBL_vla_278_17Feb2009, www.epa.gov/ncct/dsstox/sdf_kierbl.html
56
cl       07) http://epa.gov/endo/pubs/edspoverview/fmalrpt.htm (accessed 3/24/09)
58
59
60
                                                                                                   01
                                         ACS Paragon Plus Environment
                                    Previous  I     TOC

-------
                                  Submitted to Chemical Research in Toxicology                       Page 23 of 35

         (18)  Nffl NTP/ICCVAM Report (2002) Current Status of Test Methods for Detecting Endocrine

3
2        Disrupters: In Vitro Estrogen Receptor Binding Assays, NIEHS, RTF, NC
4        http://iccvam.ni ehs.nih.gov/docs/endo_docs/fmall002/erbndbrd/ERBd034504.pdf

Q        (19) Berman, H.M., Westbrook, I, Feng, Z., Gilliland, G., Bhat, T.N., Weissig, G.H., Shindyalov, I.N.,
7        Bourne, P.E. (2000) The Protein Data Bank , Nucleic Acids Research, 28, 235-242
8
9        (20) PDB ID: 1GWR, Warnmark, A., Treuter, E., Gustafsson, J.A., Hubbard R.E., Brzozowski A.M.,
10       Pike A.C. (2002) Interaction of transcriptional intermediary factor 2 nuclear receptor box peptides with
11       the coactivator binding site of estrogen receptor alpha.  JBiol Chem. 277(24):21862-8.
12
13       (21) PDB ID: 3ERT,  Shiau A.K., Barstad D., Loria P.M., Cheng L., Kushner P.J., Agard D.A., Greene
14       G.L., (1998) The structural basis of estrogen receptor/coactivator recognition and the antagonism of this
16       interaction by tamoxifen. Cell. 95(7):927-37.

1g       (22) MOE; Chemical  Computing Group, Inc.: Montreal, Quebec, Canada, 2008.
19
20       (23) Levitt, M. Accurate Modelling of Protein Conformation by Automatic Segment Matching. J. Mol.
21       Biol. 226, 507-533 (1992).
22
23       (24) T. Fechteler; U.: Schomburg, D. Dengler: Prediction of Protein Three-Dimensional Structures in
24       Insertion  and Deletion Regions:  A Procedure  for Searching Data Bases of Representative Protein
25       Fragments Using Geometric Criteria. J. Mol. Biol. 253 (1995), 114-131
26
27       (25) Cornell, W.D., Cieplak, P., Bayly, C.I., Gould I.R., Merz, Jr., K.M., Ferguson, D.M., Spellmeyer,
28       D.C., Fox, T., Caldwell, J.W., and Kollman, P.A., A Second Generation Force Field for the Simulation
         of Proteins, Nucleic Acids, and Organic Molecules, J. Am. Chem. Soc. 1995, 117, 5179-5197.

32       (26) PDB ID: 1HJ1,  Pike A.C., Brzozowski  A.M., Walton J., Hubbard R.E.,  Thorsell A.G, Li Y.L.,
33       Gustafsson J.A., Carlquist M., (2001) Structural insights  into the mode of action of a pure antiestrogen.
34       Structure. 9(2): 145-53.
35
36       (27)  PDB ID:  2J7X, Pike, A.C.W.,   Brzozowski, A.M.,  Hubbard,  R.E.,   Walton,  J.,  Bonn,  T.,
37       Thorsell, A.-G.,  Engstrom, O.,  Ljunggren, J.,  Gustaffson,  J.-A.,  Carlquist, M. ,  Structure of Agonist-
38       Bound Estrogen  Receptor Beta LED in Complex with Lxxll Motif from NcoaS  , to be published
39
40       (28)  Halgren T., (1996) Merck molecular force  field. I. Basis, form,  scope,  parameterization and
^2       performance of MMFF94; J.  Comput. Chem. 17 490-519

         (29) http://www.simbiosys.ca/ehits/index.html (Accessed 2/22/09)

*g       (30)  Zsoldos, Z.,  Reid, D., Simon, A, Sadjad, B.S., Johnson,  A.P.,  (2006) eHiTS: An Innovative
47       Approach to the Docking and Scoring Function Problems, Current Protein and Peptide Science, 2006,
48       7,421-435.
49
50       (31) http://www.eyesopen.com/products/applications/FRED.html (Accessed 2/22/09)
51
52       (32) McGann, M.R., Almond, H.R., Nicholls, A., Grant, J.A., and Brown, F.K.,  (2003)
53
54       Gaussian Docking Functions, Biopolymers, 68, 76-90.
55
56       (33) Schultz-Gasch,  and  T.,  Stahl, M., (2003) Binding Characteristics  in  structure-based  virtual
57       screening:evaluation of current docking tools. J Mol Model, 9, 47-57.
58
59
60
                                                                                                   00
                                         ACS Paragon Plus Environment
                                    Previous  I     TOC

-------
Page 24 of 35                       Submitted to Chemical Research in Toxicology

         (34) Kellenberger, E., Rodrigo, J., Muller, P., and Rognan, D., (2004) Comparative evaluation of eight
2        docking tools  for  docking  and virtual  screening  accuracy.  Proteins:  Structure,  Function,  and
3        Bioinformatics 57, 225-242.
4
5        (35) Shughrue, P.J., Lane, M.V, Scrimo, P.J., and Merchenthaler, I. (1998) Comparative distribution of
6        estrogen receptor-a (ER-a) and P (ER-P) mRNA in the rat  pituitary, gonad,  and reproductive tract.
7        Steroids. 63(10):498-504
8
9        (36)  Anstead, G.M., Carlson, K.E.  and Katzenellenbogen, J.A.(1997) The estradiol pharmacophore:
10       Ligand structure-estrogen receptor binding affinity relationships and a model for the receptor binding
] 2       site Steroids, 62(3), 268-303.
13
         (37)  Stahl, M., Rarey M.,(2001). "Detailed analysis of scoring functions for virtual screening." JMed
15       Chem 44(7): 1035-42.
16
17       (38)  Jain, A. N., (2006). "Scoring functions for protein-ligand docking." Curr Protein Pept Sci 7(5):
18       407-20.
19
20       (39)  Muthasa, D., Sabnis, Y.A., Lundborga, M., Karlen, A., (2008)  Is it possible to increase hit rates in
21       structure-based virtual screening by pharmacophore filtering? An investigation of the  advantages and
22       pitfalls of post-filtering, Journal of Molecular Graphics an d Modelling 26(8), 1237-1251.
23
24       (40)  Peach,  M.L.,  Nicklaus,  M.C., (2009)  "Combining  docking with pharmacophore filtering  for
         improved virtual screening" J.  Cheminformatics http://www.j cheminf.com/content/1/1/6
26
27       (41)  Kuiper, G.G.J.M., Carlsson, B., Grandien, K., Enmark, E., Hagblad, J., Nilsson S., Gustafsson, J.-
28       A. (1997)  "Comparison of Ligand Binding Specificity and Transcript Tissue Distribution of Estrogen
^       Receptors a and P" Endocrinology 138(3): 863-879.
oU
31       (42)  Carlson H.A., (2002). Protein flexibility and drug design: how to hit a moving  target. Curr Opin
32       Chem Biol 6(4):447-452.
33                  v '
34       (43)   Sherman W., Day T.,  Jacobson  M.P., Friesner R.A., Farid R.,  (2006). Novel procedure  for
^       modeling ligand/receptor induced fit effects. JMedChem 49:534 -553.
oD
3g       (44)  Mukherjee  S.,  Nagar S., Mullick S., Mukherjee A., Saha  A.,  (2008).   Pharmacophore
39       mapping  of arylbenzothiophene  derivatives  for MCF cell inhibition using classical and  3D space
40       modeling approaches. Journal of Molecular Graphics and Modelling, 26(6), 884-992.
41
42       (45)  Yang, J.-M., Shen T.-W., (2005) "A Pharmacophore-Based Evolutionary Approach for Screening
43       Selective Estrogen Receptor Modulators" Proteins: Structure, Function and Bioinformatics, 59, 205-
44       220.
45
46       (46) Ekins, S., Erickson, J.A., (2002) "A Pharmacophore for Human Pregnane X Receptor Ligands
Jg       "Drug Metabolism and Distribution, 30(1), 96-99.
49
5Q       (47)  Podolyan, Y., Karypis,G., (2009) "Common Pharmacophore Identification Using Frequent Clique
51       Detection Algorithm" J. Chem. Inf. Model, 49 (1), 13-21.
52
53       (48)  Gaido, K.W., Leonard,  L.S., Maness, S.C., Hall,  J.M., McDonnell, D.P.,  Saville, B., Safe, S.,
54       (1999)  "Differential  Interaction  of the Methoxychlor  Metabolite 2,2-Bis-(p-Hydroxyphenyl)-1,1,1-
55       Trichloroethane with Estrogen Receptors a and P" Endocrinology, 140(12),  5746-5753.
56
57       (49)   Blair,  R.M., Fang, H., Branham,  W.S., Hass, B.S., Dial, S.L., Moland, C.L., Tong, W., Shi, L.,
58       Perkins, R.,  Sheehan, D.M., (2000) "The Estrogen Receptor Relative Binding Affinities of 188 Natural
60       and Xenochemicals: Structural Diversity of Ligands" Toxicological Sciences 54,  138-153.
                                                                                                    73
                                         ACS Paragon Plus Environment
                                     Previous  I    TOC

-------
                                  Submitted to Chemical Research in Toxicology                       Page 25 of 35

         (50)  Laws, S.C., Carey, S.A., Ferrell, J.M., Bodman, G.J., Cooper R.L., (2000) "Estrogenic Activity of
2        Octylphenol, Nonylphenol, Bisphenol A and Methoxychlor in Rats" lexicological Sciences, 54 154-
3        167.

5        (51)  Fang, H., long, W., Welsh, W.J., Sheehan, D.M., (2003) "QSAR Models in Receptor-Mediated
6        Effects: the Nuclear Receptor Superfamily"  Theo Chem 622, 113-125.
7
8        (52)   Liu,T., Lin,Y., Wen,X., Jorrisen, R.N.,  Gilson, M.K., (2007) "BindingDB: "A Web-Accessible
9        Database of Experimentally Determined Protein-Ligand Binding Affinities" Nucleic Acids Research 35,
10       D198-D201.
11
12       (53)  Ertl, P., Rohde, B., Selzer, P., (2000) Fast Calculation of Molecular Polar Surface Area as a Sum
. .       of Fragment-Based Contributions and Its Application to the Prediction of Drug Transport Properties; J.
15       Med. Chem. 43 3714-3717.  (as implemented in 22)
16
17       (54)  Wildman, S.A.,  Crippen,  G.M.,  (1999) Prediction of Physiochemical Parameters by  Atomic
18       Contributions; J. Chem. Inf. Comput. Sci. 39 (5) 868-873. (as implemented in 22)
19
20
21
22
23
24
25
26
27       FIGURE CAPTIONS
28
29
30       Figure 1.  An Idealized Example of the Number of Chemicals as a Function of Docking Score.  In this
31
o o
~Z       example the scoring function imperfectly separates the positive chemicals from the negative chemicals.
OO
34
35       A line Q-R is drawn and used to make a prediction of the activity of each chemical.  The predictions
36
37       made based on the docking scores are a function of the position of Q-R as it moved horizontally.
38
39
40       Figure 2a.  Docking results using the method FRED with no constraints. The True Positive Rate (TPR)
41
42
43       is shown as a function of the False Positive Rate (FPR) for agonist, antagonist targets and a composite
44
45       score constructed from selecting the best score of the two for each chemical.  They  are compared to
46
47       random selection.
48
49

1-1       Figure 2b. Docking results  using the method eHiTS with no constraints.  The TPR is  shown as a

52
53       function of the FPR for agonist,  antagonist targets and a composite score constructed from selecting the
54
55       best score of the two for each chemical. They are compared to random selection.
56
57
58
59
60

                                        ACS Paragon Plus Environment
                                    Previous  I    TOC

-------
Page 26 of 35                       Submitted to Chemical Research in Toxicology

        Figure 3.  Docking results using the method FRED with 2 constraints.  The TPR is shown as a function

2
3       of the FPR for agonist, antagonist targets and a composite score constructed from selecting the best
4
5       score of the two for each chemical. They are compared to random selection.
6
7
8       Figure 4.  Docking results using the method eHiTS with 2 constraints.  The TPR is shown as a function
9
10
1 1      of the FPR for agonist, antagonist targets and a composite score constructed from selecting the best
12
1 3      score of the two for each chemical. They are compared to random selection.
14
15
16      Figure 5.  Comparison of docking  scores  for the KJERBL data set and a set of known strong estrogen

1 8
1Q      agonists (see reference 41).  See tables 1 and 2 of the supplemental material for scores  for all chemicals
20
21      considered.  Only a single active chemical has a score that is above the demarcation determined for the
22
23      weak binders.
24
25
Oft
        Figure 6.  Comparison of the KIERBL data set of weak binding chemicals to a larger set of known ER

28
29      binding chemicals.   It is  shown  in  the three  dimensions of relevant (but  probably not complete)
30
31      computed (22) physical properties, log P (54), total polar surface area (TPSA) (53), and molecular
32
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
        weight.
                                                                                                  75
                                        ACS Paragon Plus Environment
                                    Previous  I    TOC

-------
                                  Submitted to Chemical Research in Toxicology
                                                                                        Page 27 of 35
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Table 1.
A Summary
of the results for each approach
constraints
Number of True
Positives
Identified
Fred agonist
Fred-
antagonist
Fred-
composite
eHiTS agonist
eHiTS
jmtagonist 	 	
eHiTS
5


16
20

23

8
18

18
10


51
57

59

18
50

51
applied.
14


76 (27)
105

104

36
95

68
without

75


153 (54)
109 (39)

119 (42)

46 (16)
97 (35)

74 (26)
Table 1. A Summary of the results for each approach without constraints applied. Each
entry is the number of chemicals that would be called positive when Q-R is adjusted to
yield the number of true positives heading that column.  For example using the eHiTS
method and the rat ERa agonist target to obtain the first 5 true positives 8 chemicals are
called positive. In that case there are 3 false positives 10 false negatives and 263 true
negatives.  The numbers in parenthesis are the percent of the data base. For the last
column there are no false negatives.
                                         ACS Paragon Plus Environment
                                                                                                   26
                                    Previous
                                           TOC

-------
Page 28 of 35
                     Submitted to Chemical Research in Toxicology
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
      Table 2. A Summary of the results for each approach when 2
      constraints from the pharmacophore are applied.
Number of True
Positives
Identified
               10
               14
   15
Fred agonist

Fred-
antagonist
Fred-
composite


eHiTS agonist

eHiTS
antagonist
10
14

18

15


14

19
39 (14)

37 (13)

48 (17)


 23(8)

42 (15)
eHiTS
composite
10
20
32 (11)
Table 2. A Summary of the results for each approach when 2 constraints from the
pharmacophore are applied. Each entry is the number of chemicals that would be called
positive when Q-R is adjusted to yield the number of true positives heading that column.
For example using the eHiTS method and the rat ERa agonist target to obtain the first 5
true positives 7 chemicals are called positive. In that case there are 2 false positives 10
false negatives and 264 true negatives. The numbers in parenthesis are the percent of the
database that number represents. For the last column there are no false negatives.
                                      ACS Paragon Plus Environment
                                                                                             27
                                  Previous
                                       TOC

-------
                                        Submitted to Chemical Research in Toxicology
             Page 29 of 35
                                                             Q
 05
 O
 E
 CD
6
 CD
.0
 E
        Score
                                                             R
                                                                                              Predicted
                                                                    .2
                                                                    
-------
Page 30 of 35
                              Submitted to Chemical Research in Toxicology
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
AQ
                    Results with FRED  no Constraints
    1.0
    0.8
CD
-i— <
CO
K.
CD
O
Q_
CD
0.6
    0.4
    0.2
    0.0
                                                         Agonist
                                                         Antagonist
                                                         Composite
                                                         Random
       0.0
                 0.2
 0.4            0.6
  False Positive Rate
ACS Paragon Plus Environment
0.8
1.0
                               Previous

-------
                                    Submitted to Chemical Research in Toxicology
                                                                                    Page 31 of 35
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
AQ
                 Results with  eHiTS no Constraints
co
CD
o
Q_
CD
                                                               Agonist
                                                               Antagonist
                                                               Composite
                                                               Random
       0.0
0.2
   0.4            0.6

      False Positive Rate

ACS Paragon Plus Environment
0.8
1.0
                             Previous

-------
Page 32 of 35
                               Submitted to Chemical Research in Toxicology
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
AQ
                   Results  with FRED 2 Constraints
    1.0
    0.8
co
CD
O
Q_
    0.6
    0.4
    0.2
    0.0
                                           Agonist
                                           Antagonist
                                           Composite
                                           Random
      0.0
0.2
0.4            0.6
  False Positive Rate


ACS Paragon Plus Environment
0.8
1.0
                                Previous

-------
                                     Submitted to Chemical Research in Toxicology
                                                                                       Page 33 of 35
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
AQ
                   Results with eHiTS 2  Constraints
    1.0
    0.8
CD
CD
o
Q_
(D
0.6
    0.4
    0.2
    0.0
                                                             Agonist
                                                             Antagonist
                                                             Composite
                                                             Random
       0.0
                  0.2
   0.4            0.6

     False Positive Rate

ACS Paragon Plus Environment
0.8
1.0
                              Previous

-------
Page 34 of 35
             Submitted to Chemical Research in Toxicology
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
AQ
eHiTS (Log KO
                                    -10 -9
 -,   c  -5  -A  -
'7
FRED (score/ 10")
                                                               -1 10
                  Strong binder
              D  Weak binder (KIERBL)
                   ACS Paragon Plus Environment
              Previous

-------
                                            Submitted to Chemical Research in Toxicology
                                                                          Page 35 of 35
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
AQ
      TPSA/20(A*)
Q KIERBL non-binding
• KIERBL weak-binding
Q ER strong binding (bindingdb.org)
          ACS Paragon Plus Environment
      Previous

-------
Bioactivity Profiling Using BioMAP Cell Systems
                              Houck et al
In Press
Profiling  Bioactivity of the ToxCast Chemical Library Using BioMAP
Primary Human Cell Systems
Keith A. Houck,a David J. Dix,a Richard S. Judson,a Robert J. Kavlock,a Jian
Yang,b Ellen L. Bergb

"National Center for Computational Toxicology
Office of Research and Development
United States Environmental Protection Agency
Research Triangle Park, NC 27711

bBioSeek, Inc.
310 Utah, Suite 100
South San Francisco, CA 94080
                                        Other contact information:
                                        dix.david@epa.gov
                                        iudson.richard@epa.gov
                                        kavlock. robert@epa.gov
                                        eberg@bioseekinc.com
                              919-541-2701
                              919-541-3085
                              919-541-2326
                              650-552-0721
Address correspondence to:
Keith Houck, Ph.D.
US EPA
109 T.W. Alexander Dr.
D343-03
Research Triangle Park, NC 27711
Tel: 919-541-5519
Fax:  919-685-3371
Email: houck.keith@epa.gov
Disclaimer: The United States Environmental Protection Agency through its Office of
Research and Development funded and managed the research described here. It has
been subjected to Agency administrative review and approved for submission and peer
review.

Word Count: 5992
Short Title: Bioactivity Profiling Using BioMAP Cell Systems
03JUNE2009 Version
                    Previous
 1/27
TOC
                                                            Revised/Submitted

-------
Bioactivity Profiling Using BioMAP Cell Systems                         Houck et al

ABSTRACT
       The complexity  of human biology has made prediction  of health effects as a
consequence of exposure to environmental chemicals especially challenging. Complex
cell systems, such as the Biologically Multiplexed Activity Profiling (BioMAP)  primary,
human, cell-based disease models, leverage cellular regulatory networks to detect and
distinguish chemicals with a broad range of target mechanisms and biological processes
relevant to  human  toxicity.   Here  we  utilize the  BioMAP  human cell  systems  to
characterize effects relevant to human tissue and inflammatory disease biology following
exposure to the 320 environmental chemicals in the Environmental Protection Agency's
(EPA)  ToxCast  phase  I  library.    The  ToxCast chemicals were assayed  at four
concentrations in eight BioMap cell systems, with a total of 87 assay endpoints resulting
in  over 100,000 data points.  Within the context of the BioMap database, ToxCast
compounds  could be classified based on their ability to cause  overt  cytotoxicity in
primary human cell  types, or according to toxicity mechanism  class  derived from
comparisons to activity profiles of BioMap reference compounds. ToxCast chemicals
with  similarity  to inducers of mitochondrial dysfunction, cAMP  elevators, inhibitors of
tubulin function, inducers of endoplasmic reticulum stress or  NFicB  pathway inhibitors
were identified based on this BioMap analysis.  This dataset is being combined with
additional ToxCast datasets for development of predictive toxicity  models at the EPA.

Key Words: Toxicology, primary human cells, bioactivity profiling, chemical genetics
03JUNE2009 Version                     2/27                       Revised/Submitted
                     Previous  I    TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                         Houck et al
INTRODUCTION

       Alternatives  to whole animal testing of environmental  and industrial chemicals
are needed for understanding the toxicity potential of the many thousands of chemicals
and  materials in commercial use.  Cost,  animal welfare concerns,  and  relevance  to
human risk are the  major issues driving this need. The  U.S. EPA has initiated a large-
scale effort, ToxCast, which is investigating high-throughput, in vitro assays as a means
to develop predictive toxicology models. The  goal of the project is to compose broad
bioactivity profiles characterizing the in  vitro  biological  activity  of a reference  set  of
chemicals1. These profiles will be correlated with in vivo toxicity endpoints culled from
extensive in   vivo  animal  studies,  including those used for  pesticide registration
submissions  to  the EPA (e.g.,  chronic toxicity  endpoints)2.  Generation  of bioactivity
profiles for chemicals lacking toxicity information and  analysis of similarity to reference
chemicals would then be applied to predict potential for toxicity. The  ultimate goal is  to
provide an efficient means to prioritize for detailed  study the  approximately 10,000
industrial and environmental chemicals of  potential concern for  which minimal toxicity
data currently exists.3

       One   important  focus of the  ToxCast   project is measuring  the  chemical
perturbation of critical cellular signaling pathways that  may represent potential modes  of
chemical toxicity. This  is being accomplished  using  in vitro bioactivity profiles derived
from screening  the ToxCast chemical library against both specific  molecular targets
using high-throughput,  biochemical screening assays,  as well as  testing in a large suite
of cellular assays.  One  such cellular  system that characterizes pharmaceutical drug
function is based on statistical analysis of protein expression in a panel of assays using
primary human  cells  stimulated in  complex  environments.   This approach,  termed
Biologically Multiplexed Activity  Profiling (BioMAP),  provides  characterization of drug
function across a broad range of tissue and disease  biology.  The pattern of activity  in
these systems allows  classification  of molecules according  to  mechanism of action,
possibly providing insights into clinical phenomena4"6.  The cell systems contain primary
human cells  in different environments relevant to vascular inflammation  and  immune
activation.   BioMAP profiling has  been  shown to   detect and  discriminate  multiple
functional drug classes including glucocorticoids; TNF-a antagonists; and inhibitors  of

03JUNE2009 Version                     3/27                        Revised/Submitted
                     Previous  I    TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                         Houck et al

calcineurin,  HMG-CoA reductase,  heat shock  protein 90,  inosine monophosphate
dehydrogenase,  phosphodiesterase  4,  phosphoinositide kinase-3,  and p38  mitogen
activated kinase, among others.4"6

       BioMAP systems are designed to model  complex human disease and tissue
biology by stimulating human primary cells, single cell type or defined mixtures of cell
types,  such  that  multiple  disease-  and  tissue-relevant  signaling  pathways  are
simultaneously active. The choice of cell types and stimulations is guided by knowledge
of relevant disease biology  and mechanisms.  Chemical effects are then recorded by
measuring biologically meaningful  protein  readouts relevant to significant biological
responses  (e.g.,  inflammation,  tissue  remodeling).4"7  Activity  profiling   in  BioMAP
systems, in conjunction with BioSeek's database  that contains profiles  of hundreds of
experimental  pharmaceutical compounds and approved  therapeutics, provided  one
interpretation of potential modes of toxicity. The primary goal of the present study was to
generate BioMAP profiles that could then be used as part of the entire ToxCast in vitro
dataset for classifying environmental chemicals  in  the ToxCast database based  on
predictive signatures and putative toxicity pathways.
MATERIALS AND METHODS

Cell culture

       BioMAP systems  employed  are  shown in  Table 1.   Human umbilical vein
endothelial cells (HUVEC) were pooled from  multiple donors, cultured according to
standard methods,  and plated into microtiter plates at passage four.  Human neonatal
foreskin fibroblasts (HDFn) from three donors were pooled and cultured according to
standard methods.  HDFn were plated in low serum conditions 24 hr before stimulation
with cytokines.  Primary human bronchial epithelial cells, arterial smooth muscle cells
and keratinocytes were cultured according to standard methods.   Peripheral  blood
mononuclear cells (PBMC) were prepared from buffy coats from normal human donors
according to standard methods.  Monocyte-derived macrophages were differentiated in
the presence of M-CSF according to standard  procedures.  Concentrations/amounts of
agents added to confluent microtiter plates to build each system:   cytokines (IL-1(3,  1

03JUNE2009 Version                     4/27                       Revised/Submitted
                     Previous  I    TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                        Houck et al

ng/ml; TNF-a,  5 ng/ml; IFN-y, 20 ng/ml; IL-4,  5 ng/ml), activators (histamine, 10 |jM;
SAg, 20 ng/ml or  LPS,  2 ng/ml),  growth factors (TGF-(3, 5 ng/ml;  EGF, bFGF, and
PDGF-BB, 10 ng/ml), PBMC (7.5x104 cells/well) or macrophages (3.5x104 cells/well). All
primary  human cells utilized  in this work were obtained via  commercially  available
sources.

Compounds

      Compounds were tested at 40, 13.3, 4.4 and 1.48 uM for the study, in a single
well per readout parameter.   Compounds were prepared in DMSO from 20  mM stock
solutions, added 1 hr before stimulation of the cells, and were present during  the whole
24 hr stimulation period.  Final DMSO concentration was 0.2%.  Colchicine, 3.3 uM, was
included as a positive control. Compounds were tested in a blinded fashion and included
three sets of triplicate samples and five sets of duplicates for quality control purposes.

      Plate  formats.   Templates were prepared  with  seven  compounds  (four
concentrations) per 96-well plate.   One positive control (colchicine) and eight negative
control wells (0.2% DMSO) were employed on each plate.  Left and rightmost rows (A1-
H1, A12-H12) were not employed for EPA compounds.
ELISA
       The levels of readout parameters were measured by ELISA as described.9
Briefly, microtiter plates are treated, blocked, and then incubated with primary antibodies
or isotype control antibodies  (0.01-0.5  ug/ml)  for 1  hr.  After washing,  plates were
incubated with a peroxidase-conjugated anti-mouse IgG secondary antibody or a biotin-
conjugated anti-mouse IgG antibody for 1 hr followed by streptavidin-HRP for 30 min.
Plates were washed and developed with TMB substrate and the absorbance (OD) was
read at 450 nm  (subtracting the background absorbance at 650  nm).  Quantitation of
TNF-a  and PGE2 in  the  LPS  system was  done using commercially available  kits
according to  the manufacturer's  directions.    Proliferation of PBMC  (T cells)  was
quantified by  Alamar blue reduction  and proliferation of adherent cell types  was
quantified by SRB staining.10
03JUNE2009 Version                    5/27                      Revised/Submitted
                     Previous  I    TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                         Houck et al

Other Assessments

       Overtly adverse effects of compounds on cells were determined by 1) measuring
alterations in total protein (SRB  assay), 2) measuring the viability of peripheral blood
mononuclear cells; and 3) microscopic visualization.  SRB was performed by staining
cells with 0.1% sulforhodamine B after fixation with 10% TCA, and reading wells at 560
nm.   PBMC  viability was assessed by  adding Alamar blue to PBMC  that had  been
cultured for 24 hours in the  presence of activators and compounds and measuring its
reduction after 8 hr. Samples were assessed visually according to the following scheme:
2.0=cobblestone (unactivated phenotype);  1.0=activated (normal phenotype); 0.5=lacy
or sparse; 0.375=rounded; 0.25=sparse  and granular; 0.1=no cells in well.  During this
procedure,  cells  are also assessed for the presence of compound precipitates, and
samples were flagged if precipitates are observed.

Data analysis

       Measurement values for each parameter in a treated sample were divided by the
mean value from  six DMSO  control samples (from the same plate) to generate a  ratio.
All ratios were then log 10 transformed.  Visual  categorical scores (see above)  were
similarly converted (Iog10 ratios of 0.3, 0.0, -0.3, -0.4,  -0.6,  and -1.0).  Significance
prediction  envelopes were calculated for  historical controls  (99%  and 95%).     Hit
prediction envelopes (99% and 95%) were also calculated from historical controls.  Hit
prediction envelopes differ from significance envelopes in that they are calculated for the
entire profile,  not for individual readout parameters. Thus a 95%  hit envelope will contain
95% of control profiles, and therefore will depend on the  specific systems and readouts
selected for  analysis.8'31  Overtly cytotoxic compounds  are  identified  as  generating
profiles with one or more of the following readouts below the indicated thresholds:  SRB
< -0.3, PI or PBMC cytotoxicity <-0.1 or Visual score <-0.6 in one or more systems. The
complete set of  results for the 320  chemicals for each of the 87 endpoints can  be
accessed at: http://www.epa.gov/ncct/toxcast/.

       For  analysis of profile similarities,  overtly  cytotoxic compound profiles  were
removed.  The correlation metric was a  combination of similarity metrics in  addition to
Pearson's correlation (J. Yang, E. Berg,  personal communication).  This approach was

03JUNE2009 Version                    6/27                      Revised/Submitted
                     Previous  I     TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                         Houck et al

found to improve the accuracy  of mechanism classification  with test data sets (not
shown), due to the diversity of  BioMAP profile characteristics (wide variation in the
number of active readouts, the  number of active systems,  and in  the  amplitude  of
biomarker  readout changes).  Thus,  the similarity metrics used  for the analyses  of
profiles  included Pearson's correlation, a real value Tanimoto metric (=A-B/(||A||+||B|| -
A-B, where A and B are the two  profile vectors), and  a system weighted-averaged real
value Tanimoto metric ( = £ (i=system) Wi*Ti/£Wi, where Ti is the real value Tanimoto
score for the ith system, Wi is the weight for the ith system, Wi = number of markers in
the system/(1 + exp(-(max.ratio of the two profiles in this system-0.09)*100) ). The real
value Tanimoto metric was employed as a scaled version for filtering profile similarities.
The scaled version is calculated  by normalizing each profile to the unit vector (eg. A =
A/||A||) first, then applying the  formula given above.  Similar profiles were identified as
those having Pearson correlations > 0.7, and tanimoto scores > 0.5 (or  as otherwise
indicated).  Thresholds for these metrics were set using specific sets of training and test
sets of profile data (not shown). The function similarity map uses the results of pairwise
correlation  analysis to project the "proximity"  of related profiles from multi-dimensional
space to two dimensions. The two dimensional projection coordinates were generated
by applying a modified nonlinear mapping technique, using a modified stress function by
Clark. A gradient descent minimization method was used to minimize the modified stress
function, starting from a set of initial positions  (e.g. from principal components analysis).
Distances  between compounds  are  representative of their similarities and lines are
drawn between compounds whose profiles  are sufficiently similar, with metrics that are
above all selected thresholds (passing all filters).
RESULTS

Overview of ToxCast compound activities in BioMAP systems.

       320    ToxCast    proof-of-concept    chemicals    (list    available    at:
http://www.epa.gov/ncct/toxcast/) were tested in eight BioMAP model systems (Table I)
at four  concentrations, from  1.3  - 40 |o,M.   In each assay system,  7-14 biomarker
readouts were measured for a total of 87 readout measurements per compound per
concentration tested.  Active compounds were identified as those compounds for which

03JUNE2009 Version                    7/27                      Revised/Submitted
                     Previous  I     TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                         Houck et al

biomarker readout changes at one  or more concentrations resulted  in  the  overall
compound profile appearing outside of a 95% hit prediction envelope (developed from
replicates of negative control sample replicates, see Materials and Methods). Of the 320
compounds  tested, 219 (68%) were  active,  based on these criteria.   The  complete
BioMAP data set for the ToxCast chemicals can be accessed through the internet  at:
http://www.epa.gov/ncct/toxcast/.

Cytotoxicity of ToxCast compounds in BioMAP Systems.

       Each  BioMAP system  (Table  I) contains one or more  assay endpoints that
correlate with overt cell cytotoxicity.  Such cytotoxicity may confound interpretation of
mechanistic   information  by causing  changes  in  biomarker levels through  effects
secondary  to cell  death.  Thus  concentrations of  chemicals  inducing  significant
cytotoxicity were  excluded  from further analysis. These  cytotoxicity endpoints  include
sulforhodamine B (SRB) staining for total protein, alamar  blue assessment of peripheral
blood mononuclear cell  metabolic  activity (PBMC Cytotox.), and a morphologic score
(Vis). The  Vis score classifies cell  shape  as defined in  Materials  and  Methods.
Cytotoxicity of compounds can depend on cell type as well as the cellular environment.
Compound exposures that resulted in a >50% reduction in total  protein  levels (Log
ratiolO of SRB < 0.3) were considered overtly cytotoxic, and by this criteria,  70 ToxCast
compounds  were cytotoxic to  at  least one  cell type at one or more concentrations.
Figure 1 shows the distribution of cytotoxicities (indicated by red) by cell type (system)
and  concentration.  Cell-type selective cytotoxicities were observed with a number of
compounds; for example metiram-zinc was cytotoxic to PBMC and fibroblast-containing
systems, but not endothelial cells, and perfluorooctane sulfonic acid was selectively
cytotoxic to epithelial cells.  Of the compounds exhibiting overt cytotoxicity,  these were
most frequently cytotoxic to endothelial cells  (3C system) and least frequently cytotoxic
to smooth muscle cells (SM3C). Although these two systems contain different primary
cell types (endothelial cells versus smooth muscle  cells), they are stimulated with the
same combination of stimuli (IL-1p, TNFa  and IFNy), suggesting  a  cell type-specific
difference in toxicity mechanisms.  As shown in  Fig.  1, many  of the cytotoxicities
observed  show  a sharp   concentration-response  relationship.    In  addition,  a few
compounds  show a  reverse concentration  effect;  they are less  cytotoxic at higher
concentrations.       Compounds   showing   this   effect   include   fentin,    (2-

03JUNE2009 Version                     8/27                      Revised/Submitted
                     Previous  I     TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                         Houck et al

benzothiazolylthio)methyl thiocyanate (TCMTB),  captafol,  and  captan.  This  reverse
concentration effect  can be attributed to poor solubility at higher concentrations,  or
possibly biological effects such as induction of a  protective response  like the phase  II
enzyme glutathione S-transferase. Solubility was  assessed visually on all compounds
and noted precipitations included captafol on initial cell treatment and captan following
24-hr incubation.

Reproducibility of BioMAP profiles.

       In order to assess the  reproducibility of compound activities in these assays,  a
positive control, colchicine  (a  tubulin-binding  mitotic poison),  was included on  every
assay plate.  Figure 2 shows  the overlay of the replicate control data in  a profile plot.
While the amplitude  of biomarker changes for each readout shows variability between
replicates (i.e. from plate to  plate), the overall profile shape is very consistent and can be
used as  a measure of reproducibility.   Among  replicate samples of colchicine, the
Pearson correlation coefficients between any two  replicates was  high, and ranged from
0.82 - 0.97. Included in the  ToxCast chemical library were three chemicals each present
as  three   independent samples.   Correlation coefficients  for  the  40  |o,M testing
concentration between any two replicates had ranges of 0.77-0.89 for bensulide, 0.61-
0.80 for diclofop-methyl, and 0.20-0.43 for prosulfuron. Note that prosulfuron had very
little  significant  activity and  was thus based on data that largely did not fall outside the
95% significance envelope  defined by the solvent controls. Across all  assays, average
coefficient of variation of the fold-change for the triplicates was 7.2%.

Correlation analysis of ToxCast compound profiles.

       The BioMAP profiles of ToxCast compounds were compared to one another for
similarity by pairwise correlation analysis (see Materials and Methods).  Figure 3 shows
a visualization  of these relationships  in  a function homology  map  generated  by  a
comparison of  the resulting correlations  and display of the correlations  by  non-linear
projection in two dimensions (Sammon mapping,  a method of multidimensional scaling
that preserves distance and topology of data).8 Compound similarities that are significant
(above a selected  threshold) are  shown  as connected lines.  Compound  profiles
displaying overt cytotoxicity  (Fig. 1) were excluded  from the map shown.

03JUNE2009 Version                     9/27                       Revised/Submitted
                     Previous  I     TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                         Houck et al
Analysis of selected ToxCast compounds.

       The pair-wise correlation analysis suggested that ToxCast compounds could be
classified into functional groups (clusters in Fig.  3) by  BioMAP  profiling.  Compounds
were then further analyzed  by comparison to reference  BioMAP profiles to evaluate
similarity of ToxCast compounds to agents with known mechanisms of action.  Examples
of similarities and mechanism classes that were discovered by this analysis are listed in
Table II.

Mitochondria! dysfunction.

       Compounds with potential to induce mitochondrial dysfunction were identified by
their similarity to reference compounds.  Reference compounds  included oligomycin A,
an inhibitor of  mitochondrial  ATPase; shaoguamycin  (complex I  inhibitor); myxathiazol
(complex II inhibitor); and antimycin A (complex III inhibitor). Oligomycin A was tested in
the study along  with  ToxCast  compounds, while  the  profiles  for other reference
compounds were from a reference database, as described.11 Figure 4 shows an overlay
of the  BioMAP profiles for the readouts measured  in  common  for two ToxCast
compounds representative of this class, pyraclostrobin and trifloxystrobin, along with the
profile  of the reference, myxathiazol.  Pyraclostrobin and trifloxystrobin are strobilurin
fungicides known to inhibit mitochondrial function as a mode of their fungicidal action as
do a number of other compounds listed  in this class (Table II).12  Additional compounds
with profiles that significantly  match this class include the conazole fungicides and others
that inhibit sterol  biosynthesis as a mode of action suggesting  a connection between
sterol biosynthesis and mitochondrial  function.   Sterol biosynthesis inhibitors in this
mechanism  class include  prochloraz,  triadimenol,  difenoconazole,  fenarimol,  and
flusilazole.   A  possible mechanism for  this  activity may be seen in Saccharomyces
cervisiae where many of the genes that are required for efficient uptake and/or transport
of sterols are  also required for mitochondrial functions.13 Reduced  mitochondrial
membrane sterol content affects the  adenine  nucleotide  transporter in  the  inner
mitochondrial  membrane,  leading  to  decreased   intra-mitochondrial  ATP levels.14
Vertebrate mitochondria contain P450 enzymes involved in sterol metabolism and may
suffer the same fate induced by interference with sterol biosynthesis.

03JUNE2009 Version                     10/27                      Revised/Submitted
                     Previous  I     TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                        Houck et al
Compounds with potential to induce endoplasmic reticulum stress.

       Compounds  with the  potential  to  induce  endoplasmic reticulum  stress were
identified  by  their  similarity  to  a group  of  compounds  (A23187,  thapsigargin,
chlorambucil, sodium  azide,  cycloheximide  and  rotenone),  previously identified as
having this potential.9  These compounds are known to interact with diverse targets, for
example,  chlorambucil  is  a  nitrogen  mustard  alkylating  agent  that  is  used  in
chemotherapy, and covalently modifies many cellular targets in addition to DMA; A23187
is a calcium ionophore; thapsigargan inhibits the SERC ATPase; cycloheximide inhibits
acyl transferase II and protein synthesis; sodium azide inhibits mitochondrial Complex IV
and rotenone inhibits Complex I .  The ability of these compounds to induce endoplasmic
stress may be through shared secondary targets or common downstream mechanisms
involving the generation of reactive species.11 Several of the chemicals in this group are
inhibitors of mitochondrial dysfunction, such as propargite and  rotenone, or sodium
channel modulators, such as the  pyrethroids resmethrin and tefluthrin. Interestingly, the
pyrethroids, like  rotenone, have  been associated  with effects on  dopaminergic  nerve
pathways and  could  play a role in  neurological  disorders,  including  Parkinson's
Disease.15"16 As shown in Fig. 5A, the compounds in this class are somewhat diverse
and many show strong concentration-dependent differences in their profiles, consistent
with compounds affecting multiple targets. Many of the compounds in this class become
overtly cytotoxic to cells at higher  concentrations.

 A/F/cB inhibitors.

       Compounds  with the  ability to  inhibit the  NFicB pathway were identified by
similarity to  reference  NFicB inhibitors including  Ro106-9920   (kBa  ubiquitination
inhibitor), BAY 11-7085 (kBa phosphorylation inhibitor),  SC-514 and IKK-2 inhibitor IV
(IKK-2 inhibitors), or dimethyl fumarate.11 As previously  described, many chemicals in
this group act through covalent modification of  NFicB pathway components resulting in
inactivation of the NFkB transcription factor.8'17"18 The dithiocarbamate pesticides such
as dazomet, maneb and metam-sodium appear to have such activity. Dithiocarbamates
have also been described as potent inhibitors  of NFKB  activation in cellular assays.19
Fig. 5B shows a BioMAP overlay  of dazomet with dimethyl fumarate.
03JUNE2009 Version                    11/27                      Revised/Submitted
                     Previous  I    TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                         Houck et al

Elevators ofcAMP.
ToxCast compounds that up-regulate cAMP levels were identified by similarity to cAMP-
elevating reference compounds, including the phosphodiesterase IV inhibitors ICI-63197
and rolipram.  Fig. 5C shows an overlay of the BioMAP profiles of azatrine, cyanazine
and simazine, herbicides of the chlorotriazine class with highly related structures.  One
notable feature of these profiles is the  strong inhibition of TNFa in  the BioMAP  LPS
system, and  inhibition  of PAI-1 levels in the HDF3CGF system, hallmark activities of
phosphodiesterase inhibitors that differentiate them from other mechanisms.  Atrazine
has been demonstrated to inhibit phosphodiesterases  and to activate a cyclic-AMP
response element reporter gene.20"21  Elevated cAMP can induce aromatase expression
causing increased androgen conversion to estrogens in a  wide range  of  tissues.22 In
addition, atrazine was shown  to increase  activity for  an  aromatase reporter gene.21
These activities could  explain,  in  part,  the  reported  endocrine disrupter activity  of
atrazine  resulting from amplification  of cAMP; possibly  an increase  in  the cAMP-
responsive cellular kinase SGK-1; and nuclear receptor NR5A (SF-1)  phosphorylation, a
regulator of aromatase gene expression.21

Microtubule function and estrogen receptor signaling.

       We have previously described similarity between BioMAP profiles of the  estrogen
receptor agonist,  17beta-estradiol, its metabolite, 2-methoxyestradiol and  microtubule
destabilizers.8 Compounds with similarity to this class were identified by their similarity to
colchicine,   vinblastine  (microtubule   destabilizers)   17beta-estradiol,   and/or  2-
methoxyestradiol. Fig. 5D shows an  overlay of  benomyl and fludioxonil  with paclitaxel.
Benomyl, an antifungal compound, inhibits fungal cell mitosis by binding  to microtubules
and deforming their structure. It also has been demonstrated to disrupt mammalian cell
microtubules  and inhibit proliferation and mitosis in HeLa  cells with IC50  of around 5
micromolar, less potent than its activity in fungal cells.23 This may be responsible for the
teratogenic activity of benomyl.24 The  phenylpyrrole fungicide fludioxonil  is a natural
product analog with an unknown mode of action.25  However, fludioxonil has  been shown
to be  a clastogen and  inducer of polyploidy in CHO cells,  activities that are consistent
with effects on microtubule function.26 Teratogenic activity of fludioxonil has not been
demonstrated, although whether the parent molecule  reaches the  developing  fetus
during developmental toxicity testing is not known.

03JUNE2009 Version                     12/27                      Revised/Submitted
                     Previous   I     TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                         Houck et al
DISCUSSION

       The goal of the ToxCast program is to develop algorithms that predict potential
for  chemical toxicity  using data generated  by  in  vitro  bioactivity profiling  assays
combined with  physicochemical properties. Given this ambitious goal of developing a
comprehensive approach that will cover a large  number of toxicity  mechanisms, the
profiling assays should encompass a wide range of potential toxicity targets. In part, this
requirement is  satisfied by high-throughput, biochemical screening  assays against a
large  number of specific molecular targets.1 However,  having  assays for all potential
individual toxicity targets is not practical. First, knowledge of specific molecular targets of
toxicity  is incomplete  and the ability  to  assay  all  proteins for  effects  of chemical
perturbation is  not presently  possible. Second,  targets of  chemical  toxicity may  be a
function of emergent  properties of more  complex systems such as  the cell or intact
organism that are not present at the biochemical  level.27 For these  reasons, we  have
included complex cellular assay systems as part of the ToxCast program. The BioSeek
cell systems models  not  only provide such  cellular assays,  but were conducted in
primary human cells and thus may provide a more direct link to human health effects
than would transformed cell lines typically used in cell-based screening.28

       Eight BioMAP systems were used in the present study yielding 7-14 readouts per
system  for  a total  of  87  readouts  per compound. The different readouts were not
selected for known  relevance to toxicity. The selection of  readouts  and design of the
assay systems was directed  towards optimal detection and discrimination  of  diverse
target and pathway  mechanisms.4"6 We did not assume a priori that individual readouts
were related to toxicity, but rather our hypothesis was that if the patterns of readouts in
these assays  can  be  correlated with  diverse mechanisms, some  of  these  may  be
relevant to toxicity.  Despite the large amount of data generated, only a relatively narrow
breadth of specific  pathophysiology was  covered, chiefly vascular  and inflammatory
biology. However,  the ubiquitous nature  of critical  cell signaling pathways makes it
possible that a relatively higher percentage of important toxicological targets are  present
in these systems. Indeed,  past experience with the BioMAP systems demonstrated the
ability to discern signatures of activity for disease  modulators not directly related to the

03JUNE2009 Version                    13/27                       Revised/Submitted
                     Previous  I     TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                         Houck et al

biology  of the  specific  BioMAP  system, for example detection of oncology  and
cardiovascular drug activities in systems designed for inflammatory mechanisms.4 The
key to success under this assumption  is a reference library  of bioactivity profiles
covering  important mechanisms  of toxicology both  to validate the assays and  to
distinguish mechanisms  of action  of  unknown chemicals. Correlation  of  bioactivity
profiles of a library of pharmacological  probes and drugs with signatures  generated by
screening the ToxCast chemical library provided evidence for mechanisms of potential
toxicity for a subset of the library.

       Using statistical correlation methods to classify ToxCast chemicals with respect
to potential mechanisms  of bioactivity  revealed at least five reasonable  matches with
profiles  in the  BioSeek database  of known  compounds. These mechanisms: NFicB
inhibitors, cAMP elevation, inducers  of mitochondrial dysfunction, endoplasmic reticulum
stress and microtubule inhibition, were associated  with 76 of the 309  unique chemical
structures in the library. The types of correlated activities were sometimes  related to the
intended  mechanism of action of the chemical, e.g. pesticides that act through inhibition
of mitochondrial respiratory chain  function in insects  or that destabilize microtubule
function in fungus.  In other cases, these appear not to be related to intended mechanism
of action, e.g.  inhibition of  NFicB activation, cAMP elevation or TNF-a secretion.  Such
effects may or may  not reflect toxicity mechanisms. However, they do shed  light on
mechanisms of bioactivity and  potential molecular targets which may prove useful  in
evaluating the potential toxicity of new chemicals. As we gain greater understanding  of
the importance of the many different  signaling pathways, we will better be able  to
describe  what  constitutes  a  "toxicity pathway"  and  assess the significance  of  a
chemical's perturbation of it. An important component of this  effort is the profiling  of
compounds of known pharmacological  and toxicological effects to measure their ability
to modulate these pathways. This will provide validation of the  toxicity link of the
pathway  as well as reference profiles to use for assessing the effects of new chemicals.

       It was somewhat surprising  that for the most part, chemicals in the ToxCast
library  generated relatively  weak signatures  relative to  the profiles of the  reference
pharmacological probes  and drugs. All  of the compounds  in  the ToxCast library,
consisting largely of pesticide active ingredients, are bioactive and include many with
significant toxicities when tested at high doses in animals.2 Since bioactivity is  a function

03JUNE2009 Version                    14/27                      Revised/Submitted
                     Previous   I     TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                          Houck et al

of dose,  it is important to consider the in vitro screening concentration relative to the
high-dose, in vivo animal toxicology. It is not clear whether the concentrations used in
this profiling effort, i.e. 40 |o,M top  concentration, is  a reasonable representation of the
effective  tissue concentrations associated with lowest effect levels in the in vivo toxicity
testing.2 Toxicity testing in rodents  is generally done up to the maximally tolerated dose,
and without tissue dosimetry for comparison to concentration response in cell-based, in
vitro assays,  interpretation remains a challenge. As part of their development,  many of
these pesticides were designed to be selectively toxic  towards pest species  and the
weak signatures detected in the BioMAP profiles may reflect this built in safety margin.

       Not all compounds produced statistically significant profiles that closely matched
reference compounds in the BioSeek database. There are several possible explanations
for this.  Compound  signatures may reflect poly-pharmacology,  i.e.  activity against
multiple pathways, resulting in confounding signatures of  activity that change with testing
concentration. Many of the probes or drugs used to generate  the  reference bioactivity
profiles are very selective for specific pathways as an inherent  function of their  utility as
a pharmacological probe or drug.  Activities against targets other than the intended may
only occur at  much  higher  concentrations  due to medicinal chemistry  efforts and
selective screening. The environmental chemicals,  on the other hand, may  never  have
been subject to such engineering and thus could display multiple overlapping profiles at
the concentrations tested without one predominant behavior. Alternatively, there may be
no reference compounds in the database for accurate comparison. For these cases, use
of knowledge about  the ToxCast chemicals may provide useful information for future
screening using the BioMAP systems. Extensive in vivo toxicity data for the compounds
are available  in a relational  database2 and  a  variety of research data exists in the
scientific literature describing modes of action and toxicities for many of the compounds.
Further analysis  and predictive  modeling  of the  BioMAP profiles  of the  Toxcast
compounds may yield classes of chemicals that can be correlated with toxicity endpoints
or mechanisms. This  would serve  to guide interpretation of future  testing of chemicals
with little or no toxicity information available.

       In addition to the methods of analysis described here,  individual endpoints will be
included  in a more comprehensive collection of data that make up the EPA's  ToxCast
project. Data from 8-10 in vitro assays  sources (both  cell-free and  cell-based), plus

03JUNE2009 Version                     15/27                       Revised/Submitted
                     Previous  I    TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                         Houck et al

physicochemical properties and chemical structure information is being use to develop
predictive "signatures" of in vivo toxicity as captured in ToxRefDB. A signature in this
context is a  rule or algorithm against which the in vitro data for a chemical is tested. If
the chemical matches the signature, then it is predicted to be toxic for the endpoint being
evaluated. We are using a variety of statistical and machine learning methods to  mine
the ToxCast data.29 The primary input from the BioMAP data described here are Lowest
Effective  Concentrations (LEG) which are  the lowest tested concentrations at which a
given chemical shows a BioMAP response that is statistically significantly different from
background. For each  BioMAP  system,  we exclude components where  the LEG is
greater than or equal to the LEG at which cytotoxicity is seen for the chemical in the
particular BioMAP system.

       The future of toxicity testing requires  the ability to  determine the effects of
chemicals on  important cellular toxicity  pathways  in human  cells and  interpret the
significance  of  any perturbation.30 Bioactivity profiling of compounds  in  the  primary,
human BioMAP cell systems facilitates this for many  pathways involved in vascular and
inflammatory biology. Coupled with a  reference database  of  many compounds with
known toxicities, this approach provides a means to interpret bioactivity in the context of
effects on human or rodent pathophysiology. Much work remains to be done in terms of
expanding the diversity of toxicological probes for the  reference database, expanding the
pathophysiology covered by the complex cellular assays, and relating in vitro screening
concentrations to actual human exposure  levels. In  addition, environmental toxicology
typically  involves effects resulting  from exposure to complex  mixtures of chemicals.
While not addressed in this study, the pathway-based approach may allow the  ability to
discern unexpected  effects of exposure  to mixtures that would only  be apparent in
complex  cell biology models. Overall,  this approach represents an important first step in
the ability to  use  primary  human cells  in vitro as model  systems  for  evaluating
environmental chemicals for risks to human health.
03JUNE2009 Version                    16/27                      Revised/Submitted
                     Previous  I     TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems
                              Houck et al
ACKNOWLEDGEMENTS

      The data presented in this paper was generated under EPA contract EP-W-07-
039 with BioSeek Inc., as part of the ToxCast research program.  The authors thank
John Nanartowicz and John Southerland for the procurement and management of this
contract.
03JUNE2009 Version
                    Previous
17/27
TOC
Revised/Submitted

-------
Bioactivity Profiling Using BioMAP Cell Systems                        Houck et al
REFERENCES

   1.  Dix, DJ, Houck, KA, Martin, MT, Richard, AM, Setzer, W,  and Kavlock, RJ: The
       ToxCast program  for prioritizing  toxicity  testing  of environmental chemicals.
       Toxicol Sci 2007;95:5-12.
   2.  Martin M, Judson  RS, Reif DM, Kavlock RJ and Dix  DJ:  Profiling Chemicals
       Based on Chronic Toxicity Results from the U.S. EPA ToxRef Database. Environ
       Health Perspect, in press.
   3.  Judson R, Richard A,  Dix DJ,  Houck K,  Martin M,  Kavlock R, Dellarco V,  Henry
       T,  Holderman T, Sayre P, Tan  S, Carpenter T, Smith  E:  The Toxicity Data
       Landscape for Environmental Chemicals. Environ Health Perspect, in press.
   4.  Kunkel, EJ, Plavec, I, Nguyen,  D, Melrose,  J, Rosier,  ES,  Kao, LT, Wang, Y,
       Hytopoulos, E., Bishop, AC, Bateman, R, Shokat,  KM, Butcher, EC, &  Berg, EL:
       Rapid Structure-Activity and Selectivity Analysis of Kinase Inhibitors by BioMAP
       analysis  in  Complex  Human  Primary Cell-Based Models.   /Assay Drug Dev
       7ec/?no/2004a;2:431-41.
   5.  Kunkel, EJ,  Dea, M, Ebens, A, Hytopoulos, E, Melrose, J, Nguyen, D, Ota,  KS,
       Plavec, I, Wang, Y, Watson, SR, Butcher, EC, Berg, EL: An Integrative Biology
       Approach for Analysis  of Drug Action in Models of Human Vascular Inflammation.
       FASEB Journal 2004b; 18:1279-81.
   6.  Berg EL,  Kunkel EJ,  Hytopoulos E,  Plavec I: Characterization of compound
       mechanisms and secondary activities by BioMAP analysis. J Pharmacol Toxicol
       Methods 2006; 53:67-74.
   7.  Berg EL, Kunkel EJ, Hytopoulos  E: Biological complexity and drug discovery: a
       practical systems biology approach. Syst Biol (Stevenage) 2005;152:201-6.
   8.  Plavec, I, Sirenko,  O, Privat, S,  Wang, Y,  Dajee, M, Melrose, J, Nakao,  B,
       Hytopoulos, E,  Berg,  EL,  &  Butcher,  EC:  Method  for Analyzing  Signaling
       Networks   in  Complex  Cellular  Systems.  Proc  Natl  Acad  Sci  USA
       2004;101:1223-28.
   9.  Melrose J, Tsurushita N, Liu G, Berg EL:  IFN-gamma inhibits  activation-induced
       expression of E- and P-selectin on endothelial cells. J Immunol 1998;161;2457-
       2464.
03JUNE2009 Version                    18/27                      Revised/Submitted
                     Previous  I     TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                        Houck et al

   10. Ahmed SA, Gogal, RM Jr, Walsh, JE:  A new rapid and simple non-radioactive
      assay to monitor and determine the proliferation of lymphocytes: an alternative to
      [3H]thymidine incorporation assay. J Immunol Methods 1994;170:211-24.
   11. Berg  EL et al., in preparation.
   12. Bartlett DW, Clough JM, Godwin JR, Hall AA, Hamer M, Parr-Dobrzanski B: The
      strobilurin fungicides. Pest Manag Sci 2002;58:649-62.
   13. Reiner S, Micolod D, Zellnig G, Schneiter R:  A genomewide screen reveals a
      role of mitochondria in anaerobic uptake of  sterols in yeast.  Mol Biol Cell
      2006;17:90-103.
   14. Astin  AM and Haslam JM: The effects of altered membrane sterol composition on
      oxidative phosphorylation in a  haem  mutant of Saccharomyces  cerevisiae.
      Biochem J 1977; 166:287-98.
   15. Nasuti C, Gabbianelli R,  Falcioni ML, Di Stefano A,  Sozio P, Cantalamessa  F:
      Dopaminergic system modulation, behavioral changes, and oxidative stress after
      neonatal administration of pyrethroids. Toxicology 2007; 229:194-205.
   16. Ryu  EJ, Harding  HP,  Angelastro  JM,  Vitolo  OV,  Ron   D,  Greene LA:
      Endoplasmic reticulum stress  and the  unfolded  protein  response in  cellular
      models of Parkinson's disease.  J Neurosci. 2002;22:10690-8.
   17. Toledano MB and Leonard WJ: Modulation of transcription factor NF-kappa B
      binding  activity by  oxidation-reduction  in vitro. Proc Natl Acad Sci  USA
      1991;88: 4328-32.
   18. Chen, L, Fischle, W, Verdin, E,  Greene, WC: Duration of nuclear NF-kappaB
      action regulated by reversible acetylation. Science 2001 ;293:1653-7.
   19. Schreck R, Meier B, Ma'nnel DN,  Droge W, Baeuerle PA:  Dithiocarbamates as
      potent inhibitors of nuclear factor kappa B activation in intact cells. J Exp Med
      1992; 175:1181-94.
   20. Roberge  M,  Hakk  H,  Larsen  G:  Atrazine is a competitive  inhibitor  of
      phosphodiesterase  but  does not affect the  estrogen  receptor. Toxicol Lett
      2004;154:61-68.
   21.Suzawa  M, Ingraham  HA:  The herbicide atrazine  activates  endocrine  gene
      networks via non-steroidal NR5A nuclear receptors in fish and mammalian cells.
      PLoSOA/E2008;3:e2117.
   22. Fan W,  Yanase T, Morinaga H, Gondo S,  Okabe T, Nomura M,  Komatsu  T,
      Morohashi K, Hayes  TB, Takayanagi R, Nawata H: Atrazine-induced aromatase

03JUNE2009  Version                    19/27                      Revised/Submitted
                     Previous  I     TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems                         Houck et al

       expression is SF-1 dependent: implications for endocrine  disruption in wildlife
       and reproductive cancers in humans.  Environ Health Perspect 2007; 115:720-7.
   23. Gupta K, Bishop  J,  Peck A, Brown J,  Wilson L,  and Panda D:  Antimitotic
       Antifungal Compound  Benomyl Inhibits Brain  Microtubule Polymerization and
       Dynamics and Cancer Cell Proliferation at Mitosis, by Binding to a Novel Site in
       Tubulin.  Biochemistry 2004; 43: 6645 -6655.
   24. Kavlock, RJ, Chernoff, N, Gray, Jr, LE , Gray,  JA, Whitehouse, D: Teratogenic
       effects of benomyl in the Wistar rat and CD-1 mouse, with emphasis on the route
       of administration.  Toxicol. Appl. Pharmacol 1982;62:44-54.
   25. Rosslenbroich H-J, Stuebler D: Botrytis cinerea—history of chemical control and
       novel fungicides for its management. Crop Protection 2000; 19:557-561.
   26. U.S.  Environmental  Protection Agency: Fludioxonil; Pesticide Tolerance. Fed
       Regist 1998;63:53820-53826.
   27. Bhalla US,,  lyengar R: Emergent  Properties of Networks of Biological Signaling
       Pathways. Science 1999;283: 381 -387.
   28. Horrocks C,  Halse R,  Suzuki R,  Shepherd PR: Human cell systems  for drug
       discovery. Curr Opin Drug Discov Devel 2003;6:570-5.
   29. Judson,  R, Elloumi,  F, Setzer, RW,  Li, Z, Shah, I: A Comparison  of  Machine
       Learning Algorithms for Chemical Toxicity Classification Using a Simulated Multi-
       Scale Data Mode. BMC Bioinformatics 2008 ;9:241.
   30. National Research Council: Toxicity Testing in the 21st Century: A Vision and a
       Strategy Washington D.C.: National Academies Press 2007.
   31. Storey, JD, & Tibshirani, R: Statistical significance for genomewide studies. Proc
       NatlAcad Sci U SA 2003; 100:9440-5.
03JUNE2009 Version                    20/27                      Revised/Submitted
                     Previous  I    TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems
                                 Houck et al
FIGURES AND TABLES

Table I. Eight BioMAP Systems utilized in this study. BioMAP Systems listed according to their
short  names  are  comprised  of the cell types shown cultured and  stimulated  with  the
environmental factors (added along with test compounds) for 24 hours.  For each System, the
biomarker readouts listed (number of readouts is shown in parentheses) are measured at 24 or
72 hours as described in Materials and Methods.
System
3C
~. *
LPS ^
SAO.
BE3C
HDF3CGF
KF3CT
SM3C
Cell Types
Endothelial cells
Endothelial cells
Peripheral Blood
Mononuclear Cells +
Endothelial cells
Peripheral Blood
Mononuclear Cells +
Endothelial cells
Bronchial epithelial
cells
Fibroblasts
Keratinocytes +
Fibroblasts
\fescularsmooth
muscle cells
Environment
IL-1|!+TNF-a+IFN-v
IL-4+histamine
TLR4
TCR
IL-lp+TNF-a+IFN-v
IL-lfS+TNF-a-HFN-v
+bFGF+EGF+PDGF-BB
IL-1(}+TNF-a+IFN-v
•tTGF-il
IL-l|J+TNF-a+IFN-v
Readouts
MCP-1, VCAM-'i, ICAM-i.Throribomodulin, Tissue
Factor, E-selectin, uPAR. IL-6, MIG. HLA-DR, Prolif.
Vis., SRB (J3)
VEGFRII, P-selectin, VCAM-' , uPAR, Eotaxin-3, MCP-
1,SRB(7)
CD40. VCAM-1, Tissue Factor, MCP-', E-selectin, IL-
1a, IL-6. M-CSF, TNF-a, PGE2. SRB (11)
MCP--.. CD3B. CD40, CD69, E-selectin, IL-G. MIG,
PBMC Cytotox, SRB, Proliferation (10)
uPAR, IP-10, MIG, HLA-DR. IL-J a, MMPA PAH,
SRB. TGF-bl,tPAuPA(11)
VCAM-1, IP-10, IL-8, MIG, Collagen III, M-CSF, MMP-1,
PAI-1, Proliferation, TIMP-1, EGFR, SRB ('2>
MCP-1, ICAM-', IP--0, IL-'a, MMP-9, TGF-bl, TIMP-2,
uPA SRB (9)
MCP-1, VCAM-1, Thronbomodulin, Tissue Factor, IL-
6, LDLR, SAA uPAR, IL-6. MIG, HLA-DR, M-CSF.
Prolif., SRB (14}
03JUNE2009 Version
21/27
Revised/Submitted
                      Previous
TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems
                                             Houck et al
Table  II. Table of ToxCast compounds  with assigned  mechanisms identified  by similarity of
BioMAP profiles to reference compounds  (see Results).  Reference compounds that were tested
in the study along with ToxCast compounds are shown in  italics.
      (Z, E )-F en py rox i mate
          Azoxystrobin
          Bromoxynil
            Butralin
          Cyazofamid
       D-cis,trans-Allethrin
         Difenoconazole
     Difenzoquat Metilsulfate
         Dimethomorph
          Ethofumesate
          Famoxadone
          Fenamidone
  Mitochondria! Dysfunction
          Fenarimol
         Fenitrothion
         Fenoxycarb
           Fipronil
        Fluoxastrobin
         Flusilazole
         tndoxacarb
        Methoxychlor
    Methyl (sothiocyanate
            MGK
         Norflurazon
          Novaluron
    Oligomycin A
     Oxadiazon
     Oxyfluorfen
    Paclobutrazol
     Prochloraz
     Propargite
   Pyraclostrobin
     Pyridaben
    Tebufenpyrad
     Thiazopyr
     Triadimenol
    Trifloxystrobin
          17p-Estradiot
 3-tedo-2-P ropy ny I bu ty Icarbamate
          Abamectin
            Benomyl
        Chlorpyrifes Oxon
    Microtubule Inhibitors
          Cyprodinil
         Dinicortazole
     Emamectln Benzoate
         Endosulfan
         Fludioxonil
    Hexythiazox
    Niclosamide
  Parathion-Methyl
     Prodiamine
Pyridaben (high dose)
      ER Stress Inducers
            A2318?
          Chlorambucil
         Cycloheximicle
           Dithiopyr
       Flumiciorac-Pentyl
           Propargite
          Resmethrin
           Rotenone
           Tefluthrin
           Zoxamide
3-lodo-2-P ropy ny I buty Icarbamate
         Acetochlor
         Bendiocarb
          Benfluralin
           Captan
          Dazomet
      Dimethyl fuma/afe
   Formetanate Hydrochloride
      IKK-2 Inhibitor IV
           Maneb
    Metam-Sodium Hydrate
  cAMP Elevation
      Atrazine
     Cyanazine
     Propazine
     Simazine
03JUNE2009 Version
          22/27
    Revised/Submitted
                       Previous
         TOC

-------
          Bioactivity Profiling Using BioMAP Cell Systems
                                       Houck et al
I
a

o
                Fig. 1. Distribution of cytotoxicities observed for 70 ToxCast compounds.  Cytotoxicity
                is indicated by SRB measurements or Alamar blue (SAg PBMC).  The threshold for
                cytotoxicity (High)  is  indicated  by measurements  of <-0.3  (Red). Orange and  yellow
                indicate measurements of-0.2  to -0.3 (Mod) and  -0.1 to -0.2 (Low), respectively.  No
                significant effects on SRB measurements is indicated by white (None). Table columns
                are organized by system and concentration  (highest concentration on the left in each
                system).
          System
       Concentration

                                                                ,,
BE3C  HDF3CGF  KF3CT
IPS
                                                               (PBMC)   3C
4H
SM3C
    'EL
      »:.•• H :•- :
       • . . f • . •  ..  i
      DryzBln
      . . .,....,. ., .,
      Crl- '- 1"
      ~ .-;. p
        -tfl
      H=TE
      •MHo^
          03JUNE2009 Version
     23/27
                                                          Revised/Submitted
                                 Previous
    TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems
                                   Houck et al
Fig.  2.  Overlay of  BioMAP profiles of positive control  compound  (colchicine)  replicates.
Colchicine was tested in the 8 BioMAP Systems (Figure 1) at 3.3 uM and included as a positive
control on every plate  (template)  used in the study. All replicates (T1-T16) are shown.  The
biomarker readouts measured (see Methods) are indicated along the x-axis. The y-axis shows
the Iog10 expression ratios of the readout level measurements relative to solvent (DMSO buffer)
controls.  Each data point represents a single well. The grey area above and below the dashed
line indicates the 95% significance envelope of DMSO  negative controls.
           	 BioMAP Systems	
                         Cytotoxicity Readouts
                              Readout Parameters (Biomarkers)
03JUNE2009 Version
                       Previous
24/27
TOC
Revised/Submitted

-------
Bioactivity Profiling Using BioMAP Cell Systems
                                  Houck et al
Fig. 3. Function Similarity Map for EPA ToxCast Compounds.  Compound profiles in 8 BioMAP
systems were compared by pain/vise correlation and correlations analyzed for significance and
subjected to non-linear projection (see Materials and Methods).  Compounds at concentrations
resulting in overt toxicity to cells are not included. Compound clusters discussed in text include:
(1)  mitochondrial  inhibitors;  (2)  NFidB  inhibitors;  (3)  cAMP  elevators;  and  (4)  inducers  of
endoplasmic reticulum stress.
                       A.OWLI-
03JUNE2009 Version
                       Previous
25/27
TOC
Revised/Submitted

-------
Bioactivity Profiling Using BioMAP Cell Systems
                                                                   Houck et al
Fig. 4. BioMAP profiles of pyraclostrobin and trifloxystrobin compared to reference compound
myxathiazol, an inhibitor of mitochondrial electron transport chain  complex II.  Compounds
were   tested   at   the   indicated   concentrations   as   described   in   Materials   and
Methods
   •0.9
   -l.D
   -1.1
   -1.2
   •U
   i -:
• Myxothiazol, 1.111 uM
• Pyractostrobin. 13.333 uM
• Trifloxystrabin, 13.333 uM
         III
         TFF

!«i
03JUNE2009 Version
                                 26/27
                    Revised/Submitted
                      Previous
                                 TOC

-------
Bioactivity Profiling Using BioMAP Cell Systems
                                                        Houck et al
Fig. 5. BioMAP profiles of selected test chemicals and reference compounds showing profiles
characteristic of A) endoplasmic reticulum stress, B) NFkB inhibition, C) cAMP elevation, and D)
inhibition of microtubule function.  Compounds  were  tested at the indicated concentrations as
described in Materials and Methods.
 A.
                                               BE.3C    HDF3CGF   Kl '•<  I
JI]17r'TjTljTf     ^^pTi
                                                        rrrnrijn         mrn

                                                              p{,||WTT^|pT'iTT|»
03JUNE2009 Version
                      Previous
                      27/27
                      TOC
Revised/Submitted

-------
Pa9e 1 of 14                    ToxSci Advan^tffl&iigaiSeiKraa July 14, 2009


1
2
3
4
5             Biologically-Relevant Exposure Science for 21s Century Toxicity
®             Testing

8
9             Elaine A. Cohen Hubal*
10
11            National Center for Computational Toxicology, U.S. EPA, Research Triangle Park, NC
12            27711
13
14
15
1 e            * Author to whom correspondence should be addressed
17            U.S. Environmental Protection Agency
18            National Center for Computational Toxicology
™            Mail Drop B205-01
21            Research Triangle Park, NC 27711
22            Telephone: (919)541-4077
23            Internet: hubal.elaine@epa.gov
24
25            Send express mail to:
?7            U.S. Environmental Protection Agency
28            Mail Drop E205-01
29            4930 Old Page Road
30            Research Triangle Park,NC  27709
31
^2            Running Title: EXPOSURE AND THE NRC VISION

34
35            Keywords: Exposure, biomarkers, exposure ontology, knowledge bases, and network
36            models
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60        Published by Oxford University Press 2009.
                                  Previous  I    TOC

-------
                                               Toxicological Sciences                                    Page 2 of 14


1
2
3
4
5
6               ABSTRACT
7
8               High visibility efforts in toxicity testing and computational toxicology including the
9               recent NRC report, Toxicity Testing in the 21st Century: a Vision and Strategy (NRC,
11              2007), raise important research questions and opportunities for the field of exposure
12              science. The authors of the National Academies report (NRC, 2007) emphasize that
13              population-based data and human exposure information are required at each step of their
14              vision for toxicity testing, and that these data will continue to play a critical role in both
15              guiding development and use of the toxicity information.  In fact, state-of-the-art
                exposure science is essential for translation of toxicity data to assess potential for risk to
1 g              individuals and populations and to inform public health decisions. As we move forward
19              to implement the NRC vision, a transformational change in exposure science is required.
20              Application of a fresh  perspective and novel techniques to capture critical determinants at
21              biologically-motivated resolution for translation from controlled in vitro systems to the
~:              open, multifactorial system  of real-world human-environment interaction will be critical.
24              Development of an exposure ontology and knowledgebase will facilitate extension of
25              network analysis to the individual and population for translating toxicity information and
26              assessing health risk.  Such  a sea  change in exposure science is required to incorporate
27              consideration of lifestage, genetic susceptibility, and interaction of non-chemical
28              stressors for holistic assessment of risk factors associated with complex environmental
                disease. A new generation of scientific tools has emerged to rapidly measure signals
31              from cells, tissues, and organisms following exposure to chemicals.  Investment in 21st
32              century exposure science is  now required to fully realize the potential of the NRC vision
33              for toxicity testing.
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60


-------
Page 3 of 14                                   Toxicological Sciences


1
2
3.               INTRODUCTION
4
5
6               The NRC report, Toxicity Testing in the 21st Century: a Vision and Strategy (NRC, 2007)
7               articulates a long-range vision and promotes a transformation in toxicity testing based on
8               a rapidly evolving understanding of molecular pathways—the role of these in normal cell
^               function and in toxicity.  The key aspect of the NRC vision and proposed paradigm shift
11              in toxicity testing is that new tools are available to examine toxicity pathways in a depth
12              and breadth that has not been possible before. In response to the NRC report, efforts
13              underway to apply high-throughput-screening (HTS) approaches for chemical
14              prioritization and toxicity testing have been accelerated (Collins et al, 2008, Dix et al,
15              2007).  As a result, an explosion of HTS data for in vitro toxicity assays will become
                available over the next few years. How will this new toxicity information be translated
1 g              to assess potential for risks to individuals and populations from environmental exposures
19              and to improve public health?
20
21              The authors of the National Academies report (NRC, 2007) emphasize that population-
~:              based data and human exposure information are required at each step of their vision for
24              toxicity testing.  Exposure needs highlighted in the NRC report include (1) human
25              exposure data to select doses for toxicity testing and facilitate development of
26              environmentally-relevant hazard information; (2) biomonitoring data relating real-world
27              human exposures with concentrations that perturb toxicity pathways to identify
28              biologically-relevant exposures; and (3) information on host susceptibility and
                background exposures to interpret and extrapolate (i.e., translate) in Protest results for
31              risk assessment. While the importance of exposure information for design and
32              interpretation of toxicity testing under the NRC vision was clearly identified, it was
33              beyond the charge of the committee to address the required science and resources to meet
34              this need. As a result, current discrepancies in the scientific foundation for hazard and
^              exposure characterization are rapidly increasing.
37
38              Others, however, have recognized that just as interpretation of toxicogenomic hazard data
39              requires anchoring to apical endpoints for contextual relevance, understanding relevant
40              perturbations leading to these toxicogenomic endpoints requires anchoring stressors to
4^              real-world human exposure (biologically-relevant exposure metrics). New approaches
43              for toxicity testing and risk assessment require systems-based consideration of
44              interactions between exposure and effect as well as the science to predict exposures down
45              to the molecular level (Edwards and Preston, 2008;  Cohen Hubal et al., 2008; Sheldon
46              and Cohen Hubal, 2009). Wild (2005) has proposed the need for a "step change" in
47              exposure assessment and has articulated a vision for exposure measurement
2®              commensurate with that of the NRC vision for toxicity testing.  Wild has called for an
CQ              "exposome," or measurement of the life-course of environmental exposures to provide
51              the evidence base for public health decisions to address environmental health.  Wild and
52              others (e.g., Weis et al, 2005) discuss the potential of emerging technologies to provide
53              this new generation of exposure information. Finally, Lange et al. (2007) have discussed
^4              the need to integrate heterogeneous ontologies into interdisciplinary knowledge systems
56              to unify scientific fields and harness the full potential of exposure and health outcome
57
58
59
60


-------
                                              Toxicological Sciences                                   Page 4 of 14


1
2
3
               data.  Lange et al. illustrate a framework and call for development of a knowledge system
5              to seamlessly compute relationships across the source-to-outcome continuum.
6
7              In this paper, the case is presented for a transformation in exposure science
8              commensurate with the transformation in toxicology presented by the NRC. A new
^              generation of tools to rapidly characterize biologically-relevant exposures and link to
11             environmentally-relevant hazard is required to employ toxicity data for holistic risk
12             assessment and to inform public health decisions. Research initiatives required to
13             develop this exposure science include: (1) application of systems biology network
14             modeling to identify exposure metrics and models for characterizing key stressors at
15             biologically-motivated resolution; (2) development and application of advanced
               technologies to measure key exposure metrics (e.g., biomarkers to measure internal
1g             exposure, sensors to measure personal exposure); and (3) development of an exposure-
19             hazard knowledge system to facilitate risk assessment. The imperative for this
20             transformation and an outline of the required initiatives follow.
21
22
24             THE CASE FOR EXPOSURE
25
26             Exposure characterization is the risk analysis step in which human interaction with an
27             environmental agent of concern is evaluated. Exposure is defined as the contact between
28             an agent and a target (WHO, 2004). Although the primary application of this definition
               for risk assessment has been to the individual or human population as a target of exposure
31             and a chemical as an agent of exposure, the target of exposure can be an organ, tissue or
32             cell, and the agent of exposure can be a biological, physical, or psychosocial stressor or
33             the byproduct of given exposure agent (Figure 1). Exposure assessment is defined as
34             evaluation of exposure of a system, organism, or (sub)population to an agent (and its
^             derivatives).  The process may include estimating the magnitude, frequency, and duration
37             of an exposure, along with characteristics of the exposed individual or population (WHO,
38             2004). In limited cases, exposure can be measured directly, but more often due to current
39             scientific limitations exposure must be estimated (Cohen Hubal et al., 2008; Paustenbach
40             2000).
41
43             Low-level and prevalent environmental exposures may contribute substantially to the
44             burden of common complex disease (Hemminki et al, 2006, Gibson, 2008).
45             Understanding the relationships between environmental exposures and health outcomes
46             requires integration of a wide range of factors—extrinsic (e.g. environmental), intrinsic
47             (e.g. genotypic), and mechanistic (e.g. lexicological)—to support health studies and
4®             characterize risk.  Assessing complex human-health risks associated with exposures to
CQ             chemicals requires that hazard, susceptibility, and exposure are all reliably characterized.
51
52             Characterization of susceptibility is rapidly advancing through application of microarray
53             technology for genotyping and investment in large genome-wide association (GWA)
^4             studies (McCarthy et al, 2008).  Epidemiologists are now facing the challenges associated
56             with interpreting this massive amount of genomic variation data for understanding
57             etiology of complex environmental disease. Calls for tools to characterize and unravel
58
59
60
                                     Previous  I     TOC

-------
Page 5 of 14                                   Toxicological Sciences


1
2
3
               interacting genetic and environmental factors have begun (Collins, 2006; Manolio and
5              Collins, 2007). Similarly, the NRC has presented a vision for advancing characterization
Q              of hazard through application of high-throughput screening (HTS) methods and system
7              biology approaches to elucidate toxicity pathways.  As a result, toxicologists are facing
8              the challenges of translating the high content toxicity data that is now being generated
^              (Dix et al, 2007) to inform risk assessment.  The high-priority need for research to
11             interpret these hazard data in the context of real-world exposure has been identified by
12             risk assessors. At the same time, however, characterization of exposure remains
13             primitive by comparison (e.g., scenario-based assessment of exposures of sentinel
14             products in non-standardized scenarios versus measurement of real-time personal
15             exposure) and resources to improve the scientific basis of exposure assessment are
               limited or nonexistent.  This lack of balance in efforts to improve measuring hazard and
1 g             exposure is less than ideal for providing advancement in risk assessment.
19
20             Just as Wild (2005) questions whether or not fundamental knowledge about genetics will
21             improve understanding of disease etiology at the population level, we should question
~:             whether or not fundamental knowledge of toxicity pathways will improve understanding
24             of real-world human-health risk. Accurate assessment  of chemical exposures remains an
25             outstanding and largely unmet challenge in toxicology  and risk assessment. One side of
26             the hazard-exposure equation continues to be refined while the other remains subject to
27             crude characterization based largely on  indirect estimates and default assumptions.  Due
28             to the complex nature of the human system, predictions of potential health risks
               associated with chemical exposures will be limited by the least resolved or least
31             understood component of the  system.  By focusing resources exclusively on improving
32             hazard characterization we compromise the ability to fully realize benefits of the NRC
33             vision. Just as a new generation of scientific tools is being applied to rapidly assess toxic
34             response resulting from chemical exposures, there is a critical need to develop methods
^             for characterizing environmental exposures  at biologically-relevant resolution to translate
37             HTS toxicity results for human health risk assessment.
38
39
40             REQUIRED INITIATIVES
41
43             What does the real world look like and how can  we capture a picture (i.e., model) of the
44             real world that will facilitate risk assessment and allow us to make important
45             environmental health decisions? Understanding relationships among multiple
46             environmental factors and complex disease, as well as characterizing environmental
47             health risk factors requires collection and analysis  of a wide range of data.  Information
2®             on the characteristics of multiple stressors (chemical, physical, biological and
CQ             psychosocial), the characteristics of the human receptor (genetics, health status, life stage,
51             behaviors,  social factors, etc.) at multiple levels  of organization (individual, community,
52             population), and the temporal and spatial patterns of exposures and outcomes must be
53             considered. Strategic research is required to identify key determinants of exposure to
^4             capture the essence of this multifactorial reality.  What are the critical elements of
56             exposure in a given context? What are the key metrics for characterizing these exposure
57             elements in that context?  What is the required resolution for measuring key metrics and
58
59
60
                                     Previous  I     TOC

-------
                                              Toxicological Sciences                                   Page 6 of 14


1
2
o
               modeling exposures so that these are relevant for developing and interpreting hazard
5              information to assess health risks?  Finally, can new scientific understanding and tools in
6              biological, computational, and information sciences be leveraged to develop rapid,
7              inexpensive approaches for characterizing biologically-relevant exposure?
8
9              Under the NAS vision, tools developed by the pharmaceutical industry are being applied
11             to transform toxicity testing. Similarly, exposure scientists must leverage advanced
12             measurement and computational tools from disparate, but related fields to transform
13             exposure science. New technologies must be applied to move from our current crude
14             indirect estimates of exposure to the biologically-based metrics required to interpret
15             emerging toxicity data and advance human-health risk assessment.  In addition, the
               complexity of the multifactorial systems under study and the resulting multidimensional
1 g             data produced using emerging technologies require application of environmental
19             informatic capabilities and advanced computational tools to model  and link exposures to
20             health outcomes.  A combination of discovery and engineering (mechanistic)-based
21             modeling approaches for hypothesis development and testing are required.  Statistical
~:             data-mining and machine-learning approaches are required to extract information from
24             extant data on critical  exposure determinants, link exposure information with toxicity
25             data, and identify limitations and gaps in exposure data.  Engineering or mechanistic
26             approaches are required to model the human-environment  system and to test our
27             understanding of this system.
28
29
               Systems biology: exposure at all levels of biological organization
31
32             In the NRC vision, the authors propose systems-biology evaluation of signaling networks
33             to characterize perturbations of toxicity pathways and as the basis of a new toxicity-
34             testing paradigm. Environmental stressors (i.e., exposure) leading to perturbations of
^             toxicity pathways are simplified and treated as unidirectional and one dimensional.
37             Fortunately, systems theory also provides the required conceptual framework for linking
38             exposure science and toxicology to study, characterize, and make predictions about the
39             complex interactions between humans and environmental chemicals and associated
40             feedback across levels of biological organization (Figure 1).  A systems-biology approach
41             for holistic study of environmental  disease and risk assessment considers coupled
43             networks that span multiple levels of biological organization.  These networks describe
44             the overall connectivity of the system. Mechanistic understanding is derived by
45             characterization of these networks and impacts of perturbations due to behavioral and
46             environmental influences. Edwards and Preston (2008) present the conceptual basis for
47             extending network analysis to inform risk assessment. Networks at different levels of the
4®             system can be used to merge molecular-level changes with measured events at the
CQ             individual or population level. Molecular networks are developed based on data from
51             'omic measurements.  Key event networks, where each node ideally represents a toxicity
52             pathway, are abstracted from the molecular network based on biological interpretation
53             and targeted experimentation (both in vitro and in vivo). Adverse outcomes  are driven by
^4             the impact of an individual's genetics, epigenetics and exposure profile. Connectivity at
56             the population level is driven by common genetics, lifestyle, and environment.  An
57
58
59
60
                                     Previous  I     TOC

-------
Page 7 of 14                                   Toxicological Sciences


1
2
3
               example of the type of approach described by Edwards and Preston has been partially
5              demonstrated for an ecological model of endocrine disruption (Ankley, et al., 2009).
6
7              Gohlke et al (2009) present an example of how this approach can be applied using gene-
8              centered databases to develop linked networks to explore interplay between genetic and
^              environmental factors for metabolic syndrome and neuropsychiatric disorders.  The
11             analysis presented by Gohlke and coauthors highlights significant gaps in exposure
12             information required to extend this approach to assess and  mitigate human health risks.
13             The Comparative Toxicogenomic Database (Davis et al., 2008) used to compile
14             environmental factor-gene/protein relationships does indeed provide an important model
15             for how exposure information can and should be made accessible to facilitate
               investigation of  gene-environment-disease relationships. However, because the CTD  is
1 g             limited to curated information on direct chemical-gene interactions and direct gene-
19             disease associations, chemical-disease relationships must be inferred.  Here again, real-
20             world exposure information is required to translate molecular insights to assess risks to
21             individuals and populations.
22
23
24             Just as key cellular processes may be associated with multiple complex outcomes, it is
25             likely that exposures to multiple xenobiotic compounds may elicit perturbation of the
26             same key toxicity pathways. Understanding the critical determinants of multifactorial
27             perturbations and feedback in the human-environment system is required to interpret
28             toxicity data for risk assessment.  A systems approach for assessing risk provides a
               holistic view of interactions between a chemical stressor and biological entity from the
31             molecular level through to the level of the organism and/or population (Figure 1).  Such a
32             holistic systems  approach demands exposure metrics and models to characterize key
33             stressors at a level of resolution commensurate with that of the response or effects
34             (biologically-motivated resolution).
35
oc
37             Biologically-relevant exposure metrics
38
39             The major challenge to realizing the full potential of the NRC vision is the limited
40             availability of efficient and affordable  methods for measuring biologically-relevant
4^             exposures. Biologically-relevant exposure metrics are those that can be directly
43             associated with key events in a disease process and with an individual's exposure profile.
44             Based on this need to characterize biologically-relevant environmental exposures, Wild
45             (2005) has proposed investment in development of exposure biomarkers to improve our
46             ability to understand and mitigate environmental impacts on human health.  In fact, we
47             are faced with the same critical need for advanced exposure science if we are to realize
2®             the NRC vision. Though Wild's vision for an "exposome" has been articulated  in the
CQ             context of characterizing the environmental contribution to etiology of common complex
51             disease, the basic principles behind this call are germane for human health risk
52             assessment. As  early as 1999, Groopman and Kensler highlighted the challenges
53             associated with developing biomarkers and interpreting biomarker data to sort out the
^4             interactions of multiple chemicals, multiple exposures and  relation of these to health
56             outcomes.  With appropriate investment, a new generation of technologies may provide
57             the tools to address some of these challenges. Limited examples follow, but
58
59
60
                                     Previous  I    TOC

-------
                                              Toxicological Sciences                                    Page 8 of 14


1
2
o
                opportunities to adapt a wide range of sensors and biomarkers to measure chemical
5               stressors and/or derivatives of these at all levels of biological organization should be
6               considered.
7
8               Currently,  as advocated by the NAS vision, investment in 'omic technologies is focused
^               on understanding and characterizing toxicity or hazard. Yet, these tools may also yield a
11              new generation of exposure metrics.  In a second report (NRC, 2007a), the NAS has
12              called for further development of toxicogenomic technologies to increase capabilities in
13              exposure assessment. There are early indications that with appropriate investment, this
14              area of research could provide important approaches for assessing real-world exposures.
15              Wild (2009) considers application of transcriptomics for development of exposure
                biomarkers to improve exposure assessment in epidemiology.  Fry et al (2007) present
1 g              an exciting example of the potential to link environmental exposure and altered gene
19              expression. In a study conducted in Thailand, Fry and coworkers identified expression
20              signatures  from babies born to arsenic-unexposed and -exposed mothers that were highly
21              predictive  of prenatal arsenic exposure in a subsequent test population. Resulting
~:              signatures, based on a very small number of genes, show promise as biomarkers of
24              arsenic exposure. Other studies have investigated altered global gene expression
25              associated  with exposure to cigarette smoke, benzene, metal fumes and air pollution.
26              While the limited research conducted to date suggests that environmental exposures elicit
27              changes in gene expression specific to the type of exposure, significant scientific hurdles
28              remain. However, careful targeted investment should be  directed to develop 'omic tools
                to link real-world exposure and in vitro hazard information.
31
32              Biomarkers alone will not provide the full range of information required to characterize
33              exposure and make critical links to hazard  for risk assessment. Direct, noninvasive, and
34              sensitive detection of chemical stressors in relevant environmental and biological media
o c
^              could prove to be the most effective means of assessing exposure. Recently, Schwartz
37              and Collins (2007) identified the need for better environmental biosensors to study gene-
38              environment interactions associated with complex disease. Research in this area is also
39              required to apply toxicity testing results to  assess human health risk.  Advances in
40              nanotechnology and related development of small-scale sensors promise to facilitate
41              comprehensive monitoring of exposure, dose,  and associated indicators of early effect.
43              Nanotechnologies offer the potential to improve exposure and risk assessment by
44              facilitating collection of large numbers of measurements on very small numbers of
45              molecules  at a low cost. It is currently possible to develop micro- and nano-scale sensor
46              arrays that can detect specific sets of harmful agents in the environment (Andreescu et al,
47              2009). Provided adequate informatics support, these sensors can be used to monitor
4®              multiple agents in real time and the resulting data can be accessed remotely. The
CQ              potential also  exists to extend these small-scale monitoring systems to the individual level
51              to detect personal and in vivo distributions  of toxicants (Barry et al, 2009; Weis et al.
52              2005).
53
^4              Together, application and development of exposure assessment tools such as advanced
56              molecular indicators of exposure (Sen et al, 2007, Wild 2009) and nanotechnology-based
57              sensors (Barry et al, 2009; Andreescu, 2009) present the opportunity for simultaneous
58
59
60


-------
Page 9 of 14                                   Toxicological Sciences


1
2
3
                measurement of biologically-relevant exposures to multiple real-world stressors as well
5               as the potential to mechanistically link traditional exposure metrics and endpoints
6               measured in HTP in vitro assays.
7
8               Exposure-hazard knowledge system
9
10
11              The NRC vision emphasized the importance of extrapolation modeling to: (1) provide
12              quantitative, mechanistic understanding of dose-response relationship for perturbations
13              by environmental agents; (2) predict human exposures leading to tissue concentrations
14              comparable to concentrations causing perturbations in vitro, and (3) provide a basis for
15              addressing background chemical exposures, background disease processes, and host
                susceptibility (NRC, 2007). Development and use of models that can efficiently address
1 g              these critical components for translation and risk assessment requires capabilities to
19              collect, organize, retrieve, and link large amounts of disparate, multidimensional
20              exposure and hazard information.
21
OO
~:              Significant energy and resources have been committed to collate  and improve access to
24              genomic, toxicology, and health data (Richard et al 2008; Judson et al 2008; Davis et al
25              2008).  Lacking from these information resources is the real-world exposure data
26              required to translate molecular insights for assessing risks at the individual and
27              population level. Knowledge-discovery based tools are new to the exposure science
28              community.  Yet, these tools are absolutely critical as these provide the opportunity to
                efficiently leverage exposure information for extrapolation modeling and translation of in
31              vitro HTS toxicity data for risk assessment.
32
33              Translation of the hazard information developed under the NRC vision will require a
34              holistic risk assessment knowledge system that includes ontologies and databases to
^              facilitate computerized collection, organization, and retrieval of exposure, hazard, and
37              susceptibility information.  In  addition, this system must be compatible and linkable to
33              the larger environmental health universe of information to facilitate risk assessment for
39              improved public health decisions (Richard 2006, 2008; Judson et al. 2008; Davis 2008).
40              An exposure ontology consistent with those being used in toxicology and other health
4^              sciences is required to formally represent exposure concepts, the relationships between
43              these concepts and most important the relationships between exposure, susceptibility, and
44              toxicology domains.
45
46              Lange et al. (2007) illustrate a framework for an interdisciplinary knowledge system to
47              link agriculture, food science,  nutrition, and health. Although this vision is presented in a
2®              slightly different context, the concept as outlined is directly relevant for 21st century
CQ              human health risk assessment.  Just as we are experiencing a transformation in
51              toxicology, the food science community has seen a shift from discovery of essential
52              nutrients for human survival to characterization of complex, multifactorial interactions
53              among food, diet, and health.  The authors argue for standardized ontologies to define
^4              relationships, allow for automated reasoning, and facilitate meta-analyses.  This same
56              capability is clearly required to develop biologically-relevant exposure metrics, design in
57              vitro toxicity tests to measure environmentally-relevant hazard, and to incorporate
58
59
60
                                     Previous  I    TOC

-------
                                             Toxicological Sciences                                  Page 10 of 14


1
2
3
               information on susceptibility and background exposures for interpretation of these data to
5              assess real-world risks to individuals and populations.
6
7
8              EXPOSURE SCIENCE FOR THE 21ST CENTURY
9
10
11             The need for radical improvement in exposure science is not academic.  Characterization
12             of biologically-relevant exposure is required to translate advances and findings in
13             computational toxicology to information that can be directly used to support risk
14             assessment for decision making and improved public health. New technologies must be
15             applied to both toxicology and exposure science if the ultimate goal of evaluating risk to
               humans is to be achieved. Just as authors of the NRC report (2007) recognize the need
1 g             for broad-based support to achieve their vision for toxicity testing, realization of
19             objectives for 21st century risk assessment will require significant investment in exposure
20             science and development of capacity across both the public and private sector.
21             Ultimately, this additional investment will maximize contributions of emerging toxicity
~:             testing approaches toward improved understanding  of relationships between
24             environmental factors and human health outcomes.
25
26             Recognition that improvement in exposure science is required to  characterize and manage
27             risks associated with environmental stressors is broad based. A National Academies
28             workshop conducted in June 2009, Exposure Science in the 21st Century, focused on the
               role of exposure science in health studies, risk assessment, and risk prevention (NRC,
31             2009). At this workshop, the director of US EPA's  National Exposure Research
32             Laboratory announced plans for a new National  Academies committee on exposure
33             science in the 21st century (Reiter, 2009). Formation of this committee would be a
34             critical step toward building the scientific basis for exposure characterization to protect
^             environmental and public health. At the same time, the risk assessment  community
3-,             cannot wait to initiate important research in exposure science to meet rapid advances in
38             toxicity testing and critical needs for translating  emerging HTP hazard data if the NRC
39             vision (2007) is to be realized.
40
41
42
43             Acknowledgements
44             I would like to thank Drs. Tina Bahadori and Steve  Edwards for helpful  discussions.  I
45             would also like to thank Dr.  Christopher Wild for so vividly highlighting the critical need
46             for reliable exposure assessment tools.
47
2®             Disclaimer
CQ             This manuscript is has being subjected to Agency review through the Office of Research
51             and Development and been cleared for publication  by the US Environmental Protection
52             Agency.
53
54             Figure Captions
55               5      F
               Figure 1. Cascade of exposure-response processes for integrating exposure science and
co             toxicogenomic mode-of-action information. (Cohen Hubal et al.,  2008, JESEE)
59
60
                                    Previous  I     TOC

-------
Page 11 of 14                                 Toxicological Sciences


1
2
3
5              REFERENCES
6
7              Andreescu, S, J Njagi, C Ispas, MT Ravalli. (2009)  JEM Spotlight:  Applications of
8              advanced nanomaterials for environmental monitoring. JEnviron Monit. 11(1):27-40.
9
10
11             Ankley GT, Bencic DC, Breen MS, Collette TW, Conolly RB, Denslow ND, Edwards
12             SW, Ekman DR, Garcia-Reyero N, Jensen KM, Lazorchak JM, Martinovic D, Miller DH,
13             Perkins EJ, Orlando EF, Villeneuve DL, Wang RL, Watanabe KH. (2009) Endocrine
14             disrupting chemicals in fish: Developing exposure indicators and predictive models of
15             effects based on mechanism of action. Aquatic Toxicol92(3): 168-178.

\ g             Barry RC, Y Lin, J Wang, G Liu, CA Timchalk. (2009) Nanotechnology-based
19             electrochemical sensors for biomonitoring chemical exposures.  J Expo Sci Environ
20             Epidemiol. 19(1): 1-18.
21
OO
~:             Cohen Hubal, EA, J Moya,  SG Selevan. (2008) A lifestage approach to assessing
24             children's exposure. Birth Defects Res (Part B) 83:522-529.
25
26             Cohen Hubal EA, Richard AM, Imran S, Gallagher J, KavlockR, Blancato J, Edwards S.
27             (2008) Exposure science and the US EPA National Center for Computational
28             Toxicology.  J Expo Sci Environ Epidemiol. doi: 10.1038/jes.2008.70 [Online: Nov 5
30             2°°8]
31
32             Collins FS, Gray GM, Bucher JR. (2008) Transforming environmental health protection.
33             5cfence319(5865):906-7.
34
35             Collins FS. (2006) The case for a US prospective cohort study of genes and
36             environment.  Nature 429':47'5-77'.
37
O Q
39             Davis, AP, CG Murphy, MC Rosenstein, TC Wiegers, CJ Mattingly. (2008) The
40             Comparative Toxicogenomics Database facilitates identification and understanding of
41             chemical-gene-disease associations: arsenic as a case study. BMC Medical Genomics
42             1:48.
43
A A
45             Dix DJ, Houck KA, Martin MT, Richard AM, Setzer RW, Kavlock RJ. (2007) The
45             ToxCast program for prioritizing toxicity testing of environmental chemicals. Toxicol Sci.
47             95(1):5-12.
48
49             Edwards, SW, Preston,  RJ.  (2008) Systems biology and mode of action based risk
y             assessment. Toxicol Sci. 106(2):312-318.
O I
52
53             Fry RC, Navasumrit P,  Valiathan C, Svensson JP, Hogan BJ, Luo M, Bhattacharya S,
54             Kandjanapa K, Soontararuks S, Nookabkaew S, Mahidol C, Ruchirawat M, Samson LD.
55             (2007) Activation of inflammation/NF-kappaB  signaling in infants born to arsenic-
56             exposed mothers. PLoS Genet. 3(ll):e207.
O /
58
59
60
                                    Previous  I     TOC

-------
                                            Toxicological Sciences                                 Page 12 of 14


1
2
o
               Gibson, G. (2008) The environmental contribution to gene expression profiles.  Nat Rev
5              Genet 9:57 5-81.
6
7              Gohlke, JM, R Thomas, Y Zhang, MC Rosenstein, AP Davis, C Murphy, KG Becker, CJ
8              Mattingly and CJ Portier.  (2009) Genetic and environmental pathways to complex
^              diseases.  SMC Systems Biology3:46.

12             Groopman, JD, and TW Kensler (1999) The light at the end of the tunnel for chemical-
13             specific biomarkers: daylight or headlight? Carcinogenesis 20( 1)1-11.
14
15             Hemminki, K, L Bermejo, JL Bermejo, A Forsti. (2006) The  balance between heritable
17             and environmental aetiology of human disease. Nat Rev Genet 7(12) 958-65.
18
19             Judson R, A Richard, D Dix, K Houck, F Elloumi, M Martin, T Cathey, TR Transue, R
20             Spencer, M Wolf. (2008)  ACToR - Aggregated Computational Toxicology
21             Resource. ToxicolAppliedPharmacol. 233(1):7-13.

23
24             Lange, MC, DG Lemay, JB German.  (2007) A multi-ontology framework to guide
25             agriculture and food towards diet and health. JSciFoodAgric 87:1427-1434.
26
27             McCarthy MI, GR Abecass, LR Cardon, DB Goldstein, J Little, JPA loannidis, JN
28             Hirschhorn. (2008) Genome-wide association studies for complex traits: consensus,
__             uncertainty and challenges. Nat Rev Genet 91:356-59.
31
32             Manolio, TA, FS Collin.  (2007) Genes, environment, health, and disease: facing up to
33             complexity. Hum Hered63(2):63-66.
34
35             National Research Council of the National Academies (NRC). (2007) Toxicity Testing in
               the 21st Century: A Vision and  a Strategy. The National Academies Press. Washington,
38             DC
39
40             National Research Council of the National Academies (NRC). (2007a) Applications of
41             Toxicogenomic Technologies to Predictive Toxicology and Risk Assessment. The
42             National Academies Press. Washington, DC.
43
 /L             National Research Council of the National Academies (NRC). (2009) Eighth Workshop
4g             of the Standing Committee on Risk Analysis Issues and Reviews. Exposure Science In
47             The 21st Century. June 18-19,  2009. National Academy of Sciences, Washington,  DC.
48             http://dels.nas.edu/best/risk_analysis/ExposureBackground.shtml (accessed June 29,
49             2009).
50
52             National Research Council of the National Academies (NRC). (2009b)  Standing
53             Committee on Use of Emerging Science for Environmental Health.
54             http://dels.nas.edu/envirohealth/ (accessed July 1, 2009)
55
56
57
58
59
60
                                    Previous  I    TOC

-------
Page 13 of 14                                Toxicological Sciences


1
2
3
               Paustenbach DJ. 2000. The practice of exposure assessment: A state-of-the-art review
5              (Reprinted from Principles and Methods of Toxicology, 4th edition, 2001). J Toxicol
6              Environ Health B Crit Rev 3(3): 179-291.
7
8              Richard AM, Gold LS, and Nicklaus MC (2006) Chemical structure indexing of toxicity
^              data on the Internet: Moving toward a flat world. Current Opinion in Drug Discovery &
]°             Development(CODDD), 9(3):314-325.
12
13             Richard, A, C Yang, R Judson (2008) Toxicity data informatics: Supporting a new
14             paradigm for toxicity prediction. Tox. Mech. Meth. 18:103-118.
15
16             Reiter, L. (2009) EPA's Perspective on Exposure Science and Goals for the Workshop.
17             National Research Council, Eighth Workshop of the Standing Committee on Risk
               Analysis Issues and Reviews. Exposure Science In The 21st Century. June 18-19, 2009.
20             National Academy of Sciences, Washington, DC.
21
22             Schwartz D and Collins FS. (2007) Environmental biology and human disease.  Science
23             316(5825):695-696.
24
^c
               Sen, B, Mahadevan B and DeMarini DM ( 2007) Transcriptional responses to complex
2i             mixtures—A review. Mutation Research 636 (2007) 144-177.
28
29             Sheldon, LS, and EA Cohen Hubal. (2009) Exposure as part of a systems approach for
30             assessing risk. Environ Health Perspectdoi:W. 1289/ehp.0800407 [Online 8 April 2009]
31
32             Weis, BK, D Balshawl, et al. (2005). Personalized exposure assessment: Promising
               approaches for human environmental health research. Environ Health Perspect 113(7):
3*             840-848.
36
37             WHO. (2004).  IPCS Risk Assessment Terminology. Harmonization Project Document
38             No. 1, World Health Organization, Geneva. Accessed online on 06/29/09 at
39             http://www.who.int/ipcs/methods/harmonization/areas/terminology/en/index.html
40
41
42             Wild, CP. (2005) Complementing the Genome  with an "Exposome": The Outstanding
43             challenge of environmental exposure measurement in molecular epidemiology.  Cancer
44             Epidemiol Biomarkers Prev 14(8) 1847-18 5 0.
45
46             Wild, CP. (2009) Environmental exposure measurement in cancer epidemiology.
^             Mutagenesis. 24(2)  117-125.

49
50
51
52
53
54
55
56
57
58
59
60
                                    Previous  I    TOC

-------
                                                Toxicological Sciences
                                                                                           Page 14 of 14
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
AQ
           Stressor
         Perturbation
Biological
Receptor
Perturbation
Outcome
Environmental
   Source
      |	      Ambient
               Exposure
Environmental
   Source
      U
                           Population
Personal
Exposure
                               J
 Individual
           Internal Exposure'
            (Tissue Dose)

             Dose to Cell
           Dose of Stressor
              Molecules
                               J
                             Tissue
                               J
                              Cell
                               J
                           Biological
                           Molecules
                  Disease
            Incidence/Prevalence

                Disease State
            (Changes to Health Status)
                         Dynamic Tissue Changes
                               (Tissue Injury)

                         Dynamic Cell Changes
                         (Alteration in Cell Division,
                               Cell Death)

                          Dynamic Changes in
                         Intracellular Processes




-------
TOXICOLOGICAL SCIENCES 95(1), 5-12 (2007)
doi: 10.1093/toxsci/kfll03
Advance Access publication September 8, 2006
                                                    FORUM

             The ToxCast Program for Prioritizing Toxicity Testing  of
                                      Environmental Chemicals

      David J. Dix,1 Keith A. Houck, Matthew T. Martin, Ann M. Richard, R. Woodrow Setzer, and Robert J. Kavlock
        National Center for Computational Toxicology (D343-03), Office of Research and Development, U.S. Environmental Protection Agency,
                                         Research Triangle Park, North Carolina 27711

                                        Received May 24, 2006; accepted August 30, 2006
  The U.S. Environmental Protection Agency (EPA) is developing
methods for utilizing computational chemistry, high-throughput
screening (HTS), and various toxicogenomic technologies to predict
potential for toxicity and prioritize limited testing resources toward
chemicals that likely represent the greatest hazard to human health
and the environment. This chemical prioritization research pro-
gram, entitled "ToxCast," is being initiated with  the purpose of
developing  the ability to  forecast  toxicity based on bioactivity
profiling. The proof-of-concept phase of ToxCast will focus upon
chemicals with an existing, rich toxicological database in order to
provide an interpretive context for the ToxCast data. This set of
several hundred  reference chemicals  will  represent  numerous
structural classes and phenotypic outcomes, including tumorigens,
developmental and reproductive toxicants,  neurotoxicants, and
immunotoxicants. The ToxCast program will evaluate chemical
properties and bioactivity profiles across a broad spectrum of data
domains: physical-chemical, predicted biological activities based on
existing structure-activity models, biochemical properties based on
HTS assays,  cell-based  phenotypic assays, and genomic and
metabolomic analyses of cells. These data will be generated through
a series of external contracts, along with collaborations across EPA,
with the National Toxicology  Program,  and with the National
Institutes of Health Chemical Genomics Center. The resulting
multidimensional data set provides an informatics challenge re-
quiring appropriate computational methods for integrating various
chemical, biological, and toxicological data into profiles and models
predicting toxicity.
  Key Words: high-throughput  screening; toxicogenomics; chemo-
informatics; bioinformatics.
  Across several U.S. Environmental Protection Agency (EPA)
programs,  there  is  a  clear  need  to  develop methods  for
evaluating  large  numbers of environmental  chemicals  for
potential toxicity  and to use  the resulting  information to
  Disclaimer: This work was reviewed by EPA and approved for publication
but does not necessarily reflect official Agency policy.
  'To whom correspondence should be addressed. Fax: (919)  541-1194.
E-mail: dix.david@epa.gov.

Published by Oxford University Press 2006.
      prioritize the use of testing resources toward those chemicals
      and endpoints that present the greatest likelihood of risk to
      human health and the environment. This need can be addressed
      through the experience of the pharmaceutical industry in the
      use of  state of the art,  high-throughput  screening  (HTS),
      toxicogenomics, and computational chemistry tools  for the
      discovery of new drugs (Table 1), with appropriate adjustments
      to the needs of environmental toxicology. Thus, a research pro-
      gram  entitled "ToxCast" has been initiated  within EPA to
      develop an ability to forecast toxicity  based on bioactivity
      profiling. Ultimately, ToxCast's purpose is to develop methods
      of prioritizing chemicals for further screening and testing to
      assist EPA  programs in the management  and regulation of
      environmental contaminants.
        Over the past decade, HTS has developed into a primary tool
      for drug discovery based upon bioactivity screening  of the
      drugable proteome (Fliri  et al.,  2005b;  Janzen and  Hodge,
      2006). On a more limited scale, HTS has  also been adapted to
      agrochemical discovery for the analysis of target species and
      model organisms  (Smith et al., 2005; Tietjen et al.,  2005).
      Recently, HTS applications to toxicology have been expanding
      as a useful complement to traditional toxicology (Bhogal et al.,
      2005; Fliri et al., 2005a; Kikkawa et al., 2006). In the federal
      sector, the  National Institutes  of Health Chemical Genomics
      Center (NCGC) has been established (http://www.ncgc.nih.
      gov/). The NCGC is using industrial-scale HTS technologies to
      collect  data that  is useful for  developing  small-molecule
      chemical probes for basic biological research (Austin et al.,
      2004).
        Traditional  toxicology  testing involves screening  com-
      pounds through  in vivo  and in vitro tests focused on  defined
      endpoints (e.g., neurotoxicity, developmental toxicity) or mech-
      anisms of action (e.g., mutagenicity, cytotoxicity, regenerative
      hyperplasia). However, EPA is confronted with a large number
      of compounds  to  evaluate  and  faced  with the difficulty of
      prioritizing scarce resources. Thus, environmental toxicology
      is challenged by (1) too many  compounds to evaluate through
      endpoint-based in vivo testing and (2) inadequate models or
                                  Previous
TOC

-------
                                                           DIX ET AL.
                                                          TABLE 1
            Selected Examples of Biological Screening and Chemoinformatics Relevant to the Development of ToxCast
Approach
                              Citation
                                                               Description
HTS
Chemoinformatic
  surveys
Structure-activity
  studies
Austin et al. (2004)

Berg et al. (2006)

Fliri et al. (2005b)

Fliri et al. (2005a)
Melnick et al. (2006)

O'Brien et al. (2006)

Scherf et al. (2000)

Smith et al. (2005)
Tietjen et al. (2005)

Walum et al. (2005)

Richard et al. (2006)

Yang et al. (2006b)

Ekins et al. (2003)

O'Brien and DeGroot (2005)s

Poroikov et al. (2003)
NIH Molecular Libraries Initiative expanding use of small-molecule chemical probes for
  biological research
BioMAP profiling based on activity of -100 drugs in cell-based assays designed to incorporate
  biological complexity
Biological activity spectra for 1567 compounds (primarily drugs) based on interactions with 92
  ligand-binding assays
Utility of biological activity spectra for predicting drug-induced adverse effects
Effects of 1400 kinase inhibitors on panel of 35 tyrosine kinase-dependent cellular assays in
  dose-response format
High-content screening of > 600 compounds in HepG2 cells demonstrated human toxicity
  potential with 80% sensitivity and 90% specificity
Correlated gene expression changes with drug activity patterns in 60 human cell lines—one of
  first to integrate large amounts of genomic and pharmacology data
Application of HTS to agrochemical discovery
HTS assays for development of new herbicides, insecticides, and fungicides included use of
  technologies capable of evaluating > 200,000 chemicals per year
Combination of HTS in combination with basic biokinetic information to improve
  identification of toxic compounds
Public initiatives accelerating integration of diverse biological information with standardized
  chemical structure annotation
Strategy for mining structure-integrated toxicity databases to link chemical structure to
  biological endpoint
Public data on 1750 molecules to train computer models that predict inhibition of CYP2D6 and
  CYP3A4
Data  for  58,963  compounds on human  ether-a-go-go  related gene channel and 2410
compounds on inhibition of CYP2D6 combined to create predictive model of toxicty
Prediction of activity spectra for substances for total of 565 different outcomes, resulted in
  64 million predictions on -25,000 chemicals in National Cancer Institute Open Database,
  used to  select compounds for further testing
knowledge of mechanism for many types of toxicity to design
suitable in vitro testing. These challenges  are also faced by
other organizations including the U.S. Food and Drug Admin-
istration, European Union member countries, the Organization
for Economic Cooperation and Development, and the regulated
community (i.e., the pharmaceutical, agrochemical, and con-
sumer products industries).  There is an important need to dis-
tinguish between compounds that present little or no concern
from those with the greatest likelihood of causing an adverse
effect in the target species. High-throughput, high-content, and
toxicogenomic  screening methods applied to predictive toxi-
cology provide opportunities for addressing these challenges.
   The underlying  hypothesis for ToxCast is that toxicological
response is  driven by interactions between  chemicals  and
biomolecular targets. In most cases, these targets are part of the
cellular proteome (e.g., receptors, ion channels, kinases). How-
ever, for most environmental chemicals the protein targets and
biological effects underlying potential adverse effects have yet
to be defined or characterized. Because suitable assays to query
these have remained elusive, a more global approach of bio-
activity profiling is a critical goal in environmental toxicology.
This  goal is  embodied in the ToxCast program,  which  will
focus on a multiple target matrix approach rather than a single
                                            target, directed vector approach. The matrix  contains  an ex-
                                            panded number of potential targets whose chemical interac-
                                            tions  may be characterized by in silica models, biochemical
                                            assays, cell-based in vitro assays, and nonmammalian  animal
                                            models.
                                                    ENABLING HTS AND TOXICOGENOMICS
                                                                  TECHNOLOGIES

                                              Modern computational chemistry and molecular and cellular
                                            biology tools allow researchers to characterize abroad spectrum
                                            of physical and biological properties for large  numbers of
                                            chemicals  (Bredel  and Jacoby,  2004;  Table  1).  Genomics,
                                            transcriptomics, proteomics, and  metabolomics  technologies
                                            are components of  this modern  molecular biology toolkit.
                                            However, though  omics technologies produce large amounts
                                            of data per sample, they are not truly high throughput, and the
                                            per chemical cost can be significant. Thus, the primary driver
                                            transforming  drug  discovery  has  been HTS  technologies
                                            (Macarron, 2006). HTS is comprised of assays in miniaturized
                                            format that can be either target or phenotype based. Target-based
                                            assays usually measure either binding or function of proteins
                                        Previous
                                         TOC

-------
                                          TOXCAST CHEMICAL PRIORITIZATION
                         TABLE 2
    Differences in the Application of HTS to Pharmaceutical
          Research versus Environmental Toxicology
                                                                                     Antimicrobials
                                  750
                                       Pesticide Actives

Chemical space
Chemical numbers
Intended mechanism
of action (MOA)
Target potency
Off-target effects
Error rate
Parent activity
Pharmaceutical
research
Narrow
104-106
Generally known
and specific
High
Often understood
False positives
not acceptable
Chemical design factor
Environmental
toxicology
Broad
102-104
May not exist
Generally low
May or may not be
due to intended MOA
False negatives
not acceptable
Usually unknown
cell free or in engineered cells. Phenotype-based assays monitor
more complex endpoints  in cells or whole organisms. These
assays utilize small quantities of reagents and test chemicals and
can be quite cost and time effective for analyzing larger numbers
of chemicals.
   The ability to generate broad-based bioactivity profiles for
large libraries of  compounds  in  coordinated portfolios  of
biochemical and cellular  assays has become the norm in the
pharmaceutical sciences  for drug  discovery.  As  bioactivity
profiles for compound  libraries have grown, the potential of
these profiles for identifying off-target  mechanisms  and
potential liabilities has  begun to emerge (Bhogal et al., 2005;
Fliri et al, 2005b,c; Klekota et al, 2006; Melnick et al, 2006).
HTS technology optimized for drug discovery is  now being
refocused to  applications  in toxicological screening. It is im-
portant to appreciate, however, the significant and  substantial
differences between the application of HTS to pharmaceutical
research versus environmental toxicology (Table  2).  The
chemical space and numbers, the targeting and potency, and
most importantly the intolerance for false negatives are all key
differences that will impact assay selection and  study design
for ToxCast. The aim of drug discovery HTS is to find a small
number of active compounds amenable to  subsequent optimi-
zation for drug development, and in this pursuit, false negatives
are generally  not a major concern. HTS for toxicology must
determine the activity of all compounds tested, and false neg-
atives are of greater concern from a public safety standpoint.


       ENVIRONMENTAL CHEMICALS RELEVANT
                       TO TOXCAST

   There are potentially 10,000 or more environmental chem-
icals from several  EPA programs in need of prioritization for
further testing. Antimicrobials, pesticidal inerts, high produc-
tion volume (HPV: > 1 million Ibs/year) chemicals, inventory
update rule (> 10,000 Ibs/year, < 1 million Ibs/year) chemicals,
                                                                 H2OCCL
                                                                   7324
                                                 HPV Chemicals
                                                    3300
                                                                                                          Pesticidal Inerts
                                                                                                             3310
                       IUR Chemicals
                          5400
        FIG. 1.  Environmental chemicals present challenges to regulatory Agen-
      cies in prioritizing for further screening and testing. Chemical pesticide actives
      are relatively modest in number (826) and have robust toxicological test data to
      inform hazard characterization. However, toxicity data on antimicrobials and
      pesticidal inerts  (http://www.epa.gov/pesticides/regulating/index.htm),  HPV
      (http://www.epa.gov/opptintr/chemrtk/hpvchmlt.htm),  inventory update rule
      (http://www.epa.gov/oppt/iur/index.htm), and drinking  water  contaminate
      candidate list (http://www.epa.gov/safewater/mcl.htmlttmcls)  chemicals  are
      less complete, making prioritization more difficult.
      and drinking water contaminant candidate list chemicals (Fig. 1)
      generally have limited toxicological data available for hazard
      and risk assessments. As ToxCast moves beyond initial proof-
      of-concept, thousands of environmental chemicals from various
      EPA  domains  can be considered for the ToxCast program.
      Looking beyond U.S. borders, there may be utility for a program
      like ToxCast in Europe's Registration, Evaluation, and Autho-
      rization of Chemicals (REACH) program. In 2003, the Euro-
      pean  Commission  adopted the  REACH proposal as a  new
      regulatory framework for chemicals manufactured or imported
      at more than 1 ton per year. After final adoption of the REACH
      legislation,  which is expected by the end  of 2006,  REACH
      legislation is  likely to be  in force by mid  2007 (http://
      ec.europa.eu/environment/chemicals/reach/reach_intro.htm).
        For the ToxCast  proof-of-concept, conventional chemical
      pesticide actives are an ideal set of compounds for a number of
      reasons. Currently, registered pesticide  actives are relatively
      modest in number  (about 800),  yet  these  actives represent
      a fairly diverse set of structural classes (Table 3). Furthermore,
      these chemicals were all designed to have biological activity
      targeted  against  a pest  species.  This biological activity
      promises to provide a diverse range of positive results in the
      biochemical and cellular assays of ToxCast. Most importantly
      for the purposes of ToxCast, the pesticide actives have a wealth
      of uniform toxicological test data to inform hazard character-
      ization. This existing information, and  EPA's evaluation and
      interpretation of these data in current risk assessments, will
                                  Previous
TOC

-------
                                                            DIX ET AL.
                           TABLE 3
         Chemical Pesticide Actives Use Categories and
                       Chemical Classes"
Pesticide category
Total
Pesticide classification
Classified
Fungicide
Herbicide
Insecticide
Unclassified
Pesticide-use categories
Food use
Antimicrobial
Number of unique
chemicals
826

559
134
173
192
267
270
219
Number of chemical
classes
163

112
58
66
62
102
100
76
  "Categories and classes derived from EPA's OPP Information Network.

provide the  critical  and  necessary context  for interpreting
ToxCast data.
                              TABLE 4
    Examples of Key Targets, Mechanisms, and Toxicities for the
              ToxCast Chemical Prioritization Program
                                                                  Molecular targets
                           Biological mechanisms
                                 or pathways
                                                                  Kinases, phosphatases,
                                                                    proteases
                                                                  G-protein-coupled
                                                                    receptors, steroid, and
                                                                    nonsteroidal nuclear
                                                                    receptors
                                                                  Gamma-amino-butyric
                                                                    acid receptors,
                                                                    ion channels
   DNA, topoisomerases,
     ligases, and helicases
   Caspases, cyclins
     (dependent kinases)
                           Signal transduction
                             pathways
                           Receptor-mediated
                             pathways
                           Apoptosis
Oxidative
  stress
DNA recombination
  and repair
Cell cycle
                                                                                                                     Toxicities
                       Teratogenicity

                       Reproductive
(Developmental)
  neurotoxicity

(Developmental)
  immunotoxicity
Genotoxicity

Carcinogenicity
    DESIGN OF THE TOXCAST RESEARCH PROGRAM

   ToxCast is designed to populate multiple data domains of
increasing  biological  relevance and experimental cost, from
in silico to in vitro, and  perhaps even in vivo with nonmam-
malian model organisms (Fig. 2).  Associations between data
domains and across chemicals can be made in order to generate
bioactivity fingerprints and to group or bin chemicals. It is these
larger patterns gleaned from bioactivity profiling across a broad
range of assays that can be associated  with either chemical
structure (Fliri et al,  2005b,c; Melnick  et al, 2006), or with
known toxicity of reference chemicals in a proof-of-concept
             Physical-Chemical Indicators'?^.
                 Bio-Computational lndicators\%^
                  I Biochemical Based Indicators
                        Cellular Based Indicators
  FIG. 2.  The multiple data domains that will comprise the ToxCast research
program increase in both biological relevance and cost, along the continuum
from in silico to in vitro to in vivo models. Associations between data domains,
and across chemicals, will be used to  bin or group chemicals with similar
bioactivity profiles.
   study. It is from these associations or correlations between chem-
   ical structure, bioactivity profile, and toxicity outcome that the
   predictive power of ToxCast will be derived. The chemical and
   biological  diversity in ToxCast will afford  an opportunity  to
   establish  qualitative  connections  and  quantifiable  linkages
   between chemical structure, biological activity, and known  or
   predicted toxicity.
     In the course of identifying screening targets (Table 4)  or
   assays suitable for ToxCast (Table 5), two key considerations
   are the technical  and economic feasibility  of pursuing  that
   target or assay for thousands of chemicals. Rather than just the
                              TABLE 5
         Assay and Chemical Selection Criteria for ToxCast
                                                                  Assays
                                     Chemicals (for proof-of-concept)
   Capacity for thousands of chemicals

   Broad spectrum of genes, proteins,
     and metabolites
   Optimized for HTS or 'omics

   Linkage to known lexicological
     MOA
   If in vivo, nonmammalian models
   Ability to test concentration-response
   Metabolic capability
   Minimal false negatives

   Currently available
        Completeness, quality, and relevancy
          of toxicological data
        Use in other toxicity testing
          programs (e.g., NTP HTS, HPV)
        MW, solubility, ALogP—suitability
          for HTS
        Range of toxicities, modes of action

        Structural chemical classes,  and
          enough member per class for
          grouped analysis
        Chemical use classes
        Role of metabolism in toxicity
        Suitable purity, enantiomeric forms
          available
     Note. MW, molecular weight.
                                         Previous
TOC

-------
                                          TOXCAST CHEMICAL PRIORITIZATION
drugable proteome, ToxCast sets out to survey a broad spec-
trum of genes, proteins, and metabolites that comprise the cel-
lular "interactome." Pathway-based analyses may also identify
effects on higher level signaling, in addition to discrete targets
within the cellular interactome. These pathways could serve as
a good middle ground between biochemical or other target-
focused assays and more phenomenological, phenotypic, or high-
content assays. Thus, the range  of potential targets  and  assays
is very broad, and increasing biological relevance will have to
be balanced against increasing  cost for various data domains
(Fig. 2). Two abiding requirements for  ToxCast assays will be
the ability to minimize false negatives relating to  hazards to
human health and the current availability of these assays from
reliable sources.
  The majority of ToxCast data will come from a diverse series
of assay  types that collectively evaluate  a broad spectrum of
bioactivities (Fig. 3). Like prior examples in the literature (Fliri
et al, 2005c; Janzen and Hodge, 2006; Melnick et al, 2006),
ToxCast  data will include  numerous HTS assays delineating
biological effects. Eventually, ToxCast  is  designed to research
thousands of chemicals, requiring a managed library of envi-
ronmental chemicals and sophisticated chemical and biological
informatics  to identify meaningful  data associations. HTS
biochemical  assays will  be supplemented by cellular  assays
for more complex biological effects and toxicities (Table 4,
Fig. 3). The potential of high-content screening (Borchert et al.,
2005; O'Brien et al., 2006)  and  microelectronic monitoring
(Xing et  al.,  2005) of cells for detection  of specific toxicities
will be researched.  In addition,  the  use  of  the nematode
Caenorhabditis elegans  (in collaboration with the  National
Institute of Environmental Health Sciences [NIEHS]; Schwartz
et al., 2004), and  the zebrafish Danio rerio will be explored as
models of mammalian toxicity. Biological samples  from these
various in vitro and in vivo  assays will also be utilized for
supplemental genomics and metabolomics.




  FIG. 3.  Data generation for the ToxCast program will begin with a man-
aged chemical library, then flow from seven integrated types of analyses
evaluating a broad spectrum of bioactivities. These data will be interpretively
linked within the ToxCast database and a structured strategy developed to
predict toxicity.
        While much of the ToxCast data are likely to come from
      HTS enzyme and receptor assays, an important complement to
      these data will be derived from assays using complex formats
      of human, nonhuman primate, or rodent cells for detecting bio-
      transformation and complex toxicities. These are capable  of
      detecting  secondary effects (e.g., altered membrane perme-
      ability) resulting from chemically induced perturbations of the
      interactome. For example, in vitro primary hepatocyte models
      of the  liver are commonly used to screen for metabolism and
      toxicity of xenobiotics, but primary hepatocytes rapidly lose
      liver-specific functions under standard cell culture conditions.
      Advances in tissue engineering  and in  silica  modeling are
      enabling development of novel engineered  approaches (Allen
      et al., 2005;  Sivaraman  et al., 2005)  that could  improve
      chemical hazard  testing by recreating the  three-dimensional
      microscale of the liver. Such tissue engineering raises  new
      possibilities for the study  of complex toxicological processes
      in vitro (Griffith  and Swartz, 2006), and the convergence  of
      HTS and toxicogenomic data with systems biology is creating
      opportunities for developing bioengineered  and computational
      models that more realistically replicate hepatic architecture and
      function. Toward this end, ToxCast could provide critical data
      for defining processes such as nuclear receptor-mediated reg-
      ulation of xenobiotic-metabolizing enzymes, useful to systems
      biology models of toxicity. It is through systems biology that
      the issue of metabolism and biotransformation  may be best
      addressed within ToxCast. A recent workshop organized by the
      European Center for  the  Validation of Alternative  Methods
      (Coecke et  al,  2006)  emphasized the need to account for
      biotransformation with  appropriate methods and to  consider
      how such  information  can be incorporated  into computer
      models for hazard identification.
        Toxicogenomic assays,  specifically the highly parallel pro-
      filing of gene expression and cellular metabolites in ToxCast
      biological samples can be an important adjunct or alternative to
      biochemical HTS profiling. For example, nuclear receptor bind-
      ing and activity could be assessed by monitoring expression  of
      suites of genes that are the transcriptional targets for specific
      nuclear receptors of interest. The appropriate target genes can
      be identified by  a complementary  suite of positive internal
      control ligands (e.g., testosterone for the androgen receptor, ri-
      fampicin  for the human pregnane X receptor) utilized  in
      ToxCast cellular assays. Receptor activities could then be as-
      sessed based on expression of receptor-modulated genes and
      utilized as  an  efficient toxicogenomics in vitro assay for
      characterization  of environmental  chemicals (Yang et al.,
      2006a).
          SELECTION OF PROOF-OF-CONCEPT CHEMICALS

        The essential first step for the ToxCast program is to conduct
      a demonstration phase using  reference chemicals  that have
      an existing, rich toxicological database (i.e., registered chemical
                                  Previous
TOC

-------
10
                                                        DIXETAL.
pesticide actives). Several hundred reference chemicals repre-
sentative  of  differing  structural  classes  and  phenotypic
outcomes (e.g., carcinogens, reproductive toxicants, neuro-
toxicants) will need to be evaluated in ToxCast's wide net of
assays and endpoints for this proof-of-concept. As the program
matures, the assays and endpoints may be narrowed or modified
based on predictive value, derived from associations between
various data domains and the known toxicological properties of
the reference chemicals. From this proof-of-concept, a broader
strategy for identifying toxicity potential, minimizing false
negatives, and prioritizing subsequent testing can be developed
for larger number of environmental  chemicals having limited
toxicological  data. This proof-of-concept will be especially
important because of the challenges of ToxCast, as compared to
conventional drug discovery, attributable in part to the diversity
of environmental  chemicals  and issues relating to solubility,
volatility,  or confounding cytotoxicity.
  Working from EPA databases,  826 conventional chemical
pesticide  actives that are currently  registered or undergoing
registration were identified. Of these  826, at least 270 are food-
use pesticides  that have the most extensive  testing require-
ments. Table  3 presents EPA's  Office of Pesticide  Programs
(OPP) use categories and chemical classes for the majority of
these pesticides. Table 5 lists the general selection criteria that
were used for ranking chemical pesticide actives as candidates
for the ToxCast proof-of-concept. Structural annotation  was
added  to these pesticide actives, and further chemoinformatic
analysis was conducted using LeadScope Enterprise (http://
www.leadscope.com; Table  6). These 785 chemicals  were
characterized  into 101  structural classes,  with 28  of these
classes being  singletons. For proof-of-concept,  the chemicals
were prioritized based  on several criteria. High priority  was
generally  given to those chemicals in common  with the 1408
chemicals that the National  Toxicology Program (NTP) has
provided NCGC in early 2006  for  HTS. Compatibility with
standard HTS  assays was also considered; thus, low prior-
ity was given to inorganics, organometallics, high ALogP
                         TABLE 6
  Chemical Pesticide Actives Properties and Structural Classes
         Relative to use in ToxCast Proof-of-Concept
Physical-chemical properties"
# Unique chemicals
Common to NTP-1408"
MW > 150, solubility > 2, ALogP < 5
100  6
Total
       92
      328
      219

      146
      785
  Note. MW, molecular weight.
  "Physical-chemical  properties and  structural  classes calculated using
Leadscope Enterprise v2.3.6-2 (Leadscope Inc., Columbus, OH).
  *NTP provided 1408 chemicals to NCGC for HTS.
                  (octanol/water partitioning), and molecular weights < 150. The
                  328 prioritized chemicals were secondarily ranked in descend-
                  ing order of representation in other toxicological databases
                  annotated in the EPA DSSTox Structure Data File collection
                  (http://www.epa.gov/ncct/dsstox/index.html), or in other EPA
                  programs (e.g., industrial  HPV chemicals)  that correlate in
                  some fashion to ToxCast. A small minority of inorganics and
                  organometallics are included in this set of  328  chemicals
                  because of their relevance to other toxicological programs. The
                  remaining  chemicals included  an additional 219  chemical
                  pesticides that might be suited to HTS.
                   INTEGRATING CHEMICAL AND BIOLOGICAL DATA TO
                                    FORECAST TOXICITY

                    Within ToxCast, data will be generated on an environmental
                  chemical library using  numerous types of assays  evaluating
                  a broad spectrum of bioactivities (Fig. 3). These data will need
                  to be relationally linked within the ToxCast database to other
                  physical-chemical, toxicological, and in silico information, and
                  a structured strategy developed to predict toxicity based on this
                  entire data set. This structured strategy will be forged upon the
                  known toxicities of the proof-of-concept chemical pesticides.
                    We are currently in the process of collecting toxicological
                  data on pesticides and working with the OPP on how to ac-
                  curately and precisely capture this information into a relational
                  database. The OPP evaluates submitted toxicological studies in
                  a standardized  review process, which  is  captured in a Data
                  Evaluation  Record (DER). Information is being culled from
                  DERs on endpoints, dose-response, and critical effects in mam-
                  malian test species for approximately 400 chemical pesticides.
              Further
             Screening
               And
              Testing
        HIGH
MEDIUM
                    FIG. 4. The application of ToxCast data to the process of prioritizing
                  environmental chemicals based on hazard prediction: chemicals given a low
                  priority may enter into no further testing, medium priorities may be recycled
                  into ToxCast for further evaluation, and high priorities recommended for
                  further screening and testing.
                                      Previous
               TOC

-------
                                         TOXCAST CHEMICAL PRIORITIZATION
                                                                                                                    11
The DERs  being  used  are  primarily  from neurotoxicity,
developmental, reproductive, subchronic, chronic, and cancer
guideline toxicology tests. The OPP conventional toxicology
for the proof-of-concept pesticides  will  complement  the
chemoinformatic,  HTS, and  toxicogenomic  information in
the ToxCast database,  allowing  us to develop and  validate
ToxCast's predictive  power. In  addition,  toxicological data
from other EPA Programs (e.g., HPV Challenge) and the NTP
will also be helpful in developing ToxCast. Throughout the
course of methods and data development, our goal is to keep
ToxCast a public and transparent enterprise.
  Another ongoing informatics effort  is aimed at generating,
collating, reviewing, and  organizing unambiguous definitions
of chemical identity and structure for the various environmen-
tal  chemical domains  relevant to ToxCast  and EPA. To ac-
complish this, we are building on the DSSTox project. This will
also aid in the identification of other potentially useful sources
of data relative to the ToxCast candidate chemical list, as well
as  help identify  structurally similar chemicals for which
toxicity or bioassay data might be available.
  Figure 4 presents a flowchart for applying ToxCast data and
predictions  to the process of prioritizing chemicals.  Hazard
prediction represents both the primary goal and the key bio-
informatics  and chemoinformatics challenge of this approach,
and the value of such an approach is self-evident so long as false
negatives are minimized. Over the past several years, a number of
studies have been published presenting  alternative, in vitro,  and
in some cases HTS methods for integrated testing of chemicals
for bioactivity and associations with toxicity or side effects. One
example relevant to ToxCast was an integrated, tiered approach
using  computational and experimental  in vitro data for hazard
assessment, although limited to only 10 environmental chemicals
(Gubbels-van Hal et al, 2005). The hazard assessment for these
10 substances was performed on the basis of available nonanimal
data, quantitative structure activity relationship, physiologically
based pharmacokinetic modeling,  and  additional  new in vitro
testing. Based on these data, predictions of various  toxicities
were  made  and then compared  with prior  in vivo testing to
demonstrate at least a partial success. However,  the  limited
number of chemicals included in the study of Gubbels-van Hal
et al. did not allow conclusions to be drawn for the thousands of
chemicals subject to REACH.  It  is  apparent that methods
compatible with larger numbers of chemicals, which do not lead
to substantially higher costs for industry, need to be developed.
We suggest that HTS technologies, larger chemical libraries, and
expanded  data  analysis  techniques  may   accomplish these
broader goals within ToxCast.
                     CONCLUSIONS

  The strategy of ToxCast encompasses a diverse range of data
types. No single assay or endpoint will have a large impact on
interpretation of the fingerprint or bioactivity profile. It will be
      the overall pattern across many assays and data types that will be
      the predictor of toxicity used for prioritizing chemicals. This
      will be the main goal of ToxCast, taking advantage of HTS and
      toxicogenomic technologies  for bioactivity profiling  of envi-
      ronmental chemicals related in  structure or mechanism of
      action. Although the primary purpose is not to identify mech-
      anisms of action of environmental toxicants per se, this might be
      a future benefit of the program. The availability of a biologically
      and chemically  based system to categorize chemicals of like
      properties and activities will provide EPA Program Offices with
      a valuable tool that heretofore has been seriously lacking.
        In  late 2005, EPA  organized  the Chemical  Prioritization
      Community of Practice (CPCP) to provide a forum for  dis-
      cussing  the  utility  of  computational  chemistry, HTS,  and
      various toxicogenomic technologies for chemical priori tization
      and Agency use. The CPCP  has brought together experts and
      interested parties to discuss  chemical prioritization research.
      This has afforded various groups the opportunity to consider
      the ToxCast concept, from EPA Program Offices, to  external
      stakeholders such as the American Chemistry Council, the Cen-
      ter for Alternatives to Animal Testing, CropLife America and
      Environmental  Defense.  In   addition,  the CPCP  has been
      helpful in building partnerships and communicating with the
      NTP,  the NIEHS, and the NCGC.
        Many  hurdles remain to be cleared by ToxCast as it transits
      from  concept to proof-of-concept and  ultimately to  a useful
      prioritization tool, including  (1) accessing a chemical library
      providing coverage of sufficient chemical space, (2) identifying
      an upper limit on the per chemical cost  of obtaining screening
      level  data, (3) selecting assays within available resources  that
      produce  predictive bioactivity profiles,  (4) evaluating the im-
      pact of metabolism on  the efficiency and accuracy of assays,
      (5)  developing a bioinformatic approach to mining ToxCast
      data and identifying predictive signatures, and (6) carrying out
      a prospective prioritization for chemicals currently  entering
      a traditional testing process, in such a way that minimizes false
      negatives. These hurdles will be the  focus of the ToxCast
      program over then next few years.


                      SUPPLEMENTARY DATA

        Supplementary data  are  available online at http://toxsci.
      oxfordj ournals .org/.


                       ACKNOWLEDGMENTS

        We thank the many EPA, NTP, and NCGC colleagues who have supported
      our initial and ongoing efforts to develop ToxCast. Particular thanks are due to
      Maritja Wolf (Lockheed Martin, contractor to the EPA) for chemoinformatics
      assistance; Raymond Tice, Cynthia Smith,  and Kristine Witt for coordination
      with the NTP HTS  program; Chris  Austin and Jim Inglese for coordination
      with the NCGC; Tina Levine, Elizabeth Mendez, Elissa Reaves, and Jess
      Rowland,  for assistance  with EPA/OPP  toxicological data;  and to Vicki
      Dellarco (EPA/OPP) and Phil Sayre (EPA/OPPT) for guidance and helpful
      comments  in the review of this article.
                                  Previous
TOC

-------
12
                                                                    DIXETAL.
                           REFERENCES

Allen, J. W.,  Khetani, S. R., and Bhatia, S. N. (2005). In vitro zonation and
  toxicity in a hepatocyte bioreactor. Toxicol. Set. 84, 110-119.
Austin, C. P., Brady, L. S., Insel, T. R., and Collins, F. S. (2004). NIH Molecular
  Libraries Initiative. Science 306,  1138-1139.
Berg, E. L., Kunkel, E.  J., Hytopoulos, E., and Plavec, I. (2006). Character-
  ization of  compound mechanisms and  secondary activities by BioMAP
  analysis. J. Pharmacol. Toxicol. Methods 53, 67-74.
Bhogal,  N., Grindon,  C.,  Combes, R., and  Balls, M. (2005). Toxicity testing:
  Creating a revolution based on new technologies. Trends Biotechnol. 23,299-307.
Borchert, K.  M., Galvin, R. J., Frolik, C.  A.,  Hale, L. V., Halladay, D. L.,
  Gonyier, R. J., Trask, O. J., Nickischer,  D. R., and  Houck, K. A. (2005).
  High-content screening assay for activators of the Wnt/Fzd pathway  in
  primary human cells. Assay Drug Dev. Technol. 3, 133-141.
Bredel, M., and Jacoby, E. (2004). Chemogenomics: An emerging strategy for
  rapid target and drug discovery. Nat. Rev. Genet. 5, 262-275.
Coecke, S., Ahr, H., Blaauboer, B. J., Bremer, S., Casati, S., Castell, J., Combes, R.,
  Corvi, R., Crespi, C. L., Cunningham, M. L., et al. (2006). Metabolism: A
  bottleneck  in in vitro toxicological test development. The report and
  recommendations of ECVAM workshop 54. Altern. Lab Anim. 34,49-84.
Ekins, S., Berbaum, J., and Harrison, R. K. (2003). Generation and validation of
  rapid computational filters for cyp2d6 and cyp3a4. Drug Metab. Dispos. 31,
  1077-1080.
Fliri, A. E, Loging,  W.  T,  Thadeio, P. E,  and Volkmann,  R. A. (2005a).
  Analysis of drug-induced effect patterns to link structure and side effects of
  medicines. Nat. Chem. Biol. 1, 389-397.
Fliri, A. E, Loging,  W.  T,  Thadeio, P. E,  and Volkmann,  R. A. (2005b).
  Biological spectra analysis: Linking biological activity profiles to molecular
  structure. Proc. Natl. Acad. Sci. USA. 102, 261-266.
Fliri, A. E, Loging,  W.  T,  Thadeio, P. E,  and Volkmann,  R. A. (2005c).
  Biospectra analysis: Model proteome characterizations for linking molecular
  structures and biological response. J. Med. Chem. 48, 6918-6925.
Griffith,  L. G., and Swartz, M. A. (2006). Capturing  complex 3D tissue
  physiology in vitro.  Nat. Rev.  Mol Cell Biol. 7, 211-224.
Gubbels-van Hal, W. M., Blaauboer, B. J., Barentsen, H. M., Hoitink, M. A.,
  Meerts, I. A., and van der Hoeven, J. C. (2005). An alternative approach for
  the safety evaluation of new and existing chemicals, an exercise in integrated
  testing. Regul. Toxicol. Pharmacol. 42, 284-295.
Janzen,  W. P., and Hodge, C.  N. (2006). A chemogenomic approach  to
  discovering target-selective drugs. Chem. Biol. Drug  Des. 67, 85-86.
Kikkawa, R., Fujikawa, M.,  Yamamoto, T, Hamada, Y., Yamada,  H., and
  Horii, I. (2006). In vivo hepatotoxicity  study of rats in comparison with
  in vitro hepatotoxicity screening  system.  J. Toxicol. Sci. 31, 23-34.
Klekota, J., Brauner, E., Roth, F. P., and Schreiber,  S. L. (2006). Using high-
  throughput screening data to discriminate compounds with single-target
  effects from those with side effects. J. Chem. Inf.  Model 46, 1549-1562
Macarron, R. (2006). Critical review of the role of HTS in drug discovery. Drug
  Discov. Today 11, 277-279.
   Melnick, J. S., Janes, J., Kim, S., Chang, J. Y, Sipes, D. G., Gunderson, D.,
     Jarnes, L., Matzen, J. T, Garcia, M. E., Hood, T. L., et al. (2006). An efficient
     rapid system for profiling the cellular activities of molecular libraries. Proc.
     Natl. Acad. Sci.  USA.  103, 3153-3158.
   O'Brien, S. E.,  and de Groot, M.  J. (2005).  Greater  than the  sum  of its
     parts: Combining models for useful ADMET prediction. J. Med. Chem. 48,
     1287-1291.
   O'Brien, P. J.,  Irwin,  W., Diaz, D., Howard-Cofield, E.,  Krejsa, C. M.,
     Slaughter, M.  R., Gao, B., Kaludercic, N., Angeline, A., Bernardi, P., et al.
     (2006). High concordance of drug-induced human hepatotoxicity with in
     vitro cytotoxicity measured in a novel cell-based model using high content
     screening. Arch. Toxicol. 80, 580-604.
   Poroikov, V V, Filimonov, D. A., Ihlenfeldt, W. D., Gloriozova, T. A., Lagunin,
     A. A., Borodina, Y. V, Stepanchikova, A. V, and Nicklaus, M. C. (2003).
     PASS biological activity spectrum predictions in the enhanced open NCI
     database browser. J. Chem. Inf. Comput. Sci.  43, 228-236.
   Richard, A. M., Gold, L. S., and Nicklaus, M. C. (2006). Chemical structure
     indexing of toxicity data on the internet: moving towards a flat world. Curr.
     Opin. Drug Discov. Dev. 9, 314-325.
   Scherf, U., Ross, D. T, Waltham, M., Smith, L. H., Lee, J. K., Tanabe, L., Kohn,
     K. W., Reinhold, W. C., Myers, T. G., Andrews, D. T, et al. (2000). A gene
     expression database for the molecular pharmacology of cancer. Nat. Genet.
     24, 236-244.
   Schwartz, D. A., Freedman, J. H., and Linney, E. A.  (2004). Environmental
     genomics: A key to understanding biology,  pathophysiology and disease.
     Hum. Mol. Genet. 13, R217-224.
   Sivaraman, A., Leach, J. K., Townsend, S., lida, T, Hogan, B. J., Stolz, D. B.,
     Fry, R., Samson, L. D., Tannenbaum, S. R.,  and Griffith, L. G. (2005). A
     microscale in vitro physiological model of the liver: Predictive  screens for
     drug metabolism and enzyme induction. Curr. Drug Metab. 6, 569-591.
   Smith, S. C., Delaney, J.  S., Robinson, M. P., and Rice, M. J. (2005). Targeting
     inputs and optimising HTS for agrochemical  discovery. Comb. Chem. High
     Throughput Screen. 8, 577-587.
   Tietjen, K., Drewes, M., and Stenzel,  K. (2005). High throughput screening
     in agrochemical research. Comb.  Chem. High Throughput Screen. 8, 589-
     594.
   Walum, E., Hedander, J., and Garberg, P. (2005). Research perspectives for pre-
     screening alternatives  to animal experimentation—On  the relevance  of
     cytotoxicity  measurements,  barrier  passage  determinations  and high
     throughput screening in vitro to select potentially hazardous compounds in
     large sets of chemicals. Toxicol. Appl.  Pharmacol. 207, 393-397.
   Xing, J. Z., Zhu, L., Jackson, J. A., Gabos,  S., Sun, X.  J., Wang, X. B., and
     Xu,  X. (2005).  Dynamic monitoring of cytotoxicity on  microelectronic
     sensors. Chem. Res. Toxicol. 18, 154-161.
   Yang, Y, Abel, S. J., Ciurlionis, R., and Waring, J. F. (2006a). Development of
     a  toxicogenomics in  vitro assay  for the  efficient  characterization  of
     compounds. Pharmacogenomics 7, 177-186.
   Yang, C., Richard,  A. M., and Cross, K. P. (2006b). The art of data mining the
     minefields of toxicity databases to link chemistry to biology. Curr. Comput.
     Aided Drug Des. 2,  1-19.
                                              Previous
TOC

-------
  A  Balanced Accuracy  Fitness Function Leads to Robust
 Analysis  using  Grammatical  Evolution Neural Networks  in
                           the Case  of Class  Imbalance
   Nicholas E. Hardison
 Bioinformatics Research Ctr.
    Department of Statistics
North Carolina State University
      Raleigh, NC 27606
    nhardis@ncsu.edu
       Theresa J. Fanelli
  Ctr. for Human Genetics Research
Department of Molecular Physiology &
   Biophysics; Vanderbilt University
        Nashville, TN 37232
       tjf5004@psu.edu
           Scott M. Dudek
  Ctr. for Human Genetics Research
Department of Molecular Physiology &
   Biophysics; Vanderbilt University
        Nashville, TN 27232
   dudek@chgr.mc.vanderbilt.edu
       David M.  Reif
National Ctr. for Computational
Toxicology; U.S. Environmental
      Protection Agency
       RTP, NC 27711
       Marylyn D. Ritchie
  Ctr. for Human Genetics Research
Department of Molecular Physiology &
   Biophysics; Vanderbilt University
        Nashville, TN 37232
    reif.david@epa.gov      ritchie@chgr.mc.vanderbilt.edu
      Alison A. Motsinger-Reif
       Bioinformatics Research Ctr.
         Department of Statistics
      North Carolina State University
           Raleigh, NC 27606
      motsinger@stat.ncsu.edu
ABSTRACT
Grammatical  Evolution  Neural   Networks  (GENN)   is  a
computational method designed to detect gene-gene interactions
in genetic epidemiology, but has so far only been evaluated in
situations with balanced numbers of cases and controls. Real data,
however, rarely has such perfectly balanced classes. In the current
study, we test the power of GENN to detect interactions in data
with a  range of class imbalance using two  fitness  functions
(classification error and balanced  error), as well as data re-
sampling. We show that when using classification error, class
imbalance greatly decreases the power of GENN. Re-sampling
methods demonstrated improved  power, but using  balanced
accuracy resulted in the highest power. Based on the results of this
study,  balanced  error has replaced classification error  in the
GENN algorithm.

Categories and Subject Descriptors

Genetics-Based  Machine  Learning  and  Learning   Classifier
Systems.

General Terms
Algorithms

1. INTRODUCTION
Grammatical  Evolution   Neural   Networks   (GENN)   uses
grammatical evolution to evolve neural networks to detect gene-
 Permission to make digital or hard copies of all or part of this work for
 personal or classroom use is granted without fee provided that copies are
 not made or distributed for profit or commercial advantage and that
 copies bear this notice and the full citation on the first page. To copy
 otherwise,  or republish, to post on servers or to redistribute to lists,
 requires prior specific permission and/or a fee.
 GECCO '08, July 12-16, 2008, Atlanta, Georgia, USA.
 Copyright 2008 ACM 1-58113-000-0/00/0004...$5.00.
                        gene interactions in  studies of complex human diseases [1].
                        GENN has shown initial successes in both real and simulated
                        data, and while these results are encouraging, previous simulation
                        studies have used datasets with balanced numbers of cases and
                        controls.  Unfortunately, when using standard classification error
                        as the fitness function, many machine learning methods are not
                        robust to  class imbalance.
                        To try to solve this problem, investigators have tried techniques
                        such as re-sampling [2] or altering the fitness metric. One metric
                        that has  been  shown to  be  highly successful is balanced
                        error/accuracy [3]. This metric has been shown to solve the class
                        imbalance problem for  another  approach  designed  to detect
                        epistasis-Multifactor Dimensionality Reduction (MDR) [4].
                        We  assessed the performance of GENN on data with varying
                        levels of class imbalance and show that the power of GENN using
                        classification error decreases as the controlxase ratio departs from
                        unity. We compared three methods for addressing this concern:
                        re-sampling methods (over- and under-sampling) and balanced
                        accuracy  as a fitness function.

                        2.  METHODS
                        2.1  Grammatical Evolution Neural Networks
                        The steps of GENN have  been previously described in detail [1].
                        For the purposes of the current study, an option was added to the
                        configuration  file  to  specify   the  fitness   function  used:
                        classification error (CE) or balanced error (BE).  BE is the inverse
                        of balanced accuracy, defined as the  mean of sensitivity and
                        specificity [3]:
                        Balanced Accuracy = (sensitivity + specificity)/2 =
                                        '/2 [TP/(TP+FN) + TN/(TN+FP)]
                        where TP represents true positives, TN represents true negatives,
                        FP represents false positives, and FN represents false negatives.
                        This formula equally weights the errors within each class. In the
                        case of balanced data, this is equivalent to standard CE.




-------
2.2 Data Simulation
The intention of the data simulations for this power study was to
mimic gene-gene interaction, or epistasis, in case-control genetic
data to evaluate GENN using penetrance functions. Penetrance
defines the probability of disease given  a particular  genotype
combination by  modeling  the relationship  between genetic
variations and  disease risk. We used two well-described purely
epistatic  models, where the  heritability (the proportion of trait
variance due to genetics) -5%.  The first is referred to as the XOR
model, and the second is referred to as the ZZ model [5]. Both are
nonlinear models  with no  marginal  main  effects.  Software
described by Moore et al [5] was used to simulate the data.
For both  models, we simulated data with a range of controlxase
ratios and sample sizes. For the first set of simulations, the total
number of individuals in the dataset was held constant, at two
different  total sample sizes:  600 and 1200. For each sample size,
three control:case ratios were simulated:  1:1, 2:1, and 4:1. To
ensure the results  seen were due  to class imbalance instead of
decreasing numbers of cases,  a second set of simulations was
done, holding  the  number  of cases  constant at 300  and 600.
Again,  for each number of cases,  three control:case ratios were
simulated. For each  set of parameters,  100  replicates were
simulated. Each dataset had  a total of 100 SNPs, two  of which
were  functional in predicting  disease. For  the models  with
imbalanced control:case ratios, re-sampling was performed. In the
case of under-sampling (US), controls  were randomly removed
until a ratio of 1:1 was achieved. In the case of over-sampling
(OS), cases  were  randomly re-sampled until a 1:1 ratio was
achieved.

2.3 Data Analysis
GENN was used to analyze all epistasis models with classification
error, balanced error, or classification error in combination with
data re-sampling.  Parameter settings remained identical between
the analyses  and  included:  4  denies,  migration  every 25
generations,  population size  of 200 per deme, 400 generations,
crossover rate of 0.9, and a reproduction rate of 0.1.   Power for
all analyses is  reported as the number of times GENN correctly
identified the  correct  loci  with  no false  positives over 100
datasets.

3.  RESULTS
Tables  1  and 2  show the  results  for all analyses, with several
apparent  trends.  Using  classification error  (CE),  increased
imbalanced ratios  greatly decreases the power  of GENN. The
power of GENN greatly improves when OS is used. With US, a

   Table 1. Results for constant sample size simulations for
            different control: case ratios (CCR).
                                                                    Table 2. Results for constant case number simulations.
Total
Samples
600
1200
CCR
1:1
2:1
4:1
1:1
2:1
4:1
XOR Power (%)
CE
100
74
3
99
88
6
BE
100
100
100
100
100
100
US
100
87
62
100
98
85
OS
100
98
97
100
99
99
ZZ Power (%)
CE
100
100
63
100
100
59
BE
100
100
99
100
100
100
US
100
96
74
100
99
94
OS
100
100
99
100
100
100
Case
Count
300
600
CCR
1:1
2:1
4:1
1:1
2:1
4:1
XOR Power (%)
CE
100
80
2
99
83
3
BE
100
100
100
100
99
100
US
100
91
86
100
93
71
OS
100
96
95
100
99
98
ZZ Power (%)
CE
100
100
49
100
100
36
BE
100
100
100
100
100
100
US
99
99
90
100
98
92
OS
100
99
96
100
99
97
marked decrease in power in  smaller datasets with large  class
imbalance is seen.  This trend is ameliorated somewhat in larger
datasets,  as well as the datasets with fixed numbers  of cases.
Most  significantly, for  all  models  analyzed, power recovers
completely    when    using     balanced    error     (BE).

4.  DISCUSSION
From  these results, we conclude that balanced error should be
used as the fitness metric in GENN instead of classification error,
as it outperforms  standard classification  error and re-sampling
methods. Additionally, since balanced error  and classification
error are mathematically  equivalent in when data is balanced,
there is no disadvantage to using balanced error in balanced data.

5.  ACKNOWLEDGMENTS
This work was supported by National Institutes of Health grants
HL65962, GM62758, and AG20135. We would also like to thank
Jason  H. Moore  and Digna R. Velez for helpful  discussions on
class imbalance. This paper has been reviewed and approved for
publication according to US EPA policy but does not necessarily
represent the views of the Agency.

6.  REFERENCES
[1]  Motsinger-Reif A.A., Dudek S.M., Hahn L.W., Ritchie M.D.
    Comparison   of  Approaches  for   Machine   Learning
    Optimization of Neural Networks for Detecting Gene-Gene
    Interactions  in Genetic Epidemiology. Genet. Epidemiol.,
    Epub ahead of print. (2008)
[2]  Japkowicz N., Stephen S. The Class  Imbalance Problem: A
    Systematic  Study. Intelligent Data Analysis Journal,  6
    (2002)429-450.
[3]  Powers R., Goldszmidt M., Cohen I. Short term performance
    forecasting    in   enterprise   systems.   Hewlett-Packard
    Development   Company  Technical  Reports,  Computer
    Science  Department,  Stanford University,  Stanford, CA.
    (2005)
[4]  Velez D., White B.W., Motsinger A.A., Bush W.S.,  Ritchie
    M.D., Moore J.H. A Balanced Accuracy Metric for Epistasis
    Modeling  in  Imbalanced   Datasets  using  Multifactor
    Dimensionality Reduction. Genet. Epidemiol. 4 (2007) 306-
    15.
 [5] Moore,  J., Hahn, L.,  Ritchie,  M., Thornton,  T., White, B.
    Application  of genetic  algorithms  to  the  discovery  of
    complex models for simulation studies in human genetics. In
    Proceedings of the Genetic and Evolutionary Computation
    Conference. (New York,  USA,  July 9-13, 2002). Morgan
    Kaufman, San Francisco, CA, 2002, 1150-1155.




-------
                 FY2004 "New Start" Award Bibliography

Project Title: Linkage of Exposure and Effects Using Genomics, Proteomics, and
Metabolomics in Small Fish Models

Peer Reviewed Publications:
Ankley, G.T., K.M. Jensen,  E.J. Durban, E.A. Makynen, B.C. Butterworth, M.D. Kahl,
D.L. Villeneuve, A. Linnum, L.E. Gray, M. Garden, V.S. Wilson. 2005.  Effects of two
fungicides with multiple modes of action on reproductive endocrine function in the
fathead minnow (Pimephales promelas).  Toxicol. Sci. 86, 300-308.

Ankley, G.T., K.M. Jensen,  M.D. Kahl, E.A. Makynen, L.S. Blake, K.J. Greene, R.D.
Johnson and D.L. Villeneuve. 2007. Ketoconazole in the fathead minnow (Pimephlaes
promelas): reproductive toxicity and biological compensation. Environ. Toxicol. Chem.
26, 1214-1223.

Ankley, G.T., D.H. Miller, K.M. Jensen, D.L. Villeneuve and D. Martinovic. 2008.
Relationship of plasma sex steroid concentrations in female fathead minnows to
reproductive success and population status. Aquat. Toxicol. 88, 69-74.

Ankley, G.T., D. Bencic, M. Breen, T.W. Collette, R. Connolly, N.D. Denslow,
S. Edwards, D.R. Ekman, K.M. Jensen, J. Lazorchak, D.  Martinovic,  D.H. Miller, E.J.
Perkins, E.F. Orlando,  N. Garcia-Reyero, D.L. Villeneuve, R.-L.Wang ,  and K.
Watanabe.  2009.  Endocrine disrupting chemicals in fish: Developing exposure
indicators and predictive models of effects based on mechanisms of action. Aquat.
Toxicol. 92, 168-178.

Breen, M.S., D.L. Villeneuve, M. Breen, G.T. Ankley and  R.B. Conolly.  2007.
Mechanistic computational model of ovarian steroidogenesis to predict biochemical
responses to endocrine active compounds. Ann. Biomed. Engin. 35,  970-981.

Ekman, D.R., Q. Teng, K.M. Jensen, D.  Martinovic, D.L. Villeneuve, G.T. Ankley and T.
W. Collette. 2007.  NMR analysis of fathead minnow urinary metabolites: a potential
approach for studying impacts of chemical exposures. Aquat. Toxicol.  85, 104-112.

Ekman, D.R., Q. Teng, D.L. Villeneuve, M.D. Kahl, K.M. Jensen,  E.J. Durban, G.T.
Ankley and T.W. Collette.  2008.  Investigating compensation and recovery of fathead
minnow (Pimephales promelas) exposed to 17a-ethynylestradiol with metabolite
profiling. Environ. Sci. Tecnhol. 42, 4188-4195.

Ekman, D.R., Q. Teng, D.L. Villeneuve, M.D. Kahl, K.M. Jensen,  E.J. Durban, G.T.
Ankley and T.W. Collette.  2009. Profiling lipid metabolites yields  unique information on
gender- and time-dependent responses of fathead minnows (Pimephales promelas)
exposed to 17a-ethynylestradiol. Metabolomics 5, 22-32.
                      Previous  I    TOC

-------
Garcia-Reyero, N., D.L. Villeneuve, K.J. Kroll, L. Liu, E.F. Orlando, K.H. Watanabe,
M.S. Sepulveda, G.T. Ankley and N.D.  Denslow.  2009. Expression signatures for a
model androgen and antiandrogen in the fathead minnow ovary.  Environ. Sci. Technol.
43,2614-2619.

Garcia-Reyero, N., K.J. Kroll, L. Liu, E.F. Orlando, K.H. Watanabe, M.S. Sepulveda,
D.L. Villeneuve, E.J. Perkins, G.T. Ankley and N.D. Denslow. 2009. Gene expression
responses in male fathead minnows exposed to binary mixtures of an estrogen and
antiestrogen. BMC Genomics, In Press.

Johns, S.M., M.D. Kane, N.D. Denslow, K.H. Watanabe, E.F. Orlando, D.L. Villeneuve,
G.T. Ankley and M.S. Sepulveda. 2009. Characterization of ontogenetic changes in
gene expression in the fathead minnow (Pimephales promelas). Environ. Toxicol.
Chem. 28, 873-880.

Martyniuk, C.J., S. Alvarez, S. McClung, D.L. Villeneuve, G.T. Ankley and N.D.
Denslow.2009.  Quantitative proteomic profiles of androgen receptor signaling in the
liver of fathead minnows (Pimephales promelas) J. Proteome Res. In Press.

Martinovic, D., L.S. Blake, E.J. Durhan, K.J. Greene, M.D.  Kahl, K.M. Jensen,  E.A.
Makynen, D.L. Villeneuve and G.T. Ankley.  2008. Characterization of reproductive
toxicity of vinclozolin in the fathead minnow and co-treatment with an androgen to
confirm an anti-androgenic mode of action.  Environ. Toxicol. Chem. 27, 478-488.

Miller, D.H., K.M. Jensen, D.L. Villeneuve, M.D. Kahl, E.A. Makynen, E.J. Durhan and
G.T. Ankley. 2007.  Linkage  of biochemical responses to population-level effects: a
case study with vitellogenin in the fathead minnow (Pimephlaes promelas). Environ.
Toxicol. Chem. 26, 521-527.

Perkins, E.J.,  N. Garcia-Reyero,  D.L. Villeneuve,  D. Martinovic, S.M. Brasfield, L.S.
Blake, J.D. Brodin, N.D. Denslow and G.T. Ankley.  2008.  Perturbation of gene
expression and steroidogenesis with  in vitro exposure of fathead  minnow ovaries to
ketoconazole. Mar. Environ. Res. 66, 113-115.

Villeneuve, D.L., P. Larkin, I.  Knoebl, A.L. Miracle, M.D. Kahl, K.M. Jensen, E.A.
Makynen, E.J. Durhan, B.J. Carter, N.D. Denslow and G.T. Ankley.  2007. A graphical
systems model to facilitate hypothesis-driven ecotoxicogenomics research on the brain-
pituitary-gonadal axis.  Environ. Sci. Technol. 40, 321-330.

Villeneuve, D., L. Blake, J. Brodin, K. Greene, I. Knoebl, A. Miracle, D. Martinovic and
G.T. Ankley. 2007.  Transcription of key genes regulating gonadal steroidogenesis in
control and ketoconazole- or  vinclozolin-exposed fathead minnows.  Toxicol. Sci. 98,
395-407.

Villeneuve, D.L., L.S. Blake, J.D. Brodin, J.E. Cavallin, E.J. Durhan, K.M. Jensen, M.D.
Kahl, E.A. Makynen, D. Martinovic, N.D. Mueller and G.T. Ankley. 2008. Effects of a
                      Previous  I    TOC

-------
3(3-hydroxysteroid dehydrogenase inhibitor, trilostane, on the fathead minnow
reproductive axis. Toxicol. Sci. 104, 113-123.

Villeneuve, D.L.,  N.D. Mueller, D. Martinovic, E.A. Makynen, M.D. Kahl, K.M. Jensen,
E.J. Durhan, J.E. Cavallin, D. Bencicand G.T. Ankley. 2009.  Direct effects,
compensation and recovery in female fathead  minnows exposed to a model aromatase
inhibitor. Environ. Health Perspect. 117, 624-631.

Villeneuve, D.L.,  R.-L. Wang, D.C. Bencic, A.D. Biales, D. Martinovic, J.M.
Lazorchak, G. Toth and G.T. Ankley.  2009.  Altered gene expression in the brain and
ovaries of zebrafish exposed to the aromatase inhibitor fadrozole: microarray analysis
and hypothesis generation. Environ. Toxicol. Chem.  In Press.

Wang, R.-L., A. Biales, D. Bencic, D. Lattier, M. Kostich,  D. Villeneuve, G.T. Ankley, J.
Lazorchak and G. Toth. 2008a. DMA microarray application in ecotoxicology:
experimental design, microarray scanning, and factors impacting transcriptional profiles
in a small fish species. Environ. Toxicol. Chem. 27, 652-663.

Wang, R.-L., D. Bencic, A. Biales, D. Lattier, M. Kostich,  D. Villeneuve, G.T. Ankley, J.
Lazorchak and G. Toth. 2008b. DMA microarray-based ecotoxicological discovery in a
small fish specbs. Environ. Toxicol. Chem. 27, 664-675.

Watanabe, K.H.,  K.M. Jensen, E.F. Orlando and G.T. Ankley. 2007. What is
 normal? A characterization of the values and variability in reproductive endpoints of the
fathead minnow,  Pimephales promelas. Comp. Biochem. Physiol. 146, 348-356.

Watanabe, K.H.,  Z. Li,  K. Kroll, D.L. Villeneuve, N.J. Szabo, E.F. Orlando, M.S.
Sepulveda, T.W.  Collette, D.R. Ekman, G.T. Ankley and  N.D. Denslow. 2009. A
physiologically-based model of endocrine-mediated responses of male fathead
minnows to17a-ethinylestradiol. Toxicol. Sci. In Press.

Project Title: Simulating Metabolism of Xenobiotic Chemicals as a Predictor of
Toxicity

Peer Reviewed Publications:
Mazur, C.  S.; Kenneke, J. F. 2008. Cross-species comparison of conazole fungicide
metabolites using rat and rainbow trout (Onchorhynchus mykiss) hepatic microsomes
and purified human CYP 3A4.  Environmental Science and Technology, 42:947-954.

Mazur, C.  S.; Kenneke, J.F.; Tebes-Stevens, C. Okino, M. S.;  Lipscomb, J. C. 2007.  In
vitro metabolism  of the fungicide and environmental contaminant trans-bromuconazole
and implications for risk assessment.  Journal of Toxicology and Environmental Health,
Part A, 70:1241-1250.

Kenneke, J. F. 2006.  Environmental fate and ecological risk assessment for the
reregistration of antimycin A (PC Code 006314), Appendix D: In vitro mammalian
                      Previous  I    TOC

-------
metabolism and Appendix G: Summary of antimycin hydrolysis research, U.S. EPA,
Office of Pesticide Programs, Reregistration Eligibility Decision (RED) on Antimycin A

Project Title: Risk Assessment of the Inflammogenic and Mutagenic Effects of
Diesel Exhaust Particulates: A Systems Biology Approach

Peer Reviewed Publications:
Cao, D., Bromberg, PA and Samet, JM (2007). Diesel-induced Cox-2 expression
involves chromatin  modification via degradation of HDAC1  and recruitment of p300. Am.
J. Respir. Cell. Mol. Biol.37:232-239.

Cao, D., Tal, T., Graves, L, Gilmour, I., Linak, W., Reed, W., Bromberg, P., and Samet,
J., Diesel Exhaust Particulate (DEP)-lnduced Activation of StatS  Requires Activities of
EGFR and SRC in Airway Epithelial Cells, American Journal of Physiology: Lung, Cell,
& Molecular Physiology, 292, L422-L429 (2007).

Cho, S.-H.,  Yoo, J.-l., Turley, AT., Miller, C.A., Linak, W.P., Wendt, J.O.L., Muggins,
F.E., and Gilmour, M.I., Relationships between Composition and Pulmonary Toxicity of
Prototype Particles from Coal Combustion and Pyrolysis, Proceedings of the
Combustion Institute, 32, in press (2008).

Ciencewicki, J., Gowdy, K., Krantz, Q.T., Linak, W.P., Brighton, L., Gilmour, M.I., and
Jaspers, I.,  Diesel Exhaust Enhanced Susceptibility to Influenza  Infection is Associated
with Decreased Surfactant Protein Expression, Inhalation Toxicology, 19, 1121-1133
(2007).

DeMarini, D.M., Brooks, L.R., Warren, S.H., Kobayashi, T., Gilmour, M.I., and Singh P.,
Bioassay-Directed Fractionation an dSalmonella Mutagenicity of Automobile and Forklift
Diesel Exhaust Particles, Environmental Health Perspectives,  112, 814-819 (2004).

Gottipolu, R.R., Wallenborn, J.G., Karoly, E.D., Schladweiler, M.C., Ledbetter, A.D.,
Krantz, Q.T., Linak, W.P., Nyska, A., Johnson, J.A., Thomas, R., Richards, J.E., Jaskot,
R.H., and Kodavanti, U.P. (2009)., One-month Diesel Exhaust Inhalation Produces
Hypertensive Gene Expression Pattern in Healthy Rats, Environmental Health
Perspectives. 117:38-46.

Gowdy, K.,  Krantz,  Q.T., Daniels,  M., Linak, W.P., Jaspers,  I.,  and Gilmour, M.I.,
Modulation  of Pulmonary Inflammatory Responses and Anti-microbial Defenses in Mice
Exposed to Diesel Exhaust, Toxicology & Applied Pharmacology, 229, 310-319 (2008).

Linak, W.P., Yoo, J.I., Wasson, S.J., Zhu, W., Wendt, J.O.L, Huggins, F.E., Chen, Y.,
Shah, N., Huffman, G.P., and Gilmour, M.I., Ultrafine Ash Aerosols from Coal
Combustion: Characterization and Health Effects, Proceedings of the Combustion
Institute, 31, 1929-1937 (2007).
                      Previous  I    TOC

-------
Reed, W., Gilmour, I., DeMarini, D., Linak, W. and Samet, J. (2008). Gene Expression
Profiles of Human Airway Epithelial Cells Exposed to Diesel Exhaust Particles of
Varying Composition. In Preparation.

Saxena, RK, Williams, W& Gilmour, Ml. (2007) Suppression of basal and cytokine
induced expression of MHC, ICAM 1 and B7 markers on mouse lung epithelial cells
exposed to diesel exhaust particles. Am J Biochem Biotech. 3(4). 187-192.

Saxena, RK., Gilmour, Ml., & MD Hayes. Uptake of diesel exhaust particles by lung
epithelial  cells and alveolar macrophages. Biotechnology,  2007, 3(4). 187-192

Singh,  P., DeMarini, D.M., Dick, C.A.J., Tabor, D., Ryan, J., Linak, W.P., Kobayashi, T.,
and Gilmour, M.I., Bioassay-Directed Fractionation, Physiochemical Characterization,
and Pulmonary Toxicity of Automobile and Forklift Diesel Exhaust Particles in Mice,
Environmental Health Perspectives, 112(8) 820-825 (2004).

Stevens,  T.,  Krantz, Q.T., Linak, W.P., Hester, S., and Gilmour, M.I., Increased
Transcription of Immune and Metabolic Pathways in Naive and Allergic Mice Exposed
to Diesel  Exhaust, Toxicological Sciences, 102(2), 359-370 (2008).

Stevens,  T.,  Linak, WP., Gilmour, Ml. Differential potentiation of allergic lung disease in
mice exposed to chemically distinct diesel samples.  Tox Sci. 107(2), 522-534.

Stevens,T, Hester, S, & Gilmour Ml. Differential transcriptional changes in mice
exposed to chemically distinct diesel samples. Submitted.

Tal., T., Bromberg, P.A., Kim, Y. and Samet, J.M. (2008). Tyrosine  phosphatase
inhibition  induces epidermal  growth factor receptor activation in human airway epithelial
cells exposed to diesel exhaust Toxicol. Appl. Pharmacol. 233:382-388.

Project Title: Development of Microbial Metagenomic Markers for Environmental
Monitoring and Risk Assessment

Peer Reviewed Publications:
Lamendella R, Santo Domingo JW, Yannarell AC, Ghosh S, Di Giovanni G, Mackie Rl,
Oerther DB.  Evaluation of swine-specific PCR assays used for fecal source tracking and
analysis of molecular diversity of Bacteriodales-swine specific populations. Appl Environ
Microbiol. 2009 Jul 24. [Epub ahead of print]

Lu J, Santo Domingo JW, Hill S, Edge TA. Microbial Diversity and Host-
specificSequences of Canadian Goose Feces. Appl Environ Microbiol. 2009 Jul 24.

Santo Domingo, J.W. and T.A.  Edge. 2009. Identification of primary sources of faecal
pollution.  In Safe Management of Shellfish and Harvest Waters. G.  Rees., K. Pond, D.
Kay and J. Santo Domingo. IWA Publishing,  London, UK.
                      Previous  I    TOC

-------
Lee, Y.-J., M. Molina, and J.W. Santo Domingo, J.D. Willis, M. Cyterski, D.M. Endale,
and O.C. Shanks.  2008. A temporal assessment of cattle fecal pollution in two
watersheds using 16S rRNA gene-based and metagenome-based assays. Appl.
Environ. Microbiol. 74:6839-6847.

Lu, J. and J.W. Santo Domingo. 2008. Turkey fecal microbial community structure and
functional gene diversity revealed by 16S rRNA gene and metagenomic sequences. J.
Microbiol. 46:469-477.

Lu, J., J.W. Santo Domingo,  R. Lamendella, T.Edge, and S.Hill. 2008. Phylogenetic
diversity and molecular detection of gull feces. Appl. Environ. Microbiol. 74: 3969-3976.

Lamendella, R.,  Santo Domingo J.W., Kelty C, and Oerther DB. 2008. Occurrence of
bifidobacteria in  feces and environmental waters. Appl. Environ. Microbiol. 74:575-584.

Santo Domingo, J.W., D.G. Bambic, T.A. Edge, and S. Wuertz. 2007. Quo vadis source
tracking? Towards a strategic framework for environmental monitoring of fecal pollution.
Water Res. 41:3539-3552.

Lu, J., J.W. Santo Domingo,  and O.C. Shanks. 2007.  Identification of chicken-specific
fecal microbial sequences using a metagenomic approach. Water Res. 41:3561-3574.

Shanks, 0., J.W. Santo Domingo, J.  Lu, C.A. Kelty, and J.  Graham. 2007. PCR Assays
for the identification of human fecal pollution in water. Appl. Environ.  Microbiol. 73:
2416-2422.

Vogel, J.R., D.M. Stoeckel, R. Lamendella, R.B. Zelt, J.W.  Santo  Domingo, S.R.
Walker, and D.B. Oerther. 2007. Identifying fecal sources  in a selected catchment
reach using multiple source-tracking Tools. J.  Environ.  Qual. 36:718-729.

Lamendella, R.,  J. W. Santo  Domingo, D. Oerther, J. Vogel, and,  D. Stoeckel. 2007.
Assessment of fecal pollution sources in a small northern-plains watershed using PCR
and phylogenetic analyses of Bacteroidetes 16S rDNA. FEMS Microbiol. Ecol. 59:651-
660.

Santo Domingo, J.W., Lu, J., Shanks, 0., Lamendella,  R.,  Kelty, C. A.,  and Oerther,
D.B. "Development of host-specific markers for source tracking using a novel
metagenomic approach," Water Environment Federation, Proceedings  of Disinfection
2007, Pittsburg,  PA, February 4-7, 2007.

Shanks, 0., J. W. Santo Domingo, R. Lamendella, C.A. Kelty, and J. Graham. 2006.
Competitive metagenomic DMA hybridization identifies host-specific genetic markers in
cattle fecal samples. Appl. Environ. Microbiol. 72:4054-4060.
                      Previous  I    TOC

-------
Shanks, 0., J. W. Santo Domingo, and J. Graham. 2006. Use of competitive DMA
hybridization to identify differences in the genomes of two closely related fecal indicator
bacteria. J. Microbiol. Methods. 66:321-330.
Project Title: A Systems Approach to Characterizing and Predicting Thyroid
Toxicity Using an Amphibian Model

Peer Reviewed Publications:
Sternberg, Thoemke, Hornung, Tietge, and Degitz. Regulation of thyroid-stimulating
hormone release from pituitary by t4 during metamorphosis in Xenopus laevis  (In
review)

Serrano, Higgins Witthuhn,  Korte, Hornung, Tietge, and Degitz In vivo assessment and
potential diagnosis of xenobiotics that perturb the thyroid pathway: Part I. Differential
protein profiling of Xenopus  Laevis brain tissues by two-dimensional polyacrylamide gel
electrophoresis and peptide-labeling with isobaric tags for relative and absolute
quantification (iTRAQ) following exposure to model T4 inhibitors.  (In review)

Conners, K.  , Jorte, J.J., Anderson G., Degitz SJ. Charaterization of thyroid hormone
transporting  protein expression during tissue-specific metamorphosis in Xenopus
tropicalis (In review)

Hornung, M.W.  Degitz, S.J., Korte, L.M., Olson, J., Kosian, P.A., Linnum, A.L., Tietge,
J.E. Inhibition of thyroid hormone release from cultured amphibian thyroid glands by
methimazole, 6-propylthiouracil, and perchlorate. (Completed, NHEERL In-House
review. To be submitted with the following  Hornung et al. paper).

Degitz, S.J.,  Hornung, M.W., Korte, J.J, Holcombe, G.W, Kosian, P.A., Thoemke, K.R.,
Helbing, C., Tietge, J.E.  In vivo and in vitro regulation of genes in the thyroid gland
following exposure to the model T4 synthesis inhibitors methimazole, 6-propylthiouracil,
and perchlorate. (In preparation. To be submitted with above paper).

Hornung, M., Burgess, E.  Tandem in vitro and ex vivo thyroid gland assays to screen
xenobiotic chemicals for thyroid hormone synthesis inhibition. (In preparation).

Nichols systems model paper (In preparation).

Tietge Butterworth Kosian Hammermeister Hornung Haselman Degitz Analysis of
Thyroid Hormone and Related lodo-Compounds in Complex Samples by Inductively
Coupled Plasma Emission/Mass Spectrometry. (In preparation)

Tietge, Butterworth, Haselman, Holcombe, Korte, Kosian, Wolfe,  Degitz. Early Temporal
Effects of Three Thyroid Hormone Synthesis Inhibitors in Xenopus laevis. (In
preparation)
                      Previous  I    TOC

-------
Project Title: Mechanistic Indicators Of Childhood Asthma (Mica): A Systems
Biology Approach For The Integration Of Multifactorial Environmental Health Data

Peer Reviewed Publications:
Kim SJ, Dix DJ, Thompson KE, Murrell RN, Schmid JE, Gallagher JE, Rockett JC.
Effects of storage,  RNA extraction, genechip type, and donor sex on gene expression
profiling of human whole blood . Clin Chem. Jun;53(6): 1038-45. (2007)

Vesper, S.,McKinstry C., Haugland., R.,  Neas, L, Hudgens, E.,   Heidenfelder, B., and
Gallagher  J. Environmental Relative Moldiness Index (ERMIsm) as a Tool to Identify
Mold Related Risk Factors for Childhood Asthma Sci Total Environ. May 1;394(1): 192-6
(2008)

Johnson M, Hudgens E, Williams R, Andrews G,  Neas L, Gallagher J, Ozkaynak H. "A
Participant-Based Approach to Indoor/Outdoor Air Monitoring in Community Health
Studies" Journal of Exposure Science and Environmental Epidemiology. (2008), 1-10
(2008).

Cohen Hubal E, Richards A.,  Shah I, Edwards S, Gallagher J, Kavlock R, Blancato, J
Exposure Science  and the US EPA National Center for Computational Toxicology  J
Expo Sci Environ Epidemiol. November (2008).

Heidenfelder B,. Reif D,  Harkema, JR, Cohen Hubal E,  Hudgens,E.  Bramble L G.
Wagner G, Harkema JR, Morishita M,  Keeler G ,  Edwards,SW  and Gallagher J.
Comparative Microrarray Analysis and Pulmonary Changes in Brown Norway Rats
Exposed  to  Ovalbumin  and concentrated Air Particulates  Tox Sci. volume 1082009
March 2 (2009)

Heidenfelder B, Johnson M,  Hudgens E, Inmon J, Hamilton R,   Neas L, and Gallagher
J, Increased plasma reactive oxidant levels and their relationship to blood cells, total
IgE, and allergen-specific IgE in asthmatic children Journal of Asthma accepted (2009)

Williams AH, Gallagher JE, Hudgens E, Johnson MM, Mukerjee S, Ozkaynak H, Neas
LMN.  EPA Observational studies of children's respiratory health  in Detroit and
Dearborn, Michigan.  Proceedings of AWMA 102nJune 16-19; Detroit, Michigan.(2009)

J. E Gallagher, E A Cohen Hubal, S.W. Edwards  Invited book Chapter  "Biomarkers of
Environmental Exposure" "Biomarkers of toxicity: A New Era in Medicine Editors Vishal
S. Vaidya and Joseph V. Bonventre Publisher: John Wiley and Sons, Inc. October 1,
(2009)

Markey M. Johnson, Ron Williams, Zhihua Fan, Lin, Edward Hudgens, Jane Gallagher,
Alan Vette, Lucas Neas, Haluk Ozkaynak Indoor and outdoor concentrations of nitrogen
dioxide, volatile organic compounds, and polycyclic aromatic hydrocarbons among
MICA-Air households in Detroit, Michigan submitted AWMA (2009)
                     Previous  I    TOC

-------
Gallagher, J Reif, D; Heidenfelder, B Neas, L; Hudgens, E Williams, A Inmon, J;
Rhoney, S, Andrews G.,  Johnson, M  Ozkaynak, H; Edwards, S, Cohen-Hubal, E
Mechanistic Indicators of Childhood asthma ( MICA); A systems biology approach for
the integration of multifactorial environmental health data submitted: Journal of
Exposure Science and Environmental Epidemiology (2009)

In preparation

David M. Reif, Jane E. Gallagher, Brooke L. Heidenfelder, Ed E. Hudgens, Wendell
Jones, ClarLynda Williams-DeVane, Lucas M. Neas, Elaine A. Cohen Hubal, Stephen
W. Edwards Elucidating Asthma Phenotypes via Integrated Analysis of Blood Gene
Expression Data with Demographic and Clinical Information ( Nature Genetics) 2009

David M. Reif*, ClarLynda Williams-DeVane*, Elaine A. Cohen Hubal, Wendell Jones,
Ed E. Hudgens, Brooke L. Heidenfelder,  Lucas M. Neas, Jane E. Gallagher, Stephen
W. Edwards
*Authors contributed equally. Systems Modeling of Gene Expression, Demographic and
Clinical Data to Determine Disease Endotypes PLOS Comp Bio 2009
                      Previous I    TOC

-------
s
c
                July 20, 2005

                Mr. E. Timothy Oppelt
                Acting Assistant Administrator
                Office of Research and Development
                U.S. Environmental Protection Agency
                Washington, DC 20460
                                     Dr. Robert Kavlock
                                     Director
                                     National Center for Computational Toxicology
                                     U.S. Environmental Protection Agency
                                     Research Triangle Park, NC 27711
Re: National Center for Computational Toxicology Review

Dear Mr. Oppelt and Dr. Kavlock:

This is a letter report from the Board of Scientific Counselors (BOSC)
reviewing the progress of the new National Center for Computational
Toxicology (NCCT).  Dr. Kavlock and his staff at the NCCT presented an
overview of the Center's structure, activities, goals,  and progress on April 25-
26, 2005, to a Subcommittee of the BOSC. The Subcommittee consists of
Drs. George Daston (Chair), James Clark, Richard DiGiulio, Michael Clegg,
and Ken Ramos. Dr. Clegg was unable to attend the briefing and Dr. Ramos
recused himself because of a potential conflict of interest.

Because the NCCT is so new, becoming operational in February 2005,  this
report is a prospective one, and is intended to be the first of several
consultative reviews of the Center's progress.  In particular, we concentrate on
NCCT's strategic goals; its collaborations, and connectedness to the rest of the
Agency and to  outside scientists; its staffing plan; and its thematic choices.
We addressed a number of charge questions intended to focus on each of these
areas. Those charge questions and the Subcommittee's responses are listed
below, following some general comments about the Center.

The Subcommittee was extremely impressed with the progress NCCT has
made in the few short months of its existence. NCCT's mission is to serve as
a focal point for the U.S. Environmental Protection Agency (EPA) in the
application of mathematical and computational tools to all facets of the risk
assessment process. To be successful at this, the NCCT must:  (1) provide a
critical mass of expertise in computational, mathematical, and statistical
modeling; (2) develop research collaborations and partnerships with a large
number of groups within and outside the Agency; and (3) have  a clear
understanding of and regular interactions with its customers in the rest of the
Office of Research and Development (ORD), the program offices, and the
regions. The Center already has made considerable progress on all three
fronts.

Because its staffing is limited, NCCT has made the appropriate choice of
concentrating on gathering staff with biological, chemical, and  statistical
modeling expertise rather than on a particular biological or chemistry
                           Previous
                            TOC
Next

-------
                                           Computational Toxicology Subcommittee Letter Report
                                                                            Page 2 of 6

specialty.  This is an appropriate choice, as the staff is strongly aligned to the mission of
the Center. The composition of the staff is impressive; it includes some of EPA's most
accomplished biological modelers, chemists, and statisticians. Most of these individuals
have strong track records of collaboration with multiple laboratories and already are
sought after as research partners.  This choice of personnel automatically leverages
NCCT's potential well beyond what would normally be expected of a group of 19.
NCCT has created a virtual organization that brings these people together in a way that
allows them to synergize and form ad hoc groups to make progress on multiple fronts
simultaneously.

The Center already has collaborations and programmatic augmentations via internal and
Science To Achieve Results (STAR) grants.  These partnerships cover a large number of
areas of modern biology and chemistry that require high-powered computational and/or
modeling expertise, such as genomics, proteomics, and metabonomics, with coverage of
mammalian toxicology, ecotoxicology, microbiology, exposure assessment, and
quantitative risk assessment.  NCCT has a steering committee—the Computational
Toxicology Implementation and Steering Committee (CTISC)—that represents ORD
laboratories/centers, program offices,  and regions.  The role of the CTISC is still evolving
and it will be an important avenue for communication and for identifying possible
partnerships.

NCCT's strategic plan includes deliverables with short- and longer-term time horizons.
Emphasis  on Information Technologies, Prioritization Tools, Biological Models, and
Cumulative Risk will take advantage of the Center's strengths and provide much of the
Agency with technology that will improve its ability to fulfill its mission. It is clear that
the Information Technologies (especially DSSTox) and Prioritization Tools (especially
ToxCast) have the potential to address significant issues in toxicology data management
and prioritization for testing.

Charge Questions and Responses

Questions for the Center as a whole:

1.  Success of the NCCT will depend upon establishing effective collaborations with the
   other ORD laboratories and centers.  What advice can you provide to ensure that
   operations remain integrated with the other laboratories and centers within ORD?

   The Subcommittee members believe that NCCT has been set up in an optimal way to
   maximize interactions, by concentrating expertise in modeling within the Center,
   rather than the toxicologists, risk assessment specialists, etc., who populate the other
   laboratories and centers.  This provides a natural focus area on which the other
   laboratories and centers will seek collaboration. Furthermore, most of the staff  at the
   Center are highly experienced and have a long history of successful collaborations,
   including a number of active collaborations that they bring with them.  The staff is  a
   natural magnet for collaborations.


-------
                                           Computational Toxicology Subcommittee Letter Report
                                                                            Page 3 of 6

   One challenge will be to transition the Center from a collection of experts in various
   fields to a center of excellence in applying the broad tools of computational
   toxicology to address the human health and environmental health issues under the
   purview of EPA.  Experts will need to develop procedures to capture the essence of
   thought processes and computational tools that can be applied to the diversity of
   challenges the Agency addresses.  Many of the Center staff will be required to shift
   their focus from finding computational approaches to address a set of specific issues
   to developing robust tools and procedures that provide computational frameworks
   that support ORD and Agency programs.

   Not all the modeling expertise within EPA resides within NCCT, let alone the
   disciplines that rely on computational toxicology. The Center should consider
   forming an informal  "community of practice" within EPA that can serve a networking
   function for interested scientists. This community of practice would not be an
   administrative unit, but a virtual professional society within the Agency. Most of its
   business can be conducted via electronic media, with occasional meetings. The
   Subcommittee endorses the Center's concept of trying to develop various personnel
   alignments and management tools (e.g., appointing agency/federal/academic
   scientists as adjunct or associate faculty of the Center) to help recruit or gain input
   from a broader number of scientists.  Those individuals with technical expertise
   aligned with the Center's activities can be encouraged to  contribute to NCCT
   activities while being housed in other organizations within ORD, EPA, or outside of
   the Agency; they will form the nucleus of the community of practice.

   The CTISC should be explicitly tasked with identifying possible partnerships and
   collaborations (and of prioritizing them, if necessary). ORD should continue to hold
   regular meetings of its Laboratory and Center Directors, at which partnerships among
   centers, including NCCT, can be explored.

   The internal grant program that supports many of the NCCT collaborations is
   important and likely  to be highly successful. Future grant programs should provide a
   preference for projects that collaborate with the Center.

   Finally, NCCT should develop a communications plan to share its accomplishments
   and capabilities with the rest of EPA and those external to the Agency.

2.  In terms of anticipated staffing, are there particular areas that should receive greater
   or lesser attention?

   NCCT may wish  to consider adding one or two staff who have expertise in
   bioinformatics. The  planned grant for an external bioinformatics center will cover
   most of the Center's  needs in this area, but having some internal expertise would
   complement the external bioinformatics efforts and provide a natural point of contact
   between the external group and NCCT. The Center also should consider whether
   there are social science applications to computational toxicology, and if so, whether
   there is a social science expertise that should be represented on the staff.


-------
                                           Computational Toxicology Subcommittee Letter Report
                                                                            Page 4 of 6
   NCCT also may wish to consider hiring one or two leading scientists in the field of
   ecological modeling.  Of the many competencies that could be targeted, fields such as
   modeling large-scale ecological processes, population and community dynamics,
   tissue dynamics in ecological receptors (PB/PK, bioaccumulation processes, and
   lethal/adverse effects of body burdens), and environmental fate and effects of
   chemicals (including microbial biodegradation and bioavailability) would be
   particularly useful.  Although it is unlikely that one individual will have expertise
   spanning all these areas, having an individual with modeling expertise will serve as a
   focal point for collaborations with EPA scientists outside the Center who have
   complementary expertise.  During the review, the Center staff demonstrated the
   importance of obtaining insight from social scientists in developing technically sound
   and meaningful studies. NCCT should consider including this area of expertise
   among the core competencies of the Center.

3.  As we find ourselves in the post-genome era, science is progressing at a rapid pace.
   This makes it difficult to stay abreast with the current state-of-the-science.  Clearly,
   being cognizant of and understanding the technologies and advanced methods in the
   areas of the omics, modeling, and statistics is a considerable vested interest to NCCT
   for several reasons, such as being able to make decisions about which technologies
   are best for the Center to pur sue and most beneficial to the Agency.  Can the BOSC
   provide any suggestions on how best to keep apace with new technologies and
   methodologies ?

   This is a problem that we all face, but is perhaps more severe for an integrating group
   such as NCCT. Partnerships with other organizations with similar/complementary
   interests may be the best way to facilitate keeping current. Active collaborations,
   which already are the stock-in-trade for the Center, publication, and participation in
   professional meetings will keep the Center staff fresh and well informed.  These
   efforts also will serve to attract the brightest students and post-doctoral fellows, who
   will bring with them the latest technologies.

Questions concerning the areas of emphasis, or "Concept Topics":

4.  Has the Center articulated a clear rationale for each topic area,  and has it provided
   evidence that the contemplated approaches will be able  to address the major goals
   stated in A Framework for a Computational Toxicology Research Program?

   The Subcommittee members believe that NCCT is on track. It will be important for
   the Center to prepare a synthesized set of goals/milestones for the numerous projects
   in which the Center is involved, explaining how each fulfills a need, and how each
   topic area will provide tools for the Agency.  The prioritization process that the
   Center leadership has developed is a good one, which works well in selecting
   program areas that are consistent with the Center's mission.


-------
                                           Computational Toxicology Subcommittee Letter Report
                                                                            Page 5 of 6

5.   To be successful in addressing the Concept Topics, can you help identify potentially
   fruitful partnerships with others outside the Agency?

    The review provided plenty of evidence that the Center is reaching out to find
    potential collaborators among a diverse set of U.S. government and private
    institutions. Many of the collaborations discussed should be formalized in
    Memoranda of Understanding (MOU), Interagency Agreements (lAGs), and other
    formal commitments to demonstrate the degree of cooperation, leverage, and interest
    generated with other partners.  Also, NCCT will need to have opportunities to work
    with scientists and regulatory authorities from countries around the world, as
    computational toxicology is an area of evolving science with expertise in Europe,
    Canada, Asia, perhaps Russia,  as well as the United States.

    One approach to broaden international contacts would be to consider development of
    ties with U.S.-based academic centers and institutions that have liaisons with
    international scientists and organizations. Also, Center management may want to
    specifically reserve some travel allocations to allow attendance at conferences,
    workshops, or technical exchanges and site visits  at leading international sites and
    organizations around the world. A world-class center will need worldwide
    perspectives in computational toxicology.

    NCCT already is doing a good job of establishing liaisons  with other organizations
    involved in aspects  of computational toxicology,  such as the National Center for
    Toxicogenomics at the National Institute of Environmental Health Sciences (NIEHS).
    Efforts should be continued to partner with private industry in areas  of mutual
    interest.

6.   Given the mission, staffing, and resources of the Center, what is your view of the
    depth and breadth of the areas currently selected for emphasis? Are there additional
    areas that should be considered?

    The Subcommittee members believe that the Center is doing a good job of
    maintaining broad coverage through its collaborations with multiple laboratories.
    Depth will come from the other laboratories and programs with which NCCT
    collaborates.

    The Center's goal to take advantage of opportunities to broaden and generalize the
    technical approaches to the diverse scope of Agency issues is an admirable goal, and
    one that will require a disciplined approach among the technical and managerial team
    to implement. The Subcommittee realizes that the endocrine disrupter studies offer
    many concrete examples of the kind of molecular and cellular work the NCCT can
    provide in the future.  It will be important that the Center quickly provides similar
    services and value to EPA programs that can benefit from  these tools applied to non-
    endocrine disruption issues.  Plans to broaden program office representation in the
    CTISC (to include the Offices of Solid Waste and Emergency Response and
    Homeland Security, and possibly others) should quickly bring these  opportunities to


-------
                                          Computational Toxicology Subcommittee Letter Report
                                                                           Page 6 of 6

   the forefront. Discussions should proceed with Agency programs and offices dealing
   with waste management and issues surrounding remediation of contaminated sites;
   applications of environmental models to total maximum daily loads (TMDLs);
   environmental health monitoring programs such as the Environmental Monitoring
   and Assessment Program  (EMAP), various regional Bay programs (Chesapeake Bay,
   Great Lakes Program, Florida Everglades), as well as the air and water monitoring
   programs conducted by the states with federal assistance. Understanding the
   chemical and biological stressors encountered in these environmental health studies
   will broaden the types of contaminants and thus computational tools that must be
   considered by NCCT. It also will challenge applications of the Center's tools to
   issues with a broad temporal and spatial scale and provide opportunities to assess
   some dynamic aspects of human and animal populations.

In conclusion, the BOSC Subcommittee believes that NCCT has made great progress and
is on the right track to deliver against its mission. We are pleased to provide advice on
this important Center and look forward to our continuing oversight of NCCT.
Sincerely yours,
James H. Johnson, Jr.
Chair, Board of Scientific Counselors
                     Previous  I    TOC

-------
               UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
                              WASHINGTON, D.C. 20460
                                September 8, 2005
                                                                           OFFICE OF
                                                                    RESEARCH AND DEVELOPMENT
Dr. James H. Johnson, Jr.
Chair, Board of Scientific Counselors
Dean, College of Engineering, Architecture, and Computer Sciences
Howard University
2366 6th Street NW
Washington, DC 20059

Dear Dr. Johnson:

       The Office of Research and Development (ORD) would like to take this
opportunity to thank you and the members of the Board of Scientific Counselors (BOSC)
for the April 2005 progress review of the new National Center for Computational
Toxicology (NCCT). We especially thank the members of the Computational
Toxicology Subcommittee who conducted the review, Drs. George Daston (Chair),
James Clark, and Richard Di Giulio.

       We are pleased that the BOSC was very supportive of the NCCT and the direction
we are taking in this very important research program. Enclosed with this letter is our
response to the comments in your Letter Report of July 20, 2005.  Please feel free to
contact me if further information is needed.

       Again, thank you for your advice to ORD.

                                        Sincerely yours,
                                        William H. Farland, Ph.D.
                                        Acting Deputy Assistant Administrator
                                           for Science
Enclosure
cc:     Dr. George Daston (Computational Toxicology Subcommittee, Chair)
       Dr. James Clark
       Dr. Richard Di Giulio
                          Internet Address (URL) • http://www.epa.gov
        Recycled/Recyclable • Printed with Vegetable Oil Based Inks on Recycled Paper (Minimum 20% Poslconsumer)


-------
  OKD Response to Board of Scientific Counselors (BOSC) Review of the National
            Center for Computational Toxicology (NCCT) in April 2005

       The following is a narrative response to the comments and recommendations of
the BOSC review of ORD's National Center for Computational Toxicology. The review
was held April 25 - 26, 2005, in Research Triangle Park, NC. The committee considers
this to be part of a series of consultative reviews. For this review, the committee
concentrated on the NCCT's strategic goals; its collaborations, and connectedness to the
rest of the Agency and to outside scientists; its staffing plan; and its thematic choices.
They addressed a number of charge questions intended to focus on each of these areas.
Generally, the committee was very favorable to the formation of the NCCT and the
progress the Center has made since its inception a few months ago.  The committee
recognized the unique and important role for the Center because of its small size and
ability to establish strong collaborations with other groups within and outside of the
ORD. The committee emphasized the importance of collaborations and positively
commented on the number of collaborations that are already taking place. The committee
also commented favorably on the Center's four focus areas of Information Technologies,
Prioritization Tools, Biological Models, and Cumulative Risk. The committee
highlighted that the first two have the potential to address "significant issues in
toxicology..." The committee felt the NCCT has made appropriate choices in bringing
together expertise from several related disciplines to fulfill the Center's mission.

       Following are specific comments related to the charge questions made by the
committee. The committee's comments are written in italics and ORD's response
follows in regular type. Attached to this document is a summary table which provides a
summary of BOSC comments and proposed ORD actions.

I.  The first charge question asked for advice on collaboration with  other ORD
laboratories and centers and asked for advice to ensure that operations remain integrated
with those laboratories and centers.  The committee responded:

       One challenge will be to transition the  Center from a collection of experts in
       various fields to a center of excellence  in applying the broad tools of
       computational toxicology to address the human health and environmental health
       issues under the purview of EPA. Experts will need to develop procedures to
       capture the  essence of thought processes and computational tools that can be
       applied to the diversity of challenges the Agency addresses. Many of the Center
       staff will be required to shift their focus from finding computational approaches
       to address a set of specific issues to developing robust tools and procedures that
      provide computational frameworks that support ORD and Agency programs.

                    ORD strongly agrees with this comment.  The NCCT is writing an
             implementation plan that will outline the research being conducted over
             the next several years. In this plan there is a strong commitment to
             conduct work that addresses specific Agency needs. The plan will
             recognize the need for providing generic tools that will facilitate the


-------
       incorporation of computational methods into the hazard and risk
       assessment processes. An example of such a current activity is assisting in
       the compilation and web hosting of a database of parameters that can be
       used for physiologically based pharmacokinetic modeling across the life
       stages. The NCCT is  also committed to the long term goal of conducting
       two annual training workshops on topics that will help promote the use of
       computational approaches in ORD and the EPA. The plan will also
       discuss the interactions between activities within the Center and other
       components of ORD. For example, on-going work on using the newest
       "omics" technologies and bioinformatics to better identify and
       characterize pathways of toxicity  are being conducted in collaboration
       with scientists from the National Health and Ecological  Effects Laboratory
       (NHEERL). Compounds include endocrine disrupting compounds and
       pesticides, all of interest and concern to Agency program offices.
       Similarly, NCCT scientists are working with scientists from the National
       Exposure Laboratory (NERL) to use computational chemistry methods to
       better quantify the rates of key biochemical processes that are important in
       pharmacokinetic and  pharmacodynamic modeling being conducted by
       scientists with the Center and other parts of ORD.  Often the models being
       developed and applied are to address specific needs of the Program
       Offices such as for pyrethroid and N-methylcarbamate pesticides.

              Because of its small size, the Center staff has a good rapport and
       meets regularly to discuss and share their work and progress.  In these
       meetings there is a free exchange  and collaboration and interaction is
       easily promoted. Similar meetings have also been initiated with
       colleagues at the National Institute of Environmental Health Sciences
       (NIEHS). Because of the varied expertise  within these groups,  problems
       can be addressed with solutions as the goal rather than from only a
       specific narrow discipline.

              It should also  be noted that NCCT scientists work closely with
       others within the Agency on projects specifically relevant and important to
       Program Offices.  Agency scientists work directly on assessments being
       performed by the Office of Pesticide Programs (OPP) in support of the
       Food Quality Protection Act mandated re-registration program,  for
       example. NCCT scientists serve on the Agency Risk Assessment forum as
       well.  These kinds of activities help assure the NCCT will address a broad
       array of problems relevant to Agency needs.

       Not all the modeling expertise within EPA  resides within NCCT, let alone
the disciplines that rely on computational toxicology. The Center should consider
forming an informal "community of practice " within EPA that can serve a
networking function for interested scientists. This community of practice would
not be an administrative unit, but a virtual professional society within the Agency.


-------
Most of its business can be conducted via electronic media, with occasional
meetings.

              The NCCT recognizes the need for integration of computational
       efforts across ORD and has provided leadership for the formation of two
       Communities of Practice (CoP) within the EPA. One is on
       chemoinformatics and one is on biological modeling.  The goal of the
       chemoinformatics CoP is to  facilitate, coordinate and integrate efforts to
       address the challenges of chemical structure annotation (or indexing),
       retrieval, and mining of chemically-related data and documents, including
       newer toxicogenomics and metabonomics data, across EPA Program
       Offices, Labs and Centers. The goal of the biological modeling
       community of practice is to advance the principals for development and
       application of dosimetry  and other biologically based models within the
       Agency.  Dosimetry modeling includes multiple forms of toxicokinetic
       modeling (e.g., physiologically based toxicokinetic (PBTK) modeling,
       compartmental modeling), respiratory tract dosimetry modeling (e.g.,
       computational fluid dynamics), and related modeling (e.g., dermal
       absorption modeling).  The working group will also focus  on biologically
       based response modeling with special emphasis on using the newest
       "omics" information in biologically based models. A further goal is to
       foster adoption of modeling  science by Agency clients in regulatory
       decision making.

              Membership in these CoPs has initially been solicited from within
       ORD.  In the near future  this will be extended across the Agency. ORD is
       considering gaining endorsement for these groups from the Agency's
       Office of Science Policy.  It  is ORD's belief that this will extend the
       expertise and more importantly, assure that the CoPs will focus on
       important issues relevant to current and future Agency problems. These
       CoPs will operate as the committee has suggested, i.e. electronic media
       and occasional meetings.

       The Subcommittee endorses  the Center's concept of trying to develop
various personnel alignments and management tools (e.g., appointing
agency/federal/academic scientists as adjunct or associate faculty of the Center)
to help recruit or gain input from a  broader number of scientists.  Those
individuals with technical expertise  aligned with  the Center's activities can be
encouraged to contribute to NCCT activities while being housed in other
organizations within ORD, EPA, or outside of the Agency; they will form the
nucleus of the community of practice.

              The NCCT endorses this recommendation and welcomes the
       possibility of individuals from outside the NCCT doing rotational details
       to acquire training and skills in  computational toxicology.  The CoPs
       mentioned above will help create networks of individuals working towards


-------
       similar objectives.  Besides the CoPs, a number of informal alignments
       have already occurred and resulted in fruitful endeavors.  One example is
       a working relationship between NHEERL scientists and NCCT scientists
       in working contractually with a private company to investigate the
       feasibility of that company's capabilities in genomic signature
       development for screening and prioritization and for toxicity pathway
       identification. Another example is the aforementioned collaboration
       between NCCT scientists in computational toxicology contributing to a
       pharmacokinetic modeling project with NERL scientists.  Finally,
       interactions with the NTP/NIEHS are developing, as both groups have
       similar goals as identified in the NTP Roadmap and the EPA's
       Computational Toxicology Framework.

       The CTISC  [Computational Toxicology Implementation Steering
Committee] should be explicitly tasked with identifying possible partnerships and
collaborations (and of prioritizing them, if necessary). ORD should continue to
hold regular meetings of its Laboratory and Center Directors,  at which
partnerships among centers, including NCCT, can be explored.

              The  CTISC had been expanded to include the NHSRC and
       Program Office Staff; it now contains delegates from OPPTS (2), OW (2),
       OAQPS (1) and the Regions (1).  The topic of interactions between NCCT
       scientists and those  in other components of ORD has been forwarded as an
       upcoming agenda item for the CTISC.

              ORD Laboratory and Center Directors meet on  a regular basis,
       approximately once a month. Many topics including partnerships are
       regularly addressed at these meetings. In addition, the management team
       of the NCCT has scheduled monthly meetings with their counterparts in
       NHEERL and NERL and will hold quarterly meetings with the NPDs for
       Safe Pesticides and  for Human Health Research. A Memorandum of
       Agreement (MOA) has been established between the NCCT, NHEERL,
       and NERL to provide administrative support, which has provided a strong
       partnership  amongst the co-located units in RTP.

       The internal grant program that supports many of the NCCT
collaborations is important and likely to be highly successful. Future grant
programs should provide a preference for projects that collaborate with the
Center.

              The  ORD agrees that a program in which the NCCT works with
       other scientists helps coordinate a great deal of the computational
       toxicology research is very fruitful in terms of promoting collaborations.
       While another round of request for proposals has yet to be planned, the
       NCCT has committed to reserve at least 10% of its available extramural
       resources in the coming year to be used to augment or initiate


-------
              computational toxicology research within other laboratories and centers.
              Strong preference will be given to those projects in which NCCT staff will
              be involved.

       Finally, NCCT should develop a communications plan to share its
       accomplishments and capabilities with the rest of EPA and those external to the
       Agency

                    A formal communication plan will be prepared in the coming year.
              At this time, NCCT continues to look for opportunities to informally
              communicate their capabilities and accomplishments. Recently we
              established an internet homepage for the research program and have
              initiated discussions with the communications team in HQ about a broader
              scale communications effort.

II.  The second charge question asked for advice on anticipated staffing. The committee
responded:

              NCCT may wish to consider adding one or two staff who have expertise in
       bioinformatics.  The planned grant for an external bioinformatics center will
       cover most of the Center's needs in this area, but having some internal expertise
       would complement the external bioinformatics efforts and provide a natural point
       of contact between the external group and NCCT. The Center also should
       consider whether there  are social science applications to computational
       toxicology, and if so, whether there is a social science expertise that should be
       represented on the staff.

                    The NCCT has already advertised for two staff positions with
              expertise in bioinformatics. Final selection is anticipated within the next
              month. We have also requested approval to use the new Title 42 hiring
              authority to attract a more senior level bioinformatatist. In addition, two
              Environmental Bioinformatic Research Centers are being established
              through the STAR program to bolster the state of the science of informatic
              analysis in environmental health sciences. A senior position seeking
              expertise that can help develop high through-put screening and
              prioritization methods has also been advertised. Likewise, final selection
              is expected within the next month.

                    The NCCT is considering the possibility of hiring a social science
              expertise in the future. Our short term plan is to provide postdoctoral
              support to the visual analytic effort looking at children's exposure issues
              as a first foray into this area. Additionally, ORD may choose to hire such
              expertise in other programs within  the other laboratories or centers.  This
              will be considered in ORD's overall work force planning activities.


-------
                    Finally, we note that the NCCT recently hired a well known senior
              scientist as an ST to take leadership in the area of systems biology.
III. The third charge question sought advice on how to best keep apace with new
technologies and methodologies. The committee response:

       This is a problem that we all face, but is perhaps more severe for an integrating
   group such as NCCT. Partnerships with other organizations with
   similar/complementary interests may be the best way to facilitate keeping current.
   Active collaborations, which already are the stock-in-trade for the Center,
   publication, and participation in professional meetings will keep the Center staff
   fresh and well informed.  These efforts also will serve to attract the brightest students
   and post-doctoral fellows, who will bring with them the latest technologies.

             The ORD and NCCT agree fully with this comment. The staff are and
       have been actively engaged in such activities both nationally and internationally
       (e.g., ILSI, WHO, OECD).  They also look for training opportunities. Resources
       are maintained for travel and training. Recently the Center selected a candidate
       for the cross-ORD post-doctoral program. This highly qualified candidate will be
       working with a senior scientist from the NCCT and one from the NERL on a
       research very relevant to computational toxicology.

IV. The fourth charge question asked if the NCCT articulated a clear rationale for its
concept topic areas of research.  The committee response:

             The Subcommittee members believe that NCCT is on track.  It will be
       important for the Center to prepare a synthesized set of goals/milestones for the
       numerous projects in which the Center is involved, explaining how each fulfills a
       need, and how each topic area will provide tools for the Agency. The
       prioritization process that the Center leadership has developed is a good one,
       which works well in selecting program areas that are consistent with the Center's
       mission.

                    The NCCT appreciates the comments and the staff is currently
             preparing a research implementation plan that will address goals,
             rationale, and milestones over the next three years.  This plan is expected
             to be ready for review during September, 2005. An important component
             of this implementation plan is the launching of the ToxCast program,
             which is being designed to establish a process for the prioritization of
             chemicals for toxicological testing, one of the key driving forces for the
             inception of the computational toxicology program.

V. The next charge question asked the committee to help identify potentially fruitful
partnerships with others outside the Agency.  The response:


-------
       The review provided plenty of evidence that the Center is reaching out to
find potential collaborators among a diverse set of U.S. government and private
institutions. Many of the collaborations discussed should be formalized in
Memoranda of Understanding (MOU), Interagency Agreements (lAGs), and other
formal commitments to demonstrate  the degree of cooperation, leverage, and
interest generated with other partners. Also, NCCTwill need to have
opportunities to work with scientists and regulatory authorities from countries
around the world, as computational  toxicology is an area of evolving science with
expertise in Europe, Canada, Asia, perhaps Russia, as well as the United States.
       One approach to broaden international contacts would be to consider
development of ties with U.S.-based  academic centers and institutions that have
liaisons with international scientists and organizations. Also, Center
management may want to specifically reserve some travel allocations to allow
attendance at conferences, workshops, or technical exchanges and site visits at
leading international sites and organizations around the world. A world-class
center will need worldwide perspectives in computational toxicology.

       NCCT already is doing a good job of establishing liaisons with other
organizations involved in aspects of computational toxicology, such as the
National Center for Toxicogenomics at the National Institute of Environmental
Health Sciences (NIEHS).  Efforts should be continued to partner with private
industry in areas of mutual interest.

              Since the review in April the NCCT staff has visited programs in
       Russia seeking opportunities for collaboration.  Although not yet fruitful
       several promising areas were identified and are being pursued.

              The recommendation to establish ties with U.S. based academic
       centers that have liaisons with international scientists is a good one and the
       Center will investigate such possibilities. Also, the NCCT staff will  be
       working closely with the STAR program Bioinformatics Centers. A one
       day workshop was held in May 2005 with DOEs Pacific Northwest
       National Laboratory to develop communication links and begin to identify
       areas of collaboration between the two organizations.

              The NCCT management, as mentioned previously, has been and
       will continue to be careful about reserving sufficient resources to allow
       staff the ability to attend and present at conferences, workshops, etc.  In
       addition, the Center is planning a program of specific topic workshops to
       be conducted at the EPA and at national and international meetings of
       professional societies. The NCCT scientists are considering formulating
       and teaching courses in relevant areas. This will serve to show Center
       capabilities and extend the exchanges between experts from throughout
       the world and Center staff.


-------
                    Center staff are also actively involved with a number of activities
             of ILSI, the WHO and OECD and will have made more than a dozen
             presentations this year at international meetings specifically related to
             various aspects of computational toxicology. These presentations have
             helped communicate the formation of the NCCT to the international
             scientific community.

VI. The final charge question asked the committee to comment on the depth and breadth
of the emphasis areas and whether they recommended other areas for consideration. The
responses:

             The Subcommittee members believe that the Center is doing a good job of
       maintaining broad coverage through its collaborations with multiple
       laboratories. Depth will come from the other laboratories and programs with
       which NCCT collaborates.

                    Collaboration with other laboratories and centers is a centerpiece
             of NCCT's mode of operation and has been discussed in responses to
             previous comments.

             The Center's goal to take advantage of opportunities to broaden and
       generalize the technical approaches to the diverse scope of Agency issues is an
       admirable goal, and one that will require a disciplined approach among the
       technical and managerial team to implement. The Subcommittee realizes that the
       endocrine disruptor studies offer many concrete examples of the kind of
       molecular and cellular work the NCCT can provide in the future. It will be
       important that the Center quickly provides similar services and value to EPA
       programs that can benefit from these tools applied to non-endocrine disruption
       issues. Plans to broaden program office representation in the CTISC (to include
       the Offices of Solid Waste and Emergency Response and Homeland Security, and
       possibly others) should quickly bring these opportunities to the forefront.
       Discussions should proceed with Agency programs and offices dealing with waste
       management and issues surrounding remediation of contaminated sites;
       applications of environmental models to total maximum daily loads (TMDLs);
       environmental health monitoring programs such as the Environmental
       Monitoring and Assessment Program (EMAP), various regional Bay programs
       (Chesapeake Bay, Great Lakes Program, Florida Everglades), as well as the air
       and water monitoring programs conducted by the states with federal assistance.
       Understanding the chemical and biological stressors encountered in these
       environmental health studies will broaden the types of contaminants and thus
       computational tools that must be considered by NCCT. It also will challenge
       applications of the Center's tools to issues with a broad temporal and spatial
       scale and provide opportunities to assess some dynamic aspects of human and
       animal populations.


-------
                    As noted above, membership in the CTISC has been expanded.
              The Center in particular has extensive work dealing with pesticides and
              high production volume chemicals that include substances other than
              endocrine disrupters.  Discussions with parts of the Agency recommended
              by the committee have begun and will be expanded and continued in the
              coming year. At this time however it is not clear what role the NCCT
              itself will take in some of the more ecologically related areas mentioned
              by the Committee. Given the small size of the NCCT and the fact that
              some of those activities are well represented in the other laboratories and
              centers these areas may be addressed through collaborative efforts and
              temporary assignments of those scientists to the NCCT. However, this
              needs further discussion within ORD.

                    Based upon the urgent needs to develop a prioritization and
              categorization process for evaluating the large numbers of chemicals for
              which standard toxicological studies are not available, the NCCT will
              soon be launching the ToxCast project. This effort, to some extent, builds
              on the activities of the EDC proof of concept projects in that it will be
              using a variety of computational and molecular tools to collect biological
              activity patterns using high throughput screening devices.  If successful,
              this concept will provide multiple programs offices with a solution to a
              vexing problem. We are now involved in a number of briefings and
              presentations across ORD, the Program Offices (e.g., OPPTS and OW) in
              order to build a consensus about the overall program and to fine tune the
              directional details. Completion of our staffing targets that are scheduled
              for this year will greatly facilitate our ability to broaden beyond the efforts
              presented to the BOSC during the April review.

       Recognizing that this review was a progress review early in the life of the NCCT
it is expected that subsequent reviews by the Committee will take place.  The next
progress review is expected late 2006 or early 2007.


-------
Computational Toxicology Program
Summary of BOSC Comments From July 2005 Letter Report and Proposed ORD
Actions
      Recommendation
           Action Items
  Timeline
Charge Question 1, Advice on Collaboration:
Shifting focus from finding
computational approaches to
address a set of specific issues
to developing robust tools and
procedures that provide
computational frameworks that
support ORD and Agency
programs
An Implementation Plan for the
NCCT is being written that
incorporates all facets of the
Computational Toxicology Program
including research, outreach, and
operations.  The plan will recognize
the need for providing generic tools
that will facilitate the incorporation of
computational methods into the
hazard and risk assessment processes.

The NCCT  is also committed to the
long term goal of conducting two
annual training workshops on topics
that will help promote the use of
computational approaches in ORD
and EPA.
September,
2005
                                                                  Current and
                                                                  on-going
Form an informal "community
of practice " within EPA that
can serve a networking function
for interested scientists
The NCCT recognizes the need for
integration of computational efforts
across EPA.  Two such Communities
of Practice have been initiated - one
for chemoinformatics and one for
biological modeling.	
Expect first
meetings by
October 30,
2005
Develop various personnel
alignments and management
tools to help recruit or gain
input from a broader number of
scientist
The NCCT welcomes the opportunity
for staff from other units of ORD or
EPA to have rotational details for the
purpose of acquiring training and
experiences in computational
methods.   The Communities of
Practice offer other means of gaining
input from a broader range of
scientists.  The weekly work in
progress meeting with the
NTP/NIEHS offers yet another input
function
Current and
on-going
 The CTISC should be explicitly
 tasked with identifying possible
partnerships and collaborations
 (and of prioritizing them, if
The CTISC had been expanded to
include the NHSRC and Program
Office staff.  The topic of interactions
between NCCT scientists and those in
Late FY
2005/early FY
2006
                                   Page 1 of4




-------
      Recommendation
            Action Items
  Timeline
necessary
 other components of ORD has been
 forwarded as an upcoming agenda
 item for the CTISC.
ORD should continue to hold
regular meetings of its
Laboratory and Center
Directors, at which
partnerships among centers,
including NCCT, can be explore
 ORD Laboratory and Center
 Directors meet on regular basis
 approximately once a month.  In
 addition, the management team of the
 NCCT has scheduled monthly
 meetings with their counterparts in
 NHEERL and NERL on a monthly
 basis, and has agreed to hold at least
 quarterly meetings with the NPDs for
 Safe Pesticides and for Human Health
 Research.
On going
Future [internal}grant programs
should provide a preference for
projects that collaborate with
the Center.
 The NCCT has committed to reserve
 at least 10% of its available
 extramural resources in the coming
 year to be used to augment or initiate
 computational toxicology research
 within other Laboratories and
 Centers.  Strong preference will be
 given to those projects in which
 NCCT staff will be involved.
FY 2006 and
beyond
NCCT should develop a
communications plan to share
its accomplishments and
capabilities with the rest of EPA
and those external to the
Agency	
 Formal communications plan for
 NCCT to be developed and
 implemented; an internet homepage
 has been established.
FY 2006
Charge Question 2, advice on anticipated staffing:
NCCT may wish to consider
adding one or two staff who
have expertise in
bioinformatics
Two such positions have been
advertised and selection process is on-
going.  We have also requested
approval to use the new Title 42 hiring
authority to attract a more senior level
bioinformaticist.
Final
selection by
October 1,
2005
The Center also should
consider whether there are
social science applications to
computational toxicology,
and if so, whether there is a
social science expertise that
should be represented on the
staff	
 The NCCT recognizes the importance
 of this research activity and it will be
 considered by NCCT and others in
 ORD's overall work force planning
 activities.  The short term plan is to
 provide postdoctoral support to the
 visual analytic effort looking at
 children's exposure issues as a first
FY 2006
                                   Page 2 of4




-------
Recommendation

Action Items
foray into this area.
Timeline

Charge Question 3, advice on how to keep apace with new technologies and
methodologies:
Consider partnerships with
other organizations with
similar /complementary
interests to facilitate keeping
fresh; , publication, and
participation in professional
meetings will also keep the
Center staff fresh and well
informed
Staff are actively engaged in National
and International activities (ILSI,
WHO, OECD) and meetings -
resources are set aside for such
activities
On-going
Charge Question 4, has NCCT articulated a clear rationale for topic research areas:
Prepare a synthesized set of
goals/milestones for the
numerous projects in which
the Center is involved,
explaining how each fulfills a
need, and how each topic
area will provide tools for the
Agency
NCCT is developing research
implementation plan. This plan will
articulate the particular directions and
expected milestones of the research
program over the next three years.
First draft -
September,
2005
Charge Question 5, identification of fruitful partnerships with others outside the Agency
NCCT will need to have
opportunities to work with
scientists and regulatory
authorities from countries
around the world, as
computational toxicology is
an area of evolving science
with expertise in Europe,
Canada, Asia, perhaps
Russia, as well as the United
State
Consider development of ties
with U. S. -based academic
centers and institutions that
have liaisons with
international scientists and
organization
Center management may
The NCCT staff are involved in a
number of ongoing international
efforts, including those with ILSI, the
WHO and OECD. In addition,
NCCT staff recently visited Russia to
develop potential working
partnerships and we are now working
through OSP/ORD and ISTC/Russia
to develop several research proposals
in computational toxicology
The newly established STAR Centers
for Environmental Bioinformatics
should provide a logical starting place
for interactions with academic
institutions. A one day workshop was
held in May 2005 with scientists from
PNNL looking to develop a
collaborative relationship.
NCCT agrees and has and will
Ongoing
Continued
discussions
with
collaborative
proposals
developed
early in FY
2006
Immediate
and
continuing;
selection for
new staff for
bioinformatics
expected by
Oct 1,2005.
On-going
Page 3 of4




-------
       Recommendation
           Action Items
  Timeline
want to specifically reserve
some travel allocations to
allow attendance at
conferences, workshops, or
technical exchanges and site
visits at leading international
sites and organizations.
continue to reserve sufficient
resources to allow staff participation
at conferences, workshops, etc. With
our current budgetary situation, we do
not foresee any difficulty in
supporting this function.

NCCT is also preparing a program of
specific topic workshops
                                                                    Expected in
                                                                    2006 and then
                                                                    continuing
Charge question 6, depth and breadth of emphasis areas and other possible areas of
consideration:
 The Subcommittee members
 believe that the Center is
 doing a good job of
 maintaining broad coverage
 through its collaborations
 with multiple laboratories.
 Depth will come from the
 other laboratories and
programs with which NCCT
 collaborates
The NCCT appreciates the positive
feedback and will continue to develop
collaborations that will allow delivery
of important products to the Agency
over the next 3-5 years.
Ongoing
 The Subcommittee realizes
 that the endocrine disruptor
 studies offer many concrete
 examples of the kind of
 molecular and cellular work
 the NCCT can provide in the
future. It will be important
 that the Center quickly
provides similar services and
 value to EPA programs that
 can benefit from these tools
 applied to non-endocrine
 disruption issues.	
Launching of the ToxCast concept
that builds on the activities of the
EDC proof of concept projects in that
it will be using a variety of
computational and molecular tools to
collect biological activity patterns
using high throughput screening
devices to prioritize and categorize
chemicals for more standard
toxicological evaluation.
FY 2006
                                    Page 4 of4




-------
               BOARD OF SCIENTIFIC  COUNSELORS
                         December 12, 2006

                         Dr. George Gray
                         Assistant Administrator
                         Office of Research and Development
                         U.S. Environmental Protection Agency
                         1200 Pennsylvania Avenue, NW
                         Washington, DC 20460

                         Dr. Robert Kavlock
                         Director
                         National Center for Computational Toxicology
                         Office of Research and Development
                         U.S. Environmental Protection Agency
                         Research Triangle Park, NC 27711

                         Dear Drs. Gray and Kavlock:

                         This is a letter report from the Board of Scientific Counselors (BOSC) reviewing
                         the Computational Toxicology Research Program conducted by the National
                         Center for Computational Toxicology (NCCT).  The Computational Toxicology
                         Subcommittee  of the BOSC reviewed NCCT's progress and plans during a 2-
                         day meeting on June 19-20, 2006,  at the EPA facility in Research Triangle Park,
                         North Carolina. The BOSC Subcommittee consists of George Daston (Chair),
                         James Clark, Michael Clegg, Richard DiGiulio, Muiz Mumtaz, and John
                         Quackenbush.

            ,,!            This is the second review of the NCCT. The first review of the Center was
                         conducted in May 2005. The Subcommittee was very pleased with the progress
                         that the NCCT has made towards its goals since that first review. The Center
                         first became operational in February 2005; during the 16 months between its
                         establishment and this review, the  NCCT has made substantial progress  in:
                         (1) establishing priorities and goals; (2) making connections within and outside
                         EPA to leverage the staffs considerable modeling expertise; (3) expanding its
                         capabilities in informatics; and (4) significant contributions to research and
                         decision-making throughout the Agency.

                         Many of the recommendations made by the BOSC during its first review have
                         been acted on by NCCT. This includes expanding its capabilities in
                         bioinformatics  through the funding of two external centers and through staff
                         hires, expansion of its technical approaches to even more programs within the


A Federal Advisory Committee for the U.S. Environmental Protection Agency's Office of Research and Development
                              Previous  I    TOC     I    Next

-------
                            December 2006 BOSC Computational Toxicology Letter Report
                                                                               2

Agency, and the development of communities of practice (CoPs) throughout the EPA
research community in chemoinformatics, biological modeling, and chemical
prioritization. CoPs are cross-organizational groupings of experts who share an interest in
a common technology.

The Subcommittee addressed a number of charge questions during its review, the
responses to which provide a basis for comments on progress as well as specific
recommendations.

Question 1: What progress has been made in the last year in developing/maximizing
connections and collaborations within ORD and the Agency, through communities of
practice and other interactions?  Are there notable examples of collaborations that have
been established to increase the reach and effectiveness ofNCCT? Are there additional
collaboration opportunities NCCTshould explore?

During the review, the Subcommittee members heard reports on three active CoPs:
(1) Chemoinformatics; (2) Biological Modeling; and (3) Chemical Prioritization. They
also heard about one proposed CoP, Cumulative Risk.

All active CoPs have formal memberships and are chaired by NCCT staff. The Center
also has observed active participation among numerous EPA laboratories and centers and
several program offices.  The Chemoinformatics and Chemical Prioritization CoPs
already have demonstrated outreach to outside agencies, such as the National Institutes of
Health (NIH), National Institute of Environmental Health Sciences (NIEHS), and
National Toxicology Program (NTP).  Some are working with or soliciting international
and private sector collaboration. The CoPs have been effective in focusing on defining
problems and suggesting solutions, agreeing on modeling approaches and database
issues, and setting up forums and workshops for discussions. They will be responsible
for leading a better coordinated effort within EPA and among agencies.

The Subcommittee believes  that establishing a Cumulative Risk CoP is worthy of pursuit.
Such a CoP would provide significant opportunities to define areas for  improvement in
risk assessment practices and could provide  inventory tools and other benefits.  NCCT
should consider whether it would like to provide a facilitator role or leadership role in
this area.

With regard to other opportunities for exploration, the Subcommittee suggests that NCCT
seek broader program office input. Additionally, CoPs covering areas such as Mixtures,
Cross-Species Extrapolation, Population/Systems Dynamic Models, and Multimedia Fate
and Effects Modeling should be considered for either NCCT use or ORD's broader use.

Question 2: How does the work of the new extramural bioinformatics centers
complement the intramural program, and how should the outputs best be integrated into
NCCT strategic direction?

ORD funded two extramural Bioinformatics Centers, one at the University of North
Carolina directed by Fred Wright and a second at the University of Medicine and
Dentistry of New Jersey headed by William Welsh.  The Centers are used to extend the


-------
                            December 2006 BOSC Computational Toxicology Letter Report
                                                                                3

capabilities of the intramural program. Individually, the Bioinformatics Centers were
viewed to be excellent choices, each providing expertise and resources largely
complementary to each other and to the NCCT with little overlap. Although both Centers
are just beginning their work with EPA, there is great opportunity for synergy in
developing new approaches for the analysis of toxicogenomics data and the integration of
diverse information necessary to place these data into an appropriate context. In addition,
each Center has existing links to risk managers and risk management groups, providing
additional potential avenues for outreach to link the research programs of the NCCT and
the Centers to real problems.

Integration of the external Bioinformatics Centers and the programs within NCCT will
occur following the hiring of one senior and one junior bioinformatics scientist. This
may not represent sufficient personnel, however, to allow NCCT to  fully support its
overall mission.  Much of NCCT's program is focused on development of predictive
models using systems biology approaches.  Although this is a laudable approach, it
ultimately will be driven by the availability of high-quality, well-annotated data and their
integration with a wide range of other information. This will require significant effort.
Although there are efforts underway under the direction of various NCCT personnel to
begin this process, a more integrated approach is needed.

Consequently, NCCT needs to develop a more comprehensive strategic plan for data
collection, management, and integration through creation  of databases that model the
structure of the underlying information and its potential use. This will require a careful
assessment of the capabilities extant in each center so that necessary components, as well
as areas for future development, can be identified. Addressing these issues will provide
the structured data needed by NCCT's Systems Modeling and Computational Chemistry
groups.

It also was noted that there exists a need within the field for trained personnel in
computational toxicology.  In addition to the existing postdoctoral program, one feasible
approach would be to institute a career development award similar to the NIH "K"
awards that would provide mentored training and research to more senior personnel.

Question 3: Although the intent is not to review individual research programs, do the
research programs highlighted during this review offer the promise of increasing the use
and effectiveness of computational methods in Agency research? Do the efforts fulfill the
goal of leveraging the resources of NCCT to increase effectiveness?

The long-term goals (LTGs) of the Computational Toxicology Research Program are to
provide risk assessors with: (1) improved methods to understand the source-to-response
continuum, (2) advanced hazard characterization tools for prioritization and screening,
and (3) methods that enhance dose-response assessment and quantitative risk assessment.
The research efforts that were highlighted as part of the review cover each of these LTGs,
and have the potential to be broadly used within and outside the Agency.  This  included
efforts in high-throughput screening (HTS), modeling of molecular interactions with
biological targets, modeling of complex pharmacokinetic  and pharmacodynamic
behaviors of small molecules, and database development and management, among others.
The portfolio provided a mix of short- and long-term deliverables.  Many of the former
stand a good chance for application within program offices or other parts of ORD within


-------
                             December 2006 BOSC Computational Toxicology Letter Report
                                                                                4

months.  The research programs included those from external institutions. The
Subcommittee found that NCCT has effectively leveraged its limited resources.

One of the major aims of NCCT is to develop useful relational databases.  This also
presents a significant challenge in managing the information. The Center should develop
a strategic plan for data integration and for constructing databases that should be
considered as information models.

Question 4: Because a large part of the mission of NCCT is to accelerate the use of
computational tools in the mission of the Agency, please comment on:

-Y-  Part A:  Do the proposed computational models have the potential to identify and
    reduce uncertainties associated with the risk assessment process?

Yes, proposed computational models have the potential to identify and reduce
uncertainties associated with risk assessment.  Additional opportunities outside the
mechanistic models (especially in biomarkers that indicate exposure but that are not
immediately or directly linked to toxicological response) may exist to fulfill NCCT's
mission.

-Y-  Part B:  Will these models  be able  to help identify susceptible populations and
    compare potential risks to those populations with risks to the general and less
    susceptible population?

Ultimately, these and other models within NCCT and outside the Agency can help
identify susceptible populations.  Appropriately, models currently are being developed
for use in computational toxicology. Within 3-5 years, some of these models likely will
be sufficiently developed and validated to address susceptibility. "Susceptible
populations" may be defined to include life stages, gender, race, socioeconomic group,
species, and geographic distribution.

-Y-  Part C:  Is the coordination between model development and associated data
    collection sufficient to avoid problems with the models being either over- or under-
    determined?

Overall, data collection appears appropriately coordinated with model development. It
will be important to validate models based on genomic methodologies given the inherent
constraints in sample sizes, and other challenges, with these approaches.

Question 5: Please comment on the Computational Toxicology Implementation Plan,
focusing on the NCCT and Science To  Achieve Results (STAR) components. Does it set
an achievable road map for accomplishing NCCT's major goals over the next 3 years, as
described in "A Framework for Computational Toxicology Research Program "?  Does it
set realistic and relevant milestones, and clearly articulate projected program outputs
that will result in environmental outcomes?
The Implementation Plan consists of five research tracks that are intended to fulfill three
long-term goals:


-------
                            December 2006 BOSC Computational Toxicology Letter Report
                                                                                5
1.  EPA risk assessors use improved methods and tools to better understand and describe
   linkages across the source-to-outcome paradigm;

2.  EPA program offices use advanced hazard characterization tools to prioritize and
   screen chemicals for toxicological evaluation;

3.  EPA risk assessors and regulators use new models based on the latest science to
   reduce uncertainties in dose-response assessment, cross-species extrapolation, and
   quantitative risk assessment.

The research tracks that will support these long term goals are: (1) development of data
for advanced biological models; (2) information technologies development and
application; (3) prioritization method development and application; (4) providing tools
and system models for extrapolation across dose, life stage,  and species; and
(5) advanced computational toxicology approaches to improve cumulative risk
predictions.

Each of the research areas is active. Tables  1 and 2 of the Center's Implementation Plan
provide details of projects and the outputs/outcomes and expected impacts of the projects.
NCCT has a core strength in modeling, and is expanding its expertise in informatics.  The
Center is leveraging its position by outreach to other EPA laboratories and programs via
internal research funding and communities of practice, and externally via STAR grants,
including the external bioinformatics centers. The addition  of the informatics centers in
particular strengthens NCCT's research in information technologies. This will be
strengthened further through the hiring of NCCT staff with informatics expertise. The
STAR grants greatly expand NCCT's capacities in the generation of high-information-
content data sets that will be needed to support model development.

Some challenges remain that will need to be overcome in the areas of database
development and management. More details are provided in our response to Question 2.
This will be especially important in the development and demonstration of biological
models derived from complex data sets. The Center is encouraged to do whatever it can,
within the boundaries of the grant process, to foster coordination of efforts between the
two external bioinformatics centers and NCCT's internal program.

The research has milestones with nearer term and longer term time horizons, which is
appropriate. It is clear that chemoinformatics tools and prioritization tools are well
underway and are likely to be applied by risk assessors and regulators within the next few
years.  The timelines are realistic and the milestones will provide practical tools and
methods to program offices. In the shorter term, information databases such as DSSTox
and prioritization models such as ToxCast will be important tools for the pesticides and
toxic substances programs, and will demonstrate the utility of computational toxicology
in  an applied setting. In the longer term, biological models  such as the virtual liver, will
improve mechanistic understanding of toxicological response and provide support for
mechanism-based risk assessment.


-------
                            December 2006 BOSC Computational Toxicology Letter Report
                                                                               6

The BOSC recommends that the NCCT develop a more detailed work plan for the virtual
liver model, and that this plan be more extensively reviewed by the Computational
Toxicology Subcommittee during its next annual review.

Question 6: Please comment on the progress made in the five major research  track
thematic areas of the Computational Toxicology Research Program, and whether the
current/planned research will address the major goals in the framework. The Center has
made staffing additions and initiated new research over the past year. Based on these
changes, what is the Subcommittee 's view of the depth and breadth of the areas selected
for emphasis?

The Subcommittee believes that the research program covers the range of thematic areas.
Some areas, however, have deeper coverage than others.  The areas of cumulative risk
assessment and cross-species extrapolation are still under-represented, but given the state-
of-the-science, it is appropriate to place limited emphasis and continue to leverage
research outside the Agency in these areas for the next 3-5 years. The staffing  additions
in HTS, toxicogenomics, and biological modeling are all  strong and have improved the
strength and breadth of NCCT. The planned staff additions in bioinformatics will be
critical to the continuing success of the Center.  One of these additions should have
strong skills in data management systems.

Question 7: What evidence exists that NCCT is responsive to program office and
regional needs?

Most of the presentations addressed program office input in planning priorities and
approaches.

Some projects formed to support program office issues, such as carbamate cumulative
risk, DSSTox, and RefTox DB.  The Subcommittee noted program office and regional
office staff as co-principal  investigators on various projects.  The Implementation Plan
references a role for the Computational Toxicology Implementation and Steering
Committee (CTISC), which could be useful, if sustained.

Question 8: Please comment on how  effectively NCCT is communicating its research
program to EPA program offices, regional offices, and other stakeholders to inform their
environmental decision making.

NCCT has components of both a research and service center—it both initiates and
receives new ideas.  For a young organization, NCCT has done very well in establishing
communication with its collaborators, contractors, and some stakeholders.  The
establishment of CoPs and participation of internal clients is a good start to
communication within the Agency. Also of note is NCCT's establishment of monthly
videoconference presentations. Within the past year, NCCT has commendably given 21
presentations to various offices within EPA to raise awareness.  Most of the other
communication activities seemed to be investigator-initiated. Given that the Center plans
to develop tools and methods that will be used by ORD and other EPA staff, NCCT
should establish a regularly scheduled plan for communication and updates. This process
will convey the sense that new ideas are welcomed by NCCT and allow the Center to
accept ideas and be aware of the needs of the program offices, regional offices, and


-------
                            December 2006 BOSC Computational Toxicology Letter Report
                                                                              7

stakeholders.  The establishment of such a process will enhance the marketing of tools
and methods developed by NCCT.  One way to give Agency clients part ownership in the
Center is to invite them to BOSC reviews, such as this one, and ask them to share how
they are using NCCT's methods, tools, and information.  The Subcommittee recommends
that NCCT communicate with the Regional Risk Assessor's Office and seek its
representation.

Question 9: Is the current research program designed to achieve environmental
outcomes? Please provide recommendations on how the NCCT can best measure these
outcomes.

The current program is designed to achieve environmental outcomes that are appropriate
to the Agency. Potential measures to determine these outcomes include:

-Y- Use  of screening models for chemical prioritization.

-Y- Validation and use of genomics-associated biomarkers in field studies.

-Y- Use  of computational models in the risk assessment process in the long term.

-Y- Success of databases (DSSTox, pesticides) in cleaning up and organizing disparate
   databases and making them widely useful to environmental science and regulatory
   communities.

-Y- Use  of specific models (such as virtual liver, pyrethroid metabolism, macromolecular
   modeling, physiologically based pharmacokinetic (PBPK) models, steroidogenesis
   models, cumulative risk models, and so forth, by broader environmental science and
   risk  assessment communities.

In conclusion, the Computational Toxicology Subcommittee of the BOSC believes that
NCCT is making exceptional progress towards its mission. We are pleased to provide
advice on this important Center and look forward to future opportunities to offer
suggestions for improving the NCCT.

Sincerely,
James R. Clark
Chair, Board of Scientific Counselors
                    Previous  I    TOC

-------
             UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
                           WASHINGTON, D.C. 20460


                                JAN 1 9  2007
Dr. James R. Clark                                                      OFFICE OF
m. •  r>    J  fv  •*•<-,-.     1                                   RESEARCH AND DEVELOPMENT
Chair, Board of Scientific Counselors
Exxon Mobil Research & Engineering Co.
3225 Gallows Road, Room 3A412
Fairfax, VA 22037

Dear Dr.  Clark:

      The Office of Research and Development (ORD) would like to take this
opportunity to thank you and the members of the Board of Scientific Counselors
(BOSC) for the June, 2006 progress review of the National Center for
Computational Toxicology (NCCT). We especially thank the members of the sub-
committee who conducted the review, George Daston (Chair), James Clark,
Michael Clegg, Richard DiGiulio, Muiz Mumtaz, and John Quackenbush.

      Enclosed with this letter is ORD's response to the comments and
recommendations on the NCCT in your letter report of December 12, 2006. Please feel
free to contact me if further information is needed.

      We are pleased that the BOSC was very supportive of the NCCT and the direction
we are taking in this very important research program.

      Again, thank you for your advice to ORD.
                                 Sincerel
                                       Teichman, Ph.D.
                                 Acting Deputy Assistant Administrator
                                    for Science
Enclosure

cc:     Dr. George Datson
       Dr. Michael Clegg
       Dr. Richard DiGiulio
       Dr. Muiz Mumtaz
       Dr. John Quackenbush
                        Internet Address (URL) • http://www.epa.gov
   Recycled/Recyclable • Printed with Vegetable Oil Based Inks on Recycled Paper (Minimum 50% Postconsumer content)




-------
                                     •
                                      ~r
                                      u
                          \
                                  "
   Office of Research and Development's (ORD) January 2007
Response to the Board of Scientific Counselors (BOSC) December
2006 Final Letter Report that Reviews ORD's National Center for
                   Computational Toxicology
                BOSC Computational Toxicology Subcommittee:
                     Dr. George Daston (Chair)
                     Dr. James R. Clark
                     Dr. Michael Clegg
                     Dr. Richard DiGiulio
                     Dr. M. Moiz Mumtaz
                     Dr. John Quackenbush
                     Submitted by:
                     Dr. Robert Kavlock
                     Director
                     National Center for Computational Toxicology
                     Office of Research and Development
                 Previous  I   TOC   I    Next

-------
January 2007 ORD Response to BOSC December 2006 Computational Toxicology Final
                                  Letter Report

ORD Response to Board of Scientific Counselors (BOSC) June 2006 Review of the
National Center for Computational Toxicology (NCCT)

The following is a narrative response to the comments and recommendations of the
BOSC review of ORD's National Center for Computational Toxicology that was held on
June 19-20, 2006, in Research Triangle Park, NC. The review was conducted by a
standing subcommittee of the BOSC.  The subcommittee had previously reviewed the
NCCT on April 25-26, 2005, and ORD had responded to that review on September 8,
2005. In the second review, the BOSC noted that in its 16 months of existence, "NCCT
had made substantial progress in (1) establishing goals and priorities; (2) making
connections within and outside EPA to leverage the staff s considerable modeling
experience; (3) expanding its capabilities in informatics; and (4) significant contributions
to research and decision-making throughout the Agency." Furthermore, they noted,
"many of the recommendations made by the BOSC during its first review have been
acted on by NCCT."

Following are specific comments related to the charge questions made by the committee.
The charge questions are summarized in bold text, followed by the BOSC's comments in
italics, and ORD's response to the comments in regular type. Attached to this document
is a summary table of the BOSC comments and proposed ORD actions.

1. The first charge question asked  for an evaluation of progress the Center made
   during the past year in developing and maximizing connections and
   collaboration within ORD and the rest of the Agency. Specifically, the
   committee was asked about interactions, including the established Communities
   of Practice (CoPs) and other notable examples and if there are other
   opportunities that NCCT should explore.

   All active CoPs have formal memberships and are chaired by NCCT staff. The Center
   also has observed active participation among numerous EPA laboratories and
   centers and several program offices. The Chemoinformatics and Chemical
   Prioritization CoPs already have demonstrated outreach to outside agencies, such as
   the National Institutes of Health (N1H), National Institute of Environmental Health
   Sciences (N1EHS), and National Toxicology Program (NTP).  Some are working with
   or soliciting international and private sector collaboration. The CoPs have been
   effective in focusing on defining problems and suggesting solutions,  agreeing on
   modeling approaches and database issues,  and setting up forums and workshops for
   discussions. They will be responsible for leading a better coordinated effort within
   EPA and among agencies.  The Subcommittee believes that establishing a Cumulative
   Risk CoP is worthy of pursuit. Such a CoP would provide significant opportunities to
   define areas for improvement in risk assessment practices and could provide
   inventory tools and other benefits. NCCT should consider whether it would like to
   provide a facilitator role or leadership role in this area.

   Response: ORD appreciates the Committee's recognition of NCCT's current efforts.
   Also, ORD agrees on the importance of pursing the formation of other relevant CoPs.


-------
January 2007 ORD Response to BOSC December 2006 Computational Toxicology Final
                                  Letter Report

   Due to the small size of the NCCT staff, ORD is concerned these could over tax the
   staff. NCCT is committed to supporting the three existing CoPs (Chemoinformatics,
   Chemical Prioritization, and Biological Modeling) and will take steps to ensure their
   vitality. For other CoP ideas, ORD will look across all the Labs and Centers to see if
   relevant similar work groups and committees are already established and could be
   amended to address such issues, or encourage the establishment of ones for which no
   precedent can be formed.  In particular, we have further considered a CoP centered on
   Cumulative Risk, as proposed during the review. While still favoring the idea, we
   have realized from the activities of the existent CoPs that they function best when
   aligned along a well defined issue and have a commonly identified goal. For
   Cumulative Risk, our current opinion is the issue and goal of a dedicated CoP must be
   better refined, and we hope to work with other Agency scientists to foster this
   refinement and development.

   With regard to other opportunities for exploration, the subcommittee suggested
   NCCT seek broader program office input.  Additionally, CoPs covering areas such as
   Mixtures, Cross-Species Extrapolation, Population/Systems Dynamic Models, and
   Multimedia Fate and Effects Modeling should be considered for either NCCT use or
   ORD's broader use.

   Response: ORD also agrees with the Committee's recommendation of broader
   program office input. The Center is in the process of increasing the number and
   frequency of contacts and meetings with Program and Regional Offices. Formal
   presentations are often part of those contacts.  For example, we have recently given
   overview presentations of the program to EPA's Science Policy Council and Regional
   Risk Assessors, both of which drew considerable interest.  In addition, we are
   scheduled to present a detailed overview of the ToxCast program to Office of
   Prevention, Pesticides, and Toxic Substances (OPPTS) on February 1,  2007, in
   Washington, D.C. Finally, the upcoming International Forum on Computational
   Toxicology being organized by NCCT on behalf of ORD is expected to provide
   opportunities for contact within and outside the Agency.  NCCT is preparing and
   executing several means to better communicate progress, outputs, and  abilities to the
   rest of the Agency, and in particular we are working on ways to improve the content
   of our internet site. NCCT staff are also organizing a series of short courses in the
   field of computational biology for Agency staff and others as well.

2.  The second charge question dealt with interactions with the two newly funded
   STAR Environmental Bioinformatics Seminars.

   Individually, the Bioinformatics Centers were viewed as excellent choices, each
   providing expertise and resources largely complementary to each other and to the
   NCCT with little overlap.  Although both Centers are just beginning their work with
   EPA, there is great opportunity for synergy in developing new approaches for the
   analysis of toxicogenomic data and integration of diverse information necessary to
   place these data into an appropriate context. Integration of the external
   Bioinformatics Centers and the programs within NCCT will occur following hiring of


-------
January 2007 ORD Response to BOSC December 2006 Computational Toxicology Final
                                  Letter Report

   one senior and one junior bioinformatics scientist. This may not represent sufficient
   personnel, however, to allow NCCT to fully support its overall mission.

   Response: NCCT is continuing to work with ORD's National Center for
   Environmental Research (NCER) to ensure the Bioinformatic Centers enhance the
   current state of the science in this critical research area.  The hiring of Dr. Richard
   Judson as a Title 42 Senior Bioinformatician in NCCT has greatly facilitated the
   interactions with the Centers. Dr. Judson coordinates a monthly EPA-wide seminar
   program (Info on Informatics), which features one of the project areas from each of
   the Centers.  The goal of the series is to promote EPA awareness of the objectives of
   the Centers and to help facilitate development of interactions with them.  Drs. Judson
   and Kavlock, together with staff from NCER,  performed a site visit to the University
   of Medicine & Dentistry of New Jersey (UMDNJ) Center in December, which
   resulted in very rewarding discussions concerning future interactions. A second site
   visit is planned for early 2007 by Dr. Judson and several other EPA scientists to
   further develop ties. Due to the geographical closeness, interactions with the
   University of North Carolina (UNC) Center have  been more frequent and targeted. A
   predoctoral student has been identified to interact on matters related to genomic data
   storage, analysis and interpretation, and several interactions have developed in
   conjunction with the chemical prioritization efforts of the ToxCast program.

   Consequently, NCCT needs to develop a more comprehensive strategic plan for data
   collection, management,  and integration through  creation of databases that model the
   structure  of the underlying information and its potential use.

   Response: Dr. Judson, working in conjunction with Dr. Imran Shah, our Title 42
   Computational Systems Biologist (both joined the Center in September), has also
   taken the lead in developing an overall framework for information management
   within NCCT.  In response to the strong recommendation of the BOSC related to the
   need to adequately address this topic, we propose a targeted briefing on our approach
   to information management and information technology for the BOSC sometime in
   the May-June 2007 time  frame.

   It was noted that there exists a need within the field for trained personnel in
   computational toxicology. In addition to the existing postdoctoral program, one
   feasible approach would be to institute a career development award similar to the
   NIH "K" awards that would provide mentor ed training andresearch to more senior
   personnel.

   Response: We appreciate the recommendation to strengthen our training component,
   as we view this as one of our three critical functions (in addition to providing a
   service function to other ORD researchers and conducting innovative research on the
   use of computational models in risk assessment).  We will work with appropriate
   human resource components within EPA to explore options for career development
   training of other scientists. We have also engaged advanced discussions within


-------
January 2007 ORD Response to BOSC December 2006 Computational Toxicology Final
                                  Letter Report

   NCCT on hosting several advanced training courses for EPA staff.  Lead topics are
   Physiologically Based Pharmacokinetic Models and Chemical Prioritizations Tools.

3.  The third charge question was designed to promote discussion about the
   potential of NCCT research programs making impacts on Agency function, and
   how well the NCCT was leveraging its resources in this regard.

   The portfolio provided a mix of short- and long-term deliverables. Many of the
   former stand a good chance for application within program offices  or other parts of
   ORD within months.  The research programs included those from external
   institutions.  NCCT has leveraged its limited resources to good effect.

   Response: The NCCT program was designed with the goal of having some short-to
   intermediate- term deliverables, as well as some projects with longer timelines, and
   we appreciate the recognition of the value of this by the BOSC.  We have continued
   to work to best leverage our resources, and present three examples of related efforts
   since the review. The first  is the establishment of an Interagency Agreement with the
   National Chemical Genomics Center (NCGC) of the NIH to conduct quantitative,
   high throughput screening  analysis of ToxCast chemicals against a number of nuclear
   receptor assays.  This TAG  provides NCCT with a direct link to NIH's Molecular
   Library Initiative.  This Initiative will be providing extremely cost effective data to us
   over the next 5 years, as it taps  into a well established infrastructure geared to running
   these types of assays. NCCT has also started a series of high level meetings with the
   management of the National Health and Environmental Effects Research Laboratory
   (NHEERL), the National Environmental Research Laboratory (NERL), and the
   National Risk Management Research Laboratory (NRMRL) to best define working
   relationships between these groups and how to target the computational toxicology
   resources available in those laboratories.  The first of these meeting was held with
   NHEERL on January 18, 2007. Finally,  we have been working closely with staff in
   NCER to define the next Request for Applications (RFA) in computational
   toxicology. The objective  of this RFA will be to establish several academic centers
   working in areas of computational systems biology, and we are excited about the
   prospect of this activity to move us forward more rapidly in programs such as the
   Virtual Liver, as well as in  developing the computer infrastructure and computational
   approaches to systems biology from a toxicological viewpoint.

   One of the major aims of NCCT is to develop useful relational databases.  This also
   presents a significant challenge in managing the information. The Center should
   develop a strategic plan for data integration and for constructing databases that
   should be considered as information models.

   Response: As noted in the  response to Q2, NCCT is developing a strategic plan for
   data information and management, and is prepared to bring its plan  to the BOSC for
   comment within the next 6  months.


-------
January 2007 ORD Response to BOSC December 2006 Computational Toxicology Final
                                  Letter Report

4.  The fourth charge question consisting of three parts focused on NCCT's mission
   to accelerate the use of computational tools in the Agency mission.  The first part
   more specifically asked the committee to comment on whether the proposed
   computational models have the potential to identify and reduce uncertainties
   associated with risk assessment.

   Yes, proposed computational models have the potential to identify and reduce
   uncertainties associated with risk assessment. Additional opportunities outside the
   mechanistic models (especially in biomarkers that indicate exposure but that are not
   immediately or directly linked to toxicological response) may exist to fulfill NCCT's
   mission.

   Response: ORD is pleased the committee endorses the selection of computational
   models and the planned approach to develop, test, and use these models. ORD agrees
   there are other opportunities that can use other than mechanistic models including
   exposure biomarkers.  Since the review, new expert staff have come on board with
   systems modeling expertise.  Plans are being formulated  for an extensive liver model
   that can simulate its molecular processes and predict the  possible toxic effects of
   chemicals on liver function.  As part of this effort, modules at many different levels
   and complexities will be formulated, including those that relate data to tissue outcome
   without detailed specific knowledge of mechanism. Further,  work has now begun on
   computational approaches to apply advanced statistical and machine learning
   methods to evaluate human exposure and environmental  health data.  Target data
   include multiple types of biomarker and environmental exposure information.

   The second part of the charge question  addresses the models' ability to help identify
   susceptible populations and compare the risks to those  populations with the risks to
   the general population.

   Ultimately, these and other models within NCCT and outside the Agency can help
   identify susceptible populations.  Appropriately, models currently are being
   developed for use in computational toxicology.  Within 3-5 years, some of these
   models likely will be sufficiently developed and validated to address susceptibility.
   "Susceptible populations " may be defined to include life stages, gender, race,
   socioeconomic group, species, and geographic distribution.

   Response: ORD accepts this endorsement and will continue in its computational
   modeling activities to consider this an important goal.

   The last part of this question asked whether there was sufficient coordination between
   model development and associated data to avoid having the models being either over-
   or under-determined.

   Overall, data collection appears appropriately coordinated with model development.
   It will be important to validate models based on genomic methodologies given the
   inherent constraints in sample sizes, and other challenges, with these approaches.


-------
January 2007 ORD Response to BOSC December 2006 Computational Toxicology Final
                                  Letter Report
   Response: ORD agrees and recognizes the importance and challenge of validating
   and testing of all models.  NCCT modelers have established close working
   collaborations with laboratory biologists and chemists who are conducting many of
   the experiments or gathering and using existing data for model building and testing.

5.  The fifth charge question addressed whether the Computational Toxicology
   Implementation Plan described an achievable roadmap and set forth realistic
   milestones and outputs.

   Each of the research areas is active. NCCT has a core strength in modeling, and is
   expanding its expertise in informatics. The Center is leveraging its position by
   outreach to other EPA labs and programs via internal research funding and
   communities of practice, and externally via STAR grants and the external
   bioinformatics centers.  The addition of the informatics centers in particular
   strengthens NCCT's research in information technologies. This will be strengthened
   further through the hiring of NCCT staff with informatics expertise.  The STAR grants
   greatly expand NCCT's capacities in the generation of high-information-content data
   sets that will be needed to support model development.

   There are  still some challenges that will need to be overcome in the areas of database
   development and management. More details are provided in our response to
   question 2. This will be especially important in the development and demonstration
   of biological models derived from complex data sets.

   The research  has milestones with nearer term and longer term time horizons, which is
   appropriate.  It is clear that chemoinformatics tools andprioritization tools are well
   underway  and are likely to be applied by risk assessors and regulators within the next
   few years.

   Response: Some challenges remain that will need to be overcome in the areas of
   database development and management.  More details are provided in our response to
   Question 2. This will be especially important in the development  and demonstration
   of biological models derived from complex data sets. The Center will do whatever it
   can, within the boundaries of the grant process, to foster coordination of efforts
   between the two external bioinformatics  centers and NCCT's internal program.

   The BOSC recommends the NCCT develop a more detailed work plan for the virtual
   liver model, and that this plan be more extensively reviewed by the Computational
   Toxicology Subcommittee during its next annual review

   Response: Development of the virtual liver model has gained momentum with the
   hiring of Dr. Imran Shah, a Title 42 Computational Systems Biologist. He has been
   leading biweekly discussions with relevant staff members from NCCT, NHEERL,
   NERL and NCEA to articulate reasonable goals and expectations  for this effort.
   NCCT proposes we schedule a teleconference with the BOSC in the third quarter of


-------
January 2007 ORD Response to BOSC December 2006 Computational Toxicology Final
                                  Letter Report

   2007 to present a briefing and lead a discussion on development of the Virtual Liver
   activity.

6.  The sixth charge question addressed the depth and breadth of the resources
   directed at fulfilling the Implementation Plan.

   The subcommittee believes that the research program covers the range of thematic
   areas.  Some areas, however, have deeper coverage than others.  The areas of
   cumulative risk assessment and cross-species extrapolation are still under-
   represented, but given the state-of-the-science, it is appropriate to place limited
   emphasis on these areas for the next 3-5 years. The staffing additions in HTS,
   toxicogenomics, and biological modeling are all strong and have improved the
   strength and breadth ofNCCT. The planned staff additions in bioinformatics will be
   critical to the continuing success of the Center. One of these additions should have
   strong skills in data management systems.

   Response: We agree and the recently  hired Title 42 scientists who are filling two
   critical gaps in our expertise.  Their contribution will be evident when we brief the
   BOSC on our information management and virtual liver programs.  Together, they
   provide expertise in informatics and advanced computational methods, and are
   working in key areas for the NCCT. We have reserved a more junior level  position to
   support the programming needs of these two members, and are in the processing of
   re-orienting the support provided to us by the Environmental Modeling and
   Visualization Laboratory of the Office of Environmental Information, which has been
   supplying a variety of support activities to the Computational  Toxicology Program
   for the past two years. Finally, we are in  advanced discussions with a senior level
   scientist in the area of toxicogenomics. We will know shortly whether this additional
   Title 42 position within the NCCT will provide senior leadership in genomics.

7.  The seventh charge question asked about evidence that NCCT is being
   responsive to program and regional office needs.

   Most of the presentations addressed program office input in planning priorities and
   approaches.

   Some projects formed to support program office issues, such as carbamate
   cumulative risk, DSSTox, andRefToxDB. The Subcommittee noted program office
   and regional office staff as co-principal investigators on various projects. The
   Implementation Plan references a role for the Computational Toxicology
   Implementation and Steering Committee (CTISC), which could be useful, if sustained.

   Response: ORD thanks the committee for its response and encouragement. While
   ORD recognizes the usefulness of the  CTISC, its role is being reevaluated to
   determine if, in its current state, this is the most effective manner to insure wide
   involvement and support from the Program and Regional Offices as  well as others.
   As mentioned in our discussion under  Question 1, we are engaging many other


-------
January 2007 ORD Response to BOSC December 2006 Computational Toxicology Final
                                  Letter Report

   opportunities, including the Communities of Practice, to this end. We are also taking
   the opportunity to brief various EPA groups about the research program, with recent
   presentations to the Science Policy Council, the Regional Risk Assessors (which
   consists of both EPA and state risk assessors involved in Superfund sites), and an
   upcoming presentation to the Office of Pesticide Programs.

8.  The eighth charge question dealt with communication issues.

   NCCT has components of both a research and service center—it both initiates and
   receives new ideas.  For a young organization, NCCT has done very well in
   establishing communication with its collaborators, contractors, and some
   stakeholders.  The establishment ofCoPs and participation of internal clients is a
   good start to communication within the Agency. Also of note is NCCT's
   establishment of monthly videoconference presentations.  Most of the other
   communication activities seemed to be investigator-initiated.  Given that the Center
   plans to develop tools and methods that will be used by ORD and other EPA staff,
   NCCT should establish a regularly scheduled plan for communication and updates.
   This process will convey the sense that new ideas are welcomed by NCCT and allow
   NCCT to accept ideas and be aware of the needs of the program offices, regional
   offices, and stakeholders.  The establishment of such a process will enhance the
   marketing of tools and methods developed by NCCT.  One way to give Agency clients
   part ownership in the Center is to invite them to BOSC reviews, such as this, and ask
   them to share how they are using NCCT's methods, tools, and information.  The
   Subcommittee recommends that NCCT communicate with the Regional Risk
   Assessor's Office and seek its representation. Within the past year, NCCT has
   commendably given 21 presentations to various offices within EPA to raise
   awareness.

   Response: We agree and NCCT is paying close attention to this. Staff are regularly
   looking for and finding opportunities to interact with other scientists, organizations,
   and Agency programs.  In addition to scientific publications and presentations,
   feature articles are often written, such as one in the January, 2007 issue of EM
   highlighting the research activities of NCCT. NCCT is in the process of enhancing
   the Computational Toxicology website to communicate more effectively and in a
   more timely fashion. A senior ORD communications staff member and an intern in
   the communications office are working with NCCT to develop, publish, and
   disseminate appropriate messages. We expect to be releasing periodic updates on
   progress in implementing the ToxCast program, and we just completed a fact sheet
   describing the Interagency Agreement we just signed with NCGC/NIH.  This is the
   first tangible component of the ToxCast program.  As the various supporting
   contracts are awarded over the next six months, we will be posting updates on our
   website.  We  also will be using the upcoming International Science Forum on
   Computational Toxicology to engage a large number of Agency scientists.  Finally, as
   noted above, at their request we briefed the Regional Risk Assessors on the program,
   and received a number of emails following the presentation asking for additional
   details.


-------
January 2007 ORD Response to BOSC December 2006 Computational Toxicology Final
                                  Letter Report
9.  The ninth and final charge question asked if the current research program was
   designed to achieve environmental outcomes and how those outcomes could be
   measured.

   The current program is designed to achieve environmental outcomes that are
   appropriate to the Agency. Potential measures to determine these outcomes include:

       ?  Use of screening models for chemical prioritization.
       ?  Validation and use of genomics-associated biomarkers infield studies.
       ?  Use of computational models in the risk assessment process in the long term.
       ?  Success of databases (DSSTox, pesticides) in cleaning up and organizing
          disparate databases and making them widely useful to environmental science
          and regulatory communities.
       ?  Use of specific models (such as virtual liver, pyrethroid metabolism,
          macromolecular modeling, physiologically basedpharmacokinetic (PBPK)
          models, steroidogenesis models, cumulative risk models, and so forth, by
          broader environmental science and risk assessment communities.

   Response: ORD thanks the committee for these suggestions. NCCT is looking to
   develop specific ways to regularly gather information to apply to those measures.
   Progress and results will be shared with the committee at future meetings. Our
   current thinking is it would be best to engage the BOSC over the next year on specific
   projects, particularly ToxCast, the Virtual Liver, and our Information Management
   plans.  As these are programs still in rapid phases of evolution,  dialogue with the
   BOSC would be  beneficial to use in refining their approaches.  At the discretion of
   the BOSC, these could be done either in individual teleconferences over the next 6-9
   months, or at a face-to-face meeting, focusing on the three topic areas.

   We suggest the next all encompassing review of the program be held in the first half
   of 2008. At that  time, we would have made considerable progress on a number of
   research fronts that would allow us to change the main purpose of the review from
   reviewing strategic directions to analyzing the research outcomes.


-------
Computational Toxicology Program
Summary of BOSC Comments and Recommendations from December 2006 Letter
Report and Proposed ORD Actions	
        Recommendation
                Action Items
       Timeline
Establish a Community of Practice
(CoP) for Cumulative Risk
We favor the idea. We have realized from the
activities of the existent CoPs that they function
best when aligned along a well defined issue and
have a commonly identified goal.  Upon further
reflection since the BOSC review, our current
opinion is that we need to better refine the issue
and goal of a dedicated Cumulative Risk CoP, and
we hope to work with other Agency scientists to
foster its conceptualization and development.
2007 and 2008
NCCT seek broader program office
input. Additionally, CoPs covering
areas such as Mixtures, Cross-
Species Extrapolation,
Population/Systems Dynamic
Models, and Multimedia Fate and
Effects Modeling should be
considered for either NCCT use or
ORD's broader use.
We are committed to an increased number and
frequency of contacts and meetings with Program
and Regional Offices with formal presentations;
other noteworthy communication events include
the International Forum on Computational
Toxicology; short training courses and an
improved Web site communication.  Regarding
the establishment of additional CoPs, ORD is
concerned that these could over tax the small staff
of NCCT. NCCT remains committed to
supporting the three existing CoPs) and taking
steps to ensure their vitality. For other CoP ideas,
ORD will look across all the Labs and Centers to
see if relevant similar work groups and
committees are already established that could be
amended to address such issues, and encourage
establishment of ones for which no precedent can
be found.
2007
Integration of the external
Bioinformatics Centers and the
programs within NCCT will occur
following hiring of one senior and
onejuniorbioinformatics scientist.
This may not represent sufficient
personnel, however, to allow NCCT
to fully support its overall mission.
NCCT continues to work with NCER to ensure
that the Bioinformatic Centers enhance the current
state of the science in this critical research area.
The hiring of Dr. Richard Judson as a Title 42
Senior Bioinformatician in NCCT has greatly
facilitated the interactions with the Centers.  Dr.
Judson coordinates a monthly EPA wide seminar
program (Info on Informatics), which features one
of the project areas from one of the Centers.
Regular visits between the Centers and
EPA/NCCT are scheduled and have begun.
Initiated and on-going
NCCT needs to develop a more
comprehensive strategic plan for data
collection, management, and
integration through creation of
databases that model the structure of
the underlying information and its
potential use.
Dr. Judson, working in conjunction with Dr. Imran
Shah, our Title 42 Computational Systems
Biologist, has taken the lead in developing an
overall framework for information management
within NCCT.  hi response to the strong
recommendation of the BOSC related to the need
to adequately address this topic, we propose a
targeted briefing on our approach to information
management and information technology to the
BOSC sometime in mid 2007
On-going; proposed
briefing for sub-
committee in mid-2007.




-------
         Recommendation
                 Action Items
       Timeline
There is a need within the field for
trained personnel in computational
toxicology,  hi addition to the
existing postdoctoral program, one
feasible approach would be to
institute a career development award
similar to the MH "K" awards that
would provide mentored training and
research to more senior personnel.
We view this as one of our three critical functions
(in addition to providing a service function to
other ORD researchers and conducting innovative
research on use the use of computational models
in risk assessment).  We will work with
appropriate human resource components within
EPA to explore options for career development
training of other scientists.  We are also having
advanced discussions within the NCCT on hosting
several advanced training courses for EPA staff.
Lead topics are Physiologically Based
Pharmacokinetic Models and Chemical
Prioritizations Tools.
Start in 2007 and
continue beyond
Additional opportunities outside the
mechanistic models (especially in
biomarkers that indicate exposure but
that are not immediately or directly
linked to toxicological response) may
exist to fulfill NCCT's mission.
ORD is considering these and all types of model
structures in its developing model program. This
is particularly true in NCCT virtual liver project
currently under design.
On-going
It will be important to validate
models based on genomic
methodologies given the inherent
constraints in sample sizes, and other
challenges, with these approaches.
We agree and recognize the importance and
challenge of validating and testing of all models.
The NCCT modelers have established close
working collaborations with laboratory biologists
and chemists who are conducting many of the
experiments or gathering and using existing data
for model building and testing.
On-going
NCCT develop a more detailed work
plan for the virtual liver model, and
that this plan be more extensively
reviewed by the Computational
Toxicology Subcommittee during its
next annual review
Development of the Virtual Liver has gained
momentum with the hiring of Dr. Imran Shah, a
Title 42 Computational Systems Biologist. He has
been leading biweekly discussions with relevant
staff members from NCCT, NHEERL, NERL and
NCEA to articulate reasonable goals and
expectations for this effort.  ORD is committed to
briefing the BOSC on the status of this project in
mid 2007.
3M Quarter of 2007 for a
teleconference on plans
and progress of the
virtual liver project with
committee.
NCCT should establish a regularly
scheduled plan for communication
and updates. NCCT should invite
Agency clients to BOSC reviews,
such as this, and ask them to share
how they are using NCCT's methods,
tools, and information. The
Subcommittee recommends that
NCCT communicate with the
Regional Risk Assessor's Office and
seek its representation.
Staff are regularly looking for and finding
opportunities to interact with other scientists,
organizations, and Agency programs. NCCT is in
the process of enhancing the Computational
Toxicology web site to communicate more
information in a more timely fashion. A senior
ORD communications staff member and an intern
in communications are now working with NCCT
to develop, publish, and disseminate appropriate
messages. We expect to be releasing periodic
updates on progress in implementing the ToxCast
program, and we just completed a fact sheet
describing the Interagency Agreement we just
signed with NCGC/NIH. As the various
supporting contracts are awarded over the next six
months, we will be posting updates on our
website.  We also will be using the upcoming
International Science Forum on Computational
Toxicology to engage a large number of Agency
On-going and continuous




-------
         Recommendation
                 Action Items
        Timeline
                                    scientists. Finally, as noted above, at their request
                                    we briefed the Regional Risk Assessors on the
                                    program, and received a number of emails
                                    following the presentation asking for additional
                                    details.
Committee suggested several
potential measures to determine
outcomes
NCCT is looking to develop specific ways to
regularly gather information to apply to those
measures. Progress and results will be shared with
the committee at future meetings.  Our current
thinking is that it would be best to engage the
BOSC over the next year on specific projects,
particularly ToxCast, the virtual liver, and our
information management plans. As these are
programs still in rapid phases of evolution,
dialogue with the BOSC would be beneficial to
use in refining their approaches.  At the discretion
of the BOSC, these could be done either in
individual teleconferences over the next 6-9
months, or at a face-to-face  meeting focused on
the three topic areas.	
2007 and beyond




-------
S
                       BOARD  OF  SCIENTIFIC  COUNSELORS
                                  September 16, 2008

                                  Dr. George Gray
                                  Assistant Administrator
                                  Office of Research and Development
                                  U.S. Environmental Protection Agency

                                  Dr. Robert Kavlock
                                  Director
                                  National Center for Computational Toxicology
                                  U.S. Environmental Protection Agency

                                  Dear Drs. Gray and Kavlock:

                                  This is a letter report from the Board of Scientific Counselors (BOSC)
                                  reviewing the National Center for Computational Toxicology (NCCT). The
                                  Computational Toxicology Subcommittee of the BOSC Executive
                                  Committee reviewed NCCT's progress and plans during a 2-day meeting
                                  held December 17-18, 2007, at the EPA facility in Research Triangle Park,
                                  North Carolina.  The BOSC Subcommittee consists of George Daston
                                  (Chair), James Clark, Richard DiGiulio, Muiz Mumtaz, John Quackenbush,
                                  and Cynthia Stokes.

                                  This is the third review of NCCT conducted by the BOSC. The Subcom-
                                  mittee was very pleased with the progress that the Center has made towards
                                  its goals. NCCT first became operational in February 2005; during the 2.5
                                  years between its establishment and this review, NCCT has made substan-
                                  tial progress in establishing priorities and goals; making connections within
                                  and outside EPA to leverage the staffs considerable modeling expertise;
                                  expanding its capabilities in informatics; and making significant contribu-
                                  tions to research and decision-making throughout the Agency. We are
                                  pleased to see that informatics tools developed by the Center already are
                                  being used by program offices, and that the program offices are taking
                                  advantage of the expertise of the Center in developing critical elements for
                                  risk assessment, such as a biologically-based dose-response (BBDR) model
                                  for arsenic, an environmental contaminant of considerable public health
                                  importance. Many of the recommendations made by BOSC during its
                                  earlier reviews have been acted on by NCCT. This includes improved
                                  capabilities in bioinformatics through the funding of two external centers
                                  and in informatics and systems biology through staff hires; expansion of its
         A Federal Advisory Committee for the U.S. Environmental Protection Agency's Office of Research and Development
                                     Previous
TOC
Next

-------
               September 2008 BOSC Computational Toxicology Letter Report


technical approaches to even more programs within the Agency; and the formation of an
extensive collaboration with the National Institute of Environmental Health Sciences (NIEHS)
and the National Human Genome Research Institute (NHGRI) for its ToxCast project.

The purpose of the December 2007 review was to continue to provide NCCT with advice on the
progress the Center has made, in the past year, in fulfilling its mission and strategic goals. In
particular, the Subcommittee addressed six charge questions for five NCCT activities (ToxCast,
Informatics Technology/Information Management (IT/IM) activities, Virtual Liver,
Developmental Systems Biology, and Arsenic BBDR). The Subcommittee's responses to these
questions follow.

Charge Question 1: Does the scope and involvement of expertise in the project reflect activities
consistent with the function of a Center?

The NCCT was founded only a few years ago and has been achieving a critical mass of expertise
through selective hiring, external grants, and the formation of connections with other groups of
experts within EPA. The purpose of this question was to gauge the progress of the Center in
achieving the level of expertise needed to pursue its mission.

The staff working in NCCT and those  scientists involved from outside the Agency who are
working as collaborators are highly qualified in various aspects of computational toxicology. The
Center's effort to solidify formal agreements in terms  of memoranda of understanding (MOUs),
cooperative research and development agreements (CRADAs), etc., with various organizations
has opened up a diversity of quality opportunities to leverage and enhance Office of Research
and Development (ORD) efforts. A timely  example is the February 14, 2008, announcement of
the collaboration between NIEHS, NHGRI, and EPA's NCCT.  As described in the press release,
this collaboration leverages the strengths of each group to use high-speed, automated screening
robots to  test suspected toxicants using cells and isolated molecular targets.

The staff and collaborators at the center have the appropriate expertise and insights. The utility of
the tools and deliverables can be enhanced if the staff moves toward being more explicit on how
the tools under development support EPA risk assessments. Some of the ORD researchers seem
to be searching for an application for their sophisticated tools, and discussions with Agency staff
practicing risk assessments (Office of Pollution Prevention and Toxics [OPPT]; Office of Water,
Office of Wastewater Management; Office of Prevention, Pesticides, and Toxic Substances
[OPPTS], etc.) could provide direction as to the appropriate milestones and deliverables for these
efforts. The BOSC reviews and the Center would benefit if representatives from these Agency
offices attended BOSC reviews to ensure that all parties understand how NCCT's efforts address
the most relevant needs of the Agency. The BOSC wants to ensure that this advice is seen as
encouragement to reach out to risk assessment practitioners. The ongoing work in developing the
analytical approaches and information databases is of high technical quality, as the Center staff
and collaborators are working on many new and exciting approaches. By holding research
planning  discussions with risk assessment practitioners, the applications of the computational
toxicology tools and resources can be directed to ensure the most relevant and efficient use of
data and models.
                         Previous  I    TOC

-------
               September 2008 BOSC Computational Toxicology Letter Report


One challenge for the center staff involved in developing informatics datasets will be to develop
efficient and effective ways to handle the wealth of data available in some areas to avoid
redundancies of data entries and to focus on the most informative data. Again, interactions with
various program offices and their risk assessment activities should provide a basis to set the
long-term goals for the Informatics/Data management team. This will allow the development of
structured short-term and mid-term activities needed to meet the long-term goals.

The BOSC noted that it remains somewhat unclear how the Center intends to use ToxCast and
associated analyses to approach risk assessment. For instance, species-to-species translation was
mentioned, and the data are being obtained from multiple species, not just humans, but how the
different species data will be reconciled was not discussed. Although the primary goal of the
ToxCast project is prioritization of chemicals for detailed risk assessment, not the risk
assessment itself, it is interesting to contemplate how the projected database and analysis might
be directly relevant. Similarly, it was noted that an early decision regarding ToxCast was that
ecology and paths of exposure were not going to be addressed in this project (at least not
initially). Nonetheless,  at several points, paths of exposure arose during the review because of
their obvious relevance. The Subcommittee is prompted to ask how it might be addressed in
future work.

The Subcommittee also noted that the means of using the eventual Virtual Liver models for
actual risk assessment at EPA is unclear. The BOSC encourages additional thought and efforts
along these lines, in collaboration with the appropriate EPA program office personnel. This is not
a criticism of the current project vision by any means, but because direct or indirect application
to risk  assessment would be a fantastic result, it seems prudent to consider the possibility earlier
rather than later.

Charge Question 2:  Are the goals and milestones suitably described, ambitious, and
innovative?

The purpose of this question was to determine whether NCCT is performing its mission of
providing novel approaches to the practical problems of toxicology and risk assessment that are
needed by EPA.

For the Center overall,  the answer to this question is "yes." In particular, the goals of the Center
are well-described, very ambitious and innovative, as well as important for the future of research
at EPA. The issue of "milestones" is somewhat more complex, in part due to the varying levels
of maturity for Center components. In most cases, previous accomplishments and current
activities are well described, but more detail concerning projected future milestones would be
helpful. It is recognized, however, that these projects are very innovative and substantial
flexibility is appropriate. This is particularly true for less mature but highly creative projects such
as the Virtual Liver and Virtual Embryo. Also, in considering goals and milestones, it may be
appropriate to consider the timely integration of each project's accomplishments into the
Agency's  risk assessment activities. In the following paragraphs, Charge Question 2 is addressed
in the context of the five major Center activities discussed at the review meeting.
                         Previous  I     TOC

-------
               September 2008 BOSC Computational Toxicology Letter Report


ToxCast. ToxCast is the most mature of the Center's projects, which is appropriate considering
that it is most central to NCCT's overall mission. The goal of this project, to provide a cost and
time-efficient methodology for screening and prioritizing chemicals of concern to the Agency
(~11,000 by current estimates) is suitably ambitious, and is well described.  Progress on this
project has been very strong and these accomplished milestones are very well described. Future
plans for the project also are well described, although a more detailed time  table  for milestones
past 2008 would be helpful.

IT/IM Activities. This project is highly symbiotic with ToxCast, and the success of one is highly
dependent on the success of the  other. By extension, the success of the NCCT overall depends
upon the success of these two projects. The comments above for ToxCast also pertain to the
Informatics project. The project is highly and suitably ambitious, and its goals and substantial
progress are well described. Again, future plans are described well in a general way, but more
detail concerning future milestones (beyond 2008, which is well described) would be
appropriate.

Virtual Liver. Although narrower in scope than the foregoing projects, the Virtual Liver project
is very ambitious; it also is relatively young, apparently becoming fully operational with the
arrival of Dr. Imran Shah in September 2006. Its fit with the goals of NCCT is perhaps less clear
than the previous two projects; it is more "visionary" in nature, and less directly  applicable to
risk assessment, as described by one of the EPA scientists involved. The goals of the project  and
the nature of research to be performed to achieve those goals are clearly described. There is some
concern that this project may be overly ambitious. It may be helpful if key objectives were
delineated and prioritized,  perhaps indicating achievements that are critical to the success of  the
project and those that are highly desirable. Milestones for tracking the project's progress are  not
apparent, particularly in later years  (3-5). This relatively young and very innovative project
requires considerable flexibility, however, so the lack of detailed milestones in later years is very
reasonable.

Developmental Systems Biology (Virtual Embryo). This project is at a substantially earlier
stage than the Virtual Liver project; it is led by Dr. Thomas Knudsen who joined NCCT in
September 2007. The issues of goals and milestones are essentially the same as for the Virtual
Liver, that is, strong on the former,  but understandably weaker on the latter. It is  the
Subcommittee's expectation that a more concrete research plan with goals and milestones will be
developed over the coming months.

Arsenic BBDR. This also  is a relatively new effort, with planning beginning in 2006. This
project is unusual among NCCT projects in that it is oriented toward a specific chemical with a
specific issue (Safe Drinking Water Act revisions) rather than an approach  developed with
diverse chemicals in mind. However, this project is likely to inform the eventual  development of
other biologically-based dose-response models and their application to risk assessments by the
Agency. Thus, in addition to informing the  controversial issue of arsenic risk assessment, the
project is more broadly relevant to the mission of the NCCT. The goals of the project are very
clear and well described. Milestones, however, are not stated, and may be particularly important
for this project, which has a clear deadline (2011) in order to be useful for the 2012 Safe
Drinking Water Act review cycle.
                         Previous  I     TOC

-------
               September 2008 BOSC Computational Toxicology Letter Report
Charge Question 3: Are there significant gaps in the approach that can be pointed out at this
point in the evolution of the project?

ToxCast. Dr. Dix and the ToxCast project contributors are commended for their progress in this
activity in terms of specification of desired data and the contracting of various entities to obtain
these data. The data acquisition is clearly well under way. The main gap noted is relevant to both
the ToxCast and IT/IM activities. Specifically, the Subcommittee notes that the structural
specification of the database for compilation and rigorous quantitative analysis of the ToxCast
data remains unclear. Because the data types are highly heterogeneous and the dataset is very
large, developing these structural specifications will be a challenge that the Subcommittee
suggests should be addressed as soon as possible. The IT/IM team acknowledges that this area is
a significant challenge (e.g.,  the description in the write-up provided to  the Subcommittee prior
to the review meeting). One  suggestion is that the ToxCast team compile a list of some specific
use cases, for example, specific questions that they intend to address with the database. This will
help make concrete the needed database attributes that will allow the analysis for the chemical
prioritization that is the end goal of the ToxCast project

IT/IM Activities. The IT/IM activity group has clearly made significant progress since the last
BOSC Subcommittee meeting in terms of specification and development of various software and
database tools for storing and accessing various toxicology data in existence as well as being
generated (e.g., in the ToxCast project). The fact that their ToxRef database and utility are being
used already by the Office of Pesticide Programs  to retroactively explore its own data
demonstrates early utility and applicability beyond tNCCT itself. The major gap noted for this
activity was described in the ToxCast project section above. In addition, finding an efficient and
effective methodology for extracting data from text sources was a concern for the Subcommittee.
A trial of natural language processing (NLP) for pulling information into some of the databases
was described. The Subcommittee notes that this method has been attempted rather
unsuccessfully by various research groups over probably 2 decades and thereby encourages the
exploration of other possible approaches as well.

Virtual Liver. Dr. Shah and his group are commended for having good command of the
significant breadth of biology, toxicology, and modeling that impact the project. In addition, the
"big picture" vision described is useful—there are many important questions in the field and not
limiting the vision too early is appropriate. The Subcommittee believes  that this should be
balanced, however, with some very specific goals, milestones, and timelines for the next few
years that are clearly attainable with the resources at hand in order to assure some useful concrete
outcomes. In a project with this possible magnitude, it can be tempting to try to do everything,
both in terms of the various project approaches (knowledgebase (KB), biological modeling,
dosimetry modeling, etc.) as well as the scope within any one approach (breadth of the KB,
breadth and detail of every model, etc.), and thereby end up with little actually completed. One
suggestion is that Dr. Shah and  the group develop a short prioritized list of specific scientific
research questions relevant to EPA's goals that they desire to address as soon as possible, and
use this to focus first iterations of development of both the KB and model(s). More explicit
milestones and goals for these highest priority questions then can be developed. Later iterations
of KB development and modeling can add scope (breadth/depth) to allow NCCT to address
                         Previous  I    TOC

-------
               September 2008 BOSC Computational Toxicology Letter Report


additional research questions. The Virtual Liver activity will result in models of parts of the
biology being developed simultaneously and presumably by different individuals. Because the
idea is to integrate these models eventually to predict effects from molecular function to
physiologic outcome, the compatibility of the models is paramount. Dr. Shah indicated that he is
cognizant of and planning to manage this issue, for instance, by looking into the efforts of the
international Physiome Project. The Subcommittee members note that, to their knowledge, the
issue of common coding language, which has been addressed quite extensively by the Physiome
Project, does not appear to have addressed more subtle but critical compatibility issues
concerning biological and mathematical specifications among models, such as compatibility of
assumptions, equilibrium approximations, time scales, and so forth. Hence, beyond managing
compatible coding, the activity group is encouraged to actively plan for and manage on an
ongoing basis the specifications that must be shared among models so as to produce
compatibility when it is needed.

Developmental Systems Biology (Virtual Embryo).  This project is very early in its
development, and already shows interesting progress based on the continuation of the earlier
work by Dr. Knudsen. Because the data needs of the proposed models may be significant, the
Subcommittee notes that it will be critical to identify and enlist appropriate supporters and
collaborators to provide such data. The track record of the principal investigator suggests that
this will develop naturally.

Arsenic BBDR. No specific technical gaps in the approach were noted for this activity. Because
the goal is to use the project's resulting model(s) for the 2012 review cycle of the Safe Drinking
Water Act, the Subcommittee encourages continuous communication with the appropriate
program office personnel so that concerns, objections, and skepticism can be addressed early and
on an ongoing basis. The group is commended for having such communication already in place
and it is encouraged to maintain that communication to the greatest degree possible.

Charge Question 4: Does the work offer to significantly improve environmental health impacts
and is the path toward regulatory acceptance and utilization apparent?

This question was included so that the Subcommittee could provide an opinion as to the potential
for NCCT's research to impact decision-making by the programs and offices that administer
regulations that are important for public and environmental health.

The Subcommittee believes that the work being reviewed has the potential to significantly
improve a number of aspects of the risk assessment process, and in so doing will lead to
substantial improvements in environmental health. As noted in the responses to previous
questions, the programs under review are at different levels of maturity and will deliver results at
different time points. The potential to improve the public and environmental health protection
role of the Agency, however, is enormous. These improvements will come in the form of better
tools for the prioritization of chemicals to evaluate and assess, early insight into the potential
toxicity of new substances by improved capabilities of searching for structural analogs for which
data already exist, better understanding of the fundamental molecular processes that underlie
toxicity and variability in response, and better methods for incorporating that information into
risk assessment. As with the other responses, each project that was reviewed will be discussed in
                         Previous  I    TOC

-------
               September 2008 BOSC Computational Toxicology Letter Report


separate paragraphs below (with the exception of the Arsenic BBDR, which will be addressed
under Charge Question 4a).

ToxCast. This project already has begun to generate a considerable database on the cellular and
biochemical effects of approximately 300 well-studied chemicals, mostly pesticides. The
advantage of choosing this set of chemicals, nominated by the Office of Pesticide Programs, is
that they already have been assessed for their potential to cause toxicity using a comprehensive
set of toxicity tests. This will provide the phenotypic anchoring for responses that are observed in
the high-throughput and other test methodologies that ToxCast is employing. ToxCast has the
goal of providing a scientific foundation for predicting the potential hazards of chemicals by
evaluating the responses of relevant molecular and cellular markers in simpler experimental
systems.  This will lead to an improved ability to prioritize testing, better test methods, testing
strategies that are tailored to the chemical being tested, and perhaps ultimately to the replacement
of existing test methods with ones that are not encumbered with much of the uncertainty inherent
in traditional toxicity tests. In order to reach this potential ToxCast will need to generate a lot of
new data. The recently  announced collaboration between NCCT, NIEHS, and NHGRI will
accelerate progress in this area and is a wise use of limited resources.

IT/IM Activities.  The database and software development has been outstanding. ToxRef
already is in use at the Office of Pesticide Programs and is allowing toxicologists and risk
assessors to query large databases of chemical  structures for common toxicological properties.
Relational databases of this type provide novel opportunities for risk assessors to consider the
potential biological activity of new chemicals instead of just production volume (or other
surrogates of potential exposure) in prioritizing them for further evaluation and testing.  This
already is a major achievement with practical applications.

Virtual Liver and Virtual Embryo. These programs have longer time horizons but have
significant potential to improve risk assessment. The liver is a common target organ for toxic
agents, and is a primary site of metabolism of xenobiotic compounds. Adverse effects on
embryonic development usually are irreversible, and the economic and emotional consequences
of adverse developmental outcome are significant. Therefore, the choice of these two systems for
intense investigative and modeling approaches is appropriate for an agency interested in the
public health consequences of toxicant exposure. As noted in previous responses, these  programs
will need to progress a little farther before enough of a scientific foundation is created to
accurately determine how they will  be incorporated  into the risk assessment process. It already is
clear that the information being generated will be important in reducing the uncertainty
associated with determining which chemicals pose hazards, variability in susceptibility in a
heterogeneous population, and other critical questions.

-Y-   Charge Question 4a: In addition, specifically for the Arsenic BBDR project:
    Does the proposed computational model have the potential to identify and reduce
     uncertainties with  the risk assessment process?

The answer to this  question is yes, depending on data gaps identified and resources made
available. This study might not give all the answers but will get us halfway there. EPA
recognizes that developing a universal  arsenic model describing several cancer endpoints is  a


                                            7
                         Previous  I     TOC

-------
               September 2008 BOSC Computational Toxicology Letter Report


formidable challenge. Hence a step-wise research project with an eye for the future is proposed.
Initially, a generic model for cancer will be developed that will incorporate key steps of the
mode of action commonly shared for multiple cancer types such as oxidative stress. This model,
in turn, will serve as an engine to develop specific cancer models as the need arises and resources
become available. To ascertain whether appropriate steps are being incorporated, a thorough
literature review of experimental and epidemiological data and expert consultation has been
proposed. It also is acknowledged that even though there is a lot of data, they are somewhat
weak to generate exposure time course response curves. Appropriate experiments have been
proposed to fill the research needs to develop a realistic model.

-Y-  Charge Question 4b:  Will the model be able to help identify susceptible populations and
    compare potential risks in those populations with less susceptible populations?

Yes, the initial generic model development exercise will allow identification of issues such as
mechanisms that operate in general versus subpopulations, such as susceptible populations with
varying degree of arsenic methylation.  Such issues could be the subject of workshops to explore
the issue of the extent of polymorphism in the human population.

The short-term (1-2 years) goal is the establishment of a coordinated program of laboratory
research to generate essential data needed to develop  a BBDR model that will increase
confidence in the predictions. To start with, the model development will be initiated with
available data. Work proposed includes multistage clonal growth modeling, target tissue
dosimetry, and methylated metabolites  of arsenic.

The long-term (3-5 years) goal of developing a robust version might be too optimistic.  As the
project gets underway, new  questions and issues might be identified that will require additional
laboratory research and continued resources. The project has a good future as it can be easily
adapted to the latest (2007) National Academies toxicity testing report that recommends a
systems biology and computational tool integration.

-Y-  Charge Question 4c: Is coordination between model development and associated data
    collection sufficient to avoid problems with models being either over- or under-determined?

Yes, it is desirable to see what health effects are caused at lower doses to avoid the potential of
compromise in setting an arsenic standard based on cost-benefit analysis.

Charge Question 5: Have appropriate data management and analysis tools been incorporated
into the project?

Previous reviews have highlighted the importance, as well as the challenges, of developing
useful relational databases. This question was included so that the Subcommittee could evaluate
the Center's progress in developing and implementing strategies for data management and
analysis.

ToxCast.  With regard to ToxCast, NCCT has made great progress in the past 18 months in
hiring bioinformatics and computational biology  scientists and staff members to establish the
                         Previous  I    TOC

-------
               September 2008 BOSC Computational Toxicology Letter Report


infrastructure necessary to begin meeting the needs of the program. The challenges here also are
the strengths of the ToxCast:  the diversity of the data that it will generate and the need to
effectively organize that information to facilitate its analysis and interpretation. The approach
taken by Dr. Richard Judson and his group is a sensible one given the state of the field:
information from each technology with which data will be generated will be captured in a
technology-specific database, and this information ultimately will be collected in a central data
warehouse linking the information together. The advantage of this solution is that it allows the
data from each assay to be stored in a rational format while deferring the question of how the
information will be combined to address questions relevant to the mission of EPA.

The construction of the warehouse remains an open question. Ultimately, a database is a model
of the interactions that exist in the underlying data and the relationships relevant to the analysis
that will be performed. The diversity of the data, representing a wide range of in vivo  and in vitro
assays from multiple species, makes building such a model a significant challenge. The project
seems to be lacking a set of analytical  objectives necessary for building the relevant use cases
that ultimately will inform the process of database construction, and this ultimately will
determine its utility. At this stage, ToxCast needs to begin to define analytical outcomes in order
to set goals and milestones with regard to developing and validating analytical protocols. This is
an essential  step at this point as it will  help to anchor future development and make it relevant.
This also will help to define the requirements of the interfaces that are built to access  the data.

Further, the ToxCast group should be encouraged to release the data and databases at the earliest
possible time and to consider a "CAMDA-like" workshop in which the research community is
offered access to the data with the challenge of using the data to effectively predict end points.
At least three advantages to the program will be derived from these efforts. First, public release
will help to drive the creation of relevant use cases that will further database development.
Second, it will assist in evaluating data access protocols and tools to assure the greatest utility to
the research and regulatory community. Third, it will accelerate the development of predictive
algorithms to combine the data to make predictions about relevant phenotypic outcomes.

Virtual Liver. The Virtual Liver is a  very ambitious project designed to simulate molecular,
cellular, physiological, and organ-level computational models that ultimately can be used to
make predictions regarding the toxic effects of various compounds. To limit the scope of the
project to  something that might be manageable, its initial focus will be nuclear receptor-mediated
non-genotoxic liver cancer. The group should be applauded for this decision as it will give  staff
the opportunity to focus enough to make progress.

The starting point and first challenge will be the construction of a liver KB. In any domain, this
is a nontrivial problem and ultimately will require  linking information in the literature and a host
of public data resources. The use of publicly  available resources and tools and the commitment
to making the KB available are commendable not only because it will be widely useful to the
broader community, but also because it will accelerate the development and curation of the
information within the KB.

With regards to populating the KB, the use of NLP probably is not the best solution. NLP does
not work well with the scientific literature, and its  application in this  domain remains  an area of
                         Previous  I    TOC

-------
               September 2008 BOSC Computational Toxicology Letter Report


active research. Application of NLP has the potential to introduce a great deal of noise in the
system, leading to many potential false associations that could lead to more problems than it
solves. Consequently, other methods, including expert or community curation, should be
explored.

On a larger scale, the greatest potential problem will be linking each of the domain-specific
models to build a predictive system. Again, this remains an area of active research and one that
may present significant barriers to developing verifiable solutions. The greatest challenges will
be to validate any models that emerge from the analysis.

Finally, there is a need to develop standards for interactivity and try to interface with developing
standards within the community.

Virtual Embryo. This project is in its early stages, with Dr. Knudsen only having arrived 3
months prior to this review. As such, it is still not well integrated with the overall NCCT
program, and in particular ToxCast. It remains to be seen how well it will eventually integrate
with the overall program, and its integration with other internal and external initiatives needs to
be resolved. Nevertheless, it appears that this project could provide an opportunity to explore the
results emerging from ToxCast,  and it may help direct selection of the next generation of
compounds for analysis in ToxCast.

Charge Question 6: How would you assess the outreach to other groups in executing the
projects?

Because of NCCT's limited size, it is vitally important that the Center be connected in
meaningful ways to other groups of experts who can augment the Center's capabilities. The
purpose of this question was to determine the Center's progress in making and leveraging
connections.

NCCT has done an admirable job in reaching out to other groups, both inside and outside the
Agency. Because of the relatively small size of the Center, outreach is important as a way of
augmenting its productivity. Outreach also is important in engaging others in understanding the
capabilities of computational toxicology, which will be crucial in  convincing program offices to
use the tools developed by the Center.  NCCT is doing a good job on both counts.

NCCT has been successful at developing partnerships at several levels. Within ORD, NCCT has
developed successful partnerships with the National Health and Environmental Effects Research
Laboratory (NHEERL) and the National Exposure Research Laboratory  (NERL), which can
conduct experiments and supply data for analysis and modeling by NCCT scientists. The Center
is tied into a number of the research activities in these laboratories, including the Endocrine
Disrupting Chemicals, Drinking Water, Safe Pesticides/Safe Products, and Human Health
Research Programs. NCCT has allocated a fraction of its resources toward the achievement of
goals within those ORD programs.

NCCT also has developed three  Communities of Practice (CoP) in chemi-informatics, biological
modeling, and categorization and prioritization. The purpose of the CoPs is to unite scientists


                                           10
                         Previous  I     TOC

-------
               September 2008 BOSC Computational Toxicology Letter Report


who have a common interest in an area in which NCCT is a center of excellence. The CoPs are
becoming a means of coordinating activity and communicating progress on an informal, grass-
roots level. Outreach also has taken place to program offices within EPA, especially the Office
of Pesticide Programs. This office is supplying data that are being used as part of the ToxCast
project, and is likely to be an early adopter of the predictive and priority-setting tools being
developed by the Center.

NCCT is doing a good job of joining forces with others outside the Agency, particularly at
NIEHS. The arm of the National Toxicology Program that operates at NIEHS has a strong
interest in high-throughput methods for predicting toxicity, a project that is complementary to
activity at the Center. NCCT and NIEHS have done a good job of information sharing and have
developed a constructive working partnership in which data and analysis methods will be shared.
NCCT also is establishing collaborations internationally, coordinated through the Organization
for Economic Co-operation and Development (OECD). OECD's project entitled "Molecular
Screening for Characterizing Individual Chemicals and Chemical Categories" has similar goals
as the ToxCast project. OECD has recognized that ToxCast can serve as a foundation for its
project and is developing an international consortium that will build on ToxCast. It is likely that
a number of nations and private companies will join this consortium in the coming year.
Furthermore, the recent MOU among NCCT, NIEHS, and NHGRI promises to be the most
important and extensive collaboration yet for ToxCast and NCCT. In summary, NCCT is doing
an excellent job at outreach, which in turn is enhancing its ability to fulfill its mission.

In conclusion, the BOSC Computational  Toxicology Subcommittee believes that NCCT is
making exceptional progress toward its mission. We are pleased to provide advice on this
important Center and look forward to future opportunities to provide timely advice to guide and
improve NCCT  and its programs.

Sincerely,
Gary S. Sayler, Ph.D.
Chair, BOSC
                                          11
                        Previous  I     TOC

-------
               UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
                              WASHINGTON, D.C. 20460
                                                                          OFFICE OF
                                                                   RESEARCH AND DEVELOPMENT
Dr. Gary S. Sayler
Chair, Board of Scientific Counselors
The University of Tennessee
676 Dabney Hall
Knoxville. Tennessee 37996

Dear Dr. Sayler:

       The Office of Research and Development (ORD) would like to take this opportunity to
thank you and the members of the Board of Scientific Counselors (BOSC) for the December,
2007, progress review of the National Center for Computational Toxicology (NCCT). We
greatly appreciate the efforts of the members of the subcommittee who conducted the review,
Drs. George Daston, James Clark, Richard DiGiulio, Muiz Mumtaz, John Quackenbush, and
Cynthia Stokes.

       The subcommittee was requested to provide advice on the progress made by the NCCT,
and specifically on the ToxCast7  project, IT/IM activities, the virtual tissues projects, and
biologically based does-response modeling for arsenic.

       Enclosed  is our response to the comments and recommendations in your letter of
September  16, 2008. Please feel free to contact me if further information is needed.

       We are pleased that the BOSC was very supportive of NCCT and the direction we are
taking in this  very important research program.  The guidance of the BOSC has been of great
assistance to NCCT in crafting the second generation  Implementation Plan. This plan, which
covers FY09-12, will be presented at the fourth BOSC review later this year.
       Again, thank you for your efforts on behalf of ORD.
                                 Sincere
                                 ^evnp'eichfnan, Ph.D.
                                 Deputy Assistant Administrator for Science
Enclosure
Cc:    Dr. George Daston
       Dr. James Clark
       Dr. Richard DiGiulio
       Dr. Muiz Mumtaz
       Dr. John Quackenbush
       Dr. Cynthia Stokes
          Recycled/Recyclable • Printed with Vegetable Oil Based Inks on 100% Recycled Paper (40% Postconsumer)




-------
            Office of Research and Development's Response to the
                  Board of Scientific Counselors Report on
            ORD's National Center for Computational Toxicology
                   (final report received September 2008)

                                February 2009
                  BOSC Computational Toxicology Subcommittee
                           Dr. George Daston (Chair)
                                Dr. James Clark
                              Dr. Richard DiGiulio
                               Dr. Muiz Mumtaz
                             Dr. John Quackenbush
                              Dr. Cynthia Stokes
Submitted by:
Dr. Robert Kavlock
Director, National Center for Computational Toxicology
Office of Research and Development
                     Previous
TOC

-------
          ORD Response to BOSC Computational Toxicology Letter Report
                                 February 2009

The following is a narrative response to the comments and recommendations of the
BOSC review of ORD's National Center for Computational Toxicology (NCCT), held
December 17 and 18, 2007, in Research Triangle Park, NC.  The review was conducted
by a standing subcommittee of the BOSC. The subcommittee had previously reviewed
the NCCT in April 2005 and June 2006 and ORD responded to those reviews.  In this
third review, the BOSC noted, ".. .during the 2.5 years between its establishment and this
review, NCCT has made substantial progress in establishing priorities and goals; making
connections  within and outside EPA to leverage the staff's considerable modeling
expertise; expanding its capabilities in informatics; and making significant contributions
to research and decision-making throughout the Agency." Furthermore they noted,
".. .many of the recommendations made by BOSC during its earlier reviews have been
acted on by NCCT. This includes improved capabilities in bioinformatics through the
funding of two external centers and in informatics and systems biology through staff
hires; expansion of its technical approaches to even more programs within the Agency;
and the formation of an extensive collaboration with the National Institute of
Environmental Health Sciences (NIEHS) and the National Human Genome Research
Institute (NHGRI) for its ToxCast™ project."

Each charge question is shown below in bold, followed by the BOSC's comments in
italics and ORD's response to the comments in regular type. A summary of the BOSC
recommendations and ORD's responses is provided in Table  1 at the end of this report.

Charge Question 1:  Does the scope and involvement of expertise in the project
reflect activities consistent with the function of a Center?

The NCCT was founded only a few years ago and has been achieving a critical mass of
expertise through selective hiring, external grants,  and the formation of connections with
other groups of experts within EPA. The purpose of this question was to gauge the
progress of the Center in achieving the level of expertise needed to pursue its mission.
The staff working in NCCT and those scientists involved from outside the Agency who are
working as collaborators are highly qualified in various aspects of computational
toxicology.  The Center's effort to solidify formal agreements in terms of memoranda of
understanding (MOUs), cooperative research and development agreements  (CRADAs),
etc., with various organizations has opened up a diversity of quality opportunities to
leverage and enhance Office of Research and Development (ORD) efforts. A timely
example is the February 14, 2008, announcement of the collaboration between NIEHS,
NHGRI, and EPA 's NCCT. As described in the press release, this collaboration
leverages the strengths of each group to use high-speed, automated screening robots to
test suspected toxicants using cells and isolated molecular targets.

The staff and collaborators at the center have the appropriate expertise and insights.
The utility of the tools and deliver able s can be enhanced if the staff moves toward being
more explicit on how  the tools under development support EPA risk assessments. Some
of the ORD researchers seem to be  searching for an application for  their sophisticated
tools,  and discussions with Agency  staff practicing risk assessments  (Office of Pollution
Prevention and Toxics fOPPTJ; Office of Water, Office of Wastewater Management;
                     Previous  I    TOC

-------
           ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009

Office of Prevention, Pesticides, and Toxic Substances [OPPTS], etc.) could provide
direction as to the appropriate milestones and deliverables for these efforts. The BOSC
reviews and the Center would benefit if representatives from these Agency offices
attended BOSC reviews to ensure that all par ties understand how NCCT's efforts address
the most relevant needs of the Agency.  The BOSC wants to ensure that this advice is seen
as encouragement to reach out to risk assessment practitioners. The ongoing work in
developing the analytical approaches and information databases is of high technical
quality, as the Center staff and collaborators are working on many new and exciting
approaches. By holding research planning discussions with risk assessment
practitioners, the applications of the computational toxicology tools and resources can be
directed to ensure the most relevant and efficient use of data and models.
(Recommendations #1 and #2 in Table 1)

ORD Response: ORD appreciates this recommendation. As noted in the report, NCCT
regularly meets with program offices, risk assessors, and other potential practitioners in
planning and conducting this research.  A priority action item of the  NCCT for FY2009 is
to improve connectivity with NHEERL, NERL and NCEA relative to building the
foundation for a transformation in the conduct of evaluating the toxicity of chemicals.
We are continuing to engage Communities of Practice to help achieve this end.  In
previous reviews, some of these stakeholders were invited and attended meetings of this
BOSC subcommittee.  The next review will be a broad review of the computational
toxicology program and the new implementation plan. For this and future meetings,
Agency stakeholders will be invited to attend the meeting and enter discussions as
appropriate. Further, the NCCT will ask such stakeholders  to review and comment on the
new implementation plan prior to the next BOSC meeting.

Charge question 1 continued:

One challenge for the center staff involved in developing informatics datasets will be to
develop efficient and effective ways to handle  the wealth of data available in some areas
to avoid redundancies of data entries and to focus on  the most informative data. Again,
interactions with various program offices and their risk assessment activities should
provide a basis to set the long-term goals for the Informatics/Data management team.
This will allow the development of structured short-term and mid-term activities needed
to meet the long-term goals.  (Recommendation #3 in Table 1)

ORD Response: To address this important issue the NCCT has five main database-
related, data-intensive  projects: ACToR, ToxRefDB, ToxMiner, the  ToxCast™ chemical
registry, and DSSTox. ACToR (http://actor.epa.gov/actor) is the global repository of data
that is relevant to environmental chemicals. It is populated from more than 200 public
repositories of toxicity data to provide a broad, but in  many cases shallow view of the
universe of data available on chemicals of interest to the NCCT and  the EPA. ToxRefDB
is focused on extracting high quality in vivo toxicology data on chemicals in the
ToxCast™ program, capturing study information down to the treatment group level, and
extracting these into a  relational database well-suited to predictive modeling.  ToxRefDB
is also being developed into a web-accessible  resource that can be queried to derive
                     Previous  I    TOC

-------
           ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009

treatment related toxicity effects directly from the database.  ToxMiner is a compilation
of statistical tools capable of analyzing relationships between ToxCast™ and ToxRefDB
data, and performing predictive signatures.  The ToxCast™ chemical registry is used to
track nominations for ToxCast™ screening, to track chemical procurement, sample
identity and sample QC, and finally to link actual samples to ToxCast™ data. DSSTox is
adding the quality reviewed chemical structure layer to data sets of interest to NCCT, and
publishing additional inventories and toxicity data sets of interest to EPA and external
groups. The underlying data model and database tables for all but DSSTox are being
consolidated to remove data redundancy and to reduce the effort required to manage
multiple related systems. DSSTox is primarily a file-based system, and as data are
curated, they are entered into the ACToR system for further use. We are actively
working with other partners (ORD, OPP, OPPT, OW, OHS, NCEA, the Tox21 partners)
to prioritize chemicals to be entered into the system and to obtain and enter data. We
believe this compilation of information on the toxicity of chemicals provides a solid
foundation for the NCCT to not only understand the extent of public information on
chemicals, but also to provide public access to this increasingly data rich repository of
information on chemicals.

Charge question 1 continued:

The BOSC noted that it remains somewhat unclear how the Center intends to use
ToxCast and associated analyses to approach risk assessment.  For instance,  species-to-
species translation was mentioned, and the data are being obtained from multiple
species, not just humans, but how the different species data will be reconciled was not
discussed. Although the primary goal of the ToxCast project is prioritization of
chemicals for detailed risk assessment, not the risk assessment itself, it is interesting to
contemplate how the projected database and analysis might be directly relevant.
Similarly, it was noted that an early decision regarding ToxCast was that ecology and
paths of exposure were not going to be addressed in this project (at least not initially).
Nonetheless, at several points, paths of exposure arose during the review because of their
obvious relevance. The Subcommittee is prompted to ask how it might be addressed in
future work. (Recommendation #4 in Table 1)

ORD Response:  The NCCT has recognized the opportunity to address the full source-to-
outcome continuum of risk assessment, and has recently done this in several ways.  This
need is reflected in the FY2009 priorities for NCCT that include increased connectivity
with other components of ORD. Thus, NCCT has organized an ORD-wide workgroup to
expand an overarching strategy for developing a high  throughput approach to risk
assessment-building from the example and lessons from ToxCast™ and expanding on
applications to exposure and mode of action assessment.  One part of this approach will
be to develop exposure predictions on the thousands of chemicals relevant to ToxCast™,
in a Center project tentatively called ExpoCast.  Finally, the translation of ToxCast™
predictions directly to humans is being accomplished by direct comparison of results for
rodent and human targets and pathways interrogated by complementary assays.  In
addition, a proposal has been accepted for consideration by the HESI Emerging Issues
Program at its annual meeting in January 2009 to establish collaborations with the
                      Previous  I    TOC

-------
           ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009

pharmaceutical industry to supply chemicals with identified human toxicity for use in
Phase lib of ToxCast™. This phase would include at least 100 pharmaceutical
                                                              Tl\/f
compounds with known human toxicities and would extend ToxCast   predictive
signatures from Phase I of rodent toxicity endpoints, to similar toxicity endpoints in
humans.

Charge question 1 continued:

The Subcommittee also noted that the means of using the eventual Virtual Liver models
for actual risk assessment at EPA  is unclear.  The BOSC encourages additional thought
and efforts along these lines, in collaboration with the appropriate EPA program office
personnel.  This is not a criticism of the current project vision by any means, but because
direct or indirect application to risk assessment would be a fantastic result, it seems
prudent to consider the possibility earlier rather than later. (Recommendation #5 in
Table 1)

ORD Response: The Virtual Liver (v-Liver) is being developed in conjunction with
NHEERL research activities.  A detailed plan for v-Liver will be presented to the BOSC
at the 2009 review.  The objective of the v-Liver project is to coordinate an integrated in
vitro  and in virtuo program in the long-term for toxicity testing that is efficient, relevant
to humans and less dependent on animals, with the ultimate goal of use in risk
assessment. We agree that stakeholder involvement from EPA program offices is a
critical requirement for the success of the v-Liver project. Although program office
personnel were not directly involved in the early v-Liver research planning phase, senior
scientists from NCEA/RTP, NHEERL and NCCT who have a good grasp of risk
assessment needs for fulfilling EPA's mission, are part of the core  team.  Their collective
insight into key challenges facing risk assessment and the requirement for future toxicity
testing have been vital for shaping the vision for the v-Liver system. Therefore we
believe the v-Liver project is poised to actively engage program office personnel to
address challenges in mode of action (MOA) elucidation and quantitative dose-response
prediction for chronic liver injury.

Program office personnel will be engaged in the  design, development, and utilization of
the system. This is being accomplished through a few practical use cases that demonstrate
the value of Virtual Tissues for developing a proof of concept (PoC) for assessing the risk
of environmental chemicals to liver physiology and human health. Over the next two
years, the v-Liver PoC will define a subset of hepatic effects, apical endpoints, and
relevant environmental chemicals  which will be developed in close collaboration with
program office personnel to ensure application to risk assessment and relevance to the
EPA  mission. In addition to developing a Virtual Tissues platform that will contribute in
the long-term to the future of toxicity testing, the short-term milestones of the v-Liver
PoC will also aim to address current client needs.

The v-Liver project plan (please see Appendix for outputs) outlines how  stakeholders will
be involved.  Currently, the project is aligned closely with the ToxCast™, ToxRefDB and
ToxMiner projects to develop methods to select environmental chemicals for the v-Liver
                      Previous  I     TOC

-------
           ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009

PoC focusing on nuclear receptor (NR) mediated hepatocarcinogenesis. Analyzing data
from ToxCast™ and ToxRefDB has identified a range of pesticides and persistent toxic
chemicals that match these criteria. Around ten chemicals will be used for the PoC and
these will be selected in collaboration with program office personnel who are actively
involved in their risk assessment and/or have substantial expertise in their MOA. We
plan to develop these collaborations with stakeholders by providing them two main types
of computational tools. In the short-term (FY09), interactive tools to aid hepatic MOA
organization and analysis will be developed. In the medium (FY10) to long-term, these
will be extended with prototype tissue-level  simulation tools that will aid in investigating
the  quantitative relationships between MOA(s) and adverse effects.

The first deliverable for risk assessors is the  v-Liver Knowledgebase (v-Liver-KB),
which formally organizes information on normal hepatic functions and their perturbation
by chemical stressors into pathophysiologic  states. Information about hepatic physiology
relevant for MOA analysis is dispersed across scores of public domain repositories as
well as the biomedical literature and the v-Liver-KB will leverage semantic approaches,
which are being increasingly adopted by the biomedical community, to provide effective
tools that fill the gaps toxicity MOA organization and inference.  The v-Liver-KB will be
deployed as an interactive web-based and desktop tool to intuitively browse and query
physiologic knowledge on PoC chemicals, to derive MOA(s) and to link assay results
from ToxCast™, species-specific effects from ToxRefDB, and other evidence curated
from the literature. We believe this system will provide computable information on key
events that transparently indicate the uncertainties and data gaps and that make inferences
on MOA from experimental data.  In addition, we will work closely with risk assessors to
customize the system for specific requirements. The v-Liver-KB will be deployed over
the  next two years and updated quarterly with any new information on the PoC
chemicals.

The second deliverable (FY10), the v-Liver Simulator (v-Liver-Sim), dynamically
simulates the key molecular and cellular perturbations leading to  adverse effects in
hepatic tissues. Initially, it will focus on modeling MOA leading to proliferative and
neoplastic liver lesions at a hepatic lobular scale.  The v-Liver-Sim is being developed as
a cellular systems model of the hepatic lobule that will use MOA information from the v-
Liver-KB to initially provide two outputs: the visualization of tissue changes at a
histological scale and the assessment of lesion incidence. A version of this system will
also be provided as a web-based/desktop tool to enable risk assessors to perform
interactive and quantitative  simulation of chemical induced perturbations of physiologic
processes leading to toxic histopathologic effects.  Eventually, the liver simulator will be
integrated with PBPK models to model alternative exposure scenarios. Over the course
of the project, the system will be evaluated in collaboration with risk assessors using  PoC
chemicals in vitro data from ToxCast™ and  published in vivo data from rodents and
humans.
                      Previous  I    TOC

-------
           ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009

Charge Question 2: Are the goals and milestones suitably described, ambitious, and
innovative?

For the Center overall, the answer to this question is "yes. " In particular, the goals of
the Center are well-described, very ambitious and innovative, as well as important for
the future of research at EPA. The issue of "milestones " is somewhat more complex, in
part due  to the varying levels of maturity for Center components. In most cases, previous
accomplishments and current activities are well described, but more detail concerning
projected future milestones would be helpful.  It is recognized, however, that these
projects are very innovative and substantial flexibility is appropriate. This is particularly
true for less mature but highly creative projects such as the Virtual Liver and Virtual
Embryo.  Also, in considering goals and mile stones, it may be appropriate to consider the
timely integration of each project's accomplishments into the Agency's risk assessment
activities. In the following paragraphs, Charge Question 2 is addressed in the context of
the five major Center activities discussed at the review meeting.

ToxCast: Future plans for the project also are well described, although a more detailed
time table for milestones past 2008 would be helpful. (Recommendation #6 in Table 1)

ORD  Response:  ORD agrees with this recommendation and has a more detailed
timetable for ToxCast™ milestones which centers around the release of data, validation
of predictive signatures, and generation of data as chemicals are tested in Phase II. With
considerable data and experience now in hand from Phase I contractors and additional
collaborations on the Phase I chemical library with laboratories within NHEERL and
outside EPA, it will be possible to better articulate the directions for Phase II of
ToxCast™. In addition, activities of the Tox21 consortium between NTP/NIEHS,
NCGC/NHGRI and NCCT/ORD are maturing and beginning to identify near-term and
medium-term goals. These activities will be described in  greater detail in the second
generation Implementation Plan, which we are now developing and will present to the
BOSC at the next NCCT review.  Please see appendix for detailed listing of milestones.

Charge question 2 continued:

IT/IM Activities: The project is highly and suitably ambitious, and its goals and
substantial progress are well described. Again, future plans are described well in a
general way, but more detail concerning future milestones (beyond 2008, which is well
described) would be appropriate. (Recommendation #6 in Table 1)

ORD  Response:  Again, ORD agrees and has  a detailed time table which emphasizes the
deployment and continual upgrade of ACToR, integration of ToxCast™ and ToxRefDB
in-vivo toxicology data, importation of available exposure, neurotoxicity, and
reproductive toxicity data.  A detailed listing of ACToR and ToxRefDB related
milestones can be found in Appendix I.

Regarding ToxMiner, the first goal in FY09 is to incorporate all of the ToxCast™ Phase I
data into ToxMiner. This involves processing the many individual data sets to eliminate
                     Previous  I    TOC

-------
           ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009

faulty data, to perform scaling and normalization, and to extract computationally useful
parameters such as maximum effect levels and IC50 values.  The second main task is to
integrate the ToxMiner database with analysis tools for statistical analysis and machine
learning. A third task is to integrate other biological information to help interpret the
results of statistical analyses. In particular, we are incorporating pathway information
and using this as an organizing principle to make sense of the results from the hundreds
                    'PA/I                                Tl\/f
of individual  ToxCast  assays.  The major goal of ToxCast   Phase I is to develop a
series of "signatures" linking in vitro data with in vivo toxicology. The related ToxMiner
goal for FY09-FY10 is  to produce and store these signatures and have them ready for
                    Tl\/f                                                        TA/f
validation on ToxCast   Phase II chemicals. Planning is well underway for a ToxCast
Data Summit in May 2009, which will provide a forum for external scientists to come
and discuss alternatives for deriving predictive signatures of ToxCast™ HTS date
relative to ToxRef identified phenotypes.

DSSTox will increase its interactions and alignment with major NCCT projects
(ToxCast™, ToxRefDB, ACToR) and broader Agency and outside projects (NHEERL,
OPPT, NTP,  CEBS, EU REACH), providing key cheminformatics support, expanding
DSSTox data file publications of toxicological data in support of predictive modeling,
and enhancing linkages to resources such as PubChem for disseminating EPA,
ToxRefDB and ToxCast™ bioassay results to the broader modeling community.
Detailed milestones are found in Appendix I.

Charge question 2 continued:

Virtual Liver: Although narrower in scope than the foregoing projects,  the Virtual Liver
project is very ambitious; it also is relatively young, apparently becoming fully
operational with the arrival of Dr. Imran Shah in September 2006. Its Jit with the goals
of NCCT is perhaps less clear than the previous two projects; it is more "visionary" in
nature, and less directly applicable to risk assessment, as described by one of the EPA
scientists involved. The goals of the project and the nature of research to be performed
to achieve those goals are clearly described. There is some concern that this project may
be overly ambitious. It may be helpful if key objectives were delineated and prioritized,
perhaps indicating achievements that are critical to the success of the project and those
that are highly desirable.  Milestones for tracking the project's progress are not
apparent, particularly in later years (3-5). This relatively young and very innovative
project requires considerable flexibility, however, so the lack of detailed milestones in
later years is very reasonable. (Recommendation #7 in Table 1)

ORD Response: The importance of developing  and applying computational system level
models of key phenotypic outcomes is reflected in the second goal of the new EPA
Strategic Plan for Evaluating the Toxicity of Chemicals that is currently working its way
through final  concurrence by the Agency.  NCCT recognizes the need to better delineate
the goals and milestones of the v-Liver project, and we have made this a key activity in
response to the  comments of the BOSC. NCCT is convinced the future  of toxicology will
be heavily dependent upon the development of computational systems level models and
has played a key role in the development of this plan and its execution through this
                     Previous  I    TOC

-------
           ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009

project. Current and additional details will be provided at the next review of the BOSC.
The short-term goals for the v-Liver project are to identify environmental chemicals for
the PoC system. Once there is buy-in from EPA stakeholders (program offices and
NCEA) on these chemicals, the team will begin populating the v-Liver-KB with relevant
mechanistic and MOA information on these chemicals including in vitro data from
ToxCast™ and in vivo data from the literature. Concurrently, the team will develop a
prototype virtual hepatic lobule to understand the key cellular responses necessary for
modeling cancer progression beginning with nuclear receptor activation. Data generated
by ToxCast™ as well as external collaborators/new contracts will be used to begin
quantitative parameterization of the cellular and molecular responses, and their
evaluation using published in vivo rodent data. The detailed milestones for the project
are described in Appendix I.

Charge question 2 continued:

Developmental Systems Biology (Virtual Embryo).  This project is at a substantially
earlier stage than the Virtual Liver project; it is led by Dr. Thomas Knudsen who joined
NCCTin September 2007. The issues of goals and milestones are essentially the same as
for the Virtual Liver, that is, strong on the former, but understandably weaker on the
latter.  It is the Subcommittee 's expectation that a more concrete research plan with
goals and milestones will be developed over the coming months. (Recommendation #8 in
Table 1)

ORD Response: A formal research plan for the Virtual Embryo, including goals and
milestones, has been developed. The long-term goal will provide a computational
framework that enables predictive modeling of prenatal developmental  toxicity.  The
project is  motivated by scientific and regulatory needs to understand how chemicals
affect biological pathways in developing tissues, and through this knowledge a more
ambitious undertaking to predict developmental toxicity.  The research  plan is built on an
expanded outlook of experimental-based techniques that aim to identify 'developmental
toxicity pathways' and an expanded scope of computational search-based techniques that
apply such knowledge into models for chemical dysmorphogenesis. Dr. Knudsen, the
lead scientist for this program, was recently invited to NCEA where he  provided an
overview  of the project.  This has led to close coordination between the computational
models and the risk assessment priorities.

Virtual Embryo's short-term goals address the knowledgebase (VT-KB) and simulation
engine (VT-SE) to enable in silico reconstruction of key developmental landmarks that
are sensitive to environmental chemicals.  Initial research focuses on early eye
development.  Proof-of-principle (2yrs) will be measured by high fidelity simulation
models to demonstrate several generalized principles, including the ability to reconstruct
genetic defects in silico, classify abnormal developmental trajectories from genetic
network inference, and predict teratogen-induced defects from pathway-level data.  A
much more detailed research plan will be provide to the BOSC in its 2009 review of the
NCCT, and detailed examples of current envisioned milestones are found in Appendix I.
                      Previous  I    TOC

-------
           ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009

Charge question 2 continued:

Arsenic BBDR: This project is unusual among NCCTprojects in that it is oriented
toward a specific chemical with a specific issue (Safe Drinking Water Act revisions)
rather than an approach developed with diverse chemicals in mind. However, this
project is likely to inform the eventual development of other biologically-based dose-
response models and their application to risk assessments by the Agency.  Thus, in
addition to informing the controversial issue of arsenic risk assessment, the project is
more broadly relevant to the mission of the NCCT.  The goals of the project are very
clear and well described. Milestones, however, are not stated, and may be particularly
important for this project, which has a clear deadline (2011) in order to be useful for the
2012 Safe Drinking Water Act review cycle.  (Recommendation #9 in Table 1)

ORD Response: At the time of the BOSC review in December, 2007, considerable effort
had been devoted to planning the development of a BBDR model for carcinogenic effects
of inorganic arsenic (iAs).  The initial focus of the planning process was a literature
review to identify data needs.  This review had shown that the pharmacokinetics (PK) of
iAs were relatively well-studied, though there were some significant remaining PK
uncertainties.  The literature was not, however, sufficient to identify with any confidence
the relevant mode or modes of action (MoA) of iAs responsible for its carcinogenic
effects. We therefore developed a generic experimental design that focused on: (1) the
description of a potential MoA as a sequence of key events; and (2) experimental
characterization of the dose-time response surfaces for the key events.  For any given
candidate MoA, it was anticipated that this experimental approach would have provided
sufficient data to allow ranking of candidate MoAs by dose and time course.  The MoA
or MoAs acting at the lowest doses and earliest time points would be considered to be the
drivers for the apical cancer outcomes.

The next step in the process was to elicit research proposals from NHEERL iAs
researchers that were to be based on the suggested experimental approach for
characterizing candidate MoAs.  The literature is consistent with a relatively large
number of MoAs for iAs. These include (among others) oxidative stress, cytolethality
and regenerative cellular proliferation, altered patterns of DNA methylation, altered  DNA
repair, and DNA damage. Receipt of the proposals was followed by an external peer
review meeting. The outside experts judged that the proposals received did not
adequately represent plausible modes of action, which caused NHEERL management to
markedly reduce the planned BBDR modeling effort and  focus on-going research on iAs
PK, with particular emphasis on  evaluation of the arsenic 3-methyl transferase knockout
mouse. The NCCT involvement in the arsenic mode of action BBDR models has been
redirected to stronger interactions with existing NCCT projects in ToxCast™ and the v-
Liver, and will be presented to the BOSC at its next review of the Center.
                                       10
                     Previous  I     TOC

-------
          ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009

Charge Question 3: Are there significant gaps in the approach that can be pointed
out at this point in the evolution of the project?

ToxCast:  Specifically, the Subcommittee notes that the structural specification of the
database for compilation and rigorous quantitative analysis of the ToxCast data remains
unclear. Because the data types are highly heterogeneous and the dataset is very large,
developing these structural specifications will be a challenge that the Subcommittee
suggests should be addressed as soon as possible. The IT/IM team acknowledges that
this area is a significant challenge (e.g., the description in the write-up provided to the
Subcommittee prior to the review meeting). One suggestion is that the ToxCast team
compiles a list of some specific use cases, for example, specific questions that they intend
to address with the database.  This will help make concrete the needed database
attributes  that will allow the analysis for the chemical prioritization that is the end goal
of the ToxCast project 1). (Recommendation #10 in Table 1)

ORD Response:  Over the last several months, these issues have become clearer, mainly
due to the fact that we now have access to  large parts of the ToxCast™ data. With the
exception of the microarray genomics data, which has been delayed due to lack of
consensus on the most appropriate bioasssy conditions, the results of all of the assays can
be reduced to a small number of summary parameters. In most cases, one of these will be
a characteristic concentration for each chemical in the assay (EC50, IC50, lowest
observed concentration at which a significant effect is seen).  The second parameter will
often be a magnitude of response. For all of the assays, we can extract a relevant
concentration and for many, a response magnitude. Related to this, the endpoint data we
will be predicting from ToxRefDB are characteristic concentrations, which are the lowest
doses at which a particular effect was seen with statistical significance. A third variable
in some assays is time - cell based assay data in some cases is provided at 2-3 time points
(e.g. 6, 24 and 48 hours). We track these times, but treat each of the times as separate
assays.  Finally, most assays  can be linked to biological pathways, either directly through
the gene or protein, or through a higher-order processing being probed. Although
ToxCast™ was envisioned to support chemical prioritization efforts of Agency regulator
offices, it has since been viewed as a source of ancillary information that can be used in
evaluating risks.  Examples of this include interest of the toxic substances office on the
effects of perfluoroacids, NCEA with phthalates, and the pesticide office with conazoles.
Such interest demonstrates the multiple values the information emerging from ToxCast™
is having on the regulatory programs of EPA beyond chemical prioritization.  We
anticipate  continued interest in the use of ToxCast™ in risk assessment considerations
and are engaging NCEA in optimal ways to bridge the applications.

As already stated, the goal of ToxCast™ Phase I, as supported by the ToxMiner system,
is to find links between in vitro assays and in vivo toxicity as captured in ToxRefDB.
These can be statistical correlations or more biologically-based toxicity pathway
linkages.  Given this, the ToxMiner database has been organized into five main pieces:
                                        11
                     Previous  I    TOC

-------
           ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009


    1.  Chemical information - this holds chemical identity and structure
    2.  Assay information - this holds the summary values extracted from in vitro assays
       and from ToxRefDB  (concentrations, response magnitude), as well as other
       related quantitative and qualitative information on chemicals such as physico-
       chemical properties and chemical class information.
    3.  Data preparation - for many of the data sets, several pre-processing steps need to
       be undertaken to map raw data onto the canonical chemical and assay data
       structure.  These tables and data structures enable these steps to be carried out in
       well-controlled manner
    4.  Statistical analysis workflow - many calculations need to be carried out to find
       signatures and the results need to be tracked and made available to the ToxCast™
       team on the web.  We are implementing specific data tables and code to carry out
       these  steps.
    5.  Pathway information - this set of data tables and tools are being designed to allow
       the analysis of the ToxCast™ data in terms of biological pathways.

Charge Question 3 continued:

IT/IMActivities:  The major gap noted for this activity was described in the ToxCast
project section above. In addition, finding an efficient and effective methodology for
extracting data from text sources was a concern for the Subcommittee. A trial of natural
language processing (NLP) for pulling information into some of the databases was
described. The Subcommittee notes that this method has been attempted rather
unsuccessfully by various research groups over probably 2 decades and thereby
encourages the exploration of other possible approaches as well.  (Recommendation #11
in Table  1)

ORD Response: NCCT agrees and is developing two main uses for literature mining, for
which we believe current technology is  suitable. In the first case, we need to  extract
tabular data for use in ToxCast™ and the virtual tissue project. These are, for instance,
quantitative values associated with in vivo toxicity or in vitro assays.  Here we are using
text mining as a sophisticated version of a PubMed search to prioritize documents for
data extraction and to do an initial automated data extract.  The results are then presented
to an analyst  to do manual quality control and data cleaning.

The second task is to generate hypotheses about biological processes such as the  co-
occurrence of gene expression changes and the observation of higher-order phenotypes.
The lack of success that the reviewer alludes to, we would argue, is in taking these
hypotheses and assigning some truth value to them based on statistical arguments. We
are using these simply as starting points for building representations of pathways and
processes that will be tested through further experiments and analyses. A more detailed
explanation of our approach to literature mining and evidence of utility will be presented
at the next BOSC review.
                                        12
                      Previous  I    TOC

-------
           ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009
Charge Question 3 continued:

Virtual Liver:  Dr. Shah and his group are commended for having a good command of
the significant breadth of biology, toxicology, and modeling that impacts the project. In
addition, the "bigpicture " vision described is useful—there are many important
questions in the field and not limiting the vision too early is appropriate. The
Subcommittee  believes that this should be balanced, however, with some very specific
goals, milestones, and timelines for the next few years that are clearly attainable with the
resources at hand in order to assure some useful concrete outcomes. In a project with
this possible magnitude, it can be tempting to try to do everything, both in terms of the
various project approaches (knowledgebase (KB), biological modeling, dosimetry
modeling, etc.) as well as the scope within any one approach (breadth of the KB, breadth
and detail of every model, etc.), and thereby end up with little actually completed. One
suggestion is that Dr. Shah and the group develop a short prioritized list of specific
scientific research questions relevant to EPA 's goals that they desire to address as soon
as possible, and use this to focus first iterations of development of both the KB and
model(s). More explicit milestones and goals for these highest priority questions then
can be developed. Later iterations of KB development and modeling can add scope
(breadth/depth) to allow NCCT to address additional research questions.
(Recommendations #6 and #12 in Table 1)

ORD Response:  The question, "How can in vivo tissue level adverse outcomes in
humans be predicted using in vitro data? "  is the "Grand Challenge" scientific problem
in toxicology that motivates the v-Liver project.  This is a very ambitious goal and
infeasible to achieve in the broad sense in just a few years. Hence, the  v-Liver project
will take a few steps towards realizing this long-term objective by focusing on a tractable
proof of concept (PoC) system using ten environmental chemicals that  activate nuclear
receptors and cause  a range of apical effects in cancer progression (non-proliferative
lesions, pre-neoplastic lesions, and neoplastic lesions). The project will engage program
office personnel to ensure relevance to EPA's mission and provide deliverables for risk
assessment within the first two years. These deliverables focus on two main scientific
questions:

a) How can tissue level adverse effects be modeled to enable extrapolation? The v-Liver
leverages the Mode  of Action Framework and public sources of mechanistic information
to formalize the description of key events leading to adverse hepatic outcomes. Our
claim is that MOA knowledge  can be universally described across species, organs,
chemicals and doses, using genes, their interactions, pathways and cellular responses that
lead to toxic effects.  This claim will be tested in the PoC by: (a) organizing sufficient
information about the 20 nuclear receptor-activators to demonstrate that key events in the
MOA(s) can be described generally for extrapolation across chemicals  and species, and
(b) using semantic methods to build an ontology for the physiologic processes, a
knowledgebase to integrate this information, and inference tools for extrapolation. The
result of this exercise will be delivered as the v-Liver-KB.
                                        13
                      Previous  I    TOC

-------
           ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009


b) How can the tissue level effects be extrapolated across doses and time? Our claim is
that quantitative tissue level effects can be generated from qualitative logical descriptions
of the MOA(s), chemical-specific data for key events and simulation of the tissues as a
cellular system. The rationale for the v-Liver Simulator is to implement a virtual hepatic
lobule as a complex cellular system to investigate emergent tissue-level effect due to
alternative MOA(s) at very low environmentally relevant doses.  To extrapolate between
species, chemicals and doses, the v-Liver team is collaborating across ORD and
extramural funding to develop in vitro models and assays to relevant quantitative data
key events. In addition, to estimate internal dose and to model alternative exposure
scenarios the project is working closely with PBPK modeling efforts across ORD. The
deliverable for this part of the project will be the v-Liver Simulator.

Charge Question 3 continued:

Virtual Liver:  The Virtual Liver activity will result in models of parts of the biology
being developed simultaneously and presumably by different individuals. Because the
idea is to integrate these models eventually to predict effects from molecular function to
physiologic outcome, the compatibility of the models is paramount.  Dr. Shah indicated
that he is cognizant of and planning to manage this issue, for instance,  by looking into the
efforts of the international Physiome Project.  The Subcommittee members note that, to
their knowledge, the issue of common coding language, which has been addressed quite
extensively by the Physiome Project, does not appear to have addressed more subtle but
critical compatibility issues concerning biological and mathematical specifications
among models, such as compatibility of assumptions, equilibrium approximations, time
scales, and so forth.  Hence, beyond managing compatible coding, the activity group is
encouraged to actively plan for and manage on an ongoing basis the specifications that
must be shared among models so as to produce compatibility when it is needed.
(Recommendation #13 in Table 1)

ORD Response: This is indeed a difficult and very important issue to consider.  To this
end NCCT is beginning to address the issue on two fronts:

    1.  NCCT plans to raise this issue for discussion by multi-scale  modeling experts at
       the NCCT organized International Workshop on Virtual Tissues, to be held in
       Research Triangle Park, NC, April 21-23, 2009. This workshop will have
       representation from the Physiome project and the SBML project and is co-
       sponsored by the European Union.
    2.  In  addition, the NCCT is actively collaborating with PBPK modelers in the
       Agency to develop a formal specification that will ease the integration with v-
       Liver-Sim. The effort is using semantic technology to define physiologic models
       at the organism level that can interface with existing tools in NERL.

These two integrated efforts will be important early steps for addressing this problem.
                                        14
                      Previous  I    TOC

-------
          ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009
Charge Question 3 continued:

Virtual Embryo: Because the data needs of the proposed models may be significant, the
Subcommittee notes that it will be critical to identify and enlist appropriate supporters
and collaborators to provide such data.  The track record of the principal investigator
suggests that this will develop naturally. (Recommendation #14 in Table 1)

ORD Response:  With successful proof-of-principle (2 yrs), the computational model of
early eye development will be used to create general models of morphogenesis during
subsequent years. Any proposed model of chemical dysmorphogenesis must be
sufficiently abstract to be computationally feasible and yet detailed enough to enable the
realistic expression of developmental defects across chemicals, doses, tissues, stages, and
species. The data needs of the proposed models will be significant as noted by the
Subcommittee. Preliminary computational models can attach existing data from in vitro
studies and semi-arbitrary parameters from in silico resources. These models will be
calibrated across species (zebrafish, mouse, rat, human) and tested for predictive
capacity. In this regard, the Virtual Embryo will leverage data generated by NCCT's
high-throughput chemical screening and prioritization research program (ToxCast™,
ToxRefDB) to model developmental toxicity pathways.

Importantly, to stimulate research in this area, NCER  released a funding opportunity
under its Science To Achieve Results (STAR) research program,  "Computational
Toxicology Research Centers: in vitro and in silico models  of developmental toxicity
pathways" (EPA-G2008-STAR-W). Collaboration with future STAR center(s) can
provide experimental data to identify developmental toxicity pathways and computational
models for developmental defects.

Because conservation of cell signaling is a founding principle of early development
across species and stages,  the in silico toolbox is likely to be extensible across
morphoregulatory responses. As such, in silico models built from scratch can be
generalized to other systems (neural tube, cardiac, urogenital) and alternative models
(embryonic stem cell assays, zebrafish embryos) for chemical-pathway interactions. In
this regard, the Virtual Embryo has begun to identify and enlist collaborators at NHEERL
to help provide such data.

High-throughput platforms now offer a powerful means of data gathering to discover key
biological pathways leading to apical endpoints of toxicity, and computational model
structures our ability to integrate these data across biological scales to build predictive
models that address mode-of-action. Successful computational models can become
increasingly important in EPA efforts to translate pathway-level data into risk
assessments, and in that regard the Virtual Embryo has also begun to identify and enlist
support from NCEA. A web-site has been developed  to communicate publically about
the project (http://www.epa.gov/ncct/v-Embryo/).
                                       15
                     Previous  I     TOC

-------
           ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009
Charge Question 3 continued:

Arsenic BBDR: The Subcommittee encourages continuous communication with the
appropriate program office personnel so that concerns, objections, and skepticism can be
addressed early and on an ongoing basis.  The group is commended for having such
communication already in place and it is encouraged to maintain that communication to
the greatest degree possible.  (Recommendation #15 in Table 1)

ORD Response:  As discussed in the response to charge question 2, this project was
largely terminated in 2008, with the exception of a few smaller efforts on
pharmacokinetics of arsenic. NCCT efforts are being redirected to incorporate concepts
of BBDR in the virtual tissue models, particularly from the viewpoint of dose-response
extrapolation.  Additional NCCT efforts are being directed at interpreting the results of
ToxCast™ in vitro concentration responses relative to the range of potential external
exposures that could provide equivalent tissue level responses (i.e., reverse
toxicokinetics). As we move forward in these areas, we will ensure adequate discussion
with client offices in EPA takes place on a routine basis.

Charge Question 4:  Does the work offer to significantly improve environmental
health impacts and is the path toward regulatory acceptance and utilization
apparent?

ORD Response:  ORD is very appreciative of the committee's affirmation of work and
progress  in ToxCast™, Informatics, and the virtual tissues. The NCCT will present
further updates on progress at the next committee review.

Charge Question 4a: In addition, specifically for the Arsenic BBDR project:
Does the proposed computational model have the potential to identify and reduce
uncertainties with the risk assessment process?

The answer to this question is yes, depending on data gaps identified and resources made
available. This study might not give all the answers but will get us halfway there. EPA
recognizes that developing a universal arsenic model describing several cancer
endpoints is a formidable challenge. Hence a step-wise research project with an eye for
the future is proposed. Initially, a generic model for cancer will be developed that will
incorporate key steps of the mode of action commonly shared for multiple cancer types
such as oxidative stress. This model, in turn, will serve as an engine to develop specific
cancer models as the need arises and resources become available. To ascertain whether
appropriate steps are being incorporated, a thorough literature review of experimental
and epidemiological data and expert consultation has been proposed. It also is
acknowledged that even though there is a lot of data, they are somewhat weak to
generate exposure time course response curves. Appropriate experiments have been
proposed to fill the research needs to develop a realistic model.

ORD Response:  Please see earlier response regarding the arsenic BBDR project.
                                       16
                     Previous  I     TOC

-------
           ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009


Charge Question 4b: Will the model be able to help identify susceptible populations
and compare potential risks in those populations with less susceptible populations?

Yes, the initial generic model development exercise will allow identification of issues
such as mechanisms that operate in general versus subpopulations, such as susceptible
populations with varying degree of arsenic methylation. Such issues could be the subject
of workshops to explore the issue of the extent of polymorphism in the human population.

The short-term (1-2 years) goal is the establishment of a coordinated program of
laboratory research to generate essential data needed to develop a BBDR model that will
increase confidence in the predictions. To start with, the model development will be
initiated with available data. Work proposed includes multistage clonal growth modeling,
target tissue dosimetry, and methylated metabolites of arsenic.
The long-term (3-5 years) goal of developing a robust version might be too optimistic. As
the project gets underway, new questions and issues might be identified that will require
additional laboratory research and continued resources. The project has a good future as
it can be easily adapted to the latest (2007) National Academies toxicity testing report
that recommends a systems biology and computational tool integration.

ORD Response:  Please see earlier response regarding the arsenic BBDR project.

Charge Question 4c: Is coordination between model development and associated
data collection sufficient to avoid problems with models being either over- or under-
determined?

Yes, it is desirable to see what health effects are caused at lower doses to avoid the
potential of compromise in setting an arsenic standard based on cost-benefit analysis.

ORD Response:  Please see earlier response regarding the arsenic BBDR project.

Charge Question 5: Have appropriate data management and analysis tools been
incorporated into the project?

ToxCastFM: The construction of the warehouse remains an open question.  Ultimately, a
database is a model of the interactions that exist in the underlying data and the relationships
relevant to the analysis that will be performed.  The diversity of the data, representing a wide
range of in vivo and in vitro assays from multiple species, makes building such a model a
significant challenge.  The project seems to be lacking a set of analytical objectives
necessary for building the relevant use cases that ultimately will inform the process of
database construction, and this ultimately will determine its utility. At this stage, ToxCast
needs to begin to define analytical outcomes in order to set goals and milestones with regard
to developing and validating analytical protocols. This is an essential step at this point as it
will help to anchor future development and make it relevant.  This also will help to define the
requirements of the interfaces that are built to access the data.
                                        17
                      Previous  I    TOC

-------
           ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009

Further, the ToxCast group should be encouraged to release the data and databases at the
earliest possible time and to consider a "CAMDA-like " workshop in which the research
community is offered access to the data with the challenge of using the data to effectively
predict end points. At least three advantages to the program will be derived from these
efforts. First, public release will help to drive the creation of relevant use cases that will
further database development. Second, it will assist in evaluating data access protocols and
tools to assure the greatest utility to the research and regulatory community.  Third, it will
accelerate the development of predictive algorithms to combine the data to make predictions
about relevantphenotypic outcomes. (Recommendations #6, 10 and 16 in Table 1)

ORD Response:  The first part of this question (database design and construction) was
addressed in the  response to  charge question 3. The ToxMiner database is able to capture
and provide all of the summary information which we believe is going to be useful for
statistical and pathway-based analysis of the ToxCast data sets.

The second question relates to analytical outcomes. By this we assume the reviewer
means the desired outcomes  of analyses of the ToxCast data.  We believe that the
outcome of ToxCast will be  a series of well-defined procedures that take as input the
results of a set of in vitro assays run on a chemical and give a result which is a statement
about the likelihood that the  chemical will lead to a particular toxicity phenotype.  The
simplest procedure is a formula (e.g. a logistic regression model) that uses the IC50
values for several assays and gives a binary prediction for a particular toxicity. More
complex procedures would use the results from a set of assays to predict whether a
particular pathway is activated. Then we could have a function that predicts the
likelihood of the outcome, given the activation of one or more pathways. The current
database has been designed to hold both the numerical data required to test these models,
and the model parameters and outcomes.  In summary, we feel that this issue has in
general been resolved over the last several months although many details still need to be
worked out, particularly regarding the best statistical approaches to be used, and the
precise way that pathway information will be incorporated.

With regard to the last comment by the reviewer, a recommendation that we hold a
CAMDA-like workshop, we are currently planning such a meeting to be held in May
2009.  We  plan to make all of the ToxCast M data available to analysis partners in early
2009.  By having a larger community trying many analysis techniques on this data, we
will maximize our chances of success.

Charge Question 5 continued:

V-liver:  With regards to populating the KB, the use of NLP probably is not the best
solution. NLP does not work well with the scientific literature, and its application in this
domain remains an area of active research.  Application of NLP has the potential to
introduce a great deal of noise in the system, leading to many potential false associations
that could lead to more problems than it solves.  Consequently, other methods, including
expert or community curation, should be explored.

On a larger scale, the greatest potential problem will be linking each of the domain-
                                        18
                      Previous  I     TOC

-------
           ORD Response to BOSC Computational Toxicology Letter Report
                                  February 2009

specific models to build a predictive system. Again, this remains an area of active
research and one that may present significant barriers to developing verifiable solutions.
The greatest challenges will be to validate any models that emerge from the analysis.
Finally, there is a need to develop standards for interactivity and try to interface with
developing standards within the community. (Recommendation #17 in Table  1)

ORD Response: Linguistic resources have several applications in the Virtual  Embryo
although an important challenge noted by the Subcommittee is to unambiguously code
unstructured (text) data in a form that can be processed by a computer to derive
interesting relationships and causality.  Querying within the proper context can make
these more precise and less noisy.  NLP enhances the coarse semantic search for specific
concepts and then provides a way to automatically extract the key facts, relationships and
quantitative information.  The results are then presented to an analyst to do manual
quality control and data cleaning. As such NLP extends, but does not replace the need
for a formal concept model (ontology) to organize the relevant information about
developmental processes and toxicities that is often present in literature in an
unstructured format.

Also noted by the Subcommittee, a broader network of expertise within the
developmental toxicology community may be useful to  building the information network.
Virtual Embryo has incorporated two open ontologies to arrange information, one for
embryology and the other developmental toxicology, and implemented this ontology in
Protege (http://obofoundry.org/). This formal ontology will be available for community
participation in linking  each of the domain-specific models to build a predictive system
for the embryo as a whole. Furthermore, informal ontologies that include less explicit
information about a pattern of malformations and underlying embryology can make a
useful contribution when the end-user is knowledgeable about the field.  Hence, Virtual
Embryo is piloting a Wiki-space (http://v-embryo.wikispaces.com/) to generate
hypotheses about the co-occurrence of specific malformations to common embryology,
or the relationship of genetic defects to higher-order phenotypes, for building
representations of pathways and processes that can be tested through further experiments
and analyses.

Charge Question 5 continued:
V-Embryo: It remains to be seen how well it will eventually integrate with the overall
program, and its integration with other internal and external initiatives needs to be
resolved.  Nevertheless, it appears that this project could provide an  opportunity to
explore the results emerging from ToxCast, and it may help direct selection of the next
generation of compounds for analysis in ToxCast. (Recommendation #18 in Table 1)
ORD Response: Although still early in its development Virtual Embryo has begun to
integrate with other activities, especially ToxCast M and the Virtual Liver. Since its
inception last December and the review addressed here, the v-Embryo has been:
           1.  integrated into NCCT's Computational Toxicology Research second
              generation Implementation Plan;
                                        19
                      Previous  I    TOC

-------
ORD Response to BOSC Computational Toxicology Letter Report
                       February 2009

2.  presented at five seminars at EPA (including NCEA) and six seminars
   outside EPA (including a Gordon Research Conference);
3.  introduced at NCCT's Computational Toxicology education course, at two
   presentations describing the implementation of prenatal developmental
   studies in ToxRefDB (manuscript in preparation), and one presentation on
   ToxCast™'s NovaScreen assay (manuscript in preparation);
4.  the topic of one book chapter (in print) and seven abstracts (five in print
   and two accepted);
5.  reflected in one submitted abstract in collaboration with Virtual Liver, and
   three
6.  submitted abstracts in collaboration with ToxCast™; and
7.  presented in the virtual tissue research section at the Human Health
   Program Review (BOSC, January 2009).
                             20
           Previous  I    TOC

-------
          ORD Response to BOSC Computational Toxicology Letter Report
                                 February 2009

                  Appendix I: Summary Action Items
                Detailed Milestones in response to Charge Question 2

ToxCast™:

FY09
   •   First initial publications and public access to ToxCast™ in vitro assay data
   •   Completion of generating all of the ToxCast™ Phase I data
   •   Sharing of ToxCast™ Phase I data with data analysis partners and hosting of the
       first "ToxCast™ Data Analysis Summit"
   •   Develop a series of "signatures" linking ToxCast™ in vitro data with ToxRefDB
       in vivo toxicology.
   •   Initiate generation of ToxCast™ Phase II data
   •   Quarterly public releases of new ToxCast™ data of various study types
FY10
   •   Quarterly public releases with new ToxCast™ data
   •   Completion of generating all of the ToxCast™ Phase II data
   •   Sharing of ToxCast™ Phase II data with data analysis partners and hosting of the
       second "ToxCast™ Data Analysis Summit"
   •   Validation of predictive "signatures" linking ToxCast™ in vitro data with
       ToxRefDB in vivo toxicology
FY11
   •   Quarterly public releases with new ToxCast™ data
   •   Application of toxicity predictions from Phases I and II of ToxCast™ to chemical
       prioritizations in EPA Program Offices
   •   Initiate generation of T
       requiring prioritization

ACToR:

FY09
   •   Initial public deployment
   •   Significant version 2, including refined chemical structure information
   •   Develop workflow for tabularization of data buried in text reports
   •   Integrate all ToxCast™ and ToxRefDB data
   •   Quarterly releases with new ToxCast™ data

       Quarterly releases with new ToxCast™ data
       Initiate generation of ToxCast™ Phase III data on chemicals and nanomaterials
FY10
   •   Implementation of a process to gather tabular data on priority chemicals from text
       reports
   •   Perform survey of sources of exposure data and import any remaining sources
   •   Develop flexible query interface and data download process
   •   Develop process to extract data from open literature
FY11
   •   Quarterly releases with new ToxCast™ data
                     Previous  I     TOC

-------
          ORD Response to BOSC Computational Toxicology Letter Report
                                 February 2009
ToxRefDB:

FY09
   •   Initial public deployment of chronic toxicity data
   •   Public deployment of reproductive and developmental toxicity data
   •   Develop flexible query interface and data download process
   •   Develop workflow for curation of similar, but non-guideline chronic, reproductive
       and developmental study types
   •   Public deployment of developmental neurotoxicity data
   •   Quarterly public releases of new data of various study types
FY10
   •   Quarterly releases with new ToxCast™ data
   •   Implementation of a process to curate data on ToxCast™ Phase II chemicals from
       multiple sources
FY11
   •   Quarterly releases with new ToxCast™ data

DSSTox:

FY09
   •   Publish paper and property files on ToxCast 320 chemical inventory, with
       guidance for SAR modeling study
   •   Publish DSSTox ToxCast 320 categories file and DSSTox ToxRef summary data
       files
   •   Coordinate efforts to structure-annotate and provide effective linkages to
       microarray data for toxicogenomics
   •   Compile and publish public genetic toxicity data and SAR predictions for
       ToxCast 320
   •   Restart Chemoinformatics Communities of Practice using EPA's Science Portal;
FY10
   •   Publish new DSSTox database and doc
   •   Explore new approaches to SAR modeling based on feature categories within
       existing DSSTox files and ToxCast™ data
   •   Expand CEBS collaboration to incorporate  DSSTox chemical content, create
       chemical linkages to external projects;
   •   Separately publish DSSTox structure inventory with various chemical
       classifications for use in modeling using publicly available tools
FY11
   •   In collaboration with ACToR, establish procedures and protocols for automating
       chemical annotation of new experimental data submitted to CEBS or NHEERL
   •   Document and employ PubChem analysis tools in  relation to published DSSTox
       and ToxCast™ data  inventory in PubChem
   •   Collaborate with SAR modeling efforts to predict ToxCast™ endpoints using in
       vitro data
                     Previous  I     TOC

-------
          ORD Response to BOSC Computational Toxicology Letter Report
                                 February 2009

   •   Continue expansion of DSSTox public toxicity database inventory for use in
       modeling with co-publication and linkage to ACToR and PubChem

v-Liver:

 FY09
   •   Prioritize proof of concept (PoC) environmental chemicals with clients. Using
       toxicity data from ToxRefDB and bioactivity data from ToxCast™, a subset of
       Phase I chemicals will be selected for the PoC, which will be finalized in
       collaboration with program offices to ensure relevance to EPA needs.
   •   Begin deployment of v-Liver KB on physiologic processes perturbed by PoC
       chemicals. The first version of the KB will focus on the PoC chemicals and
       populated mostly with their molecular activity data from ToxCast™, and cellular
       and tissue level outcomes from ToxRefDB and the literature.
   •   Deploy KB visualization tool for client interaction. Access to the KB will be
       provided using open source tools for biological data analysis.
   •   Simulate of liver lesions for alternative MOA/toxicity pathways. The prototype of
       the lesion simulator implementing the main MOA for hepatocarcinogenesis.
FY10
   •   Evaluate simulator using PoC chemicals and ToxCast data to predict outcomes.
   •   Quarterly update of v-Liver KB
   •   v-Liver KB inference tool for analyzing MOA for new chemicals/mixtures
       Extend v-Liver Simulator to liver and integrate with PBPK model

   •   Evaluate impact of genomic variation on cellular responses and lesion formation
   •   Evaluate v-Liver for simulating human pathology  outcomes using clinical  data

Most milestones will also include manuscript submissions describing the computational
methods and their biological/toxicological relevance.

v-Embryo:

   •   Literature-mining tools to index relevant facts about early eye development and
       concept model (ontology) to support this knowledge representation [2];

   •   Ocular gene network schema specified by gene-gene and gene-phenotype
       associations and subjected to dynamical network inference analysis;
       computer program of early eye development that reconstructs lens vesicle
       induction in silico using cell-based simulators and system-wiring diagrams of
       perturbation analysis of the computational (in silico) model with pathway-level
       data for normal and abnormal (toxicological) phenotypes in vitro and in vivo.

FY09
   •   Project plan and quality assurance plans for VT-KB and VT-SE
   •   Recruit:  student contractor and postdoctoral fellow
                     Previous  I     TOC

-------
          ORD Response to BOSC Computational Toxicology Letter Report
                                 February 2009

   •   Manuscript: application of VT-KB to analyze ToxRefDB developmental toxicity
       studies
   •   Model: VT-KB based qualitative (structural) model of self-regulating ocular gene
       network
   •   Model: VT-SE based cell-based computational model of lens-retina induction
   •   Manuscript: ocular morphogenesis, gene network inference, analysis and
       modeling
FY10
   •   Project plan: extend lens-retina model to other stages and species
   •   Model: incorporate pathway data from ToxCast™, mESC and ZF embryos
   •   Manuscript: sensitivity analysis for key biological pathways
   •   Manuscript: analyze developmental trajectories and phenotypes in computational
       models
   •   Project plan: integrate with other morphogenetic models
FY11
   •   Manuscript: test model against predictions for pathway-based dose-response
       relationship
   •   Manuscript: uncertainty analysis of models for complex systems model: computer
       program of early eye development using rules-based architecture, cell-based
       simulators  and systems-wiring diagrams
                     Previous  I    TOC

-------
     Table 1.  Summary of Recommendations and Proposed Actions.
  Recommendation
                            Action Item
                                               Time Line
1.  Involve
stakeholders in future
BOSC meetings.
            Identify and invite key stakeholders to attend and
            participate in BOSC meetings. Further, these
            stakeholders will also be asked to review and
            comment on new implementation plan.	
                                               Early 2009
2. Hold discussions
with risk assessment
practitioners.	
            Discussions with NCEA and others are underway,
            and regular Communities of Practice are also an
            ongoing activity to help achieve this end.	
                                               Ongoing
3.  Develop effective
ways of dealing with
wealth of data and
interact with program
offices on this issue.
            Extensive suite of interactive databases is under
            development and prioritization of data input is in
            consultation with program offices and others.
                                               Ongoing
4.  Relevance of
ToxCast™ beyond
prioritization to risk
assessment, including
exposure paths, and
ecology.	
            Expanded workgroups to address exposure
            pathways through ExpoCast. Partnering with
            NHEERL and NERL for HTS for ecological
            species other than human. Testing of
                                               2009-
                                               2012
                                               .TM .
            Pharmaceuticals in Phase II of ToxCast  to
            compare results to known human toxicities.
5.  Involve risk
assessors and others in
program offices for
planning on eventual
application of v-Liver
to risk assessment.
            Project team includes NCEA and will be
            expanded to include others. Consultations with
            program office to identify practical use cases that
            demonstrate utility of virtual tissues.
                                               Begin in
                                               2009
6.  Detailed time table
for milestones:
ToxCast
IT/IM.
        TM
            Developed and put in place (See Appendix I for
            details).
                                               Ongoing
v-tissues,
7.  Identify and
prioritize key
objectives for v-Liver
with milestones.
                                                     TM
Short-term goals: Identify environmental
chemicals for proof of concept (PoC) in
consultation with stakeholders; use of ToxCast
data to begin quantitative parameterization of
cellular and molecular responses - see Appendix
I for detailed milestones.
Long-term goals - use in RA with details being
developed and to be presented at next BOSC
review.
                                                          2009
8.  Develop more
detailed plan for v-
Embryo with
milestones.
            Plan has been developed with long-term goal of a
            computational framework enabling predictive
            modeling of prenatal developmental toxi city;
            Please see Appendix I for milestones.	
                                               2009
9.  Develop milestones
for Arsenic BBDR.
            Please see narrative for significant change in
            plans for this work	
                                               N/A
                     Previous
                           TOC

-------
            Table 1. Summary of Recommendations and Proposed Actions
10. Compile a list of
specific use cases for
specific questions that
will be addressed with
the database of
ToxCast™ data.
A goal of Phase I of ToxCast   is to find links
between in vitro and in vivo toxicity as captured
in ToxRefDB. To achieve this, the ToxMiner
database is being organized into five main pieces
- please see narrative for details.
2009
11.  Exploration of
alternatives to natural
language processing
(NLP).   	
Approaches for improving on previous uses of
NLP are underway - greater dependence upon
further testing and analyses for starting points
derived through NLP.	
Ongoing
12. Develop explicit
milestones and
research questions for
addressing EPA's
goals, and use this to
focus first iterations of
development of both
the KB and model(s).
Questions and associated milestones have been
developed:
    1)  Modeling of tissue level adverse effects to
       enable better extrapolation by formalizing
       the description of key events leading to
       adverse outcomes;
    2)  Extrapolating the tissue level effects
       across doses and time.
Ongoing
13. Delineate model
specifications for
sharing between
models of different
scales that can then be
interconnected when
appropriate.	
International workshop in April 2009 will include
multi-scale modeling experts to consider this
issue and NCCT is collaborating with PBPK
modelers to develop formal specifications to ease
model integration.
Ongoing
14. Enlist appropriate
supporters and
collaborators to gain
necessary data for
developing v-embryo.
NCCT has worked with NCER to develop STAR
funding opportunities that, through collaboration,
can provide key data. In addition, collaborators
in NHEERL have been identified and discussions
have begun.	
Ongoing
15. Continuous
communication with
program office
personnel regarding
Arsenic BBDR.
Please see memo regarding the suspension of this
project.  ORD's decision on this project was in
consultation with program office and based on the
program office's plans for this chemical and their
reduced need for this modeling effort.	
2008
16. ToxCast needs to
define analytical
outcomes to develop
and validate analytical
methods
The outcome of ToxCast will be a series of well-
defined procedures that take as input the results
of a set of in vitro assays run on a chemical and
give a result which is a statement about the
likelihood that the chemical will lead to a
particular toxicity phenotype.	
Ongoing
17. Limitations of
natural language
processing (NLP) for
v-embryo and using
alternatives.
The NCCT proposes using NLP as a starting
point and then presenting the results to an analyst
for manual quality control. NLP is used to
extend, but not replace, the need for formal
concept modeling (ontology) to organize relevant
Ongoing
                     Previous
               TOC

-------
  EPA CompTox Research Program FY2009-2012                 BOSC Review Draft- 24 August, 2009
Synopsis: ACToR is a web-based informatics platform, organized at the top level by chemical
and chemical structure that is indexing, collecting, and organizing many types of data on
environmental chemicals. Environmental chemicals are defined as those likely to be in the
environment, including all chemicals regulated or tracked by the EPA, as well as related
chemicals, such as Pharmaceuticals that find their way into water sources. ACToR is indexing
and linking to data from hundreds of sources, including the EPA, FDA, CDC, NIH, academic
groups, other governmental agencies (state and national) and international organizations, such as
the WHO. Information being indexed and gathered includes in vivo toxicity, in vitro bioassay
data, use levels, exposure information, chemical structure, regulatory information and other
descriptive data. Planning for the project began in mid-FY07; beta versions were available inside
the EPA since early FY08,  and a public version became available in December 2008. ACToR
consists of a back-end database and a front-end web interface built on low-cost, publicly
accessible applications and tools. Over the next 3 years, ACToR will expand to include more
publicly available resources and data, including more information extracted from text reports and
tabularized, and more information on chemical use and exposure. The latter effort will be
coordinated with the efforts of ExpoCast™ and NERL to identify, index and extract data from
exposure-related resources  of highest interest and importance to EPA programs. In planned
upgrades to ACToR, the  ability of users to perform flexible searches across different layers of
data will be enhanced, and  customized data downloads will be implemented. ACToR will serve
as the primary vehicle to aggregate and publicly disseminate  all published data associated with
the ToxCast1 , ToxRef, and Tox21 research projects.  Additionally, the ToxMiner and NCCT
Chemical Repository systems are being developed as part of ACToR. These are data repositories
and data analysis engines for the ToxCast/Tox21 projects.

Partnerships/Collaborations (Internal & External):
    1. EPA ToxCast™ program - provide data for use in selecting chemicals and providing
      toxicology data for validation; provide route for publication of data
   2. Tox21 partnership - provide data for use in selecting  chemicals and providing toxicology
      data for validation; provide route for publication of data
   3. DSSTox coordination - align methods for registering high-interest chemical inventories
      (ToxCast™, ToxRef, Tox21,  DSSTox published data files), utilizing DSSTox chemical
      information quality review and structure-annotation within ACToR
   4.  EPA Centers and Offices (OPPT/OPP/NCEA/OW) - provide data on chemicals of
      interest

Milestones/Products:
FY09

    1. Initial public deployment.
   2. Significant version 2, including refined chemical structure information.
   3. Develop workflow for tabularization of data buried in text reports.
   4. Integrate all ToxCast™ and ToxRefDB data.
   5. Quarterly releases with new data.
                                          2a
                         Previous  I    TOC

-------