Thursday
December 5, 1901
Part V
Guidelines for Developmental Toxicity
Risk Assessment; Notice

-------
         63798           Federal Register / Vol. 56, No.  234 / Thursday. December 5. 1991 / Notices
         ENVIRONMENTAL PROTECTION
         AGENCY,       "   '
        IFRL-4038-3]
                                   J ..... i.
        Guidelines for Developmental Toxiclty
        Risk Assessment
        AQENCY: U.S. Environmental Protection
    /••: ":,i Agency (EPA).
    "'   '. ACTION! FinarG'ujdeliQes fot
        Developmental Toxicity Risk
        Assessment.

        SUMMARY: The U.S. Environmental
        Protection Agency (EPA1 is today
        Issuing final amended guidelines for
        Assessing the risks for .developmental
        loxicily from exposure to environmental
        agents. As background information for
        this guidance, this notice describes the
        scientific basis for concern about
        exposure to agents that cause
        developmental toxicity, outlines the
        general process for assessing potential
 (  :/'.  '' risk to humans because qf
 ''  ...... ,,,  ; environmental cpntarninanls, ..............
        summarizes the history of these
        guidelines, and addresses public and
        Science Advisory Board comments on
        the 1989 "Proposed Amendments to the
        Guidelines for the Health Assessment of
        JJuspeet Developmental Toxicants" [54
        FR 9380-9403], These guidelines, which
        have been renamed "Guidelines for
        Developmental Toxicity Risk
        Assessment" (hereafter "Guidelines").
        outline principles and methods for
        evaluating data from animal and human
        Studies, exposure data, and other
        information to characterize risk to
        human development, growth, survival,
        and function because oj exposure prior
        to conception, prenatally, or to infants
        and children. Thesg GujdelinBs arnend
        and replace EPA's 1986 "Guidelines for
        the Health Assessment of Suspect
        Developmental Toxicants" [51 PR 34028-
        34040] by adding  new guidance on the
        relationship between maternal and
        developmental toxicity, characterization
        of the health-related data base for
        developmental toxicity risk assessment,
        use of the reference dose or reference
        concentration for developmental toxicity
        (RfDor or RfCor), and use of the
        benchmark dose approach. In addition,
        the Guidelines were reorganized to
        combine hazard identification and dose-
        rtsporise evaluation since these are
        usually done together in assessing risk
;"',  ,, ' ...... fetwma,n,. health effects other tljan ...... ..
.-'  "':: ...... "cancer. ..... "' ........
        EFFECTIVE DATE: The Guidelines will be
        effective December 5, 1991.
        FOR FURTHER INFORMATION .CONTACT:
        fir. Carole A. Kimmel, Reproductive and
        Developmental Toxicology Branch,
        Human Health Assessment Group,
  Office of Health and Environmental
  Assessment (RD-689), U.S.	
  Environmental Protection Agency, 401 M
  Street, SW., Washington, DC 20460, TEL:
  202-260-7331, FAX: 202-260-3803.
  SUPPLEMENTARY INFORMATION: The
  Clean Air Act (CAA), the Toxic
  Substances Control Act (TSCA), the
	  Federal Insecticide, Fungicide, and
  Rodenticide Act (FIFRA) and other
  statutes administered by the EPA
  authorize the Agency to protect public
  health against adverse effects from
  environmental pollutants. One type of
  adverse effect of great concern is
  developmental toxicity, i.e., adverse
  effects produced prior to conception,
  during pregnancy and childhood.
  Exposure to agents affecting
  development can result in any one or
  more of the following manifestations of
  developmental toxicity: Death,
  structural abnormality, growth
  alteration, and/or functional deficit.
  These manifestations encompass a wide
  array of adverse developmental end
  points, such as spontaneous abortions,
  stillbirths, malformations, early
  postnatal mortality, reduced birth
  weight, mental retardation, sensory loss,
  and other adverse functional or physical
  changes that are manifested postnatally.
 The Role of Environmental Agents in
 Developmental Toxicity

   Several environmental agents are
 established as causing developmental
 toxicity in humans (e.g., lead,
 polychlorinated biphenyls,
 methylmercury, ionizing radiation),
 while many others are suspected of
 causing developmental toxicity in
 humans based on data from
 experimental animal studies (e.g., some
 pesticides, other heavy metals, glycol
 ethers, alcohols, and phthalates). Data
 for several of the agents identified as
 causing human developmental toxicity
 have been compared to the experimental
 animal data (Nisbet and Karch, 1983;
 Kimmel et al., 1984; Hemminki and
 Vineis, 1985; Kimmel et al., 1990a). In
 these comparisons, the agents causing
 human developmental toxicity in almost
 all cases were found to produce effects
 in experimental animal studies and, in
 at least one species tested, types of
 effects similar to those in humans were
generally seen. This information
provides a strong basis for the use of
animal data in conducting human health
risk assessments. On the other hand, a
number of agents found to cause
developmental toxicity in experimental
animal studies have not shown clear
evidence of hazard in humans, but the
available human data are often too
limited to evaluate a cause and effect
      1"1   "
  relationship. The comparison of dose-
  response relationships is hampered by
  differences in route, timing and duration
  of exposure. When careful comparisons
  have been done taking these factors into
  account, the minimally effective dose for
  the most sensitive animal species was
  generally higher than that for humans,
  usually within 10-fold of the human
  effective dose, but sometimes was 100
  times or more higher (e.g.,
  polychlorinated biphenyls [Tilson et al.,
  1990]). Thus, the experimental animal
  data were generally predictive of
  adverse developmental effects in
  humans, but in some cases, the
  administered dose or exposure level
  required to achieve these adverse
  effects was much higher than the
  effective dose in humans.
    In most cases, the toxic effects of an
  agent on human development have not
  been fully studied, even though
  exposure of humans to that agent may
  have been established. At the same
  time, there are many developmental
  effects in humans with unknown causes
  and no clear link with exposure to
  environmental agents. The background
  incidence of human spontaneous
  abortion, for example, was estimated by
  Hertig (1967) to be approximately 50% of
  all conceptions, and more recently,
  Wilcox et al. (1985), using sensitive
  techniques for detecting pregnancy as
  early as 9 days postconception,
  observed that 35% of postimplantation
 pregnancies ended in an embryonic or
 fetal loss. Of those infants born alive,
 approximately 7.4% are reduced in
 weight at birth (i.e., below 2500 g)
 (Selevan, 1981), approximately 3% are
 found to have one or more congenital
 malformations at birth, and by the end
 of the  first postnatal year, about 3%
 more are found to have serious
 developmental defects (Shepard, 1986).
 Of those children born with
 developmental defects, it has been
 estimated that 20% are  due to  genetic
 transmission and 10% can be attributed
 to known exogenous factors (including
 drugs,  infections, ionizing radiation, and
 environmental agents), leaving the
 remaining 70% with unknown causes
 (Wilson, 1977). In a recent hospital-
 based  surveillance study (Nelson and
 Holmes, 1989), 50.7% of congenital
 malformations were estimated to be due
 to genetic or multifactorial causes, while
 3.2% were associated with exposure to
 exogenous agents and 2.9% to twinning
 or uterine factors, leaving 43.2% to
 unknown causes. The proportion of the
 effects with unknown causes that may
 be attributable to environmental agents
 or to a  combination of factors, such as
.environmental	agents and genetic
                                                                   "" ',"„:,"	l;^ iM1'1^!^^        'iJ'nwiJJ


                                                  '!'!	!;:	,iA Siii,-1!:1!'1;!1'i!1"111!:
                                                                                                                                  I

-------
                   Federal'Register / Vol. 56, No.234" /Thursday,
 factors, nutritional deficiencies, alcohol
 consumption, direct or indirect exposure
 to tobacco smoke, use of prescribed and
 illicit drugs, etc., is unknown.
   The social and economic impact of
 developmental disabilities on ths
 population is extremely high. Close to
 one-half of the children in hospital
. wards are there because of prenatally
 acquired malformations (Shepard, I960).
 According to the Centers for Disease
 Control, congenital anoma!'»s. sadden
 infant death syndrome, and prematurity
 combined account for more than 50% of
 infant mortality among all races in the   •
 United States (National Center for
 Health Statistics, 1988). In addition,
 among the leading causes of estimated
 years of potential life lost (YPLL) due to
 death before the age of 65, congenital
 anomalies, prematurity, and sudden
 infant death syndrome combined rank
 third (Centers for Disease Control,
 1988a, b). The YPLL estimates for
 developmental defects may actually
 underestimate the public health impact
 because the estimates do not include
 prenatal deaths, they are based only on
 those cases that die before age 65 and
 do not account for limited quality of life,
 and pregnancies may be terminated'
 early due to prenatal diagnosis of
 developmental defects.
   These data provide the basis for a
 long-standing interest by Federal
 agencies that deal with human health to
 protect against exposures to agents that
 cause developmental toxicity, and most
 of these regulatory agencies have
 provisions for considering data on
 developmental toxicity in protecting
 human health. As a step in developing
 procedures far interpreting toxicity data
 •in the regulatory context, the National'
 Academy of Sciences/National
 Research Council, in 1983, published a
 framework for the risk assessment
 process, which EPA uses as the basis for
 its risk assessment guidelines and .for
 the assessment of risk due to
 environmental agents.

 The Risk Assessment Process and Its
 Application to Developmental Toxicity
   Risk assessment is the process by
 which scientific judgments are made
 concerning the potential for toxicity, to
 occur in humans. The National Research'
 Council (1983) has defined risk
 assessment as including seme or all of
 the following components: Hazard
 identification, dose-response
 assessment, exposure assessment, and
 .risk characterization. In general, the
 process of assessing the risk of human
 developmental toxicity may be adapted
 to this format. In practice, however,
 hazard identification for developmental
 toxjcity and other noncancer health
 effects is usually done in conjunction
 with an evaluation of dose-response
 relationships, since the determination, of
 a hazard is often dependent on whether
 a dose-response relationship is present
 (Kirnmel et al., 1990b). One advantage of
 this approach is that it reflects hazard
 within the context of dose, route,
 duration and timing of exposure, all of
 which are important in comparing the,
 toxicitjr information available to
 potential human exposure scenarios.
 Secondly, this approach avoids labelling
 of chemicals as developmental toxicants
 on a purely qualitative basis. For these
 reasons, the Guidelines combine hazard
 identification and dose-response
 evaluation under one section (Section
. Ill), and characterize both hazard and
 dose information as part of the health-
 related data base for risk assessment. If
 data are considered sufficient for risk
 assessment, an oral or dermal reference
 dose for developmental toxicity (RfDDT).
 or an inhalation reference concentration
 for developmental toxicity (RfCDT) is
 then derived for comparison with human
 exposure estimates. A statement of the
 potential for human risk and the
 consequences of exposure can come
 only from integrating the hazard
 identification/dose-response evaluation
 with the human exposure estimates in
 the final risk characterisation.
 Combining hazard identification and
 dose-response evaluation, as well as
 development of the RfDnT and  RfCDT, are
 revisions of the 1986 Guidelines.
  Hazard identification/dose-response
 evaluation involves examining-all
 available experimental animal and
 human data and the associated doses,
 routes, timing and duration of exposures
 to determine if an agent causes
 developmental toxicity and/or maternal
 or paternal toxicity in that species and
 under what-exposure conditions. The
 no-observed-adverse-effect-level
 (NOAEL) and/or the lowest-observed-
 adverse-effect-level (LOAEL) are
 determined for each study and type of
 effect. Based upon the hazard
 identification/dose-response evaluation
 and criteria provided in these
 Guidelines, the health-related data base
 can be characterized as sufficient or
 insufficient for use in risk assessment
 (Section III.C). Because of the limitations
 associated with the use of the NOAEL,
 the Agency is evaluating the use of an
 additional approach, i.e., the benchmark
 dose approach (Crump, 1984), for more
 quantitative dose-response evaluation
 when sufficient data are available. The
 benchmark dose provides an indication
 of the risk associated with exposures
 near the NOAEL, taking into account the
 variability in the data and the slope of
 the dose-response curve.
   For the determination of the RfDDT or
' the RfCDT, uncertainty factors are
 applied to the NOAEL (or LOAEL, if a
 NOAEL has not been established) to
 account for extrapolation from
 experimental animals to humans and for
 variability within the human population.
 The RfDDT or RfCDT is generally based
 on a short duration of exposure as is
 typically used in developmental toxicity
 studies in experimental animals. The
 use of the terms RfDDT and RfCDT
 distinguish them from the oral or dermal
 reference dose (RfD) and the inhalation
 reference concentration (RfC) which
 refer primarily to chronic exposure
 situations (U.S. EPA, 1991). Uncertainty
• factors may also be applied to a
 benchmark dose for calculating the
 RfDDT or RfCm, but the Agency has little
 experience with applying this approach
 and is currently supporting research
 efforts to determine the appropriate
 methods. As more information becomes
 available, guidance will be written and
 published as an addendum to these
 Guidelines. These approaches are
 discussed further in section III.D.
   The exposure assessment identifies
 human populations exposed or
 potentially exposed to an agent,
 describes their composition and size,
 and presents the types, magnitudes,
 frequencies, and durations of exposure
 to the agent. The exposure assessment
 provides an estimate of human  exposure
 levels for particular populations from all
 potential sources;
   !n risk characterization, the hazard
 identification/dose-response evaluation
 and the exposure assessment for given
 populations are combined to estimate
 some measure of the risk for
 developmental toxicity. As part of risk
 characterisation, a summary of the
 strengths and weaknesses in each
 component of the risk assessment are
 discussed along with major
 assumptions, scientific judgments, and,
 to the extent possible, qualitative and
 quantitative estimates of the    .
 uncertainties. Confidence in the health-
 related data is always presented in
 conjunction with information on dose-
 response  and the RfDdt or RfCdt. If
 human exposure estimates are
 available, the exposure basis used for
 the risk assessment is clearly described,
 e.g., highly exposed individuals, or
 highly sensitive or susceptible
 individuals. The NOAEL may be
 compared to the various  estimates of
 human exposure to calculate the
 margin(s) of exposure (MOE). The
 considerations for determining
 adequacy of the MOE are similar to

-------
     63800
Register /  Vol. 56, No, 233 7 Thursday,  December 5,  l§9f '/' N6tices:
     those used in determining the
     appropriate size of the uncertainty
     f«i
-------
                  Federal Register /
  5. Other Risk Descriptors
  E. Communicating Results
  VI. Summary and Research Needs
  VII. References   .
Part B: Response to Public and Science
Advisory Board Comments
!, Introduction
II. Intent of the Guidelines
III. Basic Assumptions
IV. Maternal/Developmental Toxicity
V. Functional Developmental Toxicity
VI. Weight-of-Evidence Scheme
VII. Applicability of the RfDor Concept and
   the Benchmark Dose Approach
Part A: Guidelines for Development Toxldty
Risk Assessment

L Introduction    '
  These  Guidelines describe the
procedures that the EPA follows in   .
evaluating potential developmental
toxicity associated with human
exposure to environmental agents. The
Agency has sponsored or participated in
several conferences that'addressed
issues related to such evaluations and
that provide some of the scientific basis
for these Guidelines [U.S. EPA, 1982a;
Kimmel et aL, 1982b, 1987; Hardin, 1987;
.Perlin and McCormack, 1988; Kimmel et
al., 1989; Kimmel and Francis, 1990;
Kimmel et al.,1990a). The Agency's
authority to regulate substances that
have the potential to interfere  with
human development is derived from a
number of statutes that are implemented
through multiple offices within the EPA.
The procedures described herein are
 intended to promote consistency in the
 assessment of developmental toxic
 effects across program offices  within the
 Agency.
   These Guidelines provide a  general
 format for analyzing and organizing the
 available data  for conducting risk
 assessments. The Agency previously has
 issued testing guidelines (U.S. EPA,
 1982b, 1985a, 1989a, 1991a) that provide
 protocols designed to determine the
 potential of a test substance to induce
 structural and/or other adverse effects
 during development. These risk
 assessment Guidelines do not change
 any prescribed statutory or regulatory
 standards for the type of data necessary
 for regulatory action, but rather provide
 guidance for the interpretation of studies
 that follow the testing guidelines, and in
 addition, provide limited information for
 interpretation  of other studies (e.g.,
 epidemiologic  data, functional
 developmental toxicity studies, and
 short-term tests) that are not routinely
 required, but may be encountered when
  reviewing data on particular agents.
   Since the purpose of risk assessment
  is to make inferences aboutjjqtential r
  risks to human health, the most
  appropriate data to be used^re those
 deriving from studies of humans. If
 adequate human data are not available,
 then it is necessary to use data obtained
 from other species. There are a number
 of unknowns in the extrapolation of
 data from animaLstudies to humans.
 Therefore, a number of assumptions
 must be made on the relevance of
 effects to potential human risk which
 are generally applied in the absence  of
 data. These assumptions provide the
 inferential basis for the approaches
 taken to risk assessment in these
 Guidelines.
   First, it is assumed that an agent .that
 produces an adverse developmental
 effect in experimental animal studies
 will potentially pose a hazard to humans
 following sufficient exposure during
 development. This assumption is based •
 on the comparisons of data for agents  .
 known to cause human developmental''
 toxicity (Nisbet and Karch, 1983; Kimmel
 et al., 1984; Hemminki and Vineis, 1985;
 Kimmel et al., 1990a), which indicate
 that, in almost all cases, experimental  .
 animal data are predictive of a
 developmental effect in humans.
   It is assumed that all of the four
 manifestations of developmental
 toxicity (death, structural abnormalities,
 growth alterations, and functional
 deficits) are of concern. In the past,
 there has been a tendency to consider
 only malformations or malformations
 and death as end points of concern.
 From the data on agents that are known
, to cause human developmental toxicity
 (Nisbet and Karch, 1983; Kimmel et al.,
 1984;  Hemminki and Vineis, 1985;
 Kimmel et al., 1990a), there is; usually at
 least  one experimental species that
 mimics the types of effects seen in
 humans, but in other species tested,  the
 type of developmental perturbation may
 be different. Thus, a biologically
 significant increase in any of the four
 manifestations is considered indicative'
 of an agent's potential for disrupting
 development and producing a
 developmental hazard-
   It is assumed that the types of
 developmental effects seen in animal
 studies are not necessarily the same as
 those that may be produced in
 .humans.This assumption is made
 because it is impossible to determine
 which will be the most appropriate
 species in terms of predicting the
 specific types of effects seen in humans.
 The fact that every species may not
 react in the same way could be due to
 species-specific differences in critical
 periods, differences in timing of
 exposure, metabolism,, developmental
 patterns, placentation, or mechanisms of
 action.      ;   ,;
   The most appropriate species is used
 •to estimate human risk when data are
available (e.g., pharmacokinetics). In the
absence of such data, it is assumed that
the most sensitive species is appropriate
for use, based on observations that
humans are as sensitive or more so than
the most sensitive animal species tested
for the majority of agents known to
cause human developmental toxicity
(Nisbet and Karch, 1983; Kimmel et al.,
1984; Hemminfci and Vineis, 1985;
Kimmel eta!., 199Ga).
  In general, a threshold is assumed for
the dose-response curve for agents that
produce developmental toxicity.JThis is
based on the known capacity, of the
developing organism to compensate for
or to repair a certain amount of damage
at the cellular, tissue, or organ level. In
addition, because of the multipotency of
ceils at certain stages of development,
multiple insults at the molecular or
cellular level may be required to
produce an effect on the whole
organism.        .

!!. Definitions and Terminology
  The Agency recognizes that there are
differences in the use of terms in the
field of developmental toxicology. For
the purposes of these Guidelines the
following definitions will be used.
  Developmental toxicology—The study
of adverse effects on the developing
organism that may result from exposure
prior to conception (either parent),
during prenatal development, or .
postnatally to  the time of sexual
maturation. Adverse developmental
effects may be detected at any point in
the life span of the organism.  The major
manifestations of developmental
toxicity include: (1) Death of the
developing organism, (2) structural
abnormality, (3) altered growth, and (4)
functional deficiency.
  Altered growth-^—An alteration in
offspring organ or body weight or size.
Changes in one end point may or may
not be accompanied by other signs of
altered growth (e.g., changes in body
weight may or may not be accompanied
by changes in  crown-rump length and/or
skeletal ossification). Altered growth
can be induced at any stage of "
development, may be reversible, or may
result in a permanent change.
   Functional developmental
toxicology—The study of alterations or
delays in the physiological and/or
biochemical competence of an organism
or organ system following exposure to
an agent during critical periods of
development pre- and/or postnatally.
   Structural abnormalities—Structural
alterations in development that include
both malformations and variations,
   Malformations and variations—A
malformation  is usually defined as a

-------

                                                            	IT*,,
                       Fgdgral Register / Vol. 56, No.  234 / Thursday,  December 5,  1991 / Notices
    permarieni structural change that may
    adversely affect survival, development,
    or function. The term teratogenicity is
    used in these Guidelines to refer only to
    malformations. The term variation is
    used to indicate a divergence beyond
    the usual range of structural constitution
    that may not adversely affect survival qr
    health. Distinguishing between
   _ variations and malformations is difficult
    aincc there exists a continuum of
    responses from the normal to the
    extremely deviant. There is no generally
    ilccepfed classification of malformations
	" nnd, vaj-fatipcs., Othej terms	that are	
I11,/1 • "ibflqn ugpd, b.ut n,p, better defined,	
    include anomalies, deformations, and
    aberrations.
    III. Hazard Identificstian/fipse-
    Response Evaluation of Agents That
    Cause  Developmental Toxicity
      This section discusses the evaluation
    and interpretation of hazards for a
    variety of end points of developmental
    toxtclty seen in both human and animal
    siudlos, and describes the criteria for
    characterizing the sufficiency of the
    he«tth-rela|ed data base for cpnAjcting
    a developmental toxicity risk
    assessment It also details the use of
    dose«response data for determining
    potential hazards, and describes the
    calculation of tlie RfDpT or RfCDT, a dose
    or concsntyation tha.t is asjiimed to be
    without; appreciable risk of deleterious
    developmental effects for a given agent.
      Developmental toxicity is expressed
    as one  or more of a number of possible
    end points that may be used for
    evaluating the potential of an agent to
!"	feause abnormal development.
    Developmental toxicity generally occurs
    in a dose-related manner, may result
    from short-term exposure (including
   single exposure situations) or from
   longer-term low-level exposure, may be
   produced by various routes of exposure,
   and the types of effects may vary
   depending on the timing of exposure
   because of a number of critical periods
   of development for various organs and
   functional systems.
      The four major manifestations of
   developmental toxicity are death,
   structural abnormality, altered growth,
   and functional deficit. The relationship
   among these manifestations may vary
   with increasjng dose, and especially at
   higher doses, death of the copceptus
   may preclude expression of other
   manifestations. Of these, all four
   manifestations have been evaluated in
   human giudies. but only the first three
   are traditionally measured in laboratory
   animals using the conventional
   developmental toxicity (also called
   leralogenicity or Segment II) testing
   protqcol as well as in other study
 protocols, such as the multigeneration
 study or the continuous breeding study.
 Although functional deficits seldom
 have been evaluated in routine testing
 studies in experimental animals,
 functional evaluations are beginning to
 be required in certain regulatory
 situations (U.S. EPA, 1986a, 1988a,
 1989b, 1991a).
   Developmental toxicity can be
 considered a component of reproductive
 toxicity, and often it is difficult to
 distinguish between effects mediated
 through the parents versus direct
 interaction with developmental
 processes. For example, developmental
 toxicity may be influenced by the effects
 of toxic agents on the maternal system
 when exposure occurs during pregnancy
 or lactation. In addition, following
 parental exposure prior to conception,
 developmental toxicity may result in
 their offspring and, potentially, in
 subsequent generations. Therefore, it is
 useful to consult the "Proposed
 Guidelines for Assessing Male
 Reproductive Risk" (U.S. EPA, 1988b)
 and the "Proposed Guidelines for
 Assessing Female Reproductive Risk"
 (U.S. EPA, 1988c) in conjunction with
 these Guidelines. Mutational events that
 occur as a result of exposure to agents
 that cause sJeyelppmental toxicity may
 be difficult to discriminate from other
 possible mechanisms in standard
 studies of developmental toxicity. When
 mutational events are suspected, the
 "Guidelines for Mutagenicity Risk
 Assessment" (U.S. EPA, 1986c), which
 specifically address the risks of
 heritable mutation, should be consulted.
   Carcinogenic effects have occurred in
 humans following developmental
 exposures to diethylstilbestrol (Herbst
 et al., 1971). Several additional agents
 (e.g., direct-acting alkylating agents)
 have been shown to cause cancer
 following developmental exposures in
 experimental animals, and it appears
 from the data collected thus far that
 agents capable of causing cancer in
 adults may also cause transplacental or
 neonatal carcinogenesis (Anderson et
 al., 1985). Currently, there is no way to
 predict whether the developing offspring
 or adult will be more sensitive to the
 carcinogenic effects of an agent. At
 present, testing for carcinogenesis
 following developmental exposure is not
routinely required. However, if this type
 of effect is reported for an agent, it is
 considered appropriate to use the
 "Guidelines for Carcinogen Risk
Assessment" (U.S. EPA, 1986b) for
assessing human risk.
  A. Developmental Toxicity Studies: End
  Points and Their Interpretation

  1. Laboratory Animal Studies

   This section discusses the end points
  examined in routinely used protocols as
  well as the use of other types of studies,.
  including functional studies and short-
  term tests.
   The most commonly used protocol for
  assessing developmental toxicity in
  laboratory animals involves the
  administration of a test substance to
  pregnant animals (usually mice, rats, or
  rabbits) during the period of major
  organogenesis, evaluation of maternal
  responses throughout pregnancy, and
  examination of the dam and the uterine
  contents just prior  to term (U.S. EPA,
  1982b, 1985a; Food and Drug
  Administration (FDA), 1966,1970;
  Organization for Economic Cooperation
  and Development (OECD), 1981).  Some
  studies may use exposures of one to a
  few days to investigate periods of
  particular sensitivity for induction of
  abnormalities in specific organs or organ
  systems. In addition, developmental
  toxicity may be evaluated in studies
  involving exposure to one or both
  parents prior to conception, to the
  conceptus during pregnancy and over
  several generations, or to offspring
  during the prenatal and preweaning
  periods (U.S. EPA,  1982b, 1985a, 1986a,
  1988a, 1991a; FDA, 1966,1970; OECD,
  1981; Lamb, 1985). These Guidelines are
  intended to provide information for
  interpreting developmental effects
  related to any of these types of
  exposure.
   Appropriate study designs include a
  number of important factors. For
 example, test animal selection is
 generally based on considerations of
 species, strain, age, weight, and health
 status. Assignment of animals to dose
 groups by stratified randomization (on
 the basis of body weight) reduces bias
 and provides a basis for performing
 valid statistical tests. At a minimum, a
 high dose, a low dose, and one
 intermediate dose are included. The high
 dose is selected to produce some
 minimal maternal or adult toxicity (i.e.,
 a level that at the least produces
 marginal but significantly reduced body
 weight, reduced weight gain, or specific
 organ toxicity, and at the most produces
 no more than 10% mortality). At doses
 that cause excessive maternal toxicity
 (that is, significantly greater than the
 minimal  toxic level), information on
.developmental effects may be difficult
 to interpret and of limited value. The
 low dose is generally a NOAEL for adult
 and offspring effects, although if the low
 dose produces a biologically or

-------
Federal Register /  Voi; SGrNo.
                                                                       •Dfeceinbef'  5/1991 'j Notices"-
statistically significant increase in
response, it is considered a LOAEL (see
section IH.A.l.f for a discussion of
biological versus statistical
significance). A concurrent control group
treated with .the vehicle used for agent
administration is a critical component of
a well-designed study.
  The route of exposure in these studies
is usually oral, unless the chemical or
physical characteristics of the test
substance or pattern of human exposure
suggest a more appropriate route of
administration. In the case of dermal
exposure, developmental toxicity
studies showing no indication of
maternal or developmental toxicity are
considered insufficient for risk
assessment unless accompanied by
absorption data (Kimmel and Francis,
1990). Dermal developmental toxicity
studies in which skin irritation is too
marked (moderate erythema and/or
moderate edema, i.e., raised  "
approximately 1 mm) also are
considered insufficient, since excessive
maternal toxicity may be produced from
the irritation rather than from systemic
exposure to the agent. Assessment of
maternal toxicity is base'd on signs of
systemic toxicity rather than on local
effects such as skin irritation.
Absorption data and limited
pharmacokinetic data collected in
dermal developmental toxicity studies
provide very useful information in the
evaluation  of study design and data
interpretation (Kimmel and Francis,
1990). Many of these points also  are
pertinent to studies by other routes of
exposure.
  The evaluation of specific end points
of maternal and developmental toxicity
is discussed in the next several sections.
Appropriate historical control data
sometimes  can be very useful in the
interpretation of these end points.
Comparison of data from treated
animals with concurrent study controls
should always take precedent over
comparison with historical control data.
The most appropriate historical control
data are those from the same laboratory
in which studies were conducted. Even
data from the same laboratory, however,
should be used  cautiously and examined
for subtle changes over time that may
result from genetic alterations in the
strain or stock of the species used,
changes in,environmental conditions
both in the breeding colony of the
supplier and in the laboratory, and
changes in personnel conducting studies.
and collecting data (Kimmel and Price,
                      1990). Study data should be compared
                      with recent as well as cumulative
                      historical data. Any change in
                      laboratory procedure that might affect
                      control data should be noted and the
                      data accumulated separately from
                      previous data.

                        The next three sections (a-c) discuss
                      individual end points of maternal and
                      developmental toxicity as measured in
                      the conventional developmental toxicity
                      study, the multigeneration study, and,
                      when available, in postnatal studies.
                      Other end points specifically related.to
                      reproductive toxicity are covered in the
                      relevant risk assessment guidelines (U.S.
                      EPA, 1988b, 1988c). The fourth section
                      (d) deals with the integrated evaluation
                      of all data; including the relative effects
                      of exposure on maternal animals and
                      their offspring, which is important in
                      .assessing the level.of concern about a
                      particular agent.

                        a. End Points of Maternal Toxicity. A
                      number of end points that may be
                      observed as possible indicators of
                      maternal toxicity are listed in Table 1.
                      Maternal mortality is an obvious end
                      point of toxicity; however, a number of
                      other end points can be observed that
                      may give an indication of the more
                      subtle adverse effects of an agent. For
                      example,.in well conducted studies, the
                      mating and fertility indices provide
                      information on the general fertility rate
                      of the animal stock usadTand are
                      important indicators of toxic effects to
                      adults if treatment begins prior to
                      mating or implantation. Changes in
                      gestation length may indicate effects on
                      the process of parturition.


                         Table 1.—End Points of Maternal
                                     Toxicity

                      Mortality
                      Mating  Index  [{no. with seminal plugs or
                        sperm/no, mated)  X 100}
                      Fertility  Index [(no. with implants/no, of
                        matings)  X 100]
                      Gestation Length  (useful when  animals are
                        allowed to deliver pups)
                      Body Weight
                        DayO
                        During gestation
                        Day of necropsy
                     • Body Weight Change
                        Throughout gestation
                        During treatment {including increments of
                          time within treatment period)
                        Post-treatment to sacrifice
                        Corrected maternal (body weight  change
    Table I.—End Points of Maternal
          Toxicity—Continued


   throughout gestation minus gravid uter-
   ine weight or litter weight at sacrifice)

 Organ Weights (in cases of suspected target
   organ toxicity  and especially when sup-
   ported by adverse histopathoiogy findings)
   Absolute
   Relative to body weight
   Relative to brain weight
 Food and Water Consumption (vyhere rele-
   vant)                 .   .
 Clinical Evaluations
 •  Types, incidence, degree, and duration ojf
    clinical signs
   Enzyme markers           •
   Clinical chemistries
 Gross Necropsy and Histopathoiogy
   Body weight and the. change in body
 weight are viewed collectively as
 indicators of maternal toxicity for most
 species, although these end points may
 not be as useful in rabbits, because
• body weight changes are usually more
 variable (Kimmel and Price, 1990), and
 in some strains of rabbits, body weight
 is not a good indicator of pregnancy
 status. Body weight changes may.
 provide more information than a daily
 body weight measured during treatment
 or during gestation.  Changes in weight  ,
 gain during treatment could occur that
 would not be reflected in the total
 weight change throughout gestation,
 because of compensatory weight gain
 that may occur following treatment but
 before sacrifice^For this reason, changes
 in weight gain during treatment  can be
 examined as another indicator of
 maternal toxicity.

   Changes in maternal body, weight
 corrected for gravid uterine weight at
 sacrifice may indicate whether the effect
 is primarily maternal or intrauterine. For
 example, a significant reduction in
 weight gain throughout gestation and in
 gravid uterine weight without any
 change in corrected maternal weight
 gain generally would indicate an
 intrauterine effect. Conversely, a change
 in corrected weight  gain and no change
 in gravid uterine weight generally would
 suggest maternal toxicity and little or no
 intrauterine effect. An alternate estimate
 of maternal weight change during
 gestation can be obtained by subtracting
 the sum of the weights of the fetuses. ,
 However, this weight does not include
 the uterine or placental tissue, or the
 amniotic fluid.       -   -    .

-------
                      gq|era| Register / Vol. 56, No.  234 / Thursday, December  5,  1891 /  Notices
      Changes in other end points may also
   be important. For example, changes in
   relative and absolute organ weights may
   be signs of a maternal effect especially
   when an agent is suspected of causing
   specific organ toxicity and when such
   findings are supported by adverse
   hislopathologfc findings in those organs.
   Food and water consumption data are
   useful, especially if the agent is
   administered in the diet or drinking
   water. The amount ingested (total and
   relative to body weight) and the dose of
   the agent (relative to body weight) can
   then RC calculated, and changes in food
   and water consumption related to
   treatment can be evaluated along with
   changes in body weight and body
   weight gain. Data on food and water
   consumption also are useful when an
   agerif is suspected of affecting appetite,
   water intake, or excretory function.
         ', " "  '"i  ll '  "i  ,,  I"!:'}  !  •;	 . ;,  'i
     Clinical eyajuatipns qf toxicity also
   rrtny be  used as indicators of maternal
   loxicity. Daily clinical observations may
   be useful in describing the profile of
   maternal, toxichy and alterations in
   general homeostasis. Enzyme markers
   and clinical cheniJstries may be useful
   indicators of exposure but must be
   interpreted carefully as to whether or
   riol a change constitutes toxicity. Gross
   rifjcrppsy and hlstopatholpgy data (when
   specified in the protocol) may aid in
   determining toxic dose levels. The
   minimuin arao.tml of information
,.."  coftSidq^ed useful for evaluating
   maternal toxicity'"["as noted in the
   "Proceedings of the Workshop on the
   Evaluation of .Maternal .and	
   Developmental Toxicity" (Kimmel et al.,
   1SJ87.J], includes: morbidity or mortality,
   maternal body weight and body weight
   Stiln, clinical signs of toxicity, food and
   Walcr consumption (especially if dosing
   is via food or water), and necropsy for
   gross ev|denqe of organ toxicity. In a
   well-designed study, maternal toxicity is
   determined in the pregnant and/or
   lectNting animal over an appropriate
   part of gestation and/or the neonatal
   period, and is not assumed or
   extrapolated from other adult toxicity
   studies.

    b= End Points of Developmental
 ,•  Taxiqity: Altered Suryiygh Growths-anc[	
 "" Maiphologica/Deveiopmeni.'^Becau'se	
  the ma.fejnal .animal, and .po...^^	,	,
  conceptus, is the individual treated
  during gestation, data generally" are
  calculated as incidence per litter or as
  number and percent of litters with
  particular end points. Table 2 indicates
  the ways in which offspring and litter
  end points may be expressed.
                                            IS "' "i'lFSISI	lin '"I"..:!1-11:: ll!"1!!1!	!'! nil'
                                                                          I ill III
  Table 2.—End Points of Developmental
                 Toxicity
  Litters with implants
    No. implantation sites/dam
    No. corpora lutea [CL)/dam *
    Percent preimplantation loss
         (CL—implantations) x 100 '
                    CL
    No. and percent live offspring b/litter
    NO. and percent resorptions/litter
    No. and percent litters with  resorptions
    No. and percent late fetal deaths/litter
    No. and percent nonlive (late fetal deaths
     •+ resorptions) implants/litter
    No. and percent litters with nonlive im-
     plants
    Nq. and percent affected (nonlive + mal-
     formed) implants/litter
    No. and percent litters with affected im-
     plants
    No. and percent litters with total resorp-
     tions
    No. and percent stillbirths/litter
    No. and percent  litters with live offspring
. Litters with Jive offspring
    No. and percent live offspring/litter
    Viability of offspring c
    Sex ratio/litter
    Mean offspring body weight/litterc
    Mean male or female body weight/litter"
    No,  and percent offspring  with external,
     visceral, or  skeletal malformations/litter
    No. and percent malformed offspring/litter
    No.  and percent  litters with  malformed
     offspring
   No.  and percent malformed males or fe-
     males/litter
   No.  and percent offspring with external,
     visceral, or skeletal variations/Utter
   Ng. and percent offspring with  variations/
     litter
   No.  and percent litters  having offspring
     with variations
   Types and incidence of individual malfor-
     mations
   Types and  incidence of individual vari-
     ations.
   Individual offspring  and their  malforma-
     tions  and variations (grouped according
     to litter and  dose)
   Clinical  signs  (type, incidence, duration,
     and degree)
   Gross necropsy and histopathology
  "Important when treatment begins  prior to im-
 plantation.  May be difficult  to assess in mice.
  * Offspring refers both to fetuses observed prior
 to term or to pups following birth. The end points
 examined depend  on  the protocol used for e*ach
 study.
  'Measured at selected intervals until termina-
 tion of the study.
   When treatment of females begins
 prior to implantation, an increase in
 preimplantation loss could indicate an
 adverse effect on gamete transport, the
 fertilization process, uterine toxicity, the
 developing blastocyst, or on the process
 of implantation itself. If treatment
 begins around the time of implantation
 (i.e., day 6 of gestation in the mouse, rat,
 or rabbit), an increase in
 preimplantation loss probably reflects
 variability that  is not treatment-related
 in the animals being used, but the data
 should be  examined carefully to
       	:	.         i	n	
         11 n ii in ii  111,  i  n    111 i in n 11  n i
  determine if there is a dose-response
  relationship. If preimplantation loss is
  related to dose, further studies would be
  necessary to determine the mechanism
  and extent of such effects.
    The number and percent of live
  offspring per litter, based oh all litters,
  may include litters that have no live
  implants. The number and percent of
  resorptions and late fetal deaths give
  some indication of when the conceptus
  died, and the number and percent of
  nonlive implants per litter
  (postimplantation loss) is a combination
  of these two measures. Expression of
  data as the number and percent of litters
  showing an increased incidence for
  these end points may be less useful than
  incidence per litter because, in the
  former case, a litter is counted whether
  one  or all implants were resorbed, dead,
  or nonlive.
   If  a significant increase in
  postimplantation loss is found after
  exposure to an agent, the data may be
  compared not only with concurrent
  controls, but also with recent historical
 control data (preferably from the same
 laboratory), since there is considerable
 interlitter variability in the incidence of
 postimplantation loss (Kimmel and
 Price, 1990). If a given study control
 group exhibits an unusually high or low
 incidence of postimplantation loss
 compared to historical controls, then
 scientific judgment must be used to
 determine the adequacy of the study for
 risk assessment purposes.
   The end point "affected implants"
 (i.e.,  the combination of nonlive and
 malformed conceptuses) sometimes
 reflects a better dose-response
 relationship than does the incidence of
 nonlive or malformed-off spring taken
 individually. This is especially true at
 the high end of the dose-response curve
 in cases when the incidence of nonlive
 implants per litter is greatly increased.
 In such cases, the malformation rate
 may  appear to decrease because only
 unaffected offspring have survived. If
 the incidence of prenatal deaths or
 malformations is unchanged, then the
 incidence of affected implants will not
 provide any additional dose-response
 information. In studies where maternal
 animals are allowed to deliver pups
 normally, the number of stillbirths per
 litter  should also be noted.
  The number of live offspring per litter,
 based on those litters that have one or
 more live offspring, may be unchanged
 even  though the incidence of nonlive in
 all litters is increased. This could occur
 either because of an increase in the
number of litters with no live offspring,
or an increase in the number of implants
per litter. A decrease in the number of

-------
                            Register / V&lJ- 56T'Ndi';%& /ur'sday.  December 5, 19grVNotices
                                                                        63805
 live offspring per litter is generally
 accompanied by an increase in the
 incidence of npnlive implants per litter
 unless the implant numbers differ among
 dose groups. In postnatal studies, the
 viability of live-born offspring should be
 determined at selected intervals until
 termination of the study.
   The sex ratio per litter, as well as the
 body weights of males and females, can
 be examined to determine whether or
 not one sex is preferentially affected by
 this agent. However, this is an unusual
 occurrence.
  - A change in offspring body weight is a
 sensitive indicator of developmental
 toxicity, in part because it is a
 continuous, variable. In some cases,
 offspring weight reduction may be the
 only indicator of .developmental toxicity.
 While there is always a question as to
 whether weight reduction is a
 permanent or transitory effect, little is
 known about the long-term
 consequences of short-term fetal or
 neonatal weight changes. Therefore,
 when significant weight reduction
 effects are noted, they are used as a
 basis to establish the NOAEL; Several   .
 other factors should be considered in
 the evaluation of fetal or neonatal     '
 weight changes; for example, in
 polytocous animals, fetal and neonatal
 weights are usually inversely correlated
 with litter size, and the upper end of the
 dose-response curve may be affected by
 smaller litters and increased fetal or
 neonatal weight Additionally, the
 average body weight of males is greater
, than that of females in the more • • ._
 commonly used laboratory animals.
   Live offspring are generally examined
 for external, visceral, and skeletal
 malformations and variations. If only a
 portion of the litter is examined for one'
 or more end points, then.random
 selection of those pups examined
 introduces less bias in the data. An
 increase in the incidence of malformed
 offspring may be indicated by a change
 in one or more of the following end
 points: the incidence of malformed    ;
 offspring per litter, the number and
 percent of litters with malformed
 offspring, or the number of offspring or
 litters with a particular malformation
 that appears to increase with dose (as
 indicated by the incidence of individual
 types of malformations).
   Other ways of examining the data
 include determining the incidence of
 external, visceral, and skeletal
 malformations and variations that may
 indicate the organs or organ systems.
 affected. A listing of individual-offspring
 with  their malformations and variations
 may give an indication of the pattern of
 developmental deviations. All of these
 methods of expressing and examining
 the data are valid for determining the
 effects of an agent on structural
 development. However, care must be
 taken to avoid counting offspring more
 than once in the evaluation of any single
 end point based on number or percent of
 offspring or litters. The incidence of
 individual types of malformations and
 variations may indicate significant
 changes thai are masked if the data on
 all malformations and/or variations are
 pooled. Appropriate historical control
 data can be especially helpful in the
 interpretation of malformations and
 variations, particularly those that
 normally occur at a low incidence and
 may :or may not be related to dose in an
 individual study.
   Although a dose-related increase in
 malformations is interpreted as an
 adverse developmental effect of
 exposure to an  agent, the biological
 significance of an altered incidence of
 anatomical variations is more difficult to
 assess, and must take into account what
 is known about developmental stage
 (e.g., with skeletal ossification),
 background incidence of certain
 variations (e.g., 12 or 13 pairs of ribs in
 rabbits), or other strain-or species-
 specific factors. However, if variations
 are significantly increased in a dose-
 related manner, these should also be
 evaluated as a possible indication of
 developmental  toxicity.
   In addition, although some
 investigators have considered certain of
 these effects to  simply be associated
 with manifestations of maternal toxicity
 noted at similar dose levels (Khera,
 1984,1985,1987), such effects are still
 toxic manifestations and as such are
 generally considered a reasonable basis
 for Agency regulation and/ or risk
 assessment. On a  somewhat similar
 note, the conclusion of participants in a
 "Workshop on Reproductive Toxicity
 Risk Assessment" (Kimmel et-aL, 1986)
 was that dose-related increases in
 defects that may occur spontaneously
 are as relevant  as dose-related increases
 in any other developmental toxicity end
 points.
   c. End Points  of Developmental
 Toxicity: Functional Deficits.
 Developmental  effects that are induced
 by exogenous agents are not limited to
 death, structural abnormalities, and
 altered growth.  Rather, it has been
 demonstrated in a number of instances
. that alterations  in the functional
 competence of an  organ or a variety of.
 organ systems may result from exposure
 during critical developmental periods
 that may occur between conception and
 sexual maturation. Sometimes, these
 functional defects are observed at dose
 levels below those at which other
 indicators of developmental toxicity are
 evident (Rodier, 1978). Such effects may
 be transient or reversible in nature, but
 generally are considered adverse .
 effects. Testing for functional
 developmental toxicity has not been
 required routinely by regulatory
 agencies in the United States, but
 studies in developmental neurotoxicity
 are beginning to be required by the EPA:
 when other information indicates the
 potential for adverse functional
 developmental effects (U.S. EPA, 1986a,
 1988a, 1989b, 1991a). Data from
 postnatal studies, when available, are
 considered very useful for further
 assessment of the relative importance
 and severity of findings in the fetus and
 neonate. Often, the long-term
 consequences of adverse developmental
 outcomes noted at birth are unkriown,
 and further data on postnatal
 development and function are necessary
 to determine the full spectrum of
 potential developmental effects. Useful
 data can also be derived from well-
 conducted multigeneration studies,
 although the dose levels used in these
 studies may be much lower than in
- studies with shorter-term exposure.
   Much of the early work in functional
 developmental toxicology was related to
 behavioral evaluations, and the term
 "behavioral teratology" became
 "prominent in the mid 1970s. Recent
 advances in this area have been
 reviewed in several publications (Riley.
 and Vorhees, 1986; Kimmel, 1988;
 Kimmel et aL, 1990a). Several expert
 groups have focused on the functions
 that should be included in a behavioral
 testing battery (World Health
 Organization [WHO], 1984; Buelke-Sam
 et al., 1985; Leukroth, 1986). These
 include: sensory systems, neuromotor
 development, locomotor activity,
 learning and memory, reactivity and/or
 habituation, and reproductive behavior.
 No testing battery has fully addressed
 all of these functions, but it is important
 to. include as many as possible, and
 several testing batteries have been
 developed and evaluated for use in
 .testing (Buelke-Sam et al., 1985;
 Tanimura, 1986; Eisner et al., 1986).
   The  Agency-recently has developed a
 "generic" developmental neurotoxicity
 test guideline that can be used for both
 pesticides and industrial chemicals (U.S.
 EPA, 1991a). Because of its design, the
 developmental neurotoxicity testing
 protocol may be conducted as a
 separate study, concurrently with or as
 a follow-up to a developmental toxicity
 (Segment II) study, or be folded into a
 multigeneration study in the second
 generation. Testing is generally
 conducted in the rat. In the protocol For
 the separate study, the test agent is

-------
    63806
Federal Register /  Vol. 56,  No. 234 /Thursday, December 5,  1991 / Notices
    administered orally (other routes may be
    used on a casc-by-case basis) to at least
    three treated groups and one concurrent
    control group of animals on day 6 of
    gestation through day 10 postnatally.
    The highest dose level is selected to
    induce Sgmg overt signs of maternal
    tbxlcily, but not result in more than a
    2OT> Deduction in weight gain during
    gestation and lactation. This dose also is
"  "' ^{sleeted, to avofd	Inuterq.oi.nepnatal	
    daath or malformations sufficient to
    preclude a meaningful evaluation of
    developmental neurotoxicity. At least 20
    litters are required per treatment group.
    For behavioral tests, one female and one
    male pup per litter are randomly
    selected and assigned to one of the
    following tests: motor activity, auditory
    ,Lslfirjje,,, arjdjearnjng and memory in
    animals at weaning and as adults.
    Neuropathological evaluation and
    determination of brain weights are
    conducted on selected pups at postnatal
    day 11 and at termination of the study.
       Several criteria for selecting agents
    for developmental neurotoxicity testing
    have been suggested [Buelkc-Sam et al.
    1985; Levlne and Butcher, 1990),
    including; Agents that cause central
    nervous system malformations,
    psychoactive drugs and chemicals,
    figcnls that cause adult neurotoxicity,
    hormonally-active agents, and chemicals
    that are structurally related to others
    that cause developmental neurotoxicity
    or for w,hich wide-spread exposure and/
    or release is expected. Data from
    developmental neurotoxicity studies
    should be evaluated in light of the data
    that may have triggered such testing as
    well as all other toxicity data available.
       Less work has been done on other
    developing functional systems, but the
    assessment of postnatal renal
    morphological and functional
    development may serve as a model for
    the use of postnatal evaluations in the
    risk assessment process. As an example,
    standard morphological analyses of the
    kidneys of fetal rodents have detected
    treatment-related changes in the relative
    growth of the renal papilla versus the
    renal cortex, an effect considered in
    some cases to be a malformation
    (hydronephrosis), while in other cases a
    variation (apparent hydronephrosis,
    enlarged or dilated renal pelvis). While
    sflme Investigators (Woo and Hoar,
    1972) have provided data suggesting that
    the morphological effect represents a
    transient developmental delay, others
    have shown that it can persist well into
    postnatal life and that physiological
    function is compromised in  the affected
    Individuals (Kavlock et al., I987a, 1988;
    Daston et al, 1938; Couture, 1990). Thus,
    the biological interpretation of this
 effect on the basis of fetal examinations
 alone is tenuous (U.S. EPA, 1985b). In
 addition, the critical period for inducing
 renal morphological abnormalities
 extends into the postnatal period
 (Couture, 1990), and studies on
 perinatally-induced renal growth
 retardation {Kavlock et al., 1986,1987b;
 Slotkin et al., 1988; Gray et al., 1989;
 Gray and Kavlock, 1991) have shown
r that, renal	fjinctionjs generally altered in
 such conditions, but that manifestation
 of the dysfunction is not readily
 predictable. Thus, both morphological
 and functional assessment of the
 kidneys after birth can provide useful
 and complementary information on the
 persistence and biological significance
 of expressions of developmental
 toxicity.
  Although not as-well-studied, data
 indicate that the cardiovascular,
 respiratory, immune, endocrine,
 reproductive, and digestive systems also
 are subject to alterations in functional
 competence (Kavlock and Grabowski,
 1983; Fujii and Adams, 1987) following
 exposure during development. Currently,
 there are no standard testing procedures
 for these functional systems; however,
 when data are encountered on a
 chemical under review, they are
 considered in the risk assessment
 process.
  Direct extrapolation of functional
 developmental effects to humans is
 limited in the same way as for other end
 points of developmental toxicity, i.e., by
 the lack of knowledge about underlying
 lexicological mechanisms and their
 significance. In evaluations of a limited
 number	g^'agents known to cause	;	
 developmental neurotoxic effects in
 humans, Adams (1986) concluded that
 these agents produce similar
 developmental neurotoxic effects in
 animals and humans. This conclusion
 was strongly supported by the results of
 a recent "Workshop on the Qualitative
 and Quantitative Comparability of
 Human and Animal Developmental
 Neurotoxicity," sponsored by EPA and
 the National Institute on Drug Abuse
 (NIDA), at which participants  critically
iieya|uatedi§pd,compared the effects of
 agents known to cause human
 developmental neurotoxicity with the
 effects  seen in experimental animal
 studies (Kimmel et al., 1990a). The high
 degree  of qualitative correlation
 between human and experimental
 animal data for the agents evaluated
 lends strong support for the use of
 experimental animals in assessing the
 potential risk for developmental
 neurotoxicity in humans. Thus, as for
 other end points of developmental
 toxicity, the assumption can be made
                                                              that functional effects in animal studies
                                                              indicate the potential for altered
                                                              development in humans, although the
                                                              types of developmental effects seen in
                                                              experimental animal studies will not
                                                              necessarily be the same as those that
                                                              may be produced in humans. Thus,
                                                              when data from functional
                                                              developmental toxicity studies are
                                                              encountered for particular agents, they
                                                              should be considered in the risk
                                                              assessment process.
                                                                Some guidance is provided here
                                                              concerning important general concepts
                                                              of study design and evaluation for
                                                              functional developmental toxicity
                                                              studies.
                                                                • Several aspects of study design are
                                                              similar to those important in standard
                                                              developmental toxicity studies (e.g., a
                                                              dose-response approach with the
                                                              highest dose producing minimal overt
                                                              maternal or  perinatal toxicity, number of
                                                              litters large enough for adequate
                                                              statistical power, randomization of
                                                              animals to dose groups and test groups,
                                                              litter generally considered the statistical
                                                              unit, etc.).
                                                                • A replicate study design provides
                                                              added confidence in  the interpretation
                                                              of data.
                                                                • Use of a pharmacological/
                                                              physiological challenge may be valuable
                                                              in evaluating function and "unmasking"
                                                              effects not otherwise detectable,
                                                              •particularly  in the case of organ systems
                                                              that are endowed with a reasonable
                                                              degree of functional reserve capacity.
                                                                • Use of functional tests with a
                                                              moderate degree of background
                                                              variability may be more sensitive to the
                                                              effects of an agent on behavioral end
                                                              points than are tests  with low variability
                                                              that may be impossible to disrupt
                                                              without being life-threatening (Butcher
                                                              et al., 1980).
                                                                • A battery of functional tests, in
                                                              contrast to a single test, is usually
                                                              needed to evaluate the full complement
                                                              of organ function in an animal;  tests
                                                              conducted at several ages may provide
                                                              more information about maturational
                                                              changes and their persistence.
                                                                • Critical  periods for the disruption of
                                                              functional competence include  both the
                                                              prenatal and the postnatal periods to the
                                                              time of sexual maturation, and  the effect
                                                              is likely to vary depending on the time
                                                              and degree of exposure.
                                                                • Interpretation  of data from studies
                                                              in which postnatal exposure is  included
                                                              should take into account possible
                                                              interaction of the agent with maternal
                                                              behavior, milk composition, pup
                                                              suckling behavior,  possible direct
                                                              exposure of pups via dosed feed or
                                                              water, etc.
                                                                    1 iMn	!"!'
                                                                                                     ,«s,;;11!	 ' '<. ,:M 'i,c::	,
                                                                 ; ii" ;in i, "H ,HJ: , ii»	HI	!;" iuiiri	IKIIKLI	-iIJKBK	^ 	i.,;. i  ,i	iiiniiisMfh,ni 1,0,, m	...
                                                                 111 «'il 'V1',!	!,ii'"HI!i|l, ill'l;1,;1;'I!!"1!1,,,!:,, fill ,„; , ,. ,1!'	!II1	IlillllMLlR'ylilllliWil	Ililhil	"I/ II	ilii> II 'hi!11 'I i'l'li ,/k III: \ , 	'I  >
            ,	 „ '"'''' i '  II	t"
                                                        i ' ,,~K	!";	J"V"	l! .' i S	I?*"	[iWi-MiCajllllH	5	'i'1	'	"""	!"	' 	•'•	:|l|lli	• '  '

                                                       •	;	plicit 'Illf't '111'i:;' !!!'I'Iliii I":11:/	IMC"i", Ij1"liimilii^ jldlllli..Hi	SiM 1  	iMiAliijlllidi 'i!!!!:.liiwin    	ill I1!.'.li'illlL!.! "[iiiElil!	'	iiiilliil'ii'llf il „jllVriii::'"

                                                       .lii'i;, /mil"! nli	il ,11:1	M."IIj!ij.! "I11!.!. :!"., ill1:1!1'! "W	 Si^""'" n1 il11'1 'i1' .'I1IIU !ll|'l|<|	VliliillBlliri'^H^    [IK!	f Vniiilll''^^^
                                                                                             '•I"1.1'";;;!: WM;
                                                                                             . 'Tl.	II	1'i'iin
                                                                                              ."'a111 .iiiu .iMiii'i.ii imiiiiiiiiiii'i' iijiiiiuw.iiilliiiiiiu^^
                                                                                               / "r; ..;;£' iloiillii.i.ftlli.Uin

                                                                                               l:l
-------
                  Federal Register ./'V61V€6» No.-234-•/- Thursday, December 5,'.1991  / .Notices :
  Although interpretation of functional
data may be limited at present, it is
clear that functional effects must be
evaluated in light of other toxicity data,
including other forms of developmental-
toxicity (e.g., structural abnormalities,
perinatal death, and growth
retardation). The level of confidence in
an adverse effect may be as important
as the type of change seen, and
confidence may be increased by such
factors as replicability of the effect
either in another study of the same
function, or by convergence of data from
tests that purport to measure similar
functions. A dose-response relationship
is considered an important measure of
chemical effect; in the case of functional"
effects, both monotonic and biphasic
dose-response curves are likely,
depending on the function being tested.
  Finally, there are at least three
general ways in which the data from.
these studies may be useful for risk
assessment purposes: (1) To help
elucidate the long-term consequences of
fetal and neonatal effects; (2) to indicate
the potential for an agent to cause
functional alterations and the effective
doses relative to those that produce
other forms of toxicity; and (3) for
existing environmental agents, to
suggest organ systems to be evaluated in
exposed human populations.
  d. Overall Evaluation of Maternal and
Developmental Toxicity. As discussed
previously, individual end points of
maternal and developmental toxicity are
evaluated in developmental toxicity
studies. In order to interpret the data
fully, an integrated evaluation must be
performed considering all maternal and
developmental end points.
  Agents that produce developmental
toxicity at a dose that is not toxic to the
maternal animal are especially of
concern because the developing
organism is affected but toxicity is not
apparent in the adult. However, the
more common situation is when adverse
developmental effects are produced only
at doses that cause minimal maternal
toxicity; in these cases, the
developmental effects are still
considered to represent developmental •
toxicity and should not be discounted as
being secondary to maternal toxicity. At
doses causing excessive maternal
toxicity (that is, significantly greater
than the minimal toxic dose),
information on developmental effects
may be difficult to  interpret and of
limited value. Current information is
inadequate to assume that         . :
developmental effects at maternally
toxic doses result only from maternal
toxicity; rather, when the LOAEL is the
same for the adult and developing
organisms, it may simply indicate that
both are sensitive to that dose level.
Moreover, whether developmental
effects are secondary to maternal
toxicity or not, the maternal effects.may
be reversible whils effects on the
offspring may be permanent. These are
important considerations for agents to
which humans may .be exposed at
minimally toxic levels either voluntarily
or involuntarily, since several agents are
known to produce adverse
developmental effects at minimally toxic
doses in adult humans (e.g., smoking,
alcohol, isotretinoin).
  Since the final risk assessment not
only takes into account the potential
hazard of an agent, but also the nature
of the dose-response relationship, it is
important that the relationship of
maternal and developmental toxicity be
evaluated and described. Then,
information from the exposure
assessment is used to determine the..
likelihood of exposure to levels near the
maternally toxic dose for each  agent
and the risk for developmental toxicity
in humans.
  Although the evaluation of
developmental toxicity is the primary
objective of standard studies within this
area, maternal effects seen within the
context of developmental toxicity
studies should be evaluated as part of
the overall toxicity profile for a given
chemical. Maternal toxicity may be seen
in the absence of or at dose levels lower
than those producing developmental
toxicity. If the maternal effect level is
lower than that in other evaluations of
adult toxicity, this implies that the
pregnant female is likely to be more
'sensitive than the nonpregnant female.
Data from reproductive and
developmental toxicity studies on the
pregnant female should be used in the
overall assessment of risk.
  Approaches for ranking agents
according to their relative maternal and
developmental toxicity have been
proposed; Schardein (1983) has
reviewed several of these. Several
approaches involve the calculation of ,
ratios relating an adult toxic dose to a
•developmentally toxic dose (Johnson,
1981; Fabro et al., 1982; Johnson and
Gabel, 1983; Brown and Freeman, 1984).
Such ratios may describe in a
qualitative and roughly quantitative
fashion the relationship of maternal
(adult) and developmental toxicity. .
However, at the U.S. EPA-sponsored
"Workshop on the Evaluation of
Maternal and Developmental Toxicity"
(Kirnmel et al., 1987), there was no
agreement as to the validity or utility of
these approaches in other aspects of the
risk assessment process. This is due in
part to uncertainty about factors that
can; affect the ratios. For example, the
number and spacing of dose levels,
differences in study design (e.g., route
and/or timing of exposure), the relative
thoroughness in the assessment of
maternal and developmental end points
examined, species differences in  ,
response,  and differences in the slope of
the dose-response curves for matema!
and developmental toxicity, can all
influence the maternal and            •
developmental effects observed and the
resulting ratios (Kimmel et al., 1987;-U.S.
EPA, 1985b). Also, maternal and
developmental end points used in the
ratios need to be better defined to
permit cross-species comparison. Until
such information ia available, the
applicability of these approaches in risk
assessment is not justified.
  e. Short-Term Testing in
Developmental Toxicity, The need for
short-term tests for developmental
toxicity has arisen from the need to
establish testing priorities for the large
number of agents in or entering the
environment, the interest in reducing the
number of animals used for routine
testing, and the expense of testing.
These approaches may be useful in
making preliminary evaluations of
•potential developmental toxicity, for
evaluating structure activity
relationships, and for assigning
priorities lor further, more extensive
testing. Furthermore,  as the risk
assessment process begins to
incorporate more pharmacokinetic and
mechanistic data, short-term tests
should be particularly useful. Kimmel
(1990) has recently discussed the
potential application  of in vitro systems
in risk assessment in a.context that is
broader than chemical screening.
However, the Agency currently
considers a short-term test as
"insufficient" by itself to carry out a risk
assessment (see Section III.C).
  Although short-term tests for
developmental toxicity are not routinely
required, such data are encountered in
the review of chemicals. Two
approaches are considered here in terms
of their contribution to the overall
testing process: (1) An in vivo
mammalian screen, and (2) in vitro test
systems.
  (1) In vivo mammalian developmental
toxicity tests. The most widely studied
in vivo short-term approach is that
developed by Chernoff and Kavlock
(1982). This approach is based on the
hypothesis that a prenatal injury, which
results in altered development will  be
manifested postoatally as reduced
viability and/or-impaired growth. Whet?
originally proposed, the test substance

-------
i-xt JP	• ./  !":' fi-y	i
                     'I h'.'l  "Si	Ill j'f'"''fi' '"ill t
                                                                                              !"'!!!1!	*,*:*'         	3h!!!!!i!f:!l!lti::!!!|««!ll:llil«a    	gillilllllllllilllCpi 1
                                                                                               'rts1}], ''niii1" .'i{vw
 63808
                       Federal Register / Vol. 56, No.  234 / Thursday,  December 5,  1991 "/ Notices  '   ;
 was administered to mice over the
 period of major organogenesis at a
 single dose level that would elicit some
 degree of maternal toxicity. At the
 N1OSH "Workshop on the Evaluation of
 the Chernoff/Kavlock Test for
 Developmental Toxicity" (Hardin, 1987),
 use of a second lower dose level was
 encouraged to potentially reduce!the
 chance; of false positive results, and the
 recording of implantation sites was
 recommended to provide a more precise
 estimate of poslimplantation loss
 (Kavlock et a)., 1987c). In this approach,
 Ihe pups are counted and weighed
 shortly after birth, and again after 3-4
 days. End points that are considered in
 the evaluation include: general maternal
 foklcity (including survival and weight
 gain), litter size, pup viability and
 weight, and gross malformations in the
 offspring. Several schemes have been
 proposed for ranking the results as a
 means of prioritizing agents for further
 Jesting (Chemoff and Kavlock, 1982;
 Brown, Jlgg^ Sdhuler et al.,	1984}.,	
'"  "The .mouse, was cliosen originally for
 this test because of its low cost, but  the
 procedure has been applied to the rat as
 Well (Wickramaratne, 1987}. The test
 can predict the potential for
 developmental toxicity of an agent in
 the species used while extrapolation of
 risk to other species, including humans,
 lias the same limitations as for other
 testing protocols. The EPA Office of
 TQXJC Substances has developed testing
 guidelines for this procedure (U.S. EPA,
 I985c). and the Office  of Pesticide
 Programs has applied similar protocols
 on a case-by-case basis (U.S. EPA,
 1983b). The National Toxicology
 Program also has developed a protocol
 that incorporates aspects of a range-
 finding study, with the intent of
 providing information on appropriate
 exposure levels should a standard
 developmental toxicity study be
 required (Morrissey et al., 1989).
 Although testing guidelines are
 available, such procedures are required
 on a case-by-case basis. Application of
 this procedure in the risk assessment
 process within the Office of Toxic
 Substances has been described (Francis
 and Furland, 1987), and the experiences
 of a number of laboratories are detailed
 In the proceedings of a NIOSH-
 ipdnsored workshop (Hardin, 1987).
   Recently, the OECD developed a
 screening protocol to be used for
 prioritizing existing chemicals for further
 testing" (draft as of March 22.1990). This
 protocol Is similar to the design of the
 Chernoff-Kavlock Test except that it
 Involves exposure of male and female
 rats 2 weeks prior to mating, throughout
 mating and gestation, and postnatally to
                                                          M ,'rtfi in;, ;;i' ife, 1, j,, WW '.iSliliiii'T,,;*,;::*;!!' lim .""I	lillWH^   	l!!l!:!i!	.UlOlill' lilllfl «''UKni;1!	n	FViiW ",
                                            day 4. Male animals are exposed
                                            following mating for a period
                                            corresponding to that of the females.
                                            Adult animals are evaluated for general
                                            toxicity and effects on reproductive
                                            organs. Pups are counted, weighed and
                                            examined for any gross physical or
litters do not respond independently, the
statistical analyses are generally
designed to analyze the relevant data
based on incidence per litter or on the
number of litters with a particular end
point. The analytical procedures used
and the results, as well as an indication
                                            bsfeaYioral,abnormalities,§t birth	ajyj	an,	plJ.ti.e.yjriance.m.eafih, grid, point, should
                                            postnatal day 4. This protocol permits
                                            evaluation of reproductive and
                                            developmental  toxicity following
                                            repeated dosing with an agent, provides
                                            an indication for the need to conduct
                                            additional studies, and provides
                                            guidance in the design of further studies.
                                            Currently, this study design is
                                            insufficient by itself to make an estimate
                                            of human risk without further studies to
                                            confirm and extend the observations.
                                              (2) In vitro developmental toxicity
                                            screens. Test systems that fall under the
                                            general heading of "in vitro"
                                            developmental  toxicity screens include
                                            any system that employs a test subject
                                            other tjjan the intact pregnant mammal.
                                            Examples of such systems include:
                                            isolated whole mammalian embryos in
                                           ,_ culture,	tissue/organ culture,	cell
                                            culture, and developing nonmammalian
                                            organisms. These systems have long
                                            been used to assess events associated
                                            with normal and abnormal development,
                                            but more recently they have been
                                            consi4ered for their potential as screens
                                            in testing (Wijson, 1978; Kimmel et al.,
                                            1982bj Brown and Fabro,1982). Many of
                                            these systems are now being evaluated
                                            for their ability to predict the
                                            developmental toxicity of various agents
                                            in intact mammalian systems. This
                                            validation process requires certain
                                            considerations in study design, including
                                            defined end points for toxicity and an
                                            understanding of the system's, ability to
                                            handle various test agents (Kimmel et
                                            al., 1982a; Kimmel, 1985; FDA, 1987;
                                            Brown, 1987).
                                              While in vitro test systems can
                                            provide significant information, they are
                                            considered insufficient, by themselves,
                                            for carrying out a risk assessment (see
                                            section III.C). In part, this is due to
                                            limitations in the application of the data
                                            to the whfile animal situation. But it is
                                            also due to the lack of assays that have
                                            been fully validated; as has been noted '
                                            in several reviews of available in vitro
                                            systems (FDA, 1987; Brown, 1987;
                                            Faustman, 1988) and at a recent
                                            workshop on in vitro teratology
                                            (Morrissey et al., 1991).
                                              f. Statistical Considerations. In the
                                            assessment of developmental toxicity
                                            data, statistical considerations require
                                            special attention. Since the litter is
                                            generally considered the experimental
                                            unit in most developmental toxicity
                                            studies, and fetuses or pups within
be evaluated carefully when reviewing
data for risk assessment purposes.
Analysis of variance (ANOVA)
techniques, with litter nested within
dose in the model, take the litter
variable into account while allowing use
of individual offspring data and an
evaluation of both within and between
litter variance as well as dose effects.
Nonparametric and categorical
procedures have also been widely used
for binomial or incidence data. In
addition, tests for dose-response trends
can be applied. Although a single
statistical approach has not been agreed
upon, a number of factors important in
the analysis of developmental toxicity
data have been discussed (Haseman
and Kupper, 1979; Kimmel et al., 1986).
  Studies that employ a replicate
experimental design (e.g., two or three
replicates with 10 litters per dose per
replicate rather than a single experiment
with 20 to 30 litters per dose group)
allow broader interpretation of study
results since the Variability between
replicates can be accounted for using
ANOVA techniques. Replication of
effects due to a given agent within a
study, as well as among studies or
laboratories, provides added strength in
the use of data for the estimation of risk.
  An important factor to consider in
evaluating data is the power of a study
(i.e., the probability that a study will
demonstrate a true effect), which is
limited by the sample size used in the
study, the background incidence  of the
end point observed, the variability in the
incidence of the end point, and the
analysis method. As an example, Nelson
and Holson (1978) have shown that the
number of litters.needed to detect a 5%
or 10% change was dramatically lower
for fetal weight (a continuous variable
with low variability) than for
resorptions (a binomial response with
high variability). With the current
recommendation in testing protocols
being 20 rodents per dose group (U.S.
EPA, 1982b, 1985a), the minimum change
detectable is  an increased incidence of
malformations 5 to 12 times above
control levels, an increase 3 to 6 times
the in utero death rate, and a decrease
0.15 to 0.25 times the fetal weight. Thus,
even within the same study, the ability
to detect a change in fetal weight is
much greater than for the other end
points measured. Consequently, for

-------
                  Federal Register / Viol. 56, Nti. 234'/Thursday,  December 5;  1991 /'-Notices
statistical reasons only, changes in fetal
weight are often observed at doses
below those producing other signs of
developmental toxicity. Any risk
assessment should present the detection
sensitivity for the study design used and
for the end point(s) evaluated.
  Although statistical analyses are
important in determining the effects of a
particular agent, the biological
.significance of data is most relevant. It
is important to be aware that with the
number of end points that can be
observed in standard protocols for
developmental toxicity studies, a few
statistically significant differences may
occur by chance. On the other hand,-
apparent trends with dose may be
biologically relevant even though pair-
wise comparisons do not indicate a
statistically significant effect. This may
be true especially for the incidence of
malformations or in utero death because
of the low power of standard study --
designs in which a relatively large
difference is required to be statistically
significant. It should be apparent from
this discussion that a great deal of
scientific judgment, based on experience
with developmental toxicity data and
with'principles of experimental design
and statistical analysis, may be required
to adequately evaluate such data.
2, Human Studies
  In principle, human data are preferred
for risk assessment. However, the
complexities of obtaining sufficient
human data are such that these data are -
not available for many potential
toxicants. The following describes the
methods of generation of human data,
their evaluation, and the weight they
should be given in risk assessments.
  The category of "human studies"
includes both epidemiologic studies and
other .reports of individual cases or
clusters of events. Greatest weight
should be given to carefully designed
epidemiologic studies with more precise
measures of exposure, since they .can
best evaluate exposure-response
relationships (see Section IV).
Epidemiologic studies in which exposure
is presumed based on occupational title
or residence (e.g., some case-referent
and all ecologic studies) may contribute
data to qualitative risk assessments, but
are of limited use for quantitative risk
assessments because of the generally
broad categorical groupings. Reports of
individual cases or clusters of events
may generate hypotheses of exposure-
outcome associationSj but require
further confirmation with well-designed
epidemiologic or laboratory studies.
These reports of cases or clusters may
give added support to associations
suggested by other human or animal
 data, but cannot stand by themselves in
 risk assessments. Risk assessors should
 seek the assistance of professionals
 trained in epidemiology when
 conducting a detailed analysis.
  a. Epidemiologic Studies. Good.
 epidemiologic studies provide the most
 relevant information for assessing
 human risk. As there ars many different
 designs for epidemiologic studies,
 simple rules for their evaluation do not
 exist.
  (1) General design considerations. The
 factors that enhance a study and thus
 increase its usefulness for risk
 assessment havs been noted in a
 number of publications (Selevan, 1980;
 Bloom, igai; U.S. EPA, 1981; Wilcox,
 1983; Sever and Hessol, 1984; Axelson,
 1985; Tilley et al., 1985; Kimmel et al..
 1988). Some of the more prominent
 factors are as follows:
  (a) Thepower of'the study: The
 power, or ability of a study to detect a
 true effect, is dependent on the size of
 the study group, the frequency of the   .
 outcome hi the general population, and
 the level of excess risk to be identified.
 In a cohort study, common outcomes,
 such as recognized fetal loss, require
 hundreds of pregnancies in order to
 have a high probability of detecting a
 modest increase hi risk (e.g., 133 in both
 exposed and unexposed groups to detect
 a doubling of background; alpha = 0.05,
 power = 80%), while less common
 outcomes, such as the total of all
 malformations recognized at birth,
 require thousands of pregnancies to
 have the same probability (e.g., more
 than 1,200 in both exposed and
 unexposed groups) (Bloom, 1981;
 Selevan, 1981; Sever and Hessol, 1984;
 Selevan, 1985; Stein et al., 1985; Kimmel
 et al., 1986). In case-referent studies,
 study sizes are dependent on  the
 frequency of exposure within the source
 population. The confidence one has in
 the results of a study without positive
 findings is related to the power of the
 study to detect meaningful differences in
 the end points studied.
  Power may be enhanced by combining
 populations from several studies using a
 metaanalysis (Greenland, 1987). The
 combined analysis would increase
 confidence in the absence of risk for
 agents with negative findings. However,
 care must be exercised in the
 combination, of potentially dissimilar
 study groups.
  A. posteriori determination of power
 of the actual study may be useful in
 evaluating contradictory studies in risk
 assessment. Absence of positive
findings in a study of low power would
 be given less weight than either a       ;
 positive study or a null study  (one with
 no significant differences) with high
 power. Positive findings from very small
 studies are open to question due' to the
 instability of the risk estimates and the
 potential for highly selected study
 groups.
   (b) Potential bias in data collection:
 Sources of bias may include selection
• bias and information bias (Rothman,
 1985). Selection bias may occur when an
 individual's willingness to participate
 varies with certain characteristics
 relating to the exposure status or health
 .status of that individual. In addition,
 selection bias may operate in the
 identification of subjects for study. For
 example, in studies of embryonic loss,
 use of hospital records to identify
 embryonic or early fetal loss will
 underascertain events, because women
 are not always hospitalized for these
 outcomes. More weight might be given
 in a risk assessment to a study in which
 a more complete list of pregnancies is
 obtained by, for example, collecting ,
 biological data [e.g., human chorionic
 gonadotropin (hCG) measurements] on
 pregnancy status from study members.
 These studies may also be affected by
 bias. The representativeness of these
 data may be affected by selection
 factors related to the willingness of
 different groups of women to continue
 participation over the total length of the
 study. Interview data result in more
 complete ascertainment; however, this
 strategy carries with it the potential for
 recall bias,  discussed in further detail
 below. A second example of different
 levels of ascertainment of events is the
 use of hospital records to study
 congenital malformations. Hospital
 records contain more complete data on
 malformations than do birth certificates
 (Mackeprang et al., 1972). Consequently,
 birth defects registries that are based on
 searches of hospital records are more
 complete than those based on vital
 records (Selevan, 1986). Thus, a study
 using hospital records to identify
 congenital malformations would be
 given more emphasis in a risk
 assessment than one using birth
 certificates.
   Studies of working women present the
 potential for additional bias since some
 factors that influence employment  status
 may also be associated with
 reproductive end points. For "example,
 due to child-care responsibilities,
 women may terminate employment, as
 might women with a history of
 reproductive problems who wish to have
 children and are concerned about
 workplace exposures (Joffe, 1985).
   Information bias may result from .
 misclassification of characteristics of
 individuals or events identified for

-------
                                             .1!'1:;,!!:1;',i	; i	niir•;jri	.usmii (
                                                                                                      'flti'HT'il'^'li**!:	Kill
   68B1Q    	Federal  Register / Vol. 56, No. 234 / Thursday,  December 5,  1991 / Notices
   study. Recall bias, one type of
   information bias, may occur when
   respondents with specific exposures or
   outcomes recall information differently
   than thpsue without the exposures or
   outcomes. Interview bias may result
   when the interviewer knows a priori the
   category of exposure (for cohort studies)
   or outcome (for case-referent studies) in
   ivhich the respondent belongs. Use of
   highly structured questionnaires and/or
   "blinding" of the interviewer will reduce
   Ihe likelihood of such bias. Studies with
   lower likelihood of the above-listed bias
   should carry more weight in a risk
   assessment.
    When data are collected by interview
   or questionnaire, the appropriate
   respondent depends on the type of data
   or study. For example, a comparison of
   husband-wjfe interviews on
   reproduction found the wives' responses
   to questions on pregnancy-related
   events lo  be considerably more
   complete  and valid than those of the
   husbands (Selevan, I960). A more recent
   study (Schnatter, 1990) found small,
   nonsignificant improvements in
   reporting  of birth weights by mothers
   compared to fathers, and that males
   who provide early fetal loss data with
   the aid of their wives give better data
   {borderline significance). Studies based
   on interview data from the appropriate
   respondent(s) would carry more weight
   than those from proxy respondents (e.g.,
   the specific individual when examining
   exposure  history and the woman or both
   partners when examining pregnancy
   history).
    Data from any source may be prone to
   errors or bias. All types of bias are
   difficult to assess; however,  validation
"   with an independent data source (e.g.,
   Vjtlal or hospital records), or use of
   btomarkers of exposure or outcome,
   where possible, may indicate the degree
   of bias present and increase confidence
   In the results of the study. Those studies
   with a low probability of biased data
   should carry more weight (Axelson,
   1985: Stein and Hatch, 1987).
    Differential misclassification, i.e.,
   when certain subgroups are more likely
   to have misclassified data than others,
   may either raise or lower the risk
   estimate. Nondifferential
   misclassification will bias the results
   toward a finding of "no effect"
   (Rothman, 1986).
    (c) Collection of data on other risk
   factors, effect modifiers, and
   oonfounders: Risk factors for
   reproductive and developmental toxicity
   include such characteristics as age,
   smoking, alcohol consumption, drug use,
   and past reproductive history.
   Additionally, occupational and
   environmental exposures are potential
       11         M       i  in n  i  n    i
     i  •• in mil n nil n i in inn i i  nip   i iiiipiiini nil m inn  i In   innninn
 risk factors for reproductive and
 developmental effects. Known and
 potential risk factors should be
 examined to identify those that may be
 effect modifiers or confounders. An
 effect modifier is a factor that produces
 different exposure-response
 relationships at different levels of that
 factor. For example, maternal age  would
 be an effect modifier if the risk
 associated with a given exposure
 increased with the mother's age. A
 CQ,nfourider is a variable thajjs a	risk	•.
 factor for the disease under study  and is
 associated with the exposure under
 study, but is not a consequence  of the
 exposure. A confounder may distort
 both the magnitude and  direction of  the
 measure of association between the
 exposure of interest and the outcome.
, For example, socioeconomic status
 might be a confounder in a study of the
 association of smoking and fertility,
 since Sdcideconomic status may be
 asspciajedjyith both.	,  ,	,	„	,
   Studies that fail to account for effect
 modifiers and confounders should be
 given less weight in a risk assessment.
 Both of these important factors need to
 be controlled in the study design and/or
 analysis to improve the estimate of the
 effects of gxposure (Kleinbaum et  al,
 1982). A more in-depth discussion  may
 be found elsewhere (Epidemiology
 Workgroup, 1981; Kleinbaum et  al., 1982;
 Rothman, 1986). The statistical
 techniques used to control for these  •
 factors require careful consideration in
 their application and interpretation
 (Kleinbaum et al., 1982; Rothman, 1986).
   (d) Statistical factors: As in animal
 studies, pregnancies experienced by the
 same woman are not .independent
 events {Kissling, 1981; Selevan, 1985).
 Women who have had embryo/fetal loss
 are reported to be more likely to have
 subsequent losses (Leridon, 1977).  In
 animal studies, the litter is generally
 used as the unit of measure to deal with
 nonindependence of events. In studies of
 humans, pregnancies are sequential with
 the risk factors changing for different
 pregnancies, making analyses
 considering nonindependence of events
 very difficult (Epidemiology Workgroup,
 1981; Kissling, 1981). If more than one
 pregnancy per woman is included, as is
 often necessary due to small study
 groups, the use of nonindependent
 observations overestimates the true size
 of the groups being compared, thus
 artificially increasing the probability of
 reaching statistical significance
 (Stiratelli et al., 1984). Biased estimates
 of risk might also result if family size
 confounds the relationship between
 exposure and outcome. Some
 approaches to deal with these issues
 have been suggested (Kissling, 1981;
„ "	I,1	'	I!	!, n'nln!	[	'!" 	  "i   n11,!,,1, I	I,,,!	 '	'	
  Stiratelli et al., 1984; Selevan, 1985). At
  this point in time, a generally accepted
  solution to this problem has not been
  developed.
    (2) Selection of outcomes for study. As
  already discussed, a number of end
  points can be considered in the
  evaluation of adverse developmental
  effects. However, some of the outcomes
  are not easily observed in humans, such
  as early embryonic loss and
  reproductive capacity of the offspring.
  Currently, the most feasible end points
  for epidemiologic studies are
  reproductive history studies of some
  pregnancy outcomes (e.g., embryo/fetal
  loss,  birth weight, sex ratio,  congenital
  malformations, postnatal function, and
  neonatal growth and survival) and
  measures of fertility/infertility which
  would include indirect evaluations of
  very  early embryonic loss. Postnatal
  outcomes for examination could include
  physical growth and development, organ
  or system function and behavioral
-  effects of exposure. Factors requiring
  control in the design or analysis (such as
  effect modifiers  and confounders) may
  vary depending  on the specific outcomes
  selected for study.
    The developmental outcomes
  available for epidemiologic examination
  are limited by a number of factors,
  including the relative magnitude of the
  exposure since differing spectra of
  outcomes may occur at different
  exposure levels, the size  and
  demographic characteristics of the
  population, and  the ability to observe
  the developmental outcome in humans.
  Improved methods for identifying some
  outcomes such as very early embryonic
  loss using new hCG assays may change
  the spectrum of outcomes available for
  study (Wilcox et al., 1985; Sweeney et
  al.,1988).
    Demographic characteristics of the
  population, such as marital status, age
  distribution, education, socioeconomic
  status (SES) and prior reproductive
  history are associated with the
  probability of whether couples will
  attempt to have  children. Differences in
  the use of birth control would also affect
  the number of outcomes available for
  study. In addition, women with live
 births are more likely to terminate
  employment than are those with other
  outcomes, such as infertility or early
  embryonic loss. Thus, retrospective
  studies  of female exposure that do not
 include terminated women workers may
 be of limited use in risk assessment
 because the level of risk for these
 outcomes is likely to be overestimated
 (Lemasters and Pinney, 1989).
   In addition to the above-mentioned
 factors, developmental end points may
                                                                                                                       	I	

-------
                   Federal Register / Vol. 5,6,  Np..23$;:/vT!mr,sday! December-;5,\199l7 Notices "         63811'
                  "	"'"••••••^    •-*•••——•»»n.rtfT.-i.irT..iL|n||ll| mini mj nTQTii IN||  ......n.~	ih m, ,M , innm, m L, lwlm IIII|BBIIH mi inn linn mil mm iiiamm	in I n IIIMMBI 1111	n
 be envisioned as effects recognized at
 various points in a continuum, starting
 at conception through death of the
 offspring. Thus, a malformed stillbirth
 would not be included in a study of
 defects observed at live birth, even
 though the etiology could be identical
 (Stein et al., 1975; Bloom, 1981). A shift
 in the patterns of outcomes could result
 from differences in timing or in level of
 exposure (Selevan and LeMasters, 1987).
   (3) Reproductive history studies, (a)
 Measures of fertility: Normally, studies
' of sub- or infertility would not be
 included in an evaluation of
 developmental effects. However, in
 humans it is  difficult to identify very
 early embryonic loss, and distinguish it
 from sub- or  infertility. Thus, studies
 that examine sub- or infertility indirectly
 examine loss very early in the
 gestational period. Infertility or
 subfertility may be thought of as a
 nonevent: A  couple is unable to have
 children within a specific time frame.
 Therefore, the epidemiologic
 measurement of reduced fertility is
 typically indirect, and is accomplished
 by comparing birth rates or time
 intervals between births or pregnancies.
 In these evaluations, the couple's joint
 ability to procreate is estimated. One
 method, the Standardized Birth Ratio
 (SBR; also referred to as the
 Standardized Fertility Ratio), compares
 the number of births observed to those
 expected based on the person-years of
 observation stratified by factors such as
 time period, age, race, marital status,
 parity, contraceptive use, etc. [Wong et
 al.. 1979; Levine et al., 1980,1981; Levine,
 1983; Starr et al., 1986). The SBR is
 analogous to the Standardized Mortality
 Ratio (SMR), a measure frequently used
 in studies of occupational cohorts,  and
 has similar limitations in interpretation
 (Gaffey, 1976; McMichael, 1976; Tsai and
 Wen, 1986).
   Analysis of the time period between
 recognized pregnancies or live births
 has been suggested as another indirect
 measure of fertility (Dobbins et al., 1978;
 Baird et al., 1986; Weinberg and Gladen,
 1986). Because the time interval between
 births increases with increasing parity
 (Leridon, 1977), comparisons within
 birth order (parity) are more
 appropriate. A statistical method (Cox
 regression) can stratify by birth or
 pregnancy order to help control for
 nonindependence of these events in the
 same woman.     ;          _£,_...  .
   Fertility may also be affected by
 alterations in sexual behavior. However,
 limited data  are available linking toxic
 .exposures to these alterations in    •
 humans. Moreover, such data are not
 easily obtained in .epidemiology studies.
                                 '*
 More information on this subject is
 available in the proposed male and
 female reproductive risk assessment
 guidelines (U.S. EPA, 1988b, 1988c).
   (bj Pregnancy outcomes: Pregnancy
 outcomes examined in human studies of
 parental exposures may include
 embryo/fetal loss, congenital
 malformations, birth weight, sex ratio at
 birth, and postnatal effects (e.g.,
 physical growth and development, organ
 or system function, and behavioral
 effects of exposure). Postnatal effects
 are discussed in more detail in the next
 section. As mentioned previously,
 epidemiologic studies that focus on only
 one type of pregnancy outcome may
 miss a true effect of exposure due to the
 continuum of outcomes. Examination of
 individual outcomes could mask a true
 effect due to reduced power resulting
 from fewer events for study. Studies that
 examine multiple end points could yield
 more information, but the results may be
 difficult to interpret.
   Evidence of a dose-response
 relationship is usually an important
 criterion in the assessment of a toxic
 exposure. However, traditional dose-
 response relationships may not always
 be observed for some end points. For
 example, with increasing dose, a
 pregnancy might end in a fetal loss
 rather than a live birth with
 malformations. A shift in the patterns of
 outcomes could result from differences
 either in level of exposure or in timing
 (Wilson, 1973; Selevan and Lemasters,
 1987)  (for a more detailed description^
 see Section III.A.2.3.5). Therefore, a risk
 assessment should, when possible,
 attempt to look at the interrelationship
 of different reproductive end points and
 patterns of exposure.
   (c) Postnatal developmental effects:
 These effects may include changes in
 growth, behavior, organ or system
 function, or cancer. Studies of
 neurological and reproductive function
 are discussed here as examples.
 Postnatal behavioral and functional
 effects in humans have been examined
 for a small number of environmental
 and occupational agents (e.g., lead,
 PCBs, methyl mercury, alcohol). For
 some agents (e.g., lead and PCBs), subtle
 changes have been observed in groups
 of children at lower exposures than for
 other developmental effects (e.g.,
Bellinger et al., 1987; Needleman, 1988;
 Davis et al.. 1990; Tilson et al., 1990).
This may not be true for all toxic agents.
These subtle differences would be
difficult to identify in individuals, but
could result in an overall shifting of
mean values when comparing groups of
exposed and unexposed. children. Some
postnatal studies have-examined infants
  or ypung children using standard.
  developmental scales (e.g., Brazelton
  Neonatal Behavioral Assessment Scale,
  Bayley Scales of Infant Development,
  Stanford Binet IV, and Wechsler Scales)
  and some biologic measure of exposure
  (e.g., blood lead levels). These tests are
  designed to examine certain end points
  and have been developed to cover
  certain age ranges. Certain tests
  examine specific aspects of
  development. For example, the Bayley
  Scales look at motor and language
  development, but do not examine
  sensory function. Batteries of tests are
  important for a proper evaluation due lo
  the possibility of interrelated effects,
  e.g., hearing-deficits and language
  development. Thus, batteries of tests
  will give a clearer indication of direct
  effects of exposure resulting in postnatal
  developmental deficits.
    Factors that may influence the
  examination of these effects include
  parental education, SES, obstetrical
  history, and health characteristics
  independent of exposure that may affect
  functional measurement (e.g., injuries
  and infections). Many social and
  lifestyle factors may also affect scoring
  on these scales (e.g., neonatal-maternal
  interactions, SES, home environment).
    Studies of premature infants carry
  special problems. For proper
  comparisons, tests keyed to age in very
  young children (less than 2.5 years of
  age) need to "correct" the age for -
  premature infants to the age they Would
  have been had they been born at term.
  In addition,  premature infants or those
.  with low birth weight for their     '
  gestational age may have problems
  resulting from the birth process not
  directly related to exposure (e.g.,
  intraventricular hemorrhage in the brain
  which can then cause developmental
  problems). Thus, the developmental
  effects.resulting from exposure may
  have their own sequelae.
   Other studies may examine effects
  occurring at a later age (e.g., in utero .
  exposure and cancer in young women).
  This long time interval typically carries
  with.it the need for retrospective
  studies, with the inherent limitations in
  accurate determination of exposure,
 effect modifiers, and confpunders. Risk
  assessment methods for cancer are
  described in the "Guidelines for
 Carcinogen Risk Assessment" (U.S.
 EPA,1986b).       • .  -•-.-•
   Reproductive effects may result from
 developmental exposures. For example,-
 environmental exposures may result in
 oocyte toxicity, in which a loss of       L
 primordial oocytes irreversibly affects a
 woman's fertility. The exposures of
 importance may occur during both the

-------
                                          • ' • :",; :. ;;a ..... ,,„ •
                                                     ;1!-1 ..... 1 1!''"!!,; ..... i ..... i, j; • ; ;
                                                                        ' ...... F 11
                                                                                       ..... '«,; i,,::i ....... < s
 §3B|2	^deral; .Register	/	.Vol.	\ 5g7 N67	2_Mj_^ws^y,"Decemb"er'5,	1991	/	Notices"
 prenatal period and after birth. Oocyte
 depletion is difficult to examine directly
 in women due to the mvasiveness of the
 tests required; however, it can be
 studied indirectly through evaluation of
 the age at reproductive senescence
 (menopause) (Everson et al., 1986). Risk
: assessment methodsfor female	
 reproductive effects are described in the
 "Proposed Guidelines for Assessing
 Female Reproductive Risk"  (U.S. EPA,
 1908c),
   Developmental exposures to males
 could affect their reproductive function
 (e.g., deplete stem or Sertoli cells
 potentially affecting sperm production)
 (Zenick and Clegg, 1989). If stem cell
 death occurs with exposure at any age,
 recovery is possible as long as some
 stem cells survive, The same is true for
 Sertoli  cells, except that they cease
 multiplication before puberty. Thus, cell
 replication cannot compensate for
 Sertoli  cell death after puberty. Human
 studies of stem and Sertoli cells would
 be difficult due to the inyasiveness, of
 the measure. Less direct measures, e.g.,
 sperm count, morphology, and motility,
 could be evaluated but this would not
 indicate what cells or stage  of
 spefmaiogeriesls had been affected. Risk
 assessment methods for male
 reproductive effects are described in the
 "Proposed Guidelines for Assessing
 Male Reproductive Risk" (U.S. EPA,
 1988b).
   In addition tp thp above effects,
 genetic damage to germ cells may result
 from developmental exposures.
 Outcomes resulting from germ-cell
 mutations could include reduced
 probability of conception as well as
 increased probability of embryo/fetal   '
 loss and other developmental  effects.
 These end points could be studied using
 the approaches described above.
 However, a human germ-cell mutagen
 has not yet been demonstrated (U.S.
 EPA, 1980c). Based on animal  studies,
 critical exposures are to germ  cells or
 early zygotes. Germcell mutagenicity
 could also be expressed as genetic
 diseases in future generations.
 Unfortunately, these studies would be
 very difficult to conduct in human
 populations due to the long time lag
 between exposure and outcome. For
 more information, refer to the
 "Guidelines for Mutagenicity Risk
 Assessment" (U.S. EPA 1986c).
   (4J Community studies/surveillance
 programs'. Eptdemiologic studies may
 also be based on broad populations
 such as a community, a nationwide
 probability sample, or surveillance
 programs (such as birth defects
 registries). Other studies have  examined
 environmental exposures, such as toxic
 agents in the water system, and adverse
 pregnancy outcome (Swan et al., 1989;
 Deane gt al.,1989). Unfortunately, in
 these studies maternally-mediated
 effects may be difficult to distinguish
 from paternally-mediated effects. In
 addition, the presumably lower
 exposure levels (compared to industrial
 settings) may require very large groups
 for study. A number of case-referent
 studies have examined thejelationship
 between broad classes of parental
 occupation in certain communities or
 countries, and embryo/fetal loss
 (Silverman et al., 1985), birth defects
 (Hemminki et al., 1980; Kwa and Fine,
 1980; Papier, 1985), and childhood     .
 cancer (Kwa and Fine, 1980; Zack et al.,
 1980; Hemminki et al., 1981; Peters et al.,
 1981J. In these reports, jobs are typically
 classified into broad categories based
 on the probability of exposure to certain
 classes or levels of exposure (e.g., Kwa
 and Fine, 1980). Such studies are most
 helpful in the identification pf topics for
 aidditional  study. However, because of
 the broaS groupings of types or levels of
 exposure, such studies are not typically
 useful for risk assessment of a particular
 agent;
  Surveillance programs may also exist
 in occupational settings. In this case,
 reproductive histories and/or clinical
 evaluations could be followed to
 monitor for reproductive effects of
 exposures. Both could yield very useful
 data for risk assessment; however, a
 clinical evaluation program would be
 costly to maintain, and there are
 numerous impediments to the collection
 of reliable  and valid information in the
 workplace. These might include similar
 concerns to those previously discussed
 plus potentially low participation rates
 due to employee sensitivities and "
 confidentiality concerns.
  (5) Identification of exposures
 important for developmental effects. For
 all examinations of the relationship
between developmental effects and
potentially toxic exposures; the
identification of the appropriate  •
exposure is crucial. Preconceptional
 exposures to either parent and in utero
exposures have been associated with
 the more commonly examined outcomes
 (e.g.. fetal loss, malformations, birth
weight, and measures of infertility).
These exposures, plus postnatal
exposure from breast milk, food, and the
general environment, may be associated
with postnatal developmental effects
(e.g., changes in behavioral and
cognitive function, or growth). The
magnitude  of exposure may affect the
spectrum of outcomes observed. This
issiie is discussed in more  detail in
sections III.A.l.b and III.B.
   Infants and young children may
 receive disproportionate levels of
 exposure due to their tendency to "put
 everything" in their mouths (pica) and
 the greater time they spend on the floor
 Carpets may serve as a reservoir for
 toxic agents (e.g., pesticides and lead
 dust), and the air nearer the floor may
 have greater levels of certain airborne
 toxicants (e.g., mercury from latex
 paints).
   Exposures in environmental settings
 are frequently lower than in industrial
 and agricultural settings. However, this
 relationship may change as exposures
 are reduced in workplaces, and as more
 is learned about environmental
 exposures (e.g., indoor air exposures,
 pesticides usage). Larger populations are
 necessary in settings with lower
 exposures (Lemasters and Selevan,
 1984). Other factors affect the
 identification of reproductive or
 developmental events with various
 levels of exposure. Exposed individuals
 may move in and out of areas with
 differing levels and types of exposures,
 affecting the number of exposed and
 comparison events for study. Thus,
 exposures can be short-term or chronic.
   Data on exposure from human studies
 are frequently qualitative, such as
 'employment or residence histories. More
 quantitative data may be difficult to
 obtain due to the nature of certain study
 designs (e.g., retrospective studies) and
 historical limitations in exposure
 measurements. Many developmental
 outcomes result from exposures during
 certain critical times. The appropriate
 exposure classification depends on the
 outcome(s) studied, the biologic
 mechanism affected by exposure, and
 the biologic half-life of the agent. The
 biologic half-life, in combination with
 the patterns of exposure (e.g.,
 continuous or intermittent) affect the
 individual's body burden and
 consequently the "true" dose during the
 critical period. The probability of
 misclassification of exposure status may
 affect the ability to recognize a true
 effect in a study  (Selevan, 1981; Hogue,
 1984; Lemasters and Selevan, 1984;
 Sever and Hessol, 1984; Kimmel et al,,
 1986). As more prospective studies are
 done, better estimates of exposure will
 be developed.
  b. Examination of Clusters or Case
Reports/Series. The identification of
 cases or clusters of adverse pregnancy
 outcomes is generally limited to those
 identified by the women involved, or
 clinically by their physicians. Examples
of outcomes more easily identified
include mid to late fetal loss or
congenital malformations. Identification
of other effects, such as very early
 in i n        in  n        ii      inn linn i ii nn i in
                      ir'JIiii  ' „'

                     	BllllIT, •(:;«!". H1,, ..
                                                                             IIIIIIIIII 11(11
                                                   IIIIIIIIII I IIIIIIIIII
                                                              IIIIIIIIII IIIIIIIIII | IIIIIIIIII III I  III III
                                                111  III 111 II 111 111

                                           II 111 11 IIIIIIIIII 111 111 Illllll 1111(1111111111
                                                                                                    IIIIIIIIII  II  II 111
                                    	1	1	11(11 j

                                    	i	I	II	IIIIIIIIII]

-------
Federal Register. / Vol.  56, NQ. -234 .
                               December 5,  1991 / Notices
                                                                                                               63813
 embryonic loss may be difficult to
 separate from the study of sub- or
 infertility. Such "nonevents" (e.g., lack
 of pregnancies or children) are much
 harder to recognize than are
 developmental effects such as
 malformations resulting from in utero
 exposure. While case reports have been
 important in the recognition of some
 agents that cause developmental
 toxicity, they may be of greatest use in
 suggesting topics for further
 investigation (Hogue, 1985). Reports of
 clusters and case reports/series are best
 used in risk assessment in conjunction
 with strong laboratory data to suggest
 that effects observed in animals also
 occur in humans. Previous discussion of
 the use of human data should be taken
 into account wherever possible.

 3. Other Considerations
  Several other types of information
 may be considered in the evaluation and
 interpretation of human and animal
 data. Information on pharmacokinetics
 and structure-activity relationships may
 be very useful, but is often lacking for
 developmental toxicity risk
 assessments.
 . a. Pharmacokinetics. Extrapolation of
 toxicity data between species can be
 aided considerably by the availability of
 data on the pharmacokinetics of a
 particular agent in the species tested
 and, when available, in humans.
 Information on absorption, half-life,
 steady-state and/or peak plasma
 concentrations, placental metabolism
 and transfer, excretion in breast milk,
 comparative metabolism, and
 concentrations of the parent compound
 and metabolites may be useful in
 predicting risk for developmental
 toxicity. Such, data may also be helpful
 in defining the dose-response curve,
 developing a more accurate comparison
 of species sensitivity {Wilson et al.,
 1975,1977), determining dosimetry at
 target sites, and comparing
 pharrnacokinetic profiles for various
 dosing regimens or routes of exposure.
 Pharrnacokinetic studies in
 developmental toxicology are most
useful if conducted in animals at the
 stage when'developmental insults occur.
The correlation of pharrnacokinetic
parameters and developmental toxicity
 data may be useful in determining the
 contribution of specific pharrnacokinetic
parameters to the effects observed
 (Kimmel and Young,.1983).
  While human pharmacokinetic data
 are often lacking, absorption data in
laboratory animals for studies
 conducted by any relevant route of
 exposure may assist in the
interpretation of the developmental
toxicity studies in the animal models for
the purposes of risk assessment. Specific
guidance regarding both the
development and application of
pharmacokinetic data was agreed upon
by the participants at the "Workshop on
the Acceptability and Interpretation of
Dermal Developmental Toxicity
Studies" (Kimmel and Francis, 1990). It
was concluded that absorption dataare
needed both when a dermal
developmental toxicity study shows no
developmental effects, as well as when
developmental effects are seen. The
results of a dermal developmental
toxicity study showing no adverse
developmental effects and without
blood level data (as evidence of dermal
absorption)  are potentially misleading
and would be insufficient for risk      ;
assessment, especially if interpreted as
a "negative" study. In studies where
developmental toxicity is detected,
regardless of the route of exposure,
absorption data  can be used to establish
the internal  dose in maternal animals for
risk extrapolation purposes,
  b. Comparisons of Molecular
Structure. Comparisons of the chemical
or physical properties of an agent with
those known to cause developmental
toxicity may indicate a potential for
developmental toxicity. Such
information may be helpful in setting
priorities for testing of agents or for
evaluation of potential toxicity when .
only minimal data are available.
Structure-activity relationships have not
been well studied in developmental
toxicology, although data are available
that suggest structure-activity       . ; .
relationships for certain classes of
chemicals (e.g., glycol ethers, steroids,
retinoids). Under certain circumstances
(e.g., in the case  of new chemicals), this
is one of several procedures used to
evaluate the potential for toxicity when
little or no data are available.

B. Dose-Response Evaluation
  The evaluation of dose-response
relationships for developmental toxicity
includes the evaluation of data from
both human and animal studies. When
quantitative dose-response data are
available in humans and with sufficient
range of exposure, dose-response  - -,  :
relationships may be examined. Since
data on human dose-response
relationships have been available
infrequently, the dose-response
evaluation is usually based on  the
assessment of data from tests performed
in laboratory animals.
  Evidence for a dose-response
relationship  is an important criterion in
the assessment of developmental
toxicity, which is usually based on
limited data from standard studies using
three dose groups and a control group.
                                                             Most agents causing developmental
                                                             toxicity in humans alter development at
                                                             doses within a narrow range near the
                                                             lowest maternally toxic dose (Kimmel et
                                                             al., 1984). Therefore, for most agents, the
                                                             exposure situations of concern will be
                                                             those that are potentially near the
                                                             maternally toxic dose range. For those
                                                             few agents that produce developmental
                                                             effects at much lower levels than
                                                             maternal effects, the potential for
                                                             exposing the'conceptus to damaging
                                                             doses is much greater than when the
                                                             maternal and developmental toxic doses
                                                             are similar. As mentioned previously
                                                             (Section III.A.l.b), however, traditional
                                                             dose-response relationships may not
                                                             always be observed for some end
                                                             points. For example, as exposure
                                                             increases, embryolethal levels may be
                                                             reached, resulting in an observed
                                                             decrease in malformations with
                                                             increasing dose (Wilson, 1973; Selevan
                                                             and LeMasters, 1987)! The potential for
                                                             this response pattern indicates that
                                                             dose-response relationships of
                                                             individual end points as well as
                                                             combinations of end points (e.g., dead
                                                             and malformed combined) must be
                                                             carefully examined and interpreted.
                                                               The evaluation of dose-response
                                                             relationships includes the identification
                                                             of effective dose levels as well as, doses
                                                             that are associated with no increased
                                                             incidence of adverse effects when
                                                             compared with controls. Much of the
                                                             focus is on the identification of the
                                                             critical  effect(s) (i.e., the adverse
                                                             effect(s) observed at the lowest dose
                                                             level) and the LOAEL and NOAEL
                                                             associated with that developmental
                                                             effect, which may be any of the four
                                                             manifestations of developmental
                                                             toxicity. The NOAEL is defined as the
                                                             highest  dose at which there is no
                                                             statistically or biologically significant
                                                             increase in the frequency of, an adverse
                                                             effect in any of the possible
                                                             manifestations of developmental
                                                             toxicity when compared with the.
                                                             appropriate control group in a data base
                                                             characterized as having sufficient
                                                             evidence for use in a risk assessment
                                                             (see Section III.C). The LOAEL is the
                                                             lowest dose at which there is a
                                                             statistically or biologically significant"
                                                             increase in the frequency of adverse
                                                             developmental effects when compared
                                                             with  the appropriate control group in a
                                                             data base characterized as having
                                                             sufficient evidence. Although a
                                                             threshold is assumed for developmental
                                                             effects, the existence of a NOAEL in an
                                                             animal study does not prove or disprove
                                                             the existence'or level of a biological
                                                            ' threshold; it only defines the'highest
                                                             level of  exposure under the conditions of

-------
          •  .  i, "0  :«'         " ,' •: :;t	, :,i&, v.	n..i'	'  I, ,:' ,,	 ,.• x ; i;,;:- ,diiii>!!,,;" ;"jB*i,ii|i\{:>jai/i.«»y •;:;*;.J-
  .  ,  •	.'  •	itii:»•	in*,.	 .:,.	',;,,;i	,,ij .„s	iii •( .i	Jniiiiiiif	i,.St$y,n•• tat,	,n',	!»",(	a<«:	*"	v*	HW*Bvaasisw	is
Jg'edcra.IRegister  / Vol. 56, No. 234  / Thursday, December  5,  1991 /Notices
itut ^ucSj that is not associated with a
Significant increase in adverse effects.
  Several limitations tn the use of the
XQAELhave been,described (Gaylor,
1883; Crump, 1984,; Kimmel and Gaylor,
1888; G, j»|:l;or, 188Si" Brown and'"Erdreich,	
Hjfej, ^{ppiol^gaoj: (i) Usa of the	
NO.:
              ,;:,:!,:'":
                       dose chosen for the NOAEL (5) Since
                       theNOAEL is.definedas a dpse that
                       does not produce an observed increase
                       in adverse responses from control levels
                       and is dependent on the power of the
                       study, theoretically, the risk associated
                       with it may fall anywhere between zero
                       and an incidence just below that
                       detectable from control levels (usually
                       in the range of 7% to 10% for quantal
                       data). Crump (1984) and Gaylor (1989)
                       have estimated the upper confidence
                       limit on risk at the NOAEL to be 2% to
                       Q% for specific developmental end
                       points from several data sets.
                         Because of the limitations associated
                       with the use of the NOAEL (Kimmel and
                       Gaylor,  1988; Gaylor, 1989; Kimmel,
                       1990), the Agency is evaluating the use
                       of an additional approach for more
                       quantitative dose-response evaluation
                               'i!	'i::!J!	*f	i,	>;'^
  when sufficient data are available, i.e.,
  the benchmark dose (Crump, 1984). The
  benchmark dose is based on a model-
  derived estimate of a particular
  incidence level, such as 10S incidence.
  More specifically, the benchmark dose
  (BE!) is derived by modeling the data in
  the observed range, selecting an
  incidence level within or near the
  observed range (e.g.,  the effective dose
  to produce a 10% increased incidence of
  response, the EDio), and determining the
  upper confidence limit on the model.
  The upper confidence value
  corresponding to, for example, a 10%
  excess in response is used to derive the
  BD which is the lower confidence limit
  on dose for that level of excess
  response, in this case, the LEDja (see
  Figure 1).
  BILLING CODE 6KO-50-M
villpi	I:TI"  [T'ljjjilliiiM       	,i', Ir ni	F " Jllilm1!'1,;'!!: ui! I'Uil'ii"'!''!: .iin*" ll n	t'li'i'iiilli

m	a	:sSnisira
                                                                                                                     I
          : •  ",,:":!i
i'*	'in*:;  jiiii, I
                                            1 i, t, , iE •',,,,",,,;,; f!,,,;;,;,;,},,;;,j.ll*1:!,^!V'^*;y!	i.^'-'''^»;;-• • ;;';!'"5!*'i;'!;f' ^''• •'&'•• *^

                                                                                      ';' T '= ' '*T'*	 »'" , 	'i1 Jl,	I	T"	'"'	i"1'11	!'";'!	I""1'" I
                                                                                      ''"lilBlp'Tl^'' 'iii,' I1'!11!!1!', V 'I1'1 l*lll' -^' I"" '"S "ll' '„;,;„'!",' !!|Ei!li
                                                                                      iiiiiiiiiiiirii"ii',;,i"''til	i'i'"'i,imi iiiiui'iiin;iiiii!iiii|i;; ,," .1;;<. ,„ <,,:,HI	IJLIINIIH K ,,HI,inii

-------
Federal Register /Vol. 58. No. 234 / Thursday. December 5,-1991 / Notices
B3S15
                     CO
                                            O
                                                            ;s 5
                                                    o\
                                                    •^•r -,	. /""S



                                                    £§QS
                                                                 i
                                                           8-S
                                                                £0
                                                              u w
                                                              S§
                                                     pa
                                                                60
                                                          .5 •5  2?  2
                                                          S * JO  5

-------
   '"(Hi •
   ..m
                                                            !!« k3! J't . ;'
                                                                  '
   ..       ,  .
 ;;;&|81C5ii
.,  ,  iiijiii "i1  i ...... :.M, •"  , ....... ,  in ,tl! „ ...... .ii i1!' ii: ::;,ii:'ijiii';il|ijvl;ii ..... ..... iii.vai ..... 'l'-xi .....     ~
   *  ,!?g*ster /. Y°l-  56, No. 234 / Thursday,  'December 5,  19S1 / Notices
  '   '           -^--^-^^^^^-^^^^^ -- , - : -- ,__, - = ----- : -
     /adous mathematical approaches
  nave been proposed for deriving the
   enchmark dose for developmental
  toxlcfty data (e.g.. Crump. 1984; Rai and
   /an Ryzin, 1085; Kimmel and Gaylor,
  WQS; Faustrnan et al., 1988; Chen and
 : Kottegj, iqs$ Kodell et alj, 19§i]. Such .......
 11 'tHocJcIs may tie used to calculate .the ......
  benchmark dose, and the particular
  node!  used may be less critical since
  estimation of the benchmark dose is
  limited to the observed dose range.
  Since the model is only used to fit the
  observed datttt the assumptions about
  the existence or nonexistence. of a
  threshold are not as pertinent Thus,
  Models that fit the empirical data well
  a«y provide a reasonable estimate of
  sue bj:nehmarjK dose, although biological
  factors known to influence data should
  be Incorporated into the model (e.g.,
  fntratittei1, conflations,, porrelations .......
  among end points (kyan et al,, 1991 |),
  T^he Agency is currently conducting
,:j," litadies to evaluate. th,e application of
 ,;"' fievcal models to actual data sets for
  calculating the benchmark dose, to
  determine the minimum data required
  for modeling, and to develop methods
  for application to continuous data. In
  addition, information from these studies
  will be used to develop guidance for
  application of the benchmark dose
  approach to the calculation of the RfDM
  or the RfCl)T, since the Agency has
  limited experience with this approach
  {sea Section III.D for a discussion of the
                       .............   '
     oT an
    Using the benchmark dose approach,
  an LED** can be calculated for each
  effect of an, agent for which there is a
  data base with sufficient evidence to
  conduct a risk assessment. In some
  ^wes, the data may be sufficient to also
  tettmale the EDes or ED®t which should
  be closer to a Jrue no effect dos.e. A
  level between the EDoi and the ED™
  usually corresponds to the lowest level
  of risk that can be estimated for
  binomial end points from standard
  developmental toxicity stadias.
    Certain principles are especially
  applicable for determining the NOAEL.
  LOAEL, and benchmark dose for
  developmental toxicity studies. First, the
  NOAEL. LOAEL, or benchmark dose are
  Identified for both developmental and
  maternal or adult toxicity, based on the
  information available from studies in
  which developmental toxicity has been
  evaluated. The NOAEL, LOAEL.  or
  benchmark dosa for maternal or adult
  to\(cfty should be compared with the
  corresponding values from other adult
  toxicity data tq determine if the
  pregnant or lactating female or the
  paternal animal (if exposure is prior to
  mating) may be more sensitive to an
                  agent than adult males or nonpregnant
                  females in other toxicity studies that
                  generally involve longer exposure times.
                   Second, for developmental toxic
                  effects, a primary assumption is that a
                  single exposure at a critical time in
                  development may produce an adverse
                  developmental effect, i.e., repeated
                  exposure is not a necessary prerequisite
                  for developmental toxicity to be
                  manifested. In most cases, hoxvever, the
                  data available for developmental
                  toxicity risk assessment are from studies
                  using exposures oyer several days of
                  development, and the NOAEL, LOAEL,
                  and/or benchmark dose is most often
                  based on a daily dose, e.g., mg/kg/day.
                  Usually,  the daily dose is not adjusted
                  for duration of exposure because
                  appropriate pharmacokinetic data are
                  not available. In cases where such data
                  ar,e ayailghje, adjustments may be made
                  to provide an estimate of equal average
                  concentration at the site of action for  the
                  human exposure scenario of concern.
                  For example, inhalation studies often
                  use 6 hr/dsy exposures during
                  development If the human exposure
                  scenario is continuous and
                  pharmacokinetic data indicate an
                  accumulation with continuous exposure,
                  appropriate adjustments can be made.
                  If, on the other hand, the human
                  exposure scenario of concern is very
                  brief or intermittent, pharmacokinetic
                  data indicating a long half-life may also
                  require adjustment of dose. When
                  quantitative absorption data by any
                  route of exposure are available, the
                  NQAEL rnay be adjusted accordingly;
                  e.g., absorption of 50% of administered
                  dp.se. could resultjn.a 5Q% jgdiisflsn in 	
                  the NOAEL. If absorption in the
                  experimental species has been
                  determined, but human absorption is not
                  known., human absorption is generally
                  assumed to be the same as that for the
                  species,with the greatest degree of
                  absorption. NOAELs from inhalation
                  exposure studies are adjusted to derive
                  a human equivalent concentration
                  (HECJ by taking into  account known
                  anatomical and physiological species
                  differences [e.g.,  minute volume,
                  respiratory rate,  etc.) (U.S.  EPA, 1991b).
                   In summary, the dose-response
                  evaluation identifies  the NOAEL,
                  LOAEL, or benchmark dose, defines the
                  range of doses for a given agent that are
                  effective in producing developmental
                  and maternal toxicity, the route, timing
                  apd dura tigs gf exposure, species
                  specificity of effects,  and any
                 pharmacokinetic or other considerations
                  that might influence the comparison
                 with human exposure scenarios.This
                 information should always accompany
                  Ihp  sharacteFization of the health-
 related data base (discussed in the next
 section).

 C. Characterization of the Health-
 Related Data Base

   This section describes the process for
 evaluating the health-related data base
 as a whole on a particular agent and
 provides criteria for characterizing the
 evidence for judging a potential
 developmental hazard in humans within
 the context of expected exposure or
 dose. This determination provides the
 basis for judging whether or not there
 are sufficient data for proceeding further
 in the risk assessment process. This
 section does not address the nature and
 magnitude" of human health risks which
 are discussed as part of the final
 characterization of risk along with
. estimates of potential human exposure
 and the  relevancy of available data for
 estimating human risk. Characterization
 of hazard potential within the context of
 exposure or dose should assist the risk
 assessor in clarifying the strengths and
 uncertainties associated with a
 particular data base. Because a complex
 interrelationship exists among study
 design, statistical analysis, and
 biological significance of the data, a
 great deal of scientific judgment, based
 on experience with developmental
 toxicity  data and with the principles of
 study design and statistical analysis,
 may be required to adequately evaluate
 the data base. Scientific judgment is
 always necessary, and in many cases,
 interaction with scientists in specific
 disciplines (e.g., developmental
 toxicology, epidemiology, statistics) is
..recommended.	
   A categorization scheme for
 characterizing the evidence for
 developmental toxicity is presented in
 Table 3. The categorization scheme
 contains two broad categories, sufficient
 evidence and insufficient evidence,
 which are defined in the  table. Data
 from all available studies, whether
 indicative of potential hazard or not,
 must be  evaluated and factored into a
 judgment as to the strength of evidence
 available to support a complete risk
 assessment for developmental toxicity.
 The primary considerations are the
 human data, if available, and the
 experimental animal data. The judgment
 of whether the data are sufficient or
 insufficient should consider quality of
 the data, power of the studies, number
 and types of end points examined,
 replication of effects, relevance of the
 test species to humans, relevance of
 route and timing of exposure for both
 human and animal studies,
 appropriateness of the dose selection in
 animal studies, and number Q! species
                              i' -Si:	f	HI, '•'^^•"•'U'lf •''!	k	LTMiS	I!	fiW^'ffi'iEtK'iW! f,S	itM	^^SpSii'SiRIBi	P
                               	,:""	:	:	""	™";111"",.;!I'll, 1	111,111111|(,111,!!!,'!!!!:!!	'''122

-------
! t, M federal Register;/ -Vol., 56, No., 234  / Thursday,- December 5, 1991  /Notices
                                                                                                                 * 63817
 examined. In addition, pharmacokinetic
 data and structure-activity
 considerations, data from other toxicity
 studies, as well as other factors that
 may affect the strength of the evidence,
 should be taken into account.

 Table 3.—Categorization of the Health-
 Related Data Base for Hazard Identifica-
 tion/Dose-Response Evaluation

            Sufficient Evidence

  The  sufficient evidence category includes
 data that  collectively provide enough infor-
 mation  to judge whether or not a human
 developmental hazard could exist within the
 context of dose, duration, timing  and route of
 exposure. This category includes both human
 and experimental animal evidence.
  Sufficient Human Evidence: This category
 includes data .from  epidemiologic  studies
 (e.g., case control and cohort) that provide
 convincing evidence for, the scientific com-
 munity to  judge that a causal relationship is
 or is not supported. A case series in conjunc-
 tion with  strong supporting evidence  may
 also be used. Supporting animal  data may or
 may not be available.
  Sufficient Experimental Animal Evidence/
 Limited Human  Data: This category includes
 data from, experimental animal studies and/
 or limited  human data that provide convinc-
 ing evidence for the scientific community to
 judge if the potential for developmental tox-
 icity exists.  The minimum evidence neces-
 sary to judge that a potential hazard exists
 generally would be data  demonstrating an
 adverse developmental effect in  a single, ap-
 propriate,  well-conducted  study in a single
 experimental animal species.  The minimum
 evidence needed to judge that a potential
 hazard  does not exist would include  data
 from appropriate, .well-conducted laboratory
 animal  studies  in  several species (at least
 two) which evaluated a variety of the poten-
 tial manifestations of developmental toxicity,
 and showed  no developmental effects  at
 doses that were  minimally toxic to the adult.
          Insufficient Evidence
  This  category  includes situations for which
 there is less than the  minimum  sufficient
 evidence necessary for assessing the poten-
 tial for developmental toxicity, such as when
 no' data are available on developmental tox-
 icity, as well as for data  bases from studies
 in animals or humans that have  a limited
 study design (e.g., small numbers, inappro-
 priate  dose selection/exposure  information,
 other uncontrolled factors), or data from  a
 single species reported to have  no adverse
 developmental effects, or data bases limited
 to information on structure/activity relation-
 ships, short-term tests, pharmacokinetics, or
 metabolic precursors.

  In general, the categorization is based
 on criteria that define the minimum
 evidence necessary to conduct a hazard
 identification/dqse-response evaluation.
 Establishing the minimum sufficient
 human evidence necessary to do a
 hazard identification/dose-response
.evaluation is difficult, since there are
                           often considerable variations in study
                           designs arid study group selection. The
                           body of human data should contain
                           convincing evidence as described in the
                           "Sufficient Human Evidence" category.
                           Because the human data necessary to
                           judge whether or not a causal
                           relationship exists are generally limited,
                           there are currently few agents that can
                           be classified in this category. In the case
                           of animal data, agents that have been
                           tested adequately in laboratory animals
                           according to current test guidelines
                           generally would be included in .the
                           "Sufficient Experimental Animal
                           Evidence/Limited Human Data"
                           category. The strength of evidence for a
                           data base increases with replication of
                           the findings and with additional animal
                           species tested. Information on
                           pharmacokinetics or mechanisms, or on
                           more than one route of exposure may
                           reduce uncertainties in extrapolation to
                           the human.
                             More evidence is necessary to judge
                           that an agent is unlikely to pose a
                           hazard for developmental toxicity than
                           that required to judge a potential
                           hazard. This is because it is more
                           difficult, both biologically and
                           statistically, to support a finding of no
                           apparent adverse effect than a finding of
                           an adverse effect. For example, to judge
                           that a hazard for developmental toxicity
                           could exist for a given agent, the
                           minimum evidence necessary would be
                           data from a single, appropriate, well-
                           executed study in a single experimental
                           animal species that demonstrate
                           developmental toxicity, and/or
                           suggestive evidence from adequately
                           conducted clinical/epidemiologic
                           studies. On the other hand, to judge that
                           an agent is unlikely to pose a hazard for
                           developmental toxicity, the minimum
                           evidence would include data from
                           appropriate, well-executed laboratory
                           animal studies in several species (at
                           least two) which evaluated a variety of
                           the potential manifestations of
                           developmental toxicity and showed no
                           adverse developmental effects at doses
                           that were minimally toxic to the adult
                           animal. In addition, there may be human
                           data from appropriate studies
                           supportive of no adverse developmental
                           effects.
                             If a data base on a particular agent.
                           includes less than the minimum
                           sufficient evidence {as defined in the
                           "Insufficient Evidence" category)
                           necessary for a risk assessment, but
                           some data are available, this
                           information could be used to  determine
                           the need for-additional testing. In the
                           event that a substantial data  base exists
                           for a given chemical, but no single study
                           meets current test guidelines, the risk
assessor should use scientific judgment
to determine whether the composite
data base may be yiejved as meeting the
"Sufficient Evidence" criteria. In some
cases, a data base may contain
conflicting data. In these instances, the
risk assessor must consider each study's
strengths'and weaknesses within the
context of the overall data base in an
attempt to define the strength of
evidence of the data base for assessing
the potential for developmental toxicity.
  Judging that the health-related data
base is sufficient to indicate a potential
developmental hazard does not mean
that the agent will be  a hazard at every
exposure level (because of the
assumption of a threshold) or in every
situation (e.g., hazard may vary
significantly depending on route and
timing of exposure). In the final risk
characterization, the characterization of
the health-related data base should  *
always be presented with information
on the dose-response  evaluation (e.g.,
LOAEL,, NOAEL, and/or benchmark    '
dose), exposure route, timing and
duration of exposure, and with the
human exposure'estimate.

D. Determination of the Reference Dose
(RfDoy) or Reference Concentration
(RfCor) for Deyelopmen tal Toxicity

  The RfDDT or RfCDT is an estimate of a
daily exposure to the  human population
that is assumed to be  without
appreciable risk of deleterious
developmental effects. The use of the'
subscript DT is intended to distinguish
these terms from the reference dose
(RfD) for oral or dermal exposure or, the
reference concentration (RfC) for
inhalation exposure, terms  that refer
primarily to chronic exposure situations
(U.S. EPA, 1991b). The RfLV or RfCDT is
derived by applying uncertainty, factors
to the NOAEL (or the  LOAEL, if a
NOAEL is not available), or the
benchmark dose. To date, the Agency
has applied uncertainty factors only to
the NOAEL or LOAEL to derive an
RfDDT or RfCDT. The Agency is planning
eventually to use the benchmark dose
approach as the basis for derivation of
the RfDDT or RfGDT and will develop
guidance as information is acquired and
analyzed from ongoing Agency.studies.
  The most sensitive developmental
effect (i.e., the critical effect) from the
most appropriate and/or sensitive
mammalian species is used for
determining the NOAEL, LOAEL, or the
benchmark dose in deriving the RfDPT or
RfCDT (Section IH.B). Uncertainty factors
(UFs) for developmental and maternal
toxicity applied to the NOAEL generally
include a 10-fold factor for interspecies
variation and a 10-fold factor for

-------
                        		,.,	;	.
                      Federal.	Register / Vol. SB. No. 234  /../Thursday, December 5, 1991 / Notices
              i variation. In general, an
             „ factor is not applied to
           for duration of exposure.
      Additional factors may be applied So
   " fece'tiri! for otfccr ttncertainties or
 ",  additions,! btfoq&ation that may exist in
 I',  Ine data base. For example, the
   '•fatutanl study cJ^lga for a
    developmental toxicity study cai^s for a
    low dose that demonstrates a NOAEL,
 ''  tpt Itj tpiue pases, the lowest dose
    administered may cause significant
,';  advVba effects) and,, thus, be identified,
    its thfi 'LQABL In circumstances where
    only a LOAEi is available, the use of an
    •ddittoiMi uncertainty factor of up to 10
  11  may bt" tctpjifpd, depending on the
    sensitivity of the end points evaluated,
    adequacy of dose levels tested, or
   general confidence in  the LOAEL In
   addition,if a benchmark dose hag. been
 '"' , cMc-dated,, it may be used to help"	
   'Interpret row c1
-------
                   Federal Register /Vol.  56, No. 234 7  Thursday, December 5, 1991 / Notices
                                                                       63819
 to be manifested, although it should be
 considered in cases where there is
 evidence of cumulative exposure or  .
 where the half-life of the agent is
 sufficiently long to produce an
 increasing body burden over time).
 Therefore, it is assumed that, in most
 cases, a single exposure at any of
 several developmental stages  may be
 sufficient to produce an adverse
 developmental effect. Most of the data
 available for risk assessment involve
 exposures over several days of
 development. Thus, human exposure
 estimates used to calculate margins of
 exposure (MOE, see following section)
 or to compare with the RfDnr or RfCDT
 are usually based on a daily dose that is
 not adjusted for duration or pattern of
 exposure. For example, it would be
 inappropriate in developmental toxicity
 risk assessments to use time-weighted
 averages or adjustment of exposure over
 a different time frame than that actually
 'encountered (such as the adjustment of
 a 6-hour inhalation exposure to account
 for'a 24-hour exposure scenario), unless
 pharmacokinetic data were  available to
' indicate an accumulation with
 continuous exposure. In the case of
 intermittent exposures, examination of
 the peak exposure(s), as well as the
 average exposure over the time p'eriod
 of exposure, would be important.
   It should be recognized that, based on
 the definition used in these Guidelines
 for developmental toxicity, exposure of
 almost any segment of the human
 population may lead to risk to the
 developing organism. This would
 include fertile men and women, the
 developing embryo and fetus,  and
 children up to the age-of sexual
 maturation. Although some effects of
 developmental exposures may be
 manifested while .the exposure is
 occurring (e.g., spontaneous abortion,
 structural abnormality present at birth,
 childhood mental retardation), some
 effects may not be detectable  until later
 in life, long after exposure has ceased
 (e.g., perinatally induced carcinogenesis,
 impaired reproductive function,
 shortened lifespan).
 V. Risk Characterization
 a. Overview
   Risk characterization is the
 culmination  of the risk assessment
 process. In this final step, risk
 characterization involves integration of
 the toxicity information from the hazard
 identification/dose-response evaluation
 with the human exposure estimates and
 provides an  evaluation of the  overall
 quality of the assessment, describes risk
 in termsjpf the nature and extent of
 harm, and communicates the results of
the risk assessment to a risk manager.
The risk manager can then use the risk
assessment, along with other risk
management elements, to make public
health decisions. The following sections
describe these three aspects of the risk
characterization in more detail, but do
not attempt to provide a full discussion
of risk characterization. Rather these
Guidelines point out issues that are
important to risk characterization for  . -
developmental toxicity.

B. Integration of the Hazard
Identification/Dose-Response
Evaluation and Exposure-Assessment

  In developing the hazard
identification/dose-response and
exposure portions of the risk
assessment, the risk assessor makes
many judgments concerning human
relevance of the toxicity data, including
the appropriateness of the various
animal models for which data are
available, the route, timing, and duration
of exposure relative to expected human
exposure, etc. These judgments should
be summarized at each stage of the risk
assessment process (e.g., the biological
relevance of anatomical variations may
be made in the hazard identification •
process, or species differences in
metabolic patterns in the dose-response
evaluation). When data are not
available to make such judgments, as is
often the case, the background
information and assumptions discussed
in the Introduction (Section I) provide a
default position. The risk assessor must
determine if some of these judgments
have implications for other portions of
the assessment, and whether the various
components of the assessment are
compatible..
  The description of the relevant data
should convey the major strengths and
weaknesses of the assessment that arise
from availability of data and the current
limits of understanding of the
mechanisms of toxicity. Confidence in
the results of a risk assessment is a
function of confidence in the results of
the analysis of these elements. Each of
these elements should have its own
characterization as a part of it.
Interpretation of data should be
explained, and the risk manager should
be given a clear picture of consensus or
lack of consensus that exists about
significant aspects'of the assessment.
Whenever more than one view is
supported by the  data and choosing  ;
between them is difficult, both views
should be presented. If one has been
selected over another, the  rationale
should be given; if not, then both should
be presented as plausible alternative
results.
  The risk characterization should not
only examine the judgments, but also
explain the constraints of available data
and the state of knowledge about the
phenomena studied in making them,
including;
  • The qualitative conclusions about
the likelihood that the agent may pose a
specific hazard to human health, the
nature of the observed effects, under
what conditions (route, dose levels,
time, and duration) of exposure these
effects occur, and whether the health-
related data are sufficient to use in a
risk assessment;
  • A discussion of the dose-response
patterns for the critical effect(s), data
such as the shapes and slopes of the
dose-response curves for the various
end points, the rationale behind the
determination of the MOAEL, LOAEL,
and/or calculation of the benchmark
dose, and the assumptions underlying
the estimation of the RfDur or RfCo?;
and
  • The estimates of the magnitude of,
human exposure, the route, duration,
and pattern of the .exposure, relevant
pharmacokinetics, and the size and
characteristics of the populations
exposed.                         ,
  The risk characterization of an agent
, should be based on data from the most
appropriate species, or, if such -
information is not available, on the most
sensitive species tested. It should also
be based oh the most sensitive indicator
of toxicity, whether maternal, paternal,
or developmental, when such data are
available, and should be considered in • • •
relationship to other forms of toxicity.
  If data used  in characterizing risk are
from a route of exposure other than the
expected human exposure, then
pharmacokinetic data should be used, if
available, to extrapolate across routes
of exposure. If such data are not
available, the Agency makes certain
assumptions concerning the amount of
absorption likely or the applicability of
the data from one route to another (U.S.
EPA, 198d985b).                   ,
  The..tevel of confidence in the hazard
idenfification/dose-response evaluation
should be stated to  the extent possible,
including determination of the  •
appropriate category regarding
sufficiency of the health-related data. A
comprehensive risk assessment ideally
includes information on a variety of end
points that provide insight into the full
spectrum of developmental responses. A
profile that integrates both human and
test species data and incorporates a
brojd range of developmental effects
proftdes more confidence in a risk
assessment for a given agent.

-------
     ,	 ,;„
            In, ::!:
                          „: ,f iii
                                    . '	;: ':;l'. ;>•, •,' ,'•" ,•••'	£•:•	i1"*	:,	i'1!1 ill!, i/l';,'! '? ;.:i K;:H!	I»»fcg|-Bf;«;:„!	l^iAlklbi.!^	: Vs. >::,,-: j'Ji'l ff'-. WH .i!»	-'''!,' ririj	HilMr«lRp|WjL.'.'
                                   :,;'.!;;s, ft,j,'•£$ \ '• f:;I?»'i:• S^J• •;f 1'M^fES^Ip(^]WffilKf?5'• ]!!;;;'	ii.?!f,|(:S*f I j"li*;:;ifIf:;!>f'|!3fl^illffiifll^* I
,,,,,,§3120  '	Federal'Register /" Vol. 56, No.";234	J-Thursd'ay,  December	5," 1991	/"Notices	  '"
    The ability to describe the nature of
  human exposure is Important for
  prediction of specific outcomes and the
  likelihood of permanence or reversibility
  of the effect. An important part of this
  effort is a description of the nature of
  the exposed populations. For example,
  the consequences of exposure to the
  <&vejop(ng individual versus the adult
  can differ markedly and again can
 ,.j ,hj1Jlu^2q®iwheitherithei|e|fect!j ..... are ......
^iJwnslent o^ permanent.' Other .................
! '4. coiitJip{tler^t|pn|i,iirelatiye to human .................
.•'•• exjj&iiurei m'igh't include potential ........
  s.v neigistic; effects, increased
  susceptibility resulting from concurrent
  for exposures to other agents,
  ''corie'itrrcnl disease, ..... and nutritional .....
  "status.       ............... .................... '
  C. ZTffsmptors of Developmental
  ToxMty Risk
    There are a number, of ways to
  describe risks. These include:
  1. Estimation of the Number of
  Individuals Exposed to Levels of
  Concern
    The RfD0f or RfC»r is assumed to be a
  level at or below which no significant
  risk occurs. TJiercfore, information ;t>n
  the populations at or below the RfDor or
  RfOoT ("not Hkely to be at risk") and
  above the RfDor or RfC0T ("may be at
  risk") may be useful information for risk
  managers.
    This method is particularly useful to a
  risk manager considering possible
  actions to ameliorate risk for a
  population, If the number of persons in
  the "at risk/ category can be estimated,
  than the number of persons potentially
h:; ' tppove^frpintlie'/'atri^lg" cajegory
  after a contemplated action is taken can
 ..... baiuse^i,asi anjndjcg tiqn of thg ..... effjejacy
Y'jpf tjiat action- ....... [[[ ............................
  2. Presenting Specific Scenarios
    Presenting specific scenarios in the
  form of "what |f?" questions is
  particularly useful to give perspective to
  the risk manager, especially where
  criteria, tolerance limits, or media
  qialfty limits are being set. The question
  being asked in these cases is, "At this
  proposed Ifmit, what would be the
 .•resulting risk for developmental toxicity
  above
  3, Risk Characterization for Highly
  Exposed Individuals
    Th|s measure and the next are
  wSmples of specific scenarios. The
  purpose of this measure is to describe
  thp upper end of the exposure
  distribution. This allows risk managers
  lo evaluate whether certain individuals
  tire at disproportionately high or
  unacceptable high risk.
                                           The objective of looking at the upper
                                         end of the exposure distribution is to
                                         derive a realistic estimate of a relatively
                                         highly exposed individual(s), for
                                         example by identifying a specified upper
                                         percentile of exposure in the population
                                         and/or by estimating the exposure of the
                                         most highly exposed individual(s).
                                         Whenever possible, it is important to
                                         express the number of individuals who
                                         qomprise the highly exposed group and
                                         discuss the potential for exposure at still
                                         higher levels.
                                           If population data are absent, it will
                                         often be possible to describe a scenario
                                         representing high end exposures using
                                         upper percentile or judgment-based
                                         values for exposure variables. In these
                                         instances^' caution should be taken not
                                         to overestimate the high end values if a
                                         "reasonable" exposure estimate is to be
                                         achieved.

                                         4. Risk Characterization for Highly
                                         Sensitive or Susceptible Individuals
                                           The purpose of this measure is to
                                         quantify exposure to identified sensitive
                                         or susceptible populations to the effect
                                         of concern. Sensitive or susceptible
                                         individuals are those within the exposed
                                         population at increased risk of
                                         expressing the adverse effect All stages
                                         of development might be considered
                                         highly sensitive or susceptible, but
                                         certain subpopulations can sometimes
                                         be identified because of critical periods
                                         for exposure; for example, pregnant or
                                         lactating women, infants, children,
                                         adolescents,.	
                                           In general, not enough is understood
                                         about the mechanisms of toxicity to
                                         identify sensitive subgroups for all
                                         agents, although factors such as
                                         nutrition, personal habits (e.g., smoking,
                                         alcohol consumption, illicit drug abuse),
                                         or pre-existing disease [e.g., 'diabetes)
                                         may predispose some individuals to be
                                         more sensitive to the developmental
                                         effects of various agents.
                                         5. Other'Risk Descriptors
                                           In risk characterization, dose-
                                         response information and the human
                                         exposure-estimates may be combined
                                         either by comparing the RfDDT or RfCDT
                                         and the human exposure estimate or by
                                         calculating the margin of exposure
                                         (MOE). The MOE is the ratio of the
                                         NOAEL from the most appropriate or
                                         sensitive species to the estimated
                                         human exposure level from all potential
                                         sources (U.S. EPA, 1985b). If a NOAEL.js
                                         not available, a LOAEL may be used in
                                         the calculation, of the MOE, but	
                                         considerations for the acceptability
                                         would be different than when  a NOAEL
                                         is used. Considerations for the
                                         acceptability of the MOE are similar to
                                         that for the uncertainty factor  applied to
the LOAEL, NOAEL, or the benchmark
dose. The MOE is presented along with
the characterization of the data base,
including the strengths and weaknesses
of the toxicity and exposure  data, the
number of species affected, and the
dose-response, route, timing, and
duration information. The RfDDT or
RfCDT comparison with the human
exposure estimate and the calculation of
the MOE are conceptually similar but
are used in different regulatory
situations. If the MOE is equal to or
more than the uncertainty factor used as
a basis for an RfDDT or RfCDT, then the
need for regulatory concern is likely to
be reduced.
  The choice of approach is dependent
upon several factors, including the
statute involved, the  situation being
addressed, the data base used, and the
needs of the decision maker. While
these methods of describing risk do not
actually estimate risks per se, they give
the risk manager some sense of how
close the exposures are to levels of
concern. The RfDDT, RfCBT, and/or the
MOE are considered along with other
risk assessment and risk management
issues in making risk management
decisions, and the scientific issues that
must be taken into account in
establishing them have been addressed
here.
E. Communicating Results

  Once the risk characterization is
completed, the focus  turns to
communicating results to the risk
manager. The risk manager uses the
results of the risk characterization, other
technologic factors, and
nontechnological social and economic
considerations in reaching a  regulatory
decision. Because of the way in which
these risk management factors may
impact different cases, consistent but
not necessarily identical risk
management decisions must  be made on
a case-by-case basis. Consequently, it is
entirely possible and appropriate that an
agent with a specific risk
characterization may be regulated
differently under different statutes.
These Guidelines are not intended to
give guidance on the nonscientific
aspects of risk management decisions.
VI. Summary and Research Needs

  These Guidelines summarize the
procedures that the U.S. Environmental
Protection Agency uses in evaluating the
potential for agents to cause
developmental toxicity. While these are
the first amendments  to the
developmental toxicity guidelines issued
in 1986, further revisions and updates
will be made  as advances occur in the

-------

                                                5, "1991 /-Notices
                                                                                                                           63821
 field. These Guidelines discuss the
 assumptions that should be made in risk
 assessment for developmental toxicity
 because of gaps in our knowledge about
 underlying biological processes and how
 these compare across species.
   Research to improve  the risk
 assessment process is needed in a
 number of areas. For example, research
 is needed to delineate the mechanisms
 of developmental toxicity and
 pathogenesis, provide comparative
 pharmacokinetic data, examine the     .
 validity of short-term in vivo and in ,
 vitro tests, elucidate possible functional
 alterations and their critical periods of
 exposure to toxic agents, develop
 improved animal models to examine the
 developmental effects of exposure
 during the premating and early
 postmating periods and in neonates,
 further evaluate  the relationship
 between maternal and developmental
 toxicity, provide insight into the concept..
 of threshold, develop approaches for
 improved mathematical modeling of
 adverse developmental effects, and
 improve animal models for examining
 the effects of agents given by various
 routes of exposure. Epidemiologic
 studies with quantitative measures of
 exposure are also strongly encouraged.
 Such research will aid in the evaluation
 and interpretation of data on
 developmental toxicity, and should
 provide methods to more precisely
 assess risk.

 VI. References

 Adams, J. (1986) Clinical relevance of
    experimental behavioral teratology.
    Neurotoxicology 7:19-34.
 Anderson, L.M.; Donovan,  P.J.; Rice, J.M.
    (1985) Risk assessment for transplacental
    carcinogens. In: Li, A.P., ed. New
    approaches in  toxicity testing and their
    application in human risk assessment.
    New York, NY: Raven  Press, pp. 179-202
Axeison, O. (1985) Epidemiologic methods in
    the study of spontaneous abortions:
    source of data, methods, and sources'of
    error. In: Hemminki, K.; Sorsa, M.;
    Vainio, H., eds. Occupational hazards
    and reproduction. Washington, DG:
    Hemisphere Pub., pp. 231 236.
Baird, D.D.; Wilcox, A.J.; Weinberg, C.R.
    (1988) Use of time to pregnancy to study
    environmental exposures. Am. J.
    Epidemiol. 124:470-480.
Bellinger, D.; Leviton, A.; Waternaux, C.;
    Needleman, H.; Rabinowitz, M. (1987)
    Longitudinal analyses of prenatal and
    postnatal lead exposure and early
    cognitive development. N. Engl. J. Med.
    316MG37-1Q43.
  Bloom, A.D. (1981) Guidelines for
      reproductive studies in exposed human
      populations. Report of Panel II. In:
      Guidelines for .studies of human
      populations exposed to mutagenic and
      reproductive hazards. White Plains, NY:
      March of Dimes Birth Defects
      Foundation, pp. 37-110.
  Brown, J.M. (1984) Validation of an in vivo
      screen for the determination of embryo/
      fetal toxicity in mice. Prepared by SRI
      International for the U.S. EPA,
      Washington,  DC, under EPA contract no.
      68-01-5079. .
  Brown, N.A. (1987) Teratogenicity testing in
      vitro: status of validation studies. Arch.
      Toxicol. Suppl. 11:105-114.
  Brown, K.G.; Erdreich, L.S. (1989) Statistical
      uncertainty in the no-observed-adverse-
      effect level. Fundam. Appl. Toxicol.
      13:235-244.
  Brown, N.A.; Fabro, S.E. (1982) The in vitro
      approach to teratogenicity testing. In:
      Snell, K., ed. Developmental toxicology.
     London, England: Croom-Helm, pp. 31-
     . 57.
  Brown, N.A.; Freeman, S.J. (1984) Alternative
      tests for teratogenicity. Alternatives Lab,
     Anim. 12:7-23.
  Buelke-Sam, J.; Kimmel, C.A.; Adams, J., eds.
     1985. Design considerations in screening
     for behavioral teratogens: results of the
     Collaborative Behavioral Teratol. Study.
     Neurobehav. Toxicol. Teratology
     7(6):537-789.
  Butcher, R.E.; Wootten, V.; Vorhees, C.V.
     (1980) Standards in behavioral teratology
,,     testing: test variability and sensitivity.
     Teratogenesis Carcinog. Mutagen. 1:49-
     61.
 Centers for Disease Control. (1988a) Trends
     in years of potential life lost due to infant
     mortality arid perinatal conditions, 1980-
     1983 and 1984-1985. Morbidity and
     Mortality Weekly Repdrt 37:249-256.
 Centers for Disease Control. (1988b)
     Premature mortality due to congenital
     anomalies—United States. Morbidity and
     Mortality Weekly Report 37:505-506.
 Chen, J.J.; Kodell, R.L. (1989) Quantitative risk
     assessment for teratological effects. J.
     Amer. Statistical Asspc, 84:966-971.
 Chernoff, N.; Kavlock, R.f. (1982) An in vivo
     teratology screen utilizing pregnant mice.
     J. Toxicol. Environ. Health 10:541-550.
 Couture, L.A. (1990) 2,3,7,8-
     Tetrachlorodibenzo-p-dioxin-induced
     hydronephrosis: characterization of the
     peak period of sensitivity for placentally-
     and lactationally induced renal lesions,
     and assessment of persistence
     [dissertation]. Chapel Hill, NC:
     University of North Carolina. Available
     from: University of Michigan,
     Dissertation Library, Ann Arbor, MI..
 Crump, K.S. (1984) A new method for
     determining allowable daily intakes.
     Fundam. Appl. Toxicol. 4:854-871.
Daston, G.P.; Rehnberg, B.F.; Carver, B.A.; :
     Kavlock, R.J. (1988) Functional teratogens
     of the rat kidney. II. Nitrofen and
     elhylenethiourea. Fundam. Appl. Toxicol.
     11:401-415.
: Davis, J.M.; Otto, D.A.; Weil, D.E.; Grant, I.D.
     (1990) The comparative developmental
     neurotoxicity of lead in humans and
     animals. Neurotoxicol, Teratol. 12:215-
     229.
 Deane, M.; Swan,  S.H.; Harris, J.A.; Epstein,
     D.M.; Neutra, R.R. (1989) Adverse
     pregnancy outcomes in relation to water
     contamination, Santa Clara County, CA,
     1980-1981. Am. J. Epidemiol. 129:894-904.
 Dobbins, J.G.; Eifler, C.W.; Buffler, P.A. (1978)
     The use of parity survivorship analysis in
     the study of reproductive outcomes.
    , Presented at the Society for
     Epidemiologic Research Conference;
     June; Seattle, WA.
 Eisner, J.; Suter, K.E.; Ulbrich, B.; Schreiner,
     G. (1986) Testing strategies in behavioral
 :   ' teratology: IV. Review and general
     conclusions. Neurobehav. Toxicol.
   .  Teratol. 8:585T590.     ,            -    '
 Epidemiology Workgroup of the Interagency
    .Regulatory Liaison Group (1981)
     Guidelines for documentation of
     epidemiologic studies. Am. J. Epidemiol.
     114(5):609-613.
 Everson, R.B.; Sandier, D.P.; Wilcox, A.J.;
     Schreinemachers, D.; Shore, D.L.;
     Weinberg, C. (1986) Effect of passive
     exposure to smoking on age at natural
     menopause. Br. Med. J. 293(6550):792.
 Fabro, S.; Shull, G.; Brown, N.A. (1982) The
     relative teratogenic index and
     teratogenic potency: proposed
     components of developmental toxicity
     risk assessment. Teratogenesis Carcinog.
     Mutagen. 2:61-76.
 Faustman, E.M. (1988) Short-term tests for
     teratogens. Mutat. Res. 205:355-384.
 Faustman, E.M.; Wellington, D.G.; Smith,
    W.P.; Kimmel, C.A. (1989)
    Characterization of a developmental
     toxicity dose-response model. Environ.
    Health Perspeot. 79:229-241.
 Food and Drug Administration. (1966)
    Guidelines for reproduction and-studies
    for safety evaluation of drugs for human
    use. Bureau of Drugs, Rockville, MD.
 Food and Drug Administration. (1970)
    Advisory Committee on Protocols for
    Safety Evaluations. Panel on
    reproduction report on reproduction
    studies in the safety evaluation of food
    additives and pesticide residues. Toxicol.
    Appl. Pharmacol.x 16:264-296.
 Food and Drug Administration (1987) Report
    of the in vitro teratology task force.
    Environ. Health Perspect. 72:201-249.
 Francis, E.Z.; Farland, W.H. (1987)
    Application of the preliminary
    .developmental ioxicity screen for
    chemical hazard identification un< tor the
    Toxic Substances Control Act,
    Teratogenesis Carcinog. Mutagen. ":107-
    117.
Fujii, T.; Adams, P.M. (1987) Functional
    teratogenesis: functional effects on the
    offspring after parental drug exposure.'
    Tokyo, Japan: Teikyo University Press.
Gaffey, W.R. (1976) A critique of the standard
    mortality ratio. J. Occup. Med. 18:157-
    160.         . i --..••.   '  /
Gaylor, D.W. (1983) The use of safety factors
    for controlling risk. J. Toxicol. Environ. -
    Health 11:329-336.

-------
  63822
Federal Register /  Vol.  56, No. 234 /  Thursday, December 5, 1991 /Notices
      toCi D.W. tl989)'"QuB.ntitat!ve risk'
      «tw!j'»i» for quintal reproductive and
      l>  (i87i) A^onoaircinoma of the vagina:
 11  	   HssoKhilhtn of ma!en!fll stiibestrol	
    i"1  therapy with appearance in young
   i	Women. I^Engi. ]. Med. 284:878.
   jfertlg. A.T. (1967} The overall problem in
       man- In: Bcr.irsohkc, K.. ed. Comparative
       aspects of reproductive failure. New
       York. NY: Springer-Verlag. pp. 11-41.
   ffogwr. CJJt (1064) Reducing
       mtlclasaiftcaUofi errors through
       questionnaire design. In: Loekey, J.E.;
       uimaster*. G.K,- Keye. WJt, eds.
       Reproduction: the new frontier in
       occupations! and environmental health
       fesciitch. New York, NY: Alaa R. Lias,
I" I    '.  Inc., pp. 81-07.
   Hflgfie, CJJl. (1965) Developmental risks.
       Presented ul: symposium oa
       cpfdamtelagy ana health risk
       K8H05Sn'.cnt; May 14; Columbia. MD.
   Joffc, M. (1985) Biases in research on
       raprodor.lion and women's work. lot. J.
       tipidciRloL 14|l);118-23.
   Johnnm, E.M, (1981) Screening for teratogenic
       nanrclf • nutmo of the problem. Annu.
"..'	Rev, ph'I'riStcoL foxicoL 21:417-429.
   Johnion. B.M.; Gabel, B.E.G. (1983) An
       artificinl embryo for detection of
       Hbnormjil developmental biology.
       Fundam. Appl. toxicot 3:243-249.
Kavlock, R.J.; Grabowski, C.T., eds. (1983)
    Abnormal functional development of the
    heart, lungs, gnd kidneys: approaches to
   .. functional teratology. Prog. Clin. Bipj.
   	Res.,""vbl. 140. New York. NY: Alan R,  '
    lass, Inc.
Kavlock, R.J.; Rehnberg, B.F.; Rogers, E.H.
    (1986) Congenital renal hypoplasia:
    effects on basal renil function in the
    developing rat. Toxicology 40:247-258,
Kavlock, R.J.; Relinberg, B.F.; Rogers, E.H.
    (1987a) The fate of adriamycin induced
    dilated renal pelvis in the fetal rat:
    physiological and morphological effects
    in the offspring. Teratology 36:51-58.
Kavlock, R.J.; Rehnberg, B.F.; Rogers, E.H.
    (i987b) Critical prenatal periods for
    chlorambucil induced functional
    teratology of the kidneys. Toxicology
    43:51-64.  	:::	'	
Kavlock, R.J.; Short R.D.. Jr.; Chernoff, N.
    (19SJ'c) Further evaluation of an in yiyo
    teratology screen. Teratogenesis
   " Carcinog. Mutagen."7:7-16.
Kavlock, R.J.; Hoyle, B.R.; Rehnberg, B.F.;
    Rogers, E. (1988) The significance of
    dilatedreggj pelvis^ in the nitrofen
    exposeS fetal rat. Toxicol. Appl.
    Phannacol. 94:287-296.
Khera, K.S. (1984) Maternal toxicity—a
    possible factor in fetal malformations in
    mice. Teratology 29:411-418.
Khera, K.S. (1985) Maternal toxicity: a
    possible ettologic factor in embryo-fetal
    deaths and fetal malformations in
    rodent-rabbit species. Teratology 31:129-
    153.
Khera, K.S. (1987) Maternal toxicity of drugs
    and metabolic disorders—a possible
    etiologic factor in the intrauterine death
    and congenital malformation: a critique
    on human data. CRC Crit. Rev. Toxicol.
    ,17:345^3757	~^' ~.	'„'	" '^ "^'~'x"'	"m  '^ ~_
Kimmgl, C.A. (1988) Current status of
    behavioral teratology—science and
    regulation. CRC Crit. Rev. Toxicol.
Kimmel, C.A. (1990) Quantitative approaches
    to human risk assessment for noncancer
    health effepts. Neurotoxicology 11:189-
    198.
Kimmel, G.L. (1985) In vitro tests in screening
    teratogens: "considerations' to aid the
    validation process. In: Marois, M., ed.
    Prevention of physical and mental
    congenital defects. Part C. New York,
    NY: Alan R. Liss, Inc., pp. 259-263.
Kimmel, G.L. (1990) In vitro assays in      l
    developmental  toxicology: their potential
    application in risk assessment. In: In
    vitro methods hi developmental
    toxicology: use in defining mechanisms
    and risk parameters. Kimmel, G.L.;
    Kochhar. D.M.,  eds. Boca Raton, FL; CRC
    . Press, pp7ifiiPl73i	
Kimmel, C.A.; Francis. E.Z. (1990)
    Proceedings of the workshop on the
    acceptability and interpretation of
    dermal developmental toxicity studies.
    Fundam. Appl. Toxicol. 14:386-398.
Kimmel C.A.: Gaylor. D.W. (1988) Issues in
    qualitative and quantitative risk analysis
    for developmental toxicology. Risk Anal.
    8:15-20.
                                                                     Kimmel, C.A.; Price, C.]. (1990)
                                                                         Developmental toxicity studies. In:
                                                                         Arnold, D.L.; Grioe, H.C.; Krewski, D,R.,
                                                                         eds. Handbook of in vivo toxicity testing.
                                                                      '.	S^nBIegoi'CAT" Academic Press, pp. 271-  •
                                                                         301.                                -
                                                                     Kimmel, C.A.; Young, J.F. (1983) Correlating
                                                                         pharmacokinetics and teratogenic end
                                                                         points. Fundam. Appl. Toxicol. 3:250-255.
                                                                     Kimmel, G.L.; Smith, K.; Kochhar, D.M.; Pratt,
                                                                         R.M. (lS82a) Overview of in vitro
                                                                       ,  teratogenicity testing: aspects of
                                                                         validation and application to screening.
                                                                         Teratogenesis Carcinog. Mutagen. 2:221-
                                                                         229.
                                                                     Kimmel, G.L.: Smith, K.; Kochhar, D.M.; Pratt,      	
                                                                         R.M. (1982b) Proceedings of the
                                                                         consensus workshop on in vitro
                                                                         teratogenesis testing. Teratogenesis
                                                                         Carcinog. Mutagen. 2:221-374.
                                                                     Kimmel, C.A.; Holson, J.F.; Hogue, C.J.; Carlo,
                                                                         G.L. (1934) Reliability of experimental
                                                                         studies for predicting hazards to human
                                                                         development. National Center for
                                                                         Toxicological Research, Jefferson, AS.
                                                                         NCTR Technical Report for Experiment
                                                                         No. 6015.
                                                                     Kimmel, C.A.; Kimmel, G.L.; Frankos, V., eds.
                                                                         (1986) Interagency Regulatory Liaison
                                                                         Group workshop on reproductive toxicity
                                                                         rislc assessment Environ. Health
                                                                         Perspect. 86:193-221.
                                                                     Kimmel, G.L.; Kimmel,  C.A.; Francis, E.Z.,
                                                                         eds. (1987) Evaluation of maternal and
                                                                         developmental toxicity. Teratogenesis
                                                                         Carcinog. Mutagen. 7:203-338.
                                                                     Kimmel, C.A.; Wellington. D.G.: Farland, W.;
                                                                         Ross, P.; Manson, J.M.; Chernoff, N.;
                                                                         Young, J.F.; Selevan, S.G.; Kaplan, N.;
                                                                         Chen, C.; Chitlik, L.D.: Siegel-Scott. C.L.;
                                                                         Valaoras, G.; Wells, S. (1989) Overview
                                                                         of a workshop on quantitative models for
                                                                         developmental toxicity risk assessment.
                                                                         Environ. Health Perspect. 79:209-215.
                                                                     Kimmel, C.A.; Rees, D.C.: Francis. E.Z., eds.
                                                                         (1990a) Proceedings of the Workshop on
                                                                         the Qualitative and Quantitative
                                                                         Comparability of Human and Animal
                                                                         Developmental Neurotoxicity.
                                                                         Neurotoxicol. Teratol. 12(3):173-292.
                                                                     Kimmel, C. A.; Kimmel, G.L.; Francis, E.Z.;
                                                                         Chitlik, L.D. (1990b) An overview of the
                                                                         U.S. EPA's proposed amendments to the
                                                                         guidelines for the health assessment of
                                                                         suspect developmental toxicants, f. Am.
                                                                         ColL Toxicol. 9:39-47.
                                                                     Kissling, G. (1981) A generalized model for
                                                                         analysis of non-independent
                                                                         observations [dissertation]. Chapel Hill,
                                                                         NC: University of North Carolina.
                                                                         Available from: University Microfilms,
                                                                         Ann Arbor. MI.
                                                                     Kleinbaum, D.G.; Kupper, L.L.; Morgenstern,
                                                                         H. (1982) Epidemiologic research:
                                                                         principles and quantitative methods.
                                                                         London: Lifetime Learning Publications.
                                                                     Kodell, R.L.; Howe, R.B.: Chen, J.J.; Gaylor,
                                                                         D.W. (1991) Mathematical modelling of
                                                                         reproductive and developmental toxic
                                                                         effects for quantitative risk assessment.
                                                                         Risk Analysis 11, in press,   r
                                                                     Kwa, S.-L.; Fine, L.J. (1980) The association
                                                                         between parental occupation and
                                                                         childhood malignancy. ]. Occup. Med.
                                                                         22:792-794.                                             ,
                                                                                   «i/5-,;'lriA,,ll	!»S,i. ':;•'!	'!	I!!!!!!	II	'	
-------
                     Federal Register  /Vol. 56,  No.  234  / Thursday.  December  5,  1991 / Notices
                                                                                 63323
 Lamb, J.C., IV. 1985. Reproductive toxicity
     testing: evaluating and developing new
     testing systems. J. Am. Coll. foxicol.
     4:163-171.
 Lemasters, G.K.; Selevan, S.G. (1984) Use of
     exposure data in occupational
     reproductive studies. Scand. J. Work
     Environ. Health 10:1-6.
 Lemasters, G.K.; Pinney, S.M. (1989)
     Employment status as a confounder
     when assessing occupational exposures
     and spontaneous abortion. J. Clin.
     Epidemic!. 42:975-81.
 Leridon, H. (1877) Human fertility: the basic
     components. Chicago, IL: The University
     of Chicago Press.
  .eukroth. R.W., ed. (1986) Predicting
     neurotoxicity and behavioral dysfunction
     rrom preclinical toxicologic data.
     Jeurotoxicol. Teratol. 9:395-471.
 Levinerkj. (1983) Methods for detecting
     occupational causes of male infertility:
     reproductive history versus semen
     analysis. Scand. J. Work Environ. Health
     9:371 376.
 Levine, T.E.; Butcher, R.E. (1990) Workshop
     on the qualitative and quantitative
     comparability of human and animal
     developmental neurotoxicity. Work
     group IV report: Triggers for
     developmental neurotoxicity testing.
     Neurotoxicol. Teratol. 12:281-284.
 Levine, R.J.; Symons, M.J.; Balogh, S.A.;
     Arndt, D.M.; Kaswandik, N.R.; Gentile,
     J.W. (I960) A method for monitoring the
   •  fertility of workers: I. Method and pilot
     studies.}. Occup. Med. 22:781-791.
 Levine, R.J.; Symons, M.J.; Balogh, S.A.;
     Milby, T.H.; Whorton, M.D. (1981) A
     method for monitoring the fertility of
     workers: II. Validation of the method
     among workers exposed to
     dibromochloropropane. J. Occup. Med.
     23:183-188.
Mackeprang, M.; Hay, S.; Lunde, A.S. (1972)
     Completeness and accuracy of reporting
     of malformations on birth certificates.
  •   HSMHA Health Reports 84:43-49.
McMichael, A.J. (1976) Standardized
     mortality ratios and the 'healthy worker
     effect': scratching beneath the surface. J.
     Occup'. Med. 18:165-168.
Morrissey, R.E.; Harris, M.W.; Schwetz, B.A.
     (1989) Developmental toxicity screen:
    results of rat studies with diethylhexyl
     phthalate and ethylene glycol
    monomethyi ether. Teratogenesis
    Carcinog. Mutagen.  9:119-129.
Morrissey. R.E.; Welsch, F.; Kavlocfc, R.J.;
    Schwetz, B.A. (1991) Proceedings of a
    conference on in vitro teratology.
    Environ. Health Perspect., in press.
National Center for Health Statistics. (1988)
    Advance report of final mortality
    statistics, 1986. Monthly Vital Statistics
    Report 37(6): Supp 1. NCHR, Hyattsville,
    MD. DHHS Publ. No. (PHS) 88-1120.
National Research Council. (1983) Risk
    assessment in the Federal government:
    managing the process. Committee on the
    institutional Means  for the Assessment
    of Risks to Public Health. Commission on
    Life Sciences, National Research
    Council. Washington, DC: National
    Academy Press, pp.  17-83.
 Needleman, H. (1988) The neurotoxic,
     teratogenic, and behavioral teratogenic
     effects of lead at low. dose: a paradigm
     for transplacental toxicants. In:
     Transplacental effects on fetal health.
  .   New York, NY: Alan R. Liss, Inc., pp.
     279-287.
 Nelson, C.J.; Holson, J.F. (1978) Statistical
     analysis of teratogenic data: problems
     and advancements. J. Environ. Pathol.
     Toxicol. 2:187-199.
 Nelson, K.; Holmes, L.B. (1989) Malformations
     due to presumed spontaneous mutations
     in newborn infants. New Engl. J. Med.
     320:19-23.
. Nisbet, I.C.T.; Karch, N.J. (1983) Chemical
     hazards to human reproduction. Park
     Ridge, IL: Noyes Data Corp.
 Organization for Economic Cooperation and
     Development (OECD). (1981) Guideline
     for testing of chemicals' teratogenicity.
 Papier, C.M. (1985) Parental occupation and
     congenital malformations in a series of
     35,000 births in Israel. Prog.  Clin. Biol.
     Res. 163:291-294.
 Perlin, S.A.; McCormack, C. (1988) Using
     weight-of-evidence classification
     schemes in the assessment of non-cancer
     health risks. In: Proceedings of the 5th
     National Conference on Hazardous
     Wastes and Hazardous Materials
     (HWHM '88); April 19-21; Las Vegas, NV.
 Peters, J.M.; Preston-Martin, S.; Yu, M.C.
     (1981) Brain tumors in children and
     occupational exposure of parents.
     Science 213:235-237.
 Rai, K.; Van Ryzin,}. (1985) A dose-response
     model for teratological experiments
     involving quantal responses. Biometrics
     41:1-9.
 Riley, E.P.; Vorhees, C.V., eds. (1986)
     Handbook of behavioral teratology. New
     York, NY: Plenum Press.
 Rodier, P.M. (1978) Behavioral teratology. In:
     Wilson, J.G.; Fraser, P.C., eds. Handbook
     of teratology, vol. 4. New York, NY:
     Plenum Press, pp. 397-428.
 Rothman, K.J. (1986) Modern epidemiology.
     Boston, MA: Little, Brown and Co., pp.
     83-94;
 Ryan, L-.M.; Catalano, P;J.; Kimmel, C.A.;
     Kimmel, G.L. (1991) Relationship -
     between fetal weight and malformation
     in developmental toxicity studies.
     Teratology 44:215-223.
 Schardein, J.L. (1983) Teratogenic risk
     assessment. In: Kalter, H., ed. Issues and
     reviews in teratology, vol. 1. New York,
     NY: Plenum Press, pp. 181-214.
 Schnatter,  A.R.L. (1990) The development of
     methods for implementing industry-
     based reproductive surveillance
     [dissertation]. New York, NY: Columbia
     University. Available from: University
  •   Microfilms, Ann Arbor, ML
 Schuler R.; Hardin, B.: Niemeyer, R.;'Booth,
   .  G.; Hazelden, K:; Piccirillo, V.; Smith, K.
     (1984)  Results of testing fifteen glycol
     ethers in a short-term, in vivo
     reproductive toxicity assay. Environ.
     Health Perspect. 57:141-148.
 Selevan, S.G. (1980) Evaluation of data
     sources for occupational pregnancy
     outcome studies [dissertation].
     Cincinnati, OH:.University of Cincinnati.
     Available from: University Microfilms,
     Ann Arbor, MI.
 Selevan, S.G. (1981) Design considerations in
     pregnancy outcome studies of
     occupational populations. Scand. J. Work
     Environ. Health 7:76-82.
 Selevan, S.G. (1985) Design of pregnancy
     outcome studies of industrial exposure.
     In: Hemminki, K.; Sorsa, M.; Vainio, H.,
1     eds. Occupational hazards and
     reproduction. Washington, DC:
     Hemisphere Pub., pp. 219-229.
 Selevan, S.G,; Hemminki, K.; Lindbohm, M-L.
     (1986) Linking data to study reproductive
     effects of occupational exposures.
     Occupational Medicine: State of the Art
     Reveiws l(3):445-455i  '.
 Selevan, S.G.; Lemasters, G.K. (1987) The
     dose-response fallacy in human
     reproductive studies of toxic exposures.
     J. Occup. Med. 29:451^154.
 Sever, L.E.; Hessol, N.A.  (1984) Overall design
     considerations in male and female
    1 occupational reproductive studies. In:
     Lockey, J.E.; LeMasters, G.K.; Keye, W.R.,
     eds. Reproduction: the new frontier in
     occupational and environmental
     research. New York, NY: Alan R. Liss,
     Inc. pp. 15—47.
 Shepard, T.H. (1980) Catalog of teratogenic
     agents. Third edition. Baltimore, MD:
     Johns Hopkins University Press.
 Shepard, T.H. (1986) Human teratogenicity.
     Adv. Pediatr. 33:225-268.
 Silverman, J.; Kline, }.; Hutzler, M.; Stein, Z.;
     Warburton, D. (1985) Maternal
     employment and the chromosomal
     characteristics of spontaneously aborted
     conceptions. J. Occup. Med. 27:427-438.
 Slotkin, T.A.; Lau, C.; Kavlock, U.J.; Gray,
     J.A.; Orband-Miller, L.; Queen, K.L.;
     Baker, F.E.; Cameron, A.M.; Antolick, L.;
     Haim, K.; Bartolome, M.; Bartolome, J.
     (1988) Role of sympathetic neurons in
     biochemical and functional development
     of the kidney: neonatal sympathectomy
    with 6-hydroxydopamine.}. Pharmacol.
    Exp. Ther. 246:427 433.
Starr, T.B.; Dalcorso, R.D.; Levine, R.J. (1986)
    Fertility of workers: a comparison of
    logistic" regression and indirect
    standardization. Am.}. Epidemiol.
    123:490-498.
Stein, Z.; Hatch, M.  (1987) Biological markers
    in reproductive epidemiology: prospects
    and precautions. Environ. Health
    Perspect. 74:67-75.
Stein, Z.; Susser, M.; Warburton, D.; Wittes,
    J.; Kline, J. (1975) Spontaneous abortion
   ' as a screening device. The effect of fetal
    surveillance on the incidence of birth
    defects. Am.}. Epidemiol. 102:275-2au.
Stein, Z.; Kline, J.; Shrout, P. (1985) Power in
    surveillance. In: Hemminki, K.; Sorsa, M.;
    Vaninio, H., eds. Occupational hazards
    and reproduction. Washington, DC:
    Hemisphere Pub., pp. 203-208.    ,
Stiratelli, R.; Laird, N.; Ware, J.H. (1984)
    Random-effects models for serial
    observations with binary responses.
    Biometrics 40:961-971.

-------
                                                   '3*
                                                              .< i, ....... i; 'i I ; ! ..... '.'I! •• i- • ..... . Sii;!1 ..... ! "":
                                                                                  : si™ ...... iiiiiSH  ii ' , fill ..... li».i ..... Mill ...... iJE" SU-M^li) "« . "ill ..... ii ...... < W: ....... , " i" ?:?• •? ^rSm§«BBfcf fljfflrai
                               	   ,	—	,,.  t  , n|     (i   ^ ^ 	n  ^ n	^ 	,	;	
                       Federal Register  / Vol.'56, No. 234 / Thursday,  December 5, 1991 / Notices
  ;;-   .  V  >ili  I ...... i"    ..  '. !> Sill' ..... , "":!*,ii|P •>",'  ." ,' ... ..... I
  ' S»v«m, S,H J Shaw, G4 Harris, J.A.; Nentra,
  ..... 'i: 1  !iR. l!?88l Congenital cardiac anomalies
    ......  In relation to wider contamination, Santa
                 y, CA, 1961-1983. Am. J.
               . 129:835-893.
                '      r, MJR^ Aarons. J.H.;
   \:,i;: , Mplfs, |,L; LaPorte, R.E, (1983) Evaluation
    ""  tfmattodii for ihe prospective
  t  ...... Itfwitjfltttlion of early fetcl fosses in
  '     tavttqhmenlai epidemiology studies. Am.
                l, 127;B«3-850. '"
          in. '•£. (1900) Collaborative1 studies oa
 .   '.;'  behavioral teratology Jn Japan.
     ..... N^trotoxtcciogy 7:3S-4§-
   TilJey, B.C.; Burnes, A.E; Be^gstrelh, E.;
       Lubi'iribe, D.: Nu!t«r. K.L.; Cotton. T.;
   ,  .if A«J«*!|;iiE. {1965J A comparison of
     •  prtgr;libc|'lJi!itory recall and medical
     1  recefdii: Implication* far retrospective
     ,-  stud1!** Am. J. Rpidcmlqi. IZ1;2S8~281.
   Tlbaii, HA,: Jacobsoti, JX.; Rqgitn, W.J. (1990)
       PotychloflnKted blpbersyls and (be
       developing nervooi system: cross-species
       oontpttlsons, NeurotoxlcoL Teratol.
       12;Z39 Z48,, ,
   Ti«t» S.P.S Wen, C.P. (1986) A review of
       metboduiogteat Issues of the
    iiiii^  »t»n ,,tet gatdelines; final rules. Federal
     .' Reglsles 50:39425-39428 and 39433-3.9434.
  US. Environmental Protection Agency.
      {I9T,i'jJ Hazard Evaluation Dlyisioa
     , «te»d«rd evaJuuMgp procedMre:
      teratology studies, pp. 22-23. Office of
      PMUdda Programs, Washington. DC.
      BPA-MO/9-8S-O18.
  VS. Enykontpents! Protecttpn Agency.
      (1M5c) Toxtc Substances Coatrol Act
      Iwt |tiltldtoes; final rules. Federal
 •'   : Roaster 50:33428-39429.
." U.S. Environmental Protection Agency.
    '' ' {ifeij, TTWefhy lene glyco! monomethyl,
     • njonocttol, and oonobutyl ethers:
 , i  , Br^ioseil'lwl role. Federal Register
      3!st~S83-17(»4.
••••'  u .......... i:::!Ii!ll ...... IliSiiV' ...... W^ri'rTO'
 U.S. Envi»3Rraental Protection Agency.
     (1986bt Sept. 24} Gnidelines for
     carcinogen risk assessment. Federal
     Register 5i(185):3399Z-340G3.
                                   '  '
                  .
     (1986c, Sept. 24) Guidelines For
     mutagenioity risk assessment. Federal
     Register-51(18S]:34006-3!4012.
 U.S. Environmental Protection Agency.
     (1988d, Sept. 24.) Guidelines for
     estimating exposures. Federal Register 51
     (185):34a42-34054.
 U.S. Enviropnjental ProtectJon Agency.
     (|988a,, Feb. 26) Diethylene glycol butyl
     etlier anidiethylene glycol butyl ether
     acetate; final test rule. Federal Register
     53:5332-5953.
 U.S. Environmental Protection Agency.
     (1988b) Proposed guidelines for assessing
     male reproductive risk. Federal Register
     53:24850-24869.
 U.S. Environmental Protection Agency.
     {1988c) Proposed guidelines for assessing
     female reproductive risk. Federal
 ......   Register 53:24834-24847.
 U.S. Environmental Protection Agency.
     (1989a)FIFRA accelerated reregistration
     phase 3 technical guidance, Appendix D.
     Office of Pesticides and Toxic
     Substances, Washington, DC. EPA No.
     54O/09-9O-O78. Available from: NTIS,
     Springfield, VA.
 U.S. Environmental Protection Agency.
     (1983b) Triethylene glycol monotnethyl
     ether; final test rule. Federal Register
     54:13472-13477.
 US. Environmental Protection Agency.
     (1991a) Pesticide assessment guidelines,
     subdivision F. Hazard evaluation: human
     and domestic animals. Addendum 10:
     Neurotoxicity, series Si, 82, and 83.
     Office of Pesticides and Toxic
  •   Substances, Washington, DC. EPA 540/
     09-91-123. Available from: NTIS,
     Springfield, VA. PB91-154817. .
 U.S. Environmental Protection Agency.
     f!991b) Integrated Risk Information
     System [IRIS). Online. Office of Health
     and Environmental Assessment,
     Washington, DC.
 Weinberg, C.R.; Gladen, B.C. (1986) The beta-
     geomekic distribution applied to
     comparative fecundability studies.
     Biometries 42:547-580.
 Wickramaratne, G.A. de S. (1987) The
     Chernoff-Kavlock assay: its validation
     and application in rats. Teratogeneais
     Carcinog. Mutagen. 7:73-83.
 Wilcox, A.J. (1983) Surveillance of pregnancy
     loss in human populations. Am. J. Ind.
     Med. 4:285-291.
 Wilcox, A.J.; Weinberg, C.R.; Wefamann, R.E.;
     Armstrong, E.G.; CanSeld, RJ3.; Nisula,
     B.C. (1985) Measuring early pregnancy
     loss: laboratory and field methods. Fertil.
     Steril. 44:366-374-.
 Wilson, }.G. (1973) Environment  and birth
     defects. New York, NY: Academic Press,
     pp. 30-32.
 Wilson, J.G. (1977) Ernbryotoxicity of drugs in
     man. In: Wilson, J.G.; Fraser, F.C., eds.
     Handbook of teratology. New York, NY:
     Plenum Press, pp. 309-355.
   ^     	i	"fern	:«	i»»!ii	•	ii	                          I
 Wilson, J.G. (1973} Sttrvey of in vitro systems:
     the?r potential use in teratogenicity
     screening. la: Wilson, J.G.; Fraser, F.C.,
     eds. Handbook of teratology, vof. 4. New
     York; NY: Plenum Press, pp. 135-153.
 Wilson, J.G.; Scott, W.f.v Ritter, E.J.; Fradkin,
     R. (1975) Comparative distribution and
     embryotoxicity of hydroxyurea in
     pregnant rate and rhesus monkeys.
     Teratology 11:183-178.
 Wilson, f.G.; Ritter, E.J.; Scott, W.J.; Fradkin,
     R. J1977} Comparative distribution and
     embryotoxicity of acetylsalieylic acid in
     pregnant rats and rhesus monkeys.
     Toxicol. Appl. Pharmacol. 41:67-78.
 Wong, O.; Utidjian, H.M.D.; Karten, V.S.
     (1979) Retrospective evaluation of
     reproductive performance of workers
     exposed to ethylene dibromide.}. Occup.
     Med. 21:93-102,
 Woo, D.C.; Hoar, R.M. (1972) "Apparent
     liydronephrosis" as a normal aspect of
     renal developmentjua late gestation of
     rats: the effect of methyl salicylate.
     Teratology 8:191-190.
 World Health Organization. (1984) Principles
     for evaluating health risks to> progeny
     associated with exposure to chemicals
     during pregnancy. In: Environmental
     Health Criteria, vol. 30. Geneva: World
     Health Organization.
 Zack, M.; Cannon, 84 Lloyd, D.; Heath. C.W.,
     Jr., Falletta, J M.; Jones, B.; Housworlh, J.;
     Cuowley; S. (1980) Cancer in children of
     parents exposed to hydrocarbon-related
     industries and occupations. Am. f.
     EpidemioL  3.-329-33S.
 Zenick, H.; Clegg, E.D. (1989} Assessment of
     male reproductive toxicity: a risk
     assessment approach. In: Hayes, A.W.,
     ed. Principles and methods of toxicology.
     Second ed. New York, NY: Raven Press,
     pp. 279-309.
 PART B: RESPONSE TO PUBLSC AND
 8C8ENCE ADVJSORY BOARD COMESEMTS

 I. Introduction

  This section summarizes the major
 issues raised in the public and Science
 Advisory Board [SAB) comments on the
 Proposed Amendments to the Guidelines
 for the Health Assessment of Suspect
 Developmental Toxicants published
 March 6,1989 (54 FR S385-9403J.
 Comments were received from 25
 individuals or organizations. The
 Agency's initial summary of the public
 comments and proposed responses were
 presented to the Environmental Health
 Committee of  the SAB on October 27,
 1989. The report of the SAB Committee
 was provided  to the Agency on Aprfl 23,
 1990.
  The SAB and public comments were
 di\rerse and addressed issues from  a
 variety of perspectives. The majority of
 the comments were favorable and hi
 support of the Proposed Amendments to
 the Guidelines. Many praised the
 Agency's efforts as being timely and
 %vell-justified.  Most commentors also
gave specific comraonts or criticisms for
                                                                                           1 "III
                                                                                  1 lil

-------

                                                                                                                63825
  further consideration, clarification, or
  re-evaluation. For example, there was
  concern expressed about the Guidelines
 . imposing further testing requirements,
  particularly functional testing, and many
  commentors felt that the Proposed
  Amendments discounted the role of
  maternal toxicity in developmental
  toxicity. In addition, there was concern
  that the proposed weight-of-evidence
  scheme would promote labeling of
  agents as causing developmental
  toxicity before the entire risk
  .-s-sessment process was completed.
    The SAB Committee also indicated
  that the proposed revisions were
  adequately founded in developmental
  toxicology and represented a step
.  forward for the Agency. They suggested
  that the Agency revisit the weight-of-
  evidence scheme, to avoid confusion
  with more commonly applied uses of
  such classifications, and to develop a
  more powerful conceptual approach.
  Further, the SAB Committee urged that
  the Agency begin to move away from
  the current use of the no-observed-
  adverse-effect level (NOAEL) and  •
  lowest-observed-adverse-effect level
  (LOAEL) basis for calculating the
  reference dose for developmental
  toxicity to a benchmark dose and
  confidence limit approach tied to
  empirical models of dose-response  "
  relationships.
    In response to the comments, the
  Agency has modified or clarified many
  sections of the Guidelines. For the
  purposes of this discussion, the major
  issues reflected by the public and SAB
  comments are discussed. Several minor
  recommendations, which are not
  discussed specifically here, also were
  considered by the Agency in the
  revision of these Guidelines.
  II. Intent of the Guidelines
    Many of the public comments
  indicated some misunderstanding of the
  intent of the Guidelines, apparently
  assuming that the risk assessment
  guidelines impose testing requirements.
  In particular, some commentors
  suggested that because the Agency was
  providing guidance on the interpretation
  of tests not required in the EPA testing
 guidelines, the Agency was suggesting
  that these tests be required in the future.
   The 1986 Guidelines and the 1989
  Proposed Amendments clearly state that
  these guidelines are not Agency testing
 guidelines, but rather are intended to
 ensure uniform interpretation of all
 existing, relevant data. However, to
 avoid any confusion, the discussion of
 study designs has been changed to
 avoid the impression that these
 Guidelines set  testing requirements. In
 ,the evaluation of data on an agent for
 risk assessment, relevant data are often
 encountered that have been generated
 from nontraditional tests. In such cases,
 it is imperative that the Agency provide
 guidance so that all data considered to
 be relevant are included in the risk
 assessment and are interpreted
 uniformly.

 III. Basic Assumptions
   In the 1986 Guidelines, several
 assumptions were implicit in the
 approach to risk assessment, but were
 not explicitly stated. These assumptions
 were detailed in the 1989 Proposed
 Amendments. Comments received from
 the public and the SAB favored
 presentation of these assumptions and
 generally agreed with the wording,
 except for the fourth assumption which
 concerns the use of the most relevant or
 most sensitive species. The 1989
 Proposed Amendments stated that "it is
 assumed that the most sensitive species
 should be used to estimate human risk.
 When data are available (e.g.,
 pharmacokinetic, metabolic) to suggest
 the most appropriate species, that
 species will be used for extrapolation."
 The SAB recommended that, for this
 assumption, the basic position  of the
 Agency should be to use data from the
 most relevant species, and that use of
 data from the most sensitive species
 should be the default position. In
 addition, the  SAB recommended that the
 threshold assumption be considered
 carefully in the dose-response
 assessment of any agent, and that the
 Agency develop more comprehensive
 approaches to risk assessment  as
 discussed further in the following
 sections.        ,         '
  Changes have been made in the
 statement of the basic assumptions in
 line with the SAB and public comments
 that clarify, but do not alter, the intent of
 the assumptions.

 IV, Maternal/Developmental Toxicity
  The 1989 Proposed Amendments
 stated that "when adverse
 developmental effects are produced only
 at maternally toxic doses, they  are still
 considered to represent developmental
 toxicity and should not be discounted as
 being secondary to maternal toxicity."
 This statement and others concerning
 the interpretation of developmental
 toxicity in the presence of maternal
 toxicity were the subject of a
 considerable number of public
 comments and were also addressed by
 the SAB. In general, commentors were
 divided in their opinions on whether
 they supported the Agency's statements
or felt that they discounted the role of
maternal toxicity in developmental
toxicity, but in general, the
 recommended changes did riot
 significantly alter the intent of the
 statements. The SAB endorsed the
 proposed revision, and suggested that
 the Agency retain the statement that
 was made in the Proposed Amendments.
   In these Guidelines, the position is
 further clarified by indicating that when
 maternal toxicity is significantly greater
 than the minimal'maternally toxic dose,
 developmental effects at that dose may
 be difficult to interpret. This statement
 is added to clarify, but not to change,
 the intent or meaning of the statements
 regarding the relationship between
 matdrnal and developmental toxicity.
 From a risk assessment point of view,
 whether a  developmental effect is or is
 not secondary to maternal toxicity, does
 not impact on the selection of the
 NOAEL or other dose-response
 methodology.

 V. Functional Developmenial Toxicity

   The 1989 Proposed Amendments
 provided information on the state-of-the-
 art in the evaluation of functional effects
 resulting from developmental exposures.
 Several commentors voiced strong
 objection to this section because they
 perceived it as indicating an imminent
 requirement-for testing. Several
 indicated there are no standard methods
 for functional testing, some felt that
 functional  end points should not be used
 to.establish the NOAEL, and others
 voiced concern about the problems with
 using postnatal exposures in animal
 studies.
   The final Guidelines further update
 this section to include a discussion of
 the latest changes in the requirements
 for functional developmental toxicity
 testing by the Agency, and reflect the
 current approach to interpretation of.
 such data,  with incorporation of
 information from the EPA/NIDA-
 sponsored  "Workshop on the
 Qualitative and Quantitative
 Comparability of Human and Animal
 Developmental  Neurotoxicity" (1990).
 The intent of, these Guidelines as stated
 above, is not to change testing
 requirements but to give guidance when
 these types of data are encountered in
 the risk assessment process. The
 Guidelines also indicate that functional
 developmental toxicity end points will
 be used for establishing the NOAEL
 when they  are found to-be the adverse
 effect occurring at the lowest dose in
 appropriate, well-conducted studies.
Interpretation of postnatal exposure
 data is a concern, and must take into
consideration effects on the mother, her
offspring, and possible interactions; a
statement to this effect has been added.
Further interpretation of data will be

-------
    63828
Fedora}  gegister / Vol. 56, No. 234 /' Thursday,  December 5,  1991 / Notices
  "•• dismissed In ftp guidance being
   ; doyojopcd fay the Agency on
    nmiiofbxiclty risk assessment.
    VI, U'eight-of-Evidence Scheme
  '.	'."" TJhe tP8 Rfoposcd Amendments
    described important considerations in
   ''determining the relative weight of
 „   various kinds of data in .estimating the "
   - risk' of developmental tpxicity in
„  .,,. Isniii)an%ii1%^	intent	of, th.fi, proposed
  "'i	iWdtght-cif-evidence'fWOE] scheme was,
    thai it not lie used in isolation, but be
"_, i  uaod[as the fir|| step in the risk
:""'':, ' AdaesMen'f pto&sS.'I'Q be integrated'
      ""i dose.'.response information and the....
     .The Wt?,h scheme,was the	suject of a
    1ifomfd*;r III" !l
                                                                                                                          ll.ilillWllM.tilliW	Mi .'I'lUfi!1 j

                                                                                                                          	i^iiiRi.
                                                     	!	ft,.
                                                         'lr VvMH	,i'Li illlJ'flt'TiililrMllI:1111!11,!!!!11;	  ,»','	   ' llll i1 1 ," '!»' ,1,1 rf';,.'. i
                                                          l \it iff i;Ne>|!	'|li!	V'i ty	: I	'i W • »t[:' ••.• >	i:-' a "
                                                 ', • t	•• i, ',".i; ,;f jli ;,i	R, 'S. I /liaiif: jiiHif	5»j;:;''«, j-K..! ii '•', 'i	 • '•'!',- +,. t VK	f :•,	f<"WJ!ftf;jjplfi!
                                                " 'iii-'i1-;' ''. ?':f'«, -!;	i''. 'l\,	JSSryt', f,,''"' if. i '''.. "f '*'':,' ;" ;i• 'Ji.	'inI,;!?1 i1"

                                                                        '.i ,:;"i",' j'Sj: JB1 liiiB: ..... tf
                                                    :!;!::v>;^                                             ..... m .....
                                                                              i'!"ii|i!!!i!ii!;i!",!'":"«iH!,, ».
                                                                                           iri::i|"r|1,iji	'm"i ',/nv, i 	'i/,1 ,:,:, "n ii'innry,,, »
                                                                         i'i, i'i 41	, ll',,;1 i|| MI, I "'I,,'4

-------