Thursday
December 5, 1901
Part V
Guidelines for Developmental Toxicity
Risk Assessment; Notice
-------
63798 Federal Register / Vol. 56, No. 234 / Thursday. December 5. 1991 / Notices
ENVIRONMENTAL PROTECTION
AGENCY, " '
IFRL-4038-3]
J ..... i.
Guidelines for Developmental Toxiclty
Risk Assessment
AQENCY: U.S. Environmental Protection
/••: ":,i Agency (EPA).
"' '. ACTION! FinarG'ujdeliQes fot
Developmental Toxicity Risk
Assessment.
SUMMARY: The U.S. Environmental
Protection Agency (EPA1 is today
Issuing final amended guidelines for
Assessing the risks for .developmental
loxicily from exposure to environmental
agents. As background information for
this guidance, this notice describes the
scientific basis for concern about
exposure to agents that cause
developmental toxicity, outlines the
general process for assessing potential
( :/'. '' risk to humans because qf
'' ...... ,,, ; environmental cpntarninanls, ..............
summarizes the history of these
guidelines, and addresses public and
Science Advisory Board comments on
the 1989 "Proposed Amendments to the
Guidelines for the Health Assessment of
JJuspeet Developmental Toxicants" [54
FR 9380-9403], These guidelines, which
have been renamed "Guidelines for
Developmental Toxicity Risk
Assessment" (hereafter "Guidelines").
outline principles and methods for
evaluating data from animal and human
Studies, exposure data, and other
information to characterize risk to
human development, growth, survival,
and function because oj exposure prior
to conception, prenatally, or to infants
and children. Thesg GujdelinBs arnend
and replace EPA's 1986 "Guidelines for
the Health Assessment of Suspect
Developmental Toxicants" [51 PR 34028-
34040] by adding new guidance on the
relationship between maternal and
developmental toxicity, characterization
of the health-related data base for
developmental toxicity risk assessment,
use of the reference dose or reference
concentration for developmental toxicity
(RfDor or RfCor), and use of the
benchmark dose approach. In addition,
the Guidelines were reorganized to
combine hazard identification and dose-
rtsporise evaluation since these are
usually done together in assessing risk
;"', ,, ' ...... fetwma,n,. health effects other tljan ...... ..
.-' "':: ...... "cancer. ..... "' ........
EFFECTIVE DATE: The Guidelines will be
effective December 5, 1991.
FOR FURTHER INFORMATION .CONTACT:
fir. Carole A. Kimmel, Reproductive and
Developmental Toxicology Branch,
Human Health Assessment Group,
Office of Health and Environmental
Assessment (RD-689), U.S.
Environmental Protection Agency, 401 M
Street, SW., Washington, DC 20460, TEL:
202-260-7331, FAX: 202-260-3803.
SUPPLEMENTARY INFORMATION: The
Clean Air Act (CAA), the Toxic
Substances Control Act (TSCA), the
Federal Insecticide, Fungicide, and
Rodenticide Act (FIFRA) and other
statutes administered by the EPA
authorize the Agency to protect public
health against adverse effects from
environmental pollutants. One type of
adverse effect of great concern is
developmental toxicity, i.e., adverse
effects produced prior to conception,
during pregnancy and childhood.
Exposure to agents affecting
development can result in any one or
more of the following manifestations of
developmental toxicity: Death,
structural abnormality, growth
alteration, and/or functional deficit.
These manifestations encompass a wide
array of adverse developmental end
points, such as spontaneous abortions,
stillbirths, malformations, early
postnatal mortality, reduced birth
weight, mental retardation, sensory loss,
and other adverse functional or physical
changes that are manifested postnatally.
The Role of Environmental Agents in
Developmental Toxicity
Several environmental agents are
established as causing developmental
toxicity in humans (e.g., lead,
polychlorinated biphenyls,
methylmercury, ionizing radiation),
while many others are suspected of
causing developmental toxicity in
humans based on data from
experimental animal studies (e.g., some
pesticides, other heavy metals, glycol
ethers, alcohols, and phthalates). Data
for several of the agents identified as
causing human developmental toxicity
have been compared to the experimental
animal data (Nisbet and Karch, 1983;
Kimmel et al., 1984; Hemminki and
Vineis, 1985; Kimmel et al., 1990a). In
these comparisons, the agents causing
human developmental toxicity in almost
all cases were found to produce effects
in experimental animal studies and, in
at least one species tested, types of
effects similar to those in humans were
generally seen. This information
provides a strong basis for the use of
animal data in conducting human health
risk assessments. On the other hand, a
number of agents found to cause
developmental toxicity in experimental
animal studies have not shown clear
evidence of hazard in humans, but the
available human data are often too
limited to evaluate a cause and effect
1"1 "
relationship. The comparison of dose-
response relationships is hampered by
differences in route, timing and duration
of exposure. When careful comparisons
have been done taking these factors into
account, the minimally effective dose for
the most sensitive animal species was
generally higher than that for humans,
usually within 10-fold of the human
effective dose, but sometimes was 100
times or more higher (e.g.,
polychlorinated biphenyls [Tilson et al.,
1990]). Thus, the experimental animal
data were generally predictive of
adverse developmental effects in
humans, but in some cases, the
administered dose or exposure level
required to achieve these adverse
effects was much higher than the
effective dose in humans.
In most cases, the toxic effects of an
agent on human development have not
been fully studied, even though
exposure of humans to that agent may
have been established. At the same
time, there are many developmental
effects in humans with unknown causes
and no clear link with exposure to
environmental agents. The background
incidence of human spontaneous
abortion, for example, was estimated by
Hertig (1967) to be approximately 50% of
all conceptions, and more recently,
Wilcox et al. (1985), using sensitive
techniques for detecting pregnancy as
early as 9 days postconception,
observed that 35% of postimplantation
pregnancies ended in an embryonic or
fetal loss. Of those infants born alive,
approximately 7.4% are reduced in
weight at birth (i.e., below 2500 g)
(Selevan, 1981), approximately 3% are
found to have one or more congenital
malformations at birth, and by the end
of the first postnatal year, about 3%
more are found to have serious
developmental defects (Shepard, 1986).
Of those children born with
developmental defects, it has been
estimated that 20% are due to genetic
transmission and 10% can be attributed
to known exogenous factors (including
drugs, infections, ionizing radiation, and
environmental agents), leaving the
remaining 70% with unknown causes
(Wilson, 1977). In a recent hospital-
based surveillance study (Nelson and
Holmes, 1989), 50.7% of congenital
malformations were estimated to be due
to genetic or multifactorial causes, while
3.2% were associated with exposure to
exogenous agents and 2.9% to twinning
or uterine factors, leaving 43.2% to
unknown causes. The proportion of the
effects with unknown causes that may
be attributable to environmental agents
or to a combination of factors, such as
.environmental agents and genetic
"" ',"„:," l;^ iM1'1^!^^ 'iJ'nwiJJ
'!'! !;: ,iA Siii,-1!:1!'1;!1'i!1"111!:
I
-------
Federal'Register / Vol. 56, No.234" /Thursday,
factors, nutritional deficiencies, alcohol
consumption, direct or indirect exposure
to tobacco smoke, use of prescribed and
illicit drugs, etc., is unknown.
The social and economic impact of
developmental disabilities on ths
population is extremely high. Close to
one-half of the children in hospital
. wards are there because of prenatally
acquired malformations (Shepard, I960).
According to the Centers for Disease
Control, congenital anoma!'»s. sadden
infant death syndrome, and prematurity
combined account for more than 50% of
infant mortality among all races in the •
United States (National Center for
Health Statistics, 1988). In addition,
among the leading causes of estimated
years of potential life lost (YPLL) due to
death before the age of 65, congenital
anomalies, prematurity, and sudden
infant death syndrome combined rank
third (Centers for Disease Control,
1988a, b). The YPLL estimates for
developmental defects may actually
underestimate the public health impact
because the estimates do not include
prenatal deaths, they are based only on
those cases that die before age 65 and
do not account for limited quality of life,
and pregnancies may be terminated'
early due to prenatal diagnosis of
developmental defects.
These data provide the basis for a
long-standing interest by Federal
agencies that deal with human health to
protect against exposures to agents that
cause developmental toxicity, and most
of these regulatory agencies have
provisions for considering data on
developmental toxicity in protecting
human health. As a step in developing
procedures far interpreting toxicity data
•in the regulatory context, the National'
Academy of Sciences/National
Research Council, in 1983, published a
framework for the risk assessment
process, which EPA uses as the basis for
its risk assessment guidelines and .for
the assessment of risk due to
environmental agents.
The Risk Assessment Process and Its
Application to Developmental Toxicity
Risk assessment is the process by
which scientific judgments are made
concerning the potential for toxicity, to
occur in humans. The National Research'
Council (1983) has defined risk
assessment as including seme or all of
the following components: Hazard
identification, dose-response
assessment, exposure assessment, and
.risk characterization. In general, the
process of assessing the risk of human
developmental toxicity may be adapted
to this format. In practice, however,
hazard identification for developmental
toxjcity and other noncancer health
effects is usually done in conjunction
with an evaluation of dose-response
relationships, since the determination, of
a hazard is often dependent on whether
a dose-response relationship is present
(Kirnmel et al., 1990b). One advantage of
this approach is that it reflects hazard
within the context of dose, route,
duration and timing of exposure, all of
which are important in comparing the,
toxicitjr information available to
potential human exposure scenarios.
Secondly, this approach avoids labelling
of chemicals as developmental toxicants
on a purely qualitative basis. For these
reasons, the Guidelines combine hazard
identification and dose-response
evaluation under one section (Section
. Ill), and characterize both hazard and
dose information as part of the health-
related data base for risk assessment. If
data are considered sufficient for risk
assessment, an oral or dermal reference
dose for developmental toxicity (RfDDT).
or an inhalation reference concentration
for developmental toxicity (RfCDT) is
then derived for comparison with human
exposure estimates. A statement of the
potential for human risk and the
consequences of exposure can come
only from integrating the hazard
identification/dose-response evaluation
with the human exposure estimates in
the final risk characterisation.
Combining hazard identification and
dose-response evaluation, as well as
development of the RfDnT and RfCDT, are
revisions of the 1986 Guidelines.
Hazard identification/dose-response
evaluation involves examining-all
available experimental animal and
human data and the associated doses,
routes, timing and duration of exposures
to determine if an agent causes
developmental toxicity and/or maternal
or paternal toxicity in that species and
under what-exposure conditions. The
no-observed-adverse-effect-level
(NOAEL) and/or the lowest-observed-
adverse-effect-level (LOAEL) are
determined for each study and type of
effect. Based upon the hazard
identification/dose-response evaluation
and criteria provided in these
Guidelines, the health-related data base
can be characterized as sufficient or
insufficient for use in risk assessment
(Section III.C). Because of the limitations
associated with the use of the NOAEL,
the Agency is evaluating the use of an
additional approach, i.e., the benchmark
dose approach (Crump, 1984), for more
quantitative dose-response evaluation
when sufficient data are available. The
benchmark dose provides an indication
of the risk associated with exposures
near the NOAEL, taking into account the
variability in the data and the slope of
the dose-response curve.
For the determination of the RfDDT or
' the RfCDT, uncertainty factors are
applied to the NOAEL (or LOAEL, if a
NOAEL has not been established) to
account for extrapolation from
experimental animals to humans and for
variability within the human population.
The RfDDT or RfCDT is generally based
on a short duration of exposure as is
typically used in developmental toxicity
studies in experimental animals. The
use of the terms RfDDT and RfCDT
distinguish them from the oral or dermal
reference dose (RfD) and the inhalation
reference concentration (RfC) which
refer primarily to chronic exposure
situations (U.S. EPA, 1991). Uncertainty
• factors may also be applied to a
benchmark dose for calculating the
RfDDT or RfCm, but the Agency has little
experience with applying this approach
and is currently supporting research
efforts to determine the appropriate
methods. As more information becomes
available, guidance will be written and
published as an addendum to these
Guidelines. These approaches are
discussed further in section III.D.
The exposure assessment identifies
human populations exposed or
potentially exposed to an agent,
describes their composition and size,
and presents the types, magnitudes,
frequencies, and durations of exposure
to the agent. The exposure assessment
provides an estimate of human exposure
levels for particular populations from all
potential sources;
!n risk characterization, the hazard
identification/dose-response evaluation
and the exposure assessment for given
populations are combined to estimate
some measure of the risk for
developmental toxicity. As part of risk
characterisation, a summary of the
strengths and weaknesses in each
component of the risk assessment are
discussed along with major
assumptions, scientific judgments, and,
to the extent possible, qualitative and
quantitative estimates of the .
uncertainties. Confidence in the health-
related data is always presented in
conjunction with information on dose-
response and the RfDdt or RfCdt. If
human exposure estimates are
available, the exposure basis used for
the risk assessment is clearly described,
e.g., highly exposed individuals, or
highly sensitive or susceptible
individuals. The NOAEL may be
compared to the various estimates of
human exposure to calculate the
margin(s) of exposure (MOE). The
considerations for determining
adequacy of the MOE are similar to
-------
63800
Register / Vol. 56, No, 233 7 Thursday, December 5, l§9f '/' N6tices:
those used in determining the
appropriate size of the uncertainty
f«i
-------
Federal Register /
5. Other Risk Descriptors
E. Communicating Results
VI. Summary and Research Needs
VII. References .
Part B: Response to Public and Science
Advisory Board Comments
!, Introduction
II. Intent of the Guidelines
III. Basic Assumptions
IV. Maternal/Developmental Toxicity
V. Functional Developmental Toxicity
VI. Weight-of-Evidence Scheme
VII. Applicability of the RfDor Concept and
the Benchmark Dose Approach
Part A: Guidelines for Development Toxldty
Risk Assessment
L Introduction '
These Guidelines describe the
procedures that the EPA follows in .
evaluating potential developmental
toxicity associated with human
exposure to environmental agents. The
Agency has sponsored or participated in
several conferences that'addressed
issues related to such evaluations and
that provide some of the scientific basis
for these Guidelines [U.S. EPA, 1982a;
Kimmel et aL, 1982b, 1987; Hardin, 1987;
.Perlin and McCormack, 1988; Kimmel et
al., 1989; Kimmel and Francis, 1990;
Kimmel et al.,1990a). The Agency's
authority to regulate substances that
have the potential to interfere with
human development is derived from a
number of statutes that are implemented
through multiple offices within the EPA.
The procedures described herein are
intended to promote consistency in the
assessment of developmental toxic
effects across program offices within the
Agency.
These Guidelines provide a general
format for analyzing and organizing the
available data for conducting risk
assessments. The Agency previously has
issued testing guidelines (U.S. EPA,
1982b, 1985a, 1989a, 1991a) that provide
protocols designed to determine the
potential of a test substance to induce
structural and/or other adverse effects
during development. These risk
assessment Guidelines do not change
any prescribed statutory or regulatory
standards for the type of data necessary
for regulatory action, but rather provide
guidance for the interpretation of studies
that follow the testing guidelines, and in
addition, provide limited information for
interpretation of other studies (e.g.,
epidemiologic data, functional
developmental toxicity studies, and
short-term tests) that are not routinely
required, but may be encountered when
reviewing data on particular agents.
Since the purpose of risk assessment
is to make inferences aboutjjqtential r
risks to human health, the most
appropriate data to be used^re those
deriving from studies of humans. If
adequate human data are not available,
then it is necessary to use data obtained
from other species. There are a number
of unknowns in the extrapolation of
data from animaLstudies to humans.
Therefore, a number of assumptions
must be made on the relevance of
effects to potential human risk which
are generally applied in the absence of
data. These assumptions provide the
inferential basis for the approaches
taken to risk assessment in these
Guidelines.
First, it is assumed that an agent .that
produces an adverse developmental
effect in experimental animal studies
will potentially pose a hazard to humans
following sufficient exposure during
development. This assumption is based •
on the comparisons of data for agents .
known to cause human developmental''
toxicity (Nisbet and Karch, 1983; Kimmel
et al., 1984; Hemminki and Vineis, 1985;
Kimmel et al., 1990a), which indicate
that, in almost all cases, experimental .
animal data are predictive of a
developmental effect in humans.
It is assumed that all of the four
manifestations of developmental
toxicity (death, structural abnormalities,
growth alterations, and functional
deficits) are of concern. In the past,
there has been a tendency to consider
only malformations or malformations
and death as end points of concern.
From the data on agents that are known
, to cause human developmental toxicity
(Nisbet and Karch, 1983; Kimmel et al.,
1984; Hemminki and Vineis, 1985;
Kimmel et al., 1990a), there is; usually at
least one experimental species that
mimics the types of effects seen in
humans, but in other species tested, the
type of developmental perturbation may
be different. Thus, a biologically
significant increase in any of the four
manifestations is considered indicative'
of an agent's potential for disrupting
development and producing a
developmental hazard-
It is assumed that the types of
developmental effects seen in animal
studies are not necessarily the same as
those that may be produced in
.humans.This assumption is made
because it is impossible to determine
which will be the most appropriate
species in terms of predicting the
specific types of effects seen in humans.
The fact that every species may not
react in the same way could be due to
species-specific differences in critical
periods, differences in timing of
exposure, metabolism,, developmental
patterns, placentation, or mechanisms of
action. ; ,;
The most appropriate species is used
•to estimate human risk when data are
available (e.g., pharmacokinetics). In the
absence of such data, it is assumed that
the most sensitive species is appropriate
for use, based on observations that
humans are as sensitive or more so than
the most sensitive animal species tested
for the majority of agents known to
cause human developmental toxicity
(Nisbet and Karch, 1983; Kimmel et al.,
1984; Hemminfci and Vineis, 1985;
Kimmel eta!., 199Ga).
In general, a threshold is assumed for
the dose-response curve for agents that
produce developmental toxicity.JThis is
based on the known capacity, of the
developing organism to compensate for
or to repair a certain amount of damage
at the cellular, tissue, or organ level. In
addition, because of the multipotency of
ceils at certain stages of development,
multiple insults at the molecular or
cellular level may be required to
produce an effect on the whole
organism. .
!!. Definitions and Terminology
The Agency recognizes that there are
differences in the use of terms in the
field of developmental toxicology. For
the purposes of these Guidelines the
following definitions will be used.
Developmental toxicology—The study
of adverse effects on the developing
organism that may result from exposure
prior to conception (either parent),
during prenatal development, or .
postnatally to the time of sexual
maturation. Adverse developmental
effects may be detected at any point in
the life span of the organism. The major
manifestations of developmental
toxicity include: (1) Death of the
developing organism, (2) structural
abnormality, (3) altered growth, and (4)
functional deficiency.
Altered growth-^—An alteration in
offspring organ or body weight or size.
Changes in one end point may or may
not be accompanied by other signs of
altered growth (e.g., changes in body
weight may or may not be accompanied
by changes in crown-rump length and/or
skeletal ossification). Altered growth
can be induced at any stage of "
development, may be reversible, or may
result in a permanent change.
Functional developmental
toxicology—The study of alterations or
delays in the physiological and/or
biochemical competence of an organism
or organ system following exposure to
an agent during critical periods of
development pre- and/or postnatally.
Structural abnormalities—Structural
alterations in development that include
both malformations and variations,
Malformations and variations—A
malformation is usually defined as a
-------
IT*,,
Fgdgral Register / Vol. 56, No. 234 / Thursday, December 5, 1991 / Notices
permarieni structural change that may
adversely affect survival, development,
or function. The term teratogenicity is
used in these Guidelines to refer only to
malformations. The term variation is
used to indicate a divergence beyond
the usual range of structural constitution
that may not adversely affect survival qr
health. Distinguishing between
_ variations and malformations is difficult
aincc there exists a continuum of
responses from the normal to the
extremely deviant. There is no generally
ilccepfed classification of malformations
" nnd, vaj-fatipcs., Othej terms that are
I11,/1 • "ibflqn ugpd, b.ut n,p, better defined,
include anomalies, deformations, and
aberrations.
III. Hazard Identificstian/fipse-
Response Evaluation of Agents That
Cause Developmental Toxicity
This section discusses the evaluation
and interpretation of hazards for a
variety of end points of developmental
toxtclty seen in both human and animal
siudlos, and describes the criteria for
characterizing the sufficiency of the
he«tth-rela|ed data base for cpnAjcting
a developmental toxicity risk
assessment It also details the use of
dose«response data for determining
potential hazards, and describes the
calculation of tlie RfDpT or RfCDT, a dose
or concsntyation tha.t is asjiimed to be
without; appreciable risk of deleterious
developmental effects for a given agent.
Developmental toxicity is expressed
as one or more of a number of possible
end points that may be used for
evaluating the potential of an agent to
!" feause abnormal development.
Developmental toxicity generally occurs
in a dose-related manner, may result
from short-term exposure (including
single exposure situations) or from
longer-term low-level exposure, may be
produced by various routes of exposure,
and the types of effects may vary
depending on the timing of exposure
because of a number of critical periods
of development for various organs and
functional systems.
The four major manifestations of
developmental toxicity are death,
structural abnormality, altered growth,
and functional deficit. The relationship
among these manifestations may vary
with increasjng dose, and especially at
higher doses, death of the copceptus
may preclude expression of other
manifestations. Of these, all four
manifestations have been evaluated in
human giudies. but only the first three
are traditionally measured in laboratory
animals using the conventional
developmental toxicity (also called
leralogenicity or Segment II) testing
protqcol as well as in other study
protocols, such as the multigeneration
study or the continuous breeding study.
Although functional deficits seldom
have been evaluated in routine testing
studies in experimental animals,
functional evaluations are beginning to
be required in certain regulatory
situations (U.S. EPA, 1986a, 1988a,
1989b, 1991a).
Developmental toxicity can be
considered a component of reproductive
toxicity, and often it is difficult to
distinguish between effects mediated
through the parents versus direct
interaction with developmental
processes. For example, developmental
toxicity may be influenced by the effects
of toxic agents on the maternal system
when exposure occurs during pregnancy
or lactation. In addition, following
parental exposure prior to conception,
developmental toxicity may result in
their offspring and, potentially, in
subsequent generations. Therefore, it is
useful to consult the "Proposed
Guidelines for Assessing Male
Reproductive Risk" (U.S. EPA, 1988b)
and the "Proposed Guidelines for
Assessing Female Reproductive Risk"
(U.S. EPA, 1988c) in conjunction with
these Guidelines. Mutational events that
occur as a result of exposure to agents
that cause sJeyelppmental toxicity may
be difficult to discriminate from other
possible mechanisms in standard
studies of developmental toxicity. When
mutational events are suspected, the
"Guidelines for Mutagenicity Risk
Assessment" (U.S. EPA, 1986c), which
specifically address the risks of
heritable mutation, should be consulted.
Carcinogenic effects have occurred in
humans following developmental
exposures to diethylstilbestrol (Herbst
et al., 1971). Several additional agents
(e.g., direct-acting alkylating agents)
have been shown to cause cancer
following developmental exposures in
experimental animals, and it appears
from the data collected thus far that
agents capable of causing cancer in
adults may also cause transplacental or
neonatal carcinogenesis (Anderson et
al., 1985). Currently, there is no way to
predict whether the developing offspring
or adult will be more sensitive to the
carcinogenic effects of an agent. At
present, testing for carcinogenesis
following developmental exposure is not
routinely required. However, if this type
of effect is reported for an agent, it is
considered appropriate to use the
"Guidelines for Carcinogen Risk
Assessment" (U.S. EPA, 1986b) for
assessing human risk.
A. Developmental Toxicity Studies: End
Points and Their Interpretation
1. Laboratory Animal Studies
This section discusses the end points
examined in routinely used protocols as
well as the use of other types of studies,.
including functional studies and short-
term tests.
The most commonly used protocol for
assessing developmental toxicity in
laboratory animals involves the
administration of a test substance to
pregnant animals (usually mice, rats, or
rabbits) during the period of major
organogenesis, evaluation of maternal
responses throughout pregnancy, and
examination of the dam and the uterine
contents just prior to term (U.S. EPA,
1982b, 1985a; Food and Drug
Administration (FDA), 1966,1970;
Organization for Economic Cooperation
and Development (OECD), 1981). Some
studies may use exposures of one to a
few days to investigate periods of
particular sensitivity for induction of
abnormalities in specific organs or organ
systems. In addition, developmental
toxicity may be evaluated in studies
involving exposure to one or both
parents prior to conception, to the
conceptus during pregnancy and over
several generations, or to offspring
during the prenatal and preweaning
periods (U.S. EPA, 1982b, 1985a, 1986a,
1988a, 1991a; FDA, 1966,1970; OECD,
1981; Lamb, 1985). These Guidelines are
intended to provide information for
interpreting developmental effects
related to any of these types of
exposure.
Appropriate study designs include a
number of important factors. For
example, test animal selection is
generally based on considerations of
species, strain, age, weight, and health
status. Assignment of animals to dose
groups by stratified randomization (on
the basis of body weight) reduces bias
and provides a basis for performing
valid statistical tests. At a minimum, a
high dose, a low dose, and one
intermediate dose are included. The high
dose is selected to produce some
minimal maternal or adult toxicity (i.e.,
a level that at the least produces
marginal but significantly reduced body
weight, reduced weight gain, or specific
organ toxicity, and at the most produces
no more than 10% mortality). At doses
that cause excessive maternal toxicity
(that is, significantly greater than the
minimal toxic level), information on
.developmental effects may be difficult
to interpret and of limited value. The
low dose is generally a NOAEL for adult
and offspring effects, although if the low
dose produces a biologically or
-------
Federal Register / Voi; SGrNo.
•Dfeceinbef' 5/1991 'j Notices"-
statistically significant increase in
response, it is considered a LOAEL (see
section IH.A.l.f for a discussion of
biological versus statistical
significance). A concurrent control group
treated with .the vehicle used for agent
administration is a critical component of
a well-designed study.
The route of exposure in these studies
is usually oral, unless the chemical or
physical characteristics of the test
substance or pattern of human exposure
suggest a more appropriate route of
administration. In the case of dermal
exposure, developmental toxicity
studies showing no indication of
maternal or developmental toxicity are
considered insufficient for risk
assessment unless accompanied by
absorption data (Kimmel and Francis,
1990). Dermal developmental toxicity
studies in which skin irritation is too
marked (moderate erythema and/or
moderate edema, i.e., raised "
approximately 1 mm) also are
considered insufficient, since excessive
maternal toxicity may be produced from
the irritation rather than from systemic
exposure to the agent. Assessment of
maternal toxicity is base'd on signs of
systemic toxicity rather than on local
effects such as skin irritation.
Absorption data and limited
pharmacokinetic data collected in
dermal developmental toxicity studies
provide very useful information in the
evaluation of study design and data
interpretation (Kimmel and Francis,
1990). Many of these points also are
pertinent to studies by other routes of
exposure.
The evaluation of specific end points
of maternal and developmental toxicity
is discussed in the next several sections.
Appropriate historical control data
sometimes can be very useful in the
interpretation of these end points.
Comparison of data from treated
animals with concurrent study controls
should always take precedent over
comparison with historical control data.
The most appropriate historical control
data are those from the same laboratory
in which studies were conducted. Even
data from the same laboratory, however,
should be used cautiously and examined
for subtle changes over time that may
result from genetic alterations in the
strain or stock of the species used,
changes in,environmental conditions
both in the breeding colony of the
supplier and in the laboratory, and
changes in personnel conducting studies.
and collecting data (Kimmel and Price,
1990). Study data should be compared
with recent as well as cumulative
historical data. Any change in
laboratory procedure that might affect
control data should be noted and the
data accumulated separately from
previous data.
The next three sections (a-c) discuss
individual end points of maternal and
developmental toxicity as measured in
the conventional developmental toxicity
study, the multigeneration study, and,
when available, in postnatal studies.
Other end points specifically related.to
reproductive toxicity are covered in the
relevant risk assessment guidelines (U.S.
EPA, 1988b, 1988c). The fourth section
(d) deals with the integrated evaluation
of all data; including the relative effects
of exposure on maternal animals and
their offspring, which is important in
.assessing the level.of concern about a
particular agent.
a. End Points of Maternal Toxicity. A
number of end points that may be
observed as possible indicators of
maternal toxicity are listed in Table 1.
Maternal mortality is an obvious end
point of toxicity; however, a number of
other end points can be observed that
may give an indication of the more
subtle adverse effects of an agent. For
example,.in well conducted studies, the
mating and fertility indices provide
information on the general fertility rate
of the animal stock usadTand are
important indicators of toxic effects to
adults if treatment begins prior to
mating or implantation. Changes in
gestation length may indicate effects on
the process of parturition.
Table 1.—End Points of Maternal
Toxicity
Mortality
Mating Index [{no. with seminal plugs or
sperm/no, mated) X 100}
Fertility Index [(no. with implants/no, of
matings) X 100]
Gestation Length (useful when animals are
allowed to deliver pups)
Body Weight
DayO
During gestation
Day of necropsy
• Body Weight Change
Throughout gestation
During treatment {including increments of
time within treatment period)
Post-treatment to sacrifice
Corrected maternal (body weight change
Table I.—End Points of Maternal
Toxicity—Continued
throughout gestation minus gravid uter-
ine weight or litter weight at sacrifice)
Organ Weights (in cases of suspected target
organ toxicity and especially when sup-
ported by adverse histopathoiogy findings)
Absolute
Relative to body weight
Relative to brain weight
Food and Water Consumption (vyhere rele-
vant) . .
Clinical Evaluations
• Types, incidence, degree, and duration ojf
clinical signs
Enzyme markers •
Clinical chemistries
Gross Necropsy and Histopathoiogy
Body weight and the. change in body
weight are viewed collectively as
indicators of maternal toxicity for most
species, although these end points may
not be as useful in rabbits, because
• body weight changes are usually more
variable (Kimmel and Price, 1990), and
in some strains of rabbits, body weight
is not a good indicator of pregnancy
status. Body weight changes may.
provide more information than a daily
body weight measured during treatment
or during gestation. Changes in weight ,
gain during treatment could occur that
would not be reflected in the total
weight change throughout gestation,
because of compensatory weight gain
that may occur following treatment but
before sacrifice^For this reason, changes
in weight gain during treatment can be
examined as another indicator of
maternal toxicity.
Changes in maternal body, weight
corrected for gravid uterine weight at
sacrifice may indicate whether the effect
is primarily maternal or intrauterine. For
example, a significant reduction in
weight gain throughout gestation and in
gravid uterine weight without any
change in corrected maternal weight
gain generally would indicate an
intrauterine effect. Conversely, a change
in corrected weight gain and no change
in gravid uterine weight generally would
suggest maternal toxicity and little or no
intrauterine effect. An alternate estimate
of maternal weight change during
gestation can be obtained by subtracting
the sum of the weights of the fetuses. ,
However, this weight does not include
the uterine or placental tissue, or the
amniotic fluid. - - .
-------
gq|era| Register / Vol. 56, No. 234 / Thursday, December 5, 1891 / Notices
Changes in other end points may also
be important. For example, changes in
relative and absolute organ weights may
be signs of a maternal effect especially
when an agent is suspected of causing
specific organ toxicity and when such
findings are supported by adverse
hislopathologfc findings in those organs.
Food and water consumption data are
useful, especially if the agent is
administered in the diet or drinking
water. The amount ingested (total and
relative to body weight) and the dose of
the agent (relative to body weight) can
then RC calculated, and changes in food
and water consumption related to
treatment can be evaluated along with
changes in body weight and body
weight gain. Data on food and water
consumption also are useful when an
agerif is suspected of affecting appetite,
water intake, or excretory function.
', " " '"i ll ' "i ,, I"!:'} ! •; . ;, 'i
Clinical eyajuatipns qf toxicity also
rrtny be used as indicators of maternal
loxicity. Daily clinical observations may
be useful in describing the profile of
maternal, toxichy and alterations in
general homeostasis. Enzyme markers
and clinical cheniJstries may be useful
indicators of exposure but must be
interpreted carefully as to whether or
riol a change constitutes toxicity. Gross
rifjcrppsy and hlstopatholpgy data (when
specified in the protocol) may aid in
determining toxic dose levels. The
minimuin arao.tml of information
,.." coftSidq^ed useful for evaluating
maternal toxicity'"["as noted in the
"Proceedings of the Workshop on the
Evaluation of .Maternal .and
Developmental Toxicity" (Kimmel et al.,
1SJ87.J], includes: morbidity or mortality,
maternal body weight and body weight
Stiln, clinical signs of toxicity, food and
Walcr consumption (especially if dosing
is via food or water), and necropsy for
gross ev|denqe of organ toxicity. In a
well-designed study, maternal toxicity is
determined in the pregnant and/or
lectNting animal over an appropriate
part of gestation and/or the neonatal
period, and is not assumed or
extrapolated from other adult toxicity
studies.
b= End Points of Developmental
,• Taxiqity: Altered Suryiygh Growths-anc[
"" Maiphologica/Deveiopmeni.'^Becau'se
the ma.fejnal .animal, and .po...^^ , ,
conceptus, is the individual treated
during gestation, data generally" are
calculated as incidence per litter or as
number and percent of litters with
particular end points. Table 2 indicates
the ways in which offspring and litter
end points may be expressed.
IS "' "i'lFSISI lin '"I"..:!1-11:: ll!"1!!1! !'! nil'
I ill III
Table 2.—End Points of Developmental
Toxicity
Litters with implants
No. implantation sites/dam
No. corpora lutea [CL)/dam *
Percent preimplantation loss
(CL—implantations) x 100 '
CL
No. and percent live offspring b/litter
NO. and percent resorptions/litter
No. and percent litters with resorptions
No. and percent late fetal deaths/litter
No. and percent nonlive (late fetal deaths
•+ resorptions) implants/litter
No. and percent litters with nonlive im-
plants
Nq. and percent affected (nonlive + mal-
formed) implants/litter
No. and percent litters with affected im-
plants
No. and percent litters with total resorp-
tions
No. and percent stillbirths/litter
No. and percent litters with live offspring
. Litters with Jive offspring
No. and percent live offspring/litter
Viability of offspring c
Sex ratio/litter
Mean offspring body weight/litterc
Mean male or female body weight/litter"
No, and percent offspring with external,
visceral, or skeletal malformations/litter
No. and percent malformed offspring/litter
No. and percent litters with malformed
offspring
No. and percent malformed males or fe-
males/litter
No. and percent offspring with external,
visceral, or skeletal variations/Utter
Ng. and percent offspring with variations/
litter
No. and percent litters having offspring
with variations
Types and incidence of individual malfor-
mations
Types and incidence of individual vari-
ations.
Individual offspring and their malforma-
tions and variations (grouped according
to litter and dose)
Clinical signs (type, incidence, duration,
and degree)
Gross necropsy and histopathology
"Important when treatment begins prior to im-
plantation. May be difficult to assess in mice.
* Offspring refers both to fetuses observed prior
to term or to pups following birth. The end points
examined depend on the protocol used for e*ach
study.
'Measured at selected intervals until termina-
tion of the study.
When treatment of females begins
prior to implantation, an increase in
preimplantation loss could indicate an
adverse effect on gamete transport, the
fertilization process, uterine toxicity, the
developing blastocyst, or on the process
of implantation itself. If treatment
begins around the time of implantation
(i.e., day 6 of gestation in the mouse, rat,
or rabbit), an increase in
preimplantation loss probably reflects
variability that is not treatment-related
in the animals being used, but the data
should be examined carefully to
: . i n
11 n ii in ii 111, i n 111 i in n 11 n i
determine if there is a dose-response
relationship. If preimplantation loss is
related to dose, further studies would be
necessary to determine the mechanism
and extent of such effects.
The number and percent of live
offspring per litter, based oh all litters,
may include litters that have no live
implants. The number and percent of
resorptions and late fetal deaths give
some indication of when the conceptus
died, and the number and percent of
nonlive implants per litter
(postimplantation loss) is a combination
of these two measures. Expression of
data as the number and percent of litters
showing an increased incidence for
these end points may be less useful than
incidence per litter because, in the
former case, a litter is counted whether
one or all implants were resorbed, dead,
or nonlive.
If a significant increase in
postimplantation loss is found after
exposure to an agent, the data may be
compared not only with concurrent
controls, but also with recent historical
control data (preferably from the same
laboratory), since there is considerable
interlitter variability in the incidence of
postimplantation loss (Kimmel and
Price, 1990). If a given study control
group exhibits an unusually high or low
incidence of postimplantation loss
compared to historical controls, then
scientific judgment must be used to
determine the adequacy of the study for
risk assessment purposes.
The end point "affected implants"
(i.e., the combination of nonlive and
malformed conceptuses) sometimes
reflects a better dose-response
relationship than does the incidence of
nonlive or malformed-off spring taken
individually. This is especially true at
the high end of the dose-response curve
in cases when the incidence of nonlive
implants per litter is greatly increased.
In such cases, the malformation rate
may appear to decrease because only
unaffected offspring have survived. If
the incidence of prenatal deaths or
malformations is unchanged, then the
incidence of affected implants will not
provide any additional dose-response
information. In studies where maternal
animals are allowed to deliver pups
normally, the number of stillbirths per
litter should also be noted.
The number of live offspring per litter,
based on those litters that have one or
more live offspring, may be unchanged
even though the incidence of nonlive in
all litters is increased. This could occur
either because of an increase in the
number of litters with no live offspring,
or an increase in the number of implants
per litter. A decrease in the number of
-------
Register / V&lJ- 56T'Ndi';%& /ur'sday. December 5, 19grVNotices
63805
live offspring per litter is generally
accompanied by an increase in the
incidence of npnlive implants per litter
unless the implant numbers differ among
dose groups. In postnatal studies, the
viability of live-born offspring should be
determined at selected intervals until
termination of the study.
The sex ratio per litter, as well as the
body weights of males and females, can
be examined to determine whether or
not one sex is preferentially affected by
this agent. However, this is an unusual
occurrence.
- A change in offspring body weight is a
sensitive indicator of developmental
toxicity, in part because it is a
continuous, variable. In some cases,
offspring weight reduction may be the
only indicator of .developmental toxicity.
While there is always a question as to
whether weight reduction is a
permanent or transitory effect, little is
known about the long-term
consequences of short-term fetal or
neonatal weight changes. Therefore,
when significant weight reduction
effects are noted, they are used as a
basis to establish the NOAEL; Several .
other factors should be considered in
the evaluation of fetal or neonatal '
weight changes; for example, in
polytocous animals, fetal and neonatal
weights are usually inversely correlated
with litter size, and the upper end of the
dose-response curve may be affected by
smaller litters and increased fetal or
neonatal weight Additionally, the
average body weight of males is greater
, than that of females in the more • • ._
commonly used laboratory animals.
Live offspring are generally examined
for external, visceral, and skeletal
malformations and variations. If only a
portion of the litter is examined for one'
or more end points, then.random
selection of those pups examined
introduces less bias in the data. An
increase in the incidence of malformed
offspring may be indicated by a change
in one or more of the following end
points: the incidence of malformed ;
offspring per litter, the number and
percent of litters with malformed
offspring, or the number of offspring or
litters with a particular malformation
that appears to increase with dose (as
indicated by the incidence of individual
types of malformations).
Other ways of examining the data
include determining the incidence of
external, visceral, and skeletal
malformations and variations that may
indicate the organs or organ systems.
affected. A listing of individual-offspring
with their malformations and variations
may give an indication of the pattern of
developmental deviations. All of these
methods of expressing and examining
the data are valid for determining the
effects of an agent on structural
development. However, care must be
taken to avoid counting offspring more
than once in the evaluation of any single
end point based on number or percent of
offspring or litters. The incidence of
individual types of malformations and
variations may indicate significant
changes thai are masked if the data on
all malformations and/or variations are
pooled. Appropriate historical control
data can be especially helpful in the
interpretation of malformations and
variations, particularly those that
normally occur at a low incidence and
may :or may not be related to dose in an
individual study.
Although a dose-related increase in
malformations is interpreted as an
adverse developmental effect of
exposure to an agent, the biological
significance of an altered incidence of
anatomical variations is more difficult to
assess, and must take into account what
is known about developmental stage
(e.g., with skeletal ossification),
background incidence of certain
variations (e.g., 12 or 13 pairs of ribs in
rabbits), or other strain-or species-
specific factors. However, if variations
are significantly increased in a dose-
related manner, these should also be
evaluated as a possible indication of
developmental toxicity.
In addition, although some
investigators have considered certain of
these effects to simply be associated
with manifestations of maternal toxicity
noted at similar dose levels (Khera,
1984,1985,1987), such effects are still
toxic manifestations and as such are
generally considered a reasonable basis
for Agency regulation and/ or risk
assessment. On a somewhat similar
note, the conclusion of participants in a
"Workshop on Reproductive Toxicity
Risk Assessment" (Kimmel et-aL, 1986)
was that dose-related increases in
defects that may occur spontaneously
are as relevant as dose-related increases
in any other developmental toxicity end
points.
c. End Points of Developmental
Toxicity: Functional Deficits.
Developmental effects that are induced
by exogenous agents are not limited to
death, structural abnormalities, and
altered growth. Rather, it has been
demonstrated in a number of instances
. that alterations in the functional
competence of an organ or a variety of.
organ systems may result from exposure
during critical developmental periods
that may occur between conception and
sexual maturation. Sometimes, these
functional defects are observed at dose
levels below those at which other
indicators of developmental toxicity are
evident (Rodier, 1978). Such effects may
be transient or reversible in nature, but
generally are considered adverse .
effects. Testing for functional
developmental toxicity has not been
required routinely by regulatory
agencies in the United States, but
studies in developmental neurotoxicity
are beginning to be required by the EPA:
when other information indicates the
potential for adverse functional
developmental effects (U.S. EPA, 1986a,
1988a, 1989b, 1991a). Data from
postnatal studies, when available, are
considered very useful for further
assessment of the relative importance
and severity of findings in the fetus and
neonate. Often, the long-term
consequences of adverse developmental
outcomes noted at birth are unkriown,
and further data on postnatal
development and function are necessary
to determine the full spectrum of
potential developmental effects. Useful
data can also be derived from well-
conducted multigeneration studies,
although the dose levels used in these
studies may be much lower than in
- studies with shorter-term exposure.
Much of the early work in functional
developmental toxicology was related to
behavioral evaluations, and the term
"behavioral teratology" became
"prominent in the mid 1970s. Recent
advances in this area have been
reviewed in several publications (Riley.
and Vorhees, 1986; Kimmel, 1988;
Kimmel et aL, 1990a). Several expert
groups have focused on the functions
that should be included in a behavioral
testing battery (World Health
Organization [WHO], 1984; Buelke-Sam
et al., 1985; Leukroth, 1986). These
include: sensory systems, neuromotor
development, locomotor activity,
learning and memory, reactivity and/or
habituation, and reproductive behavior.
No testing battery has fully addressed
all of these functions, but it is important
to. include as many as possible, and
several testing batteries have been
developed and evaluated for use in
.testing (Buelke-Sam et al., 1985;
Tanimura, 1986; Eisner et al., 1986).
The Agency-recently has developed a
"generic" developmental neurotoxicity
test guideline that can be used for both
pesticides and industrial chemicals (U.S.
EPA, 1991a). Because of its design, the
developmental neurotoxicity testing
protocol may be conducted as a
separate study, concurrently with or as
a follow-up to a developmental toxicity
(Segment II) study, or be folded into a
multigeneration study in the second
generation. Testing is generally
conducted in the rat. In the protocol For
the separate study, the test agent is
-------
63806
Federal Register / Vol. 56, No. 234 /Thursday, December 5, 1991 / Notices
administered orally (other routes may be
used on a casc-by-case basis) to at least
three treated groups and one concurrent
control group of animals on day 6 of
gestation through day 10 postnatally.
The highest dose level is selected to
induce Sgmg overt signs of maternal
tbxlcily, but not result in more than a
2OT> Deduction in weight gain during
gestation and lactation. This dose also is
" "' ^{sleeted, to avofd Inuterq.oi.nepnatal
daath or malformations sufficient to
preclude a meaningful evaluation of
developmental neurotoxicity. At least 20
litters are required per treatment group.
For behavioral tests, one female and one
male pup per litter are randomly
selected and assigned to one of the
following tests: motor activity, auditory
,Lslfirjje,,, arjdjearnjng and memory in
animals at weaning and as adults.
Neuropathological evaluation and
determination of brain weights are
conducted on selected pups at postnatal
day 11 and at termination of the study.
Several criteria for selecting agents
for developmental neurotoxicity testing
have been suggested [Buelkc-Sam et al.
1985; Levlne and Butcher, 1990),
including; Agents that cause central
nervous system malformations,
psychoactive drugs and chemicals,
figcnls that cause adult neurotoxicity,
hormonally-active agents, and chemicals
that are structurally related to others
that cause developmental neurotoxicity
or for w,hich wide-spread exposure and/
or release is expected. Data from
developmental neurotoxicity studies
should be evaluated in light of the data
that may have triggered such testing as
well as all other toxicity data available.
Less work has been done on other
developing functional systems, but the
assessment of postnatal renal
morphological and functional
development may serve as a model for
the use of postnatal evaluations in the
risk assessment process. As an example,
standard morphological analyses of the
kidneys of fetal rodents have detected
treatment-related changes in the relative
growth of the renal papilla versus the
renal cortex, an effect considered in
some cases to be a malformation
(hydronephrosis), while in other cases a
variation (apparent hydronephrosis,
enlarged or dilated renal pelvis). While
sflme Investigators (Woo and Hoar,
1972) have provided data suggesting that
the morphological effect represents a
transient developmental delay, others
have shown that it can persist well into
postnatal life and that physiological
function is compromised in the affected
Individuals (Kavlock et al., I987a, 1988;
Daston et al, 1938; Couture, 1990). Thus,
the biological interpretation of this
effect on the basis of fetal examinations
alone is tenuous (U.S. EPA, 1985b). In
addition, the critical period for inducing
renal morphological abnormalities
extends into the postnatal period
(Couture, 1990), and studies on
perinatally-induced renal growth
retardation {Kavlock et al., 1986,1987b;
Slotkin et al., 1988; Gray et al., 1989;
Gray and Kavlock, 1991) have shown
r that, renal fjinctionjs generally altered in
such conditions, but that manifestation
of the dysfunction is not readily
predictable. Thus, both morphological
and functional assessment of the
kidneys after birth can provide useful
and complementary information on the
persistence and biological significance
of expressions of developmental
toxicity.
Although not as-well-studied, data
indicate that the cardiovascular,
respiratory, immune, endocrine,
reproductive, and digestive systems also
are subject to alterations in functional
competence (Kavlock and Grabowski,
1983; Fujii and Adams, 1987) following
exposure during development. Currently,
there are no standard testing procedures
for these functional systems; however,
when data are encountered on a
chemical under review, they are
considered in the risk assessment
process.
Direct extrapolation of functional
developmental effects to humans is
limited in the same way as for other end
points of developmental toxicity, i.e., by
the lack of knowledge about underlying
lexicological mechanisms and their
significance. In evaluations of a limited
number g^'agents known to cause ;
developmental neurotoxic effects in
humans, Adams (1986) concluded that
these agents produce similar
developmental neurotoxic effects in
animals and humans. This conclusion
was strongly supported by the results of
a recent "Workshop on the Qualitative
and Quantitative Comparability of
Human and Animal Developmental
Neurotoxicity," sponsored by EPA and
the National Institute on Drug Abuse
(NIDA), at which participants critically
iieya|uatedi§pd,compared the effects of
agents known to cause human
developmental neurotoxicity with the
effects seen in experimental animal
studies (Kimmel et al., 1990a). The high
degree of qualitative correlation
between human and experimental
animal data for the agents evaluated
lends strong support for the use of
experimental animals in assessing the
potential risk for developmental
neurotoxicity in humans. Thus, as for
other end points of developmental
toxicity, the assumption can be made
that functional effects in animal studies
indicate the potential for altered
development in humans, although the
types of developmental effects seen in
experimental animal studies will not
necessarily be the same as those that
may be produced in humans. Thus,
when data from functional
developmental toxicity studies are
encountered for particular agents, they
should be considered in the risk
assessment process.
Some guidance is provided here
concerning important general concepts
of study design and evaluation for
functional developmental toxicity
studies.
• Several aspects of study design are
similar to those important in standard
developmental toxicity studies (e.g., a
dose-response approach with the
highest dose producing minimal overt
maternal or perinatal toxicity, number of
litters large enough for adequate
statistical power, randomization of
animals to dose groups and test groups,
litter generally considered the statistical
unit, etc.).
• A replicate study design provides
added confidence in the interpretation
of data.
• Use of a pharmacological/
physiological challenge may be valuable
in evaluating function and "unmasking"
effects not otherwise detectable,
•particularly in the case of organ systems
that are endowed with a reasonable
degree of functional reserve capacity.
• Use of functional tests with a
moderate degree of background
variability may be more sensitive to the
effects of an agent on behavioral end
points than are tests with low variability
that may be impossible to disrupt
without being life-threatening (Butcher
et al., 1980).
• A battery of functional tests, in
contrast to a single test, is usually
needed to evaluate the full complement
of organ function in an animal; tests
conducted at several ages may provide
more information about maturational
changes and their persistence.
• Critical periods for the disruption of
functional competence include both the
prenatal and the postnatal periods to the
time of sexual maturation, and the effect
is likely to vary depending on the time
and degree of exposure.
• Interpretation of data from studies
in which postnatal exposure is included
should take into account possible
interaction of the agent with maternal
behavior, milk composition, pup
suckling behavior, possible direct
exposure of pups via dosed feed or
water, etc.
1 iMn !"!'
,«s,;;11! ' '<. ,:M 'i,c:: ,
; ii" ;in i, "H ,HJ: , ii» HI !;" iuiiri IKIIKLI -iIJKBK ^ i.,;. i ,i iiiniiisMfh,ni 1,0,, m ...
111 «'il 'V1',! !,ii'"HI!i|l, ill'l;1,;1;'I!!"1!1,,,!:,, fill ,„; , ,. ,1!' !II1 IlillllMLlR'ylilllliWil Ililhil "I/ II ilii> II 'hi!11 'I i'l'li ,/k III: \ , 'I >
, „ '"'''' i ' II t"
i ' ,,~K !"; J"V" l! .' i S I?*" [iWi-MiCajllllH 5 'i'1 ' """ !" ' •'• :|l|lli • ' '
• ; plicit 'Illf't '111'i:;' !!!'I'Iliii I":11:/ IMC"i", Ij1"liimilii^ jldlllli..Hi SiM 1 iMiAliijlllidi 'i!!!!:.liiwin ill I1!.'.li'illlL!.! "[iiiElil! ' iiiilliil'ii'llf il „jllVriii::'"
.lii'i;, /mil"! nli il ,11:1 M."IIj!ij.! "I11!.!. :!"., ill1:1!1'! "W Si^""'" n1 il11'1 'i1' .'I1IIU !ll|'l|<| VliliillBlliri'^H^ [IK! f Vniiilll''^^^
'•I"1.1'";;;!: WM;
. 'Tl. II 1'i'iin
."'a111 .iiiu .iMiii'i.ii imiiiiiiiiiii'i' iijiiiiuw.iiilliiiiiiu^^
/ "r; ..;;£' iloiillii.i.ftlli.Uin
l:l
-------
Federal Register ./'V61V€6» No.-234-•/- Thursday, December 5,'.1991 / .Notices :
Although interpretation of functional
data may be limited at present, it is
clear that functional effects must be
evaluated in light of other toxicity data,
including other forms of developmental-
toxicity (e.g., structural abnormalities,
perinatal death, and growth
retardation). The level of confidence in
an adverse effect may be as important
as the type of change seen, and
confidence may be increased by such
factors as replicability of the effect
either in another study of the same
function, or by convergence of data from
tests that purport to measure similar
functions. A dose-response relationship
is considered an important measure of
chemical effect; in the case of functional"
effects, both monotonic and biphasic
dose-response curves are likely,
depending on the function being tested.
Finally, there are at least three
general ways in which the data from.
these studies may be useful for risk
assessment purposes: (1) To help
elucidate the long-term consequences of
fetal and neonatal effects; (2) to indicate
the potential for an agent to cause
functional alterations and the effective
doses relative to those that produce
other forms of toxicity; and (3) for
existing environmental agents, to
suggest organ systems to be evaluated in
exposed human populations.
d. Overall Evaluation of Maternal and
Developmental Toxicity. As discussed
previously, individual end points of
maternal and developmental toxicity are
evaluated in developmental toxicity
studies. In order to interpret the data
fully, an integrated evaluation must be
performed considering all maternal and
developmental end points.
Agents that produce developmental
toxicity at a dose that is not toxic to the
maternal animal are especially of
concern because the developing
organism is affected but toxicity is not
apparent in the adult. However, the
more common situation is when adverse
developmental effects are produced only
at doses that cause minimal maternal
toxicity; in these cases, the
developmental effects are still
considered to represent developmental •
toxicity and should not be discounted as
being secondary to maternal toxicity. At
doses causing excessive maternal
toxicity (that is, significantly greater
than the minimal toxic dose),
information on developmental effects
may be difficult to interpret and of
limited value. Current information is
inadequate to assume that . :
developmental effects at maternally
toxic doses result only from maternal
toxicity; rather, when the LOAEL is the
same for the adult and developing
organisms, it may simply indicate that
both are sensitive to that dose level.
Moreover, whether developmental
effects are secondary to maternal
toxicity or not, the maternal effects.may
be reversible whils effects on the
offspring may be permanent. These are
important considerations for agents to
which humans may .be exposed at
minimally toxic levels either voluntarily
or involuntarily, since several agents are
known to produce adverse
developmental effects at minimally toxic
doses in adult humans (e.g., smoking,
alcohol, isotretinoin).
Since the final risk assessment not
only takes into account the potential
hazard of an agent, but also the nature
of the dose-response relationship, it is
important that the relationship of
maternal and developmental toxicity be
evaluated and described. Then,
information from the exposure
assessment is used to determine the..
likelihood of exposure to levels near the
maternally toxic dose for each agent
and the risk for developmental toxicity
in humans.
Although the evaluation of
developmental toxicity is the primary
objective of standard studies within this
area, maternal effects seen within the
context of developmental toxicity
studies should be evaluated as part of
the overall toxicity profile for a given
chemical. Maternal toxicity may be seen
in the absence of or at dose levels lower
than those producing developmental
toxicity. If the maternal effect level is
lower than that in other evaluations of
adult toxicity, this implies that the
pregnant female is likely to be more
'sensitive than the nonpregnant female.
Data from reproductive and
developmental toxicity studies on the
pregnant female should be used in the
overall assessment of risk.
Approaches for ranking agents
according to their relative maternal and
developmental toxicity have been
proposed; Schardein (1983) has
reviewed several of these. Several
approaches involve the calculation of ,
ratios relating an adult toxic dose to a
•developmentally toxic dose (Johnson,
1981; Fabro et al., 1982; Johnson and
Gabel, 1983; Brown and Freeman, 1984).
Such ratios may describe in a
qualitative and roughly quantitative
fashion the relationship of maternal
(adult) and developmental toxicity. .
However, at the U.S. EPA-sponsored
"Workshop on the Evaluation of
Maternal and Developmental Toxicity"
(Kirnmel et al., 1987), there was no
agreement as to the validity or utility of
these approaches in other aspects of the
risk assessment process. This is due in
part to uncertainty about factors that
can; affect the ratios. For example, the
number and spacing of dose levels,
differences in study design (e.g., route
and/or timing of exposure), the relative
thoroughness in the assessment of
maternal and developmental end points
examined, species differences in ,
response, and differences in the slope of
the dose-response curves for matema!
and developmental toxicity, can all
influence the maternal and •
developmental effects observed and the
resulting ratios (Kimmel et al., 1987;-U.S.
EPA, 1985b). Also, maternal and
developmental end points used in the
ratios need to be better defined to
permit cross-species comparison. Until
such information ia available, the
applicability of these approaches in risk
assessment is not justified.
e. Short-Term Testing in
Developmental Toxicity, The need for
short-term tests for developmental
toxicity has arisen from the need to
establish testing priorities for the large
number of agents in or entering the
environment, the interest in reducing the
number of animals used for routine
testing, and the expense of testing.
These approaches may be useful in
making preliminary evaluations of
•potential developmental toxicity, for
evaluating structure activity
relationships, and for assigning
priorities lor further, more extensive
testing. Furthermore, as the risk
assessment process begins to
incorporate more pharmacokinetic and
mechanistic data, short-term tests
should be particularly useful. Kimmel
(1990) has recently discussed the
potential application of in vitro systems
in risk assessment in a.context that is
broader than chemical screening.
However, the Agency currently
considers a short-term test as
"insufficient" by itself to carry out a risk
assessment (see Section III.C).
Although short-term tests for
developmental toxicity are not routinely
required, such data are encountered in
the review of chemicals. Two
approaches are considered here in terms
of their contribution to the overall
testing process: (1) An in vivo
mammalian screen, and (2) in vitro test
systems.
(1) In vivo mammalian developmental
toxicity tests. The most widely studied
in vivo short-term approach is that
developed by Chernoff and Kavlock
(1982). This approach is based on the
hypothesis that a prenatal injury, which
results in altered development will be
manifested postoatally as reduced
viability and/or-impaired growth. Whet?
originally proposed, the test substance
-------
i-xt JP • ./ !":' fi-y i
'I h'.'l "Si Ill j'f'"''fi' '"ill t
!"'!!!1! *,*:*' 3h!!!!!i!f:!l!lti::!!!|««!ll:llil«a gillilllllllllilllCpi 1
'rts1}], ''niii1" .'i{vw
63808
Federal Register / Vol. 56, No. 234 / Thursday, December 5, 1991 "/ Notices ' ;
was administered to mice over the
period of major organogenesis at a
single dose level that would elicit some
degree of maternal toxicity. At the
N1OSH "Workshop on the Evaluation of
the Chernoff/Kavlock Test for
Developmental Toxicity" (Hardin, 1987),
use of a second lower dose level was
encouraged to potentially reduce!the
chance; of false positive results, and the
recording of implantation sites was
recommended to provide a more precise
estimate of poslimplantation loss
(Kavlock et a)., 1987c). In this approach,
Ihe pups are counted and weighed
shortly after birth, and again after 3-4
days. End points that are considered in
the evaluation include: general maternal
foklcity (including survival and weight
gain), litter size, pup viability and
weight, and gross malformations in the
offspring. Several schemes have been
proposed for ranking the results as a
means of prioritizing agents for further
Jesting (Chemoff and Kavlock, 1982;
Brown, Jlgg^ Sdhuler et al., 1984}.,
'" "The .mouse, was cliosen originally for
this test because of its low cost, but the
procedure has been applied to the rat as
Well (Wickramaratne, 1987}. The test
can predict the potential for
developmental toxicity of an agent in
the species used while extrapolation of
risk to other species, including humans,
lias the same limitations as for other
testing protocols. The EPA Office of
TQXJC Substances has developed testing
guidelines for this procedure (U.S. EPA,
I985c). and the Office of Pesticide
Programs has applied similar protocols
on a case-by-case basis (U.S. EPA,
1983b). The National Toxicology
Program also has developed a protocol
that incorporates aspects of a range-
finding study, with the intent of
providing information on appropriate
exposure levels should a standard
developmental toxicity study be
required (Morrissey et al., 1989).
Although testing guidelines are
available, such procedures are required
on a case-by-case basis. Application of
this procedure in the risk assessment
process within the Office of Toxic
Substances has been described (Francis
and Furland, 1987), and the experiences
of a number of laboratories are detailed
In the proceedings of a NIOSH-
ipdnsored workshop (Hardin, 1987).
Recently, the OECD developed a
screening protocol to be used for
prioritizing existing chemicals for further
testing" (draft as of March 22.1990). This
protocol Is similar to the design of the
Chernoff-Kavlock Test except that it
Involves exposure of male and female
rats 2 weeks prior to mating, throughout
mating and gestation, and postnatally to
M ,'rtfi in;, ;;i' ife, 1, j,, WW '.iSliliiii'T,,;*,;::*;!!' lim .""I lillWH^ l!!l!:!i! .UlOlill' lilllfl «''UKni;1! n FViiW ",
day 4. Male animals are exposed
following mating for a period
corresponding to that of the females.
Adult animals are evaluated for general
toxicity and effects on reproductive
organs. Pups are counted, weighed and
examined for any gross physical or
litters do not respond independently, the
statistical analyses are generally
designed to analyze the relevant data
based on incidence per litter or on the
number of litters with a particular end
point. The analytical procedures used
and the results, as well as an indication
bsfeaYioral,abnormalities,§t birth ajyj an, plJ.ti.e.yjriance.m.eafih, grid, point, should
postnatal day 4. This protocol permits
evaluation of reproductive and
developmental toxicity following
repeated dosing with an agent, provides
an indication for the need to conduct
additional studies, and provides
guidance in the design of further studies.
Currently, this study design is
insufficient by itself to make an estimate
of human risk without further studies to
confirm and extend the observations.
(2) In vitro developmental toxicity
screens. Test systems that fall under the
general heading of "in vitro"
developmental toxicity screens include
any system that employs a test subject
other tjjan the intact pregnant mammal.
Examples of such systems include:
isolated whole mammalian embryos in
,_ culture, tissue/organ culture, cell
culture, and developing nonmammalian
organisms. These systems have long
been used to assess events associated
with normal and abnormal development,
but more recently they have been
consi4ered for their potential as screens
in testing (Wijson, 1978; Kimmel et al.,
1982bj Brown and Fabro,1982). Many of
these systems are now being evaluated
for their ability to predict the
developmental toxicity of various agents
in intact mammalian systems. This
validation process requires certain
considerations in study design, including
defined end points for toxicity and an
understanding of the system's, ability to
handle various test agents (Kimmel et
al., 1982a; Kimmel, 1985; FDA, 1987;
Brown, 1987).
While in vitro test systems can
provide significant information, they are
considered insufficient, by themselves,
for carrying out a risk assessment (see
section III.C). In part, this is due to
limitations in the application of the data
to the whfile animal situation. But it is
also due to the lack of assays that have
been fully validated; as has been noted '
in several reviews of available in vitro
systems (FDA, 1987; Brown, 1987;
Faustman, 1988) and at a recent
workshop on in vitro teratology
(Morrissey et al., 1991).
f. Statistical Considerations. In the
assessment of developmental toxicity
data, statistical considerations require
special attention. Since the litter is
generally considered the experimental
unit in most developmental toxicity
studies, and fetuses or pups within
be evaluated carefully when reviewing
data for risk assessment purposes.
Analysis of variance (ANOVA)
techniques, with litter nested within
dose in the model, take the litter
variable into account while allowing use
of individual offspring data and an
evaluation of both within and between
litter variance as well as dose effects.
Nonparametric and categorical
procedures have also been widely used
for binomial or incidence data. In
addition, tests for dose-response trends
can be applied. Although a single
statistical approach has not been agreed
upon, a number of factors important in
the analysis of developmental toxicity
data have been discussed (Haseman
and Kupper, 1979; Kimmel et al., 1986).
Studies that employ a replicate
experimental design (e.g., two or three
replicates with 10 litters per dose per
replicate rather than a single experiment
with 20 to 30 litters per dose group)
allow broader interpretation of study
results since the Variability between
replicates can be accounted for using
ANOVA techniques. Replication of
effects due to a given agent within a
study, as well as among studies or
laboratories, provides added strength in
the use of data for the estimation of risk.
An important factor to consider in
evaluating data is the power of a study
(i.e., the probability that a study will
demonstrate a true effect), which is
limited by the sample size used in the
study, the background incidence of the
end point observed, the variability in the
incidence of the end point, and the
analysis method. As an example, Nelson
and Holson (1978) have shown that the
number of litters.needed to detect a 5%
or 10% change was dramatically lower
for fetal weight (a continuous variable
with low variability) than for
resorptions (a binomial response with
high variability). With the current
recommendation in testing protocols
being 20 rodents per dose group (U.S.
EPA, 1982b, 1985a), the minimum change
detectable is an increased incidence of
malformations 5 to 12 times above
control levels, an increase 3 to 6 times
the in utero death rate, and a decrease
0.15 to 0.25 times the fetal weight. Thus,
even within the same study, the ability
to detect a change in fetal weight is
much greater than for the other end
points measured. Consequently, for
-------
Federal Register / Viol. 56, Nti. 234'/Thursday, December 5; 1991 /'-Notices
statistical reasons only, changes in fetal
weight are often observed at doses
below those producing other signs of
developmental toxicity. Any risk
assessment should present the detection
sensitivity for the study design used and
for the end point(s) evaluated.
Although statistical analyses are
important in determining the effects of a
particular agent, the biological
.significance of data is most relevant. It
is important to be aware that with the
number of end points that can be
observed in standard protocols for
developmental toxicity studies, a few
statistically significant differences may
occur by chance. On the other hand,-
apparent trends with dose may be
biologically relevant even though pair-
wise comparisons do not indicate a
statistically significant effect. This may
be true especially for the incidence of
malformations or in utero death because
of the low power of standard study --
designs in which a relatively large
difference is required to be statistically
significant. It should be apparent from
this discussion that a great deal of
scientific judgment, based on experience
with developmental toxicity data and
with'principles of experimental design
and statistical analysis, may be required
to adequately evaluate such data.
2, Human Studies
In principle, human data are preferred
for risk assessment. However, the
complexities of obtaining sufficient
human data are such that these data are -
not available for many potential
toxicants. The following describes the
methods of generation of human data,
their evaluation, and the weight they
should be given in risk assessments.
The category of "human studies"
includes both epidemiologic studies and
other .reports of individual cases or
clusters of events. Greatest weight
should be given to carefully designed
epidemiologic studies with more precise
measures of exposure, since they .can
best evaluate exposure-response
relationships (see Section IV).
Epidemiologic studies in which exposure
is presumed based on occupational title
or residence (e.g., some case-referent
and all ecologic studies) may contribute
data to qualitative risk assessments, but
are of limited use for quantitative risk
assessments because of the generally
broad categorical groupings. Reports of
individual cases or clusters of events
may generate hypotheses of exposure-
outcome associationSj but require
further confirmation with well-designed
epidemiologic or laboratory studies.
These reports of cases or clusters may
give added support to associations
suggested by other human or animal
data, but cannot stand by themselves in
risk assessments. Risk assessors should
seek the assistance of professionals
trained in epidemiology when
conducting a detailed analysis.
a. Epidemiologic Studies. Good.
epidemiologic studies provide the most
relevant information for assessing
human risk. As there ars many different
designs for epidemiologic studies,
simple rules for their evaluation do not
exist.
(1) General design considerations. The
factors that enhance a study and thus
increase its usefulness for risk
assessment havs been noted in a
number of publications (Selevan, 1980;
Bloom, igai; U.S. EPA, 1981; Wilcox,
1983; Sever and Hessol, 1984; Axelson,
1985; Tilley et al., 1985; Kimmel et al..
1988). Some of the more prominent
factors are as follows:
(a) Thepower of'the study: The
power, or ability of a study to detect a
true effect, is dependent on the size of
the study group, the frequency of the .
outcome hi the general population, and
the level of excess risk to be identified.
In a cohort study, common outcomes,
such as recognized fetal loss, require
hundreds of pregnancies in order to
have a high probability of detecting a
modest increase hi risk (e.g., 133 in both
exposed and unexposed groups to detect
a doubling of background; alpha = 0.05,
power = 80%), while less common
outcomes, such as the total of all
malformations recognized at birth,
require thousands of pregnancies to
have the same probability (e.g., more
than 1,200 in both exposed and
unexposed groups) (Bloom, 1981;
Selevan, 1981; Sever and Hessol, 1984;
Selevan, 1985; Stein et al., 1985; Kimmel
et al., 1986). In case-referent studies,
study sizes are dependent on the
frequency of exposure within the source
population. The confidence one has in
the results of a study without positive
findings is related to the power of the
study to detect meaningful differences in
the end points studied.
Power may be enhanced by combining
populations from several studies using a
metaanalysis (Greenland, 1987). The
combined analysis would increase
confidence in the absence of risk for
agents with negative findings. However,
care must be exercised in the
combination, of potentially dissimilar
study groups.
A. posteriori determination of power
of the actual study may be useful in
evaluating contradictory studies in risk
assessment. Absence of positive
findings in a study of low power would
be given less weight than either a ;
positive study or a null study (one with
no significant differences) with high
power. Positive findings from very small
studies are open to question due' to the
instability of the risk estimates and the
potential for highly selected study
groups.
(b) Potential bias in data collection:
Sources of bias may include selection
• bias and information bias (Rothman,
1985). Selection bias may occur when an
individual's willingness to participate
varies with certain characteristics
relating to the exposure status or health
.status of that individual. In addition,
selection bias may operate in the
identification of subjects for study. For
example, in studies of embryonic loss,
use of hospital records to identify
embryonic or early fetal loss will
underascertain events, because women
are not always hospitalized for these
outcomes. More weight might be given
in a risk assessment to a study in which
a more complete list of pregnancies is
obtained by, for example, collecting ,
biological data [e.g., human chorionic
gonadotropin (hCG) measurements] on
pregnancy status from study members.
These studies may also be affected by
bias. The representativeness of these
data may be affected by selection
factors related to the willingness of
different groups of women to continue
participation over the total length of the
study. Interview data result in more
complete ascertainment; however, this
strategy carries with it the potential for
recall bias, discussed in further detail
below. A second example of different
levels of ascertainment of events is the
use of hospital records to study
congenital malformations. Hospital
records contain more complete data on
malformations than do birth certificates
(Mackeprang et al., 1972). Consequently,
birth defects registries that are based on
searches of hospital records are more
complete than those based on vital
records (Selevan, 1986). Thus, a study
using hospital records to identify
congenital malformations would be
given more emphasis in a risk
assessment than one using birth
certificates.
Studies of working women present the
potential for additional bias since some
factors that influence employment status
may also be associated with
reproductive end points. For "example,
due to child-care responsibilities,
women may terminate employment, as
might women with a history of
reproductive problems who wish to have
children and are concerned about
workplace exposures (Joffe, 1985).
Information bias may result from .
misclassification of characteristics of
individuals or events identified for
-------
.1!'1:;,!!:1;',i ; i niir•;jri .usmii (
'flti'HT'il'^'li**!: Kill
68B1Q Federal Register / Vol. 56, No. 234 / Thursday, December 5, 1991 / Notices
study. Recall bias, one type of
information bias, may occur when
respondents with specific exposures or
outcomes recall information differently
than thpsue without the exposures or
outcomes. Interview bias may result
when the interviewer knows a priori the
category of exposure (for cohort studies)
or outcome (for case-referent studies) in
ivhich the respondent belongs. Use of
highly structured questionnaires and/or
"blinding" of the interviewer will reduce
Ihe likelihood of such bias. Studies with
lower likelihood of the above-listed bias
should carry more weight in a risk
assessment.
When data are collected by interview
or questionnaire, the appropriate
respondent depends on the type of data
or study. For example, a comparison of
husband-wjfe interviews on
reproduction found the wives' responses
to questions on pregnancy-related
events lo be considerably more
complete and valid than those of the
husbands (Selevan, I960). A more recent
study (Schnatter, 1990) found small,
nonsignificant improvements in
reporting of birth weights by mothers
compared to fathers, and that males
who provide early fetal loss data with
the aid of their wives give better data
{borderline significance). Studies based
on interview data from the appropriate
respondent(s) would carry more weight
than those from proxy respondents (e.g.,
the specific individual when examining
exposure history and the woman or both
partners when examining pregnancy
history).
Data from any source may be prone to
errors or bias. All types of bias are
difficult to assess; however, validation
" with an independent data source (e.g.,
Vjtlal or hospital records), or use of
btomarkers of exposure or outcome,
where possible, may indicate the degree
of bias present and increase confidence
In the results of the study. Those studies
with a low probability of biased data
should carry more weight (Axelson,
1985: Stein and Hatch, 1987).
Differential misclassification, i.e.,
when certain subgroups are more likely
to have misclassified data than others,
may either raise or lower the risk
estimate. Nondifferential
misclassification will bias the results
toward a finding of "no effect"
(Rothman, 1986).
(c) Collection of data on other risk
factors, effect modifiers, and
oonfounders: Risk factors for
reproductive and developmental toxicity
include such characteristics as age,
smoking, alcohol consumption, drug use,
and past reproductive history.
Additionally, occupational and
environmental exposures are potential
11 M i in n i n i
i •• in mil n nil n i in inn i i nip i iiiipiiini nil m inn i In innninn
risk factors for reproductive and
developmental effects. Known and
potential risk factors should be
examined to identify those that may be
effect modifiers or confounders. An
effect modifier is a factor that produces
different exposure-response
relationships at different levels of that
factor. For example, maternal age would
be an effect modifier if the risk
associated with a given exposure
increased with the mother's age. A
CQ,nfourider is a variable thajjs a risk •.
factor for the disease under study and is
associated with the exposure under
study, but is not a consequence of the
exposure. A confounder may distort
both the magnitude and direction of the
measure of association between the
exposure of interest and the outcome.
, For example, socioeconomic status
might be a confounder in a study of the
association of smoking and fertility,
since Sdcideconomic status may be
asspciajedjyith both. , , , „ ,
Studies that fail to account for effect
modifiers and confounders should be
given less weight in a risk assessment.
Both of these important factors need to
be controlled in the study design and/or
analysis to improve the estimate of the
effects of gxposure (Kleinbaum et al,
1982). A more in-depth discussion may
be found elsewhere (Epidemiology
Workgroup, 1981; Kleinbaum et al., 1982;
Rothman, 1986). The statistical
techniques used to control for these •
factors require careful consideration in
their application and interpretation
(Kleinbaum et al., 1982; Rothman, 1986).
(d) Statistical factors: As in animal
studies, pregnancies experienced by the
same woman are not .independent
events {Kissling, 1981; Selevan, 1985).
Women who have had embryo/fetal loss
are reported to be more likely to have
subsequent losses (Leridon, 1977). In
animal studies, the litter is generally
used as the unit of measure to deal with
nonindependence of events. In studies of
humans, pregnancies are sequential with
the risk factors changing for different
pregnancies, making analyses
considering nonindependence of events
very difficult (Epidemiology Workgroup,
1981; Kissling, 1981). If more than one
pregnancy per woman is included, as is
often necessary due to small study
groups, the use of nonindependent
observations overestimates the true size
of the groups being compared, thus
artificially increasing the probability of
reaching statistical significance
(Stiratelli et al., 1984). Biased estimates
of risk might also result if family size
confounds the relationship between
exposure and outcome. Some
approaches to deal with these issues
have been suggested (Kissling, 1981;
„ " I,1 ' I! !, n'nln! [ '!" "i n11,!,,1, I I,,,! ' '
Stiratelli et al., 1984; Selevan, 1985). At
this point in time, a generally accepted
solution to this problem has not been
developed.
(2) Selection of outcomes for study. As
already discussed, a number of end
points can be considered in the
evaluation of adverse developmental
effects. However, some of the outcomes
are not easily observed in humans, such
as early embryonic loss and
reproductive capacity of the offspring.
Currently, the most feasible end points
for epidemiologic studies are
reproductive history studies of some
pregnancy outcomes (e.g., embryo/fetal
loss, birth weight, sex ratio, congenital
malformations, postnatal function, and
neonatal growth and survival) and
measures of fertility/infertility which
would include indirect evaluations of
very early embryonic loss. Postnatal
outcomes for examination could include
physical growth and development, organ
or system function and behavioral
- effects of exposure. Factors requiring
control in the design or analysis (such as
effect modifiers and confounders) may
vary depending on the specific outcomes
selected for study.
The developmental outcomes
available for epidemiologic examination
are limited by a number of factors,
including the relative magnitude of the
exposure since differing spectra of
outcomes may occur at different
exposure levels, the size and
demographic characteristics of the
population, and the ability to observe
the developmental outcome in humans.
Improved methods for identifying some
outcomes such as very early embryonic
loss using new hCG assays may change
the spectrum of outcomes available for
study (Wilcox et al., 1985; Sweeney et
al.,1988).
Demographic characteristics of the
population, such as marital status, age
distribution, education, socioeconomic
status (SES) and prior reproductive
history are associated with the
probability of whether couples will
attempt to have children. Differences in
the use of birth control would also affect
the number of outcomes available for
study. In addition, women with live
births are more likely to terminate
employment than are those with other
outcomes, such as infertility or early
embryonic loss. Thus, retrospective
studies of female exposure that do not
include terminated women workers may
be of limited use in risk assessment
because the level of risk for these
outcomes is likely to be overestimated
(Lemasters and Pinney, 1989).
In addition to the above-mentioned
factors, developmental end points may
I
-------
Federal Register / Vol. 5,6, Np..23$;:/vT!mr,sday! December-;5,\199l7 Notices " 63811'
" "'"••••••^ •-*•••——•»»n.rtfT.-i.irT..iL|n||ll| mini mj nTQTii IN|| ......n.~ ih m, ,M , innm, m L, lwlm IIII|BBIIH mi inn linn mil mm iiiamm in I n IIIMMBI 1111 n
be envisioned as effects recognized at
various points in a continuum, starting
at conception through death of the
offspring. Thus, a malformed stillbirth
would not be included in a study of
defects observed at live birth, even
though the etiology could be identical
(Stein et al., 1975; Bloom, 1981). A shift
in the patterns of outcomes could result
from differences in timing or in level of
exposure (Selevan and LeMasters, 1987).
(3) Reproductive history studies, (a)
Measures of fertility: Normally, studies
' of sub- or infertility would not be
included in an evaluation of
developmental effects. However, in
humans it is difficult to identify very
early embryonic loss, and distinguish it
from sub- or infertility. Thus, studies
that examine sub- or infertility indirectly
examine loss very early in the
gestational period. Infertility or
subfertility may be thought of as a
nonevent: A couple is unable to have
children within a specific time frame.
Therefore, the epidemiologic
measurement of reduced fertility is
typically indirect, and is accomplished
by comparing birth rates or time
intervals between births or pregnancies.
In these evaluations, the couple's joint
ability to procreate is estimated. One
method, the Standardized Birth Ratio
(SBR; also referred to as the
Standardized Fertility Ratio), compares
the number of births observed to those
expected based on the person-years of
observation stratified by factors such as
time period, age, race, marital status,
parity, contraceptive use, etc. [Wong et
al.. 1979; Levine et al., 1980,1981; Levine,
1983; Starr et al., 1986). The SBR is
analogous to the Standardized Mortality
Ratio (SMR), a measure frequently used
in studies of occupational cohorts, and
has similar limitations in interpretation
(Gaffey, 1976; McMichael, 1976; Tsai and
Wen, 1986).
Analysis of the time period between
recognized pregnancies or live births
has been suggested as another indirect
measure of fertility (Dobbins et al., 1978;
Baird et al., 1986; Weinberg and Gladen,
1986). Because the time interval between
births increases with increasing parity
(Leridon, 1977), comparisons within
birth order (parity) are more
appropriate. A statistical method (Cox
regression) can stratify by birth or
pregnancy order to help control for
nonindependence of these events in the
same woman. ; _£,_... .
Fertility may also be affected by
alterations in sexual behavior. However,
limited data are available linking toxic
.exposures to these alterations in •
humans. Moreover, such data are not
easily obtained in .epidemiology studies.
'*
More information on this subject is
available in the proposed male and
female reproductive risk assessment
guidelines (U.S. EPA, 1988b, 1988c).
(bj Pregnancy outcomes: Pregnancy
outcomes examined in human studies of
parental exposures may include
embryo/fetal loss, congenital
malformations, birth weight, sex ratio at
birth, and postnatal effects (e.g.,
physical growth and development, organ
or system function, and behavioral
effects of exposure). Postnatal effects
are discussed in more detail in the next
section. As mentioned previously,
epidemiologic studies that focus on only
one type of pregnancy outcome may
miss a true effect of exposure due to the
continuum of outcomes. Examination of
individual outcomes could mask a true
effect due to reduced power resulting
from fewer events for study. Studies that
examine multiple end points could yield
more information, but the results may be
difficult to interpret.
Evidence of a dose-response
relationship is usually an important
criterion in the assessment of a toxic
exposure. However, traditional dose-
response relationships may not always
be observed for some end points. For
example, with increasing dose, a
pregnancy might end in a fetal loss
rather than a live birth with
malformations. A shift in the patterns of
outcomes could result from differences
either in level of exposure or in timing
(Wilson, 1973; Selevan and Lemasters,
1987) (for a more detailed description^
see Section III.A.2.3.5). Therefore, a risk
assessment should, when possible,
attempt to look at the interrelationship
of different reproductive end points and
patterns of exposure.
(c) Postnatal developmental effects:
These effects may include changes in
growth, behavior, organ or system
function, or cancer. Studies of
neurological and reproductive function
are discussed here as examples.
Postnatal behavioral and functional
effects in humans have been examined
for a small number of environmental
and occupational agents (e.g., lead,
PCBs, methyl mercury, alcohol). For
some agents (e.g., lead and PCBs), subtle
changes have been observed in groups
of children at lower exposures than for
other developmental effects (e.g.,
Bellinger et al., 1987; Needleman, 1988;
Davis et al.. 1990; Tilson et al., 1990).
This may not be true for all toxic agents.
These subtle differences would be
difficult to identify in individuals, but
could result in an overall shifting of
mean values when comparing groups of
exposed and unexposed. children. Some
postnatal studies have-examined infants
or ypung children using standard.
developmental scales (e.g., Brazelton
Neonatal Behavioral Assessment Scale,
Bayley Scales of Infant Development,
Stanford Binet IV, and Wechsler Scales)
and some biologic measure of exposure
(e.g., blood lead levels). These tests are
designed to examine certain end points
and have been developed to cover
certain age ranges. Certain tests
examine specific aspects of
development. For example, the Bayley
Scales look at motor and language
development, but do not examine
sensory function. Batteries of tests are
important for a proper evaluation due lo
the possibility of interrelated effects,
e.g., hearing-deficits and language
development. Thus, batteries of tests
will give a clearer indication of direct
effects of exposure resulting in postnatal
developmental deficits.
Factors that may influence the
examination of these effects include
parental education, SES, obstetrical
history, and health characteristics
independent of exposure that may affect
functional measurement (e.g., injuries
and infections). Many social and
lifestyle factors may also affect scoring
on these scales (e.g., neonatal-maternal
interactions, SES, home environment).
Studies of premature infants carry
special problems. For proper
comparisons, tests keyed to age in very
young children (less than 2.5 years of
age) need to "correct" the age for -
premature infants to the age they Would
have been had they been born at term.
In addition, premature infants or those
. with low birth weight for their '
gestational age may have problems
resulting from the birth process not
directly related to exposure (e.g.,
intraventricular hemorrhage in the brain
which can then cause developmental
problems). Thus, the developmental
effects.resulting from exposure may
have their own sequelae.
Other studies may examine effects
occurring at a later age (e.g., in utero .
exposure and cancer in young women).
This long time interval typically carries
with.it the need for retrospective
studies, with the inherent limitations in
accurate determination of exposure,
effect modifiers, and confpunders. Risk
assessment methods for cancer are
described in the "Guidelines for
Carcinogen Risk Assessment" (U.S.
EPA,1986b). • . -•-.-•
Reproductive effects may result from
developmental exposures. For example,-
environmental exposures may result in
oocyte toxicity, in which a loss of L
primordial oocytes irreversibly affects a
woman's fertility. The exposures of
importance may occur during both the
-------
• ' • :",; :. ;;a ..... ,,„ •
;1!-1 ..... 1 1!''"!!,; ..... i ..... i, j; • ; ;
' ...... F 11
..... '«,; i,,::i ....... < s
§3B|2 ^deral; .Register / .Vol. \ 5g7 N67 2_Mj_^ws^y,"Decemb"er'5, 1991 / Notices"
prenatal period and after birth. Oocyte
depletion is difficult to examine directly
in women due to the mvasiveness of the
tests required; however, it can be
studied indirectly through evaluation of
the age at reproductive senescence
(menopause) (Everson et al., 1986). Risk
: assessment methodsfor female
reproductive effects are described in the
"Proposed Guidelines for Assessing
Female Reproductive Risk" (U.S. EPA,
1908c),
Developmental exposures to males
could affect their reproductive function
(e.g., deplete stem or Sertoli cells
potentially affecting sperm production)
(Zenick and Clegg, 1989). If stem cell
death occurs with exposure at any age,
recovery is possible as long as some
stem cells survive, The same is true for
Sertoli cells, except that they cease
multiplication before puberty. Thus, cell
replication cannot compensate for
Sertoli cell death after puberty. Human
studies of stem and Sertoli cells would
be difficult due to the inyasiveness, of
the measure. Less direct measures, e.g.,
sperm count, morphology, and motility,
could be evaluated but this would not
indicate what cells or stage of
spefmaiogeriesls had been affected. Risk
assessment methods for male
reproductive effects are described in the
"Proposed Guidelines for Assessing
Male Reproductive Risk" (U.S. EPA,
1988b).
In addition tp thp above effects,
genetic damage to germ cells may result
from developmental exposures.
Outcomes resulting from germ-cell
mutations could include reduced
probability of conception as well as
increased probability of embryo/fetal '
loss and other developmental effects.
These end points could be studied using
the approaches described above.
However, a human germ-cell mutagen
has not yet been demonstrated (U.S.
EPA, 1980c). Based on animal studies,
critical exposures are to germ cells or
early zygotes. Germcell mutagenicity
could also be expressed as genetic
diseases in future generations.
Unfortunately, these studies would be
very difficult to conduct in human
populations due to the long time lag
between exposure and outcome. For
more information, refer to the
"Guidelines for Mutagenicity Risk
Assessment" (U.S. EPA 1986c).
(4J Community studies/surveillance
programs'. Eptdemiologic studies may
also be based on broad populations
such as a community, a nationwide
probability sample, or surveillance
programs (such as birth defects
registries). Other studies have examined
environmental exposures, such as toxic
agents in the water system, and adverse
pregnancy outcome (Swan et al., 1989;
Deane gt al.,1989). Unfortunately, in
these studies maternally-mediated
effects may be difficult to distinguish
from paternally-mediated effects. In
addition, the presumably lower
exposure levels (compared to industrial
settings) may require very large groups
for study. A number of case-referent
studies have examined thejelationship
between broad classes of parental
occupation in certain communities or
countries, and embryo/fetal loss
(Silverman et al., 1985), birth defects
(Hemminki et al., 1980; Kwa and Fine,
1980; Papier, 1985), and childhood .
cancer (Kwa and Fine, 1980; Zack et al.,
1980; Hemminki et al., 1981; Peters et al.,
1981J. In these reports, jobs are typically
classified into broad categories based
on the probability of exposure to certain
classes or levels of exposure (e.g., Kwa
and Fine, 1980). Such studies are most
helpful in the identification pf topics for
aidditional study. However, because of
the broaS groupings of types or levels of
exposure, such studies are not typically
useful for risk assessment of a particular
agent;
Surveillance programs may also exist
in occupational settings. In this case,
reproductive histories and/or clinical
evaluations could be followed to
monitor for reproductive effects of
exposures. Both could yield very useful
data for risk assessment; however, a
clinical evaluation program would be
costly to maintain, and there are
numerous impediments to the collection
of reliable and valid information in the
workplace. These might include similar
concerns to those previously discussed
plus potentially low participation rates
due to employee sensitivities and "
confidentiality concerns.
(5) Identification of exposures
important for developmental effects. For
all examinations of the relationship
between developmental effects and
potentially toxic exposures; the
identification of the appropriate •
exposure is crucial. Preconceptional
exposures to either parent and in utero
exposures have been associated with
the more commonly examined outcomes
(e.g.. fetal loss, malformations, birth
weight, and measures of infertility).
These exposures, plus postnatal
exposure from breast milk, food, and the
general environment, may be associated
with postnatal developmental effects
(e.g., changes in behavioral and
cognitive function, or growth). The
magnitude of exposure may affect the
spectrum of outcomes observed. This
issiie is discussed in more detail in
sections III.A.l.b and III.B.
Infants and young children may
receive disproportionate levels of
exposure due to their tendency to "put
everything" in their mouths (pica) and
the greater time they spend on the floor
Carpets may serve as a reservoir for
toxic agents (e.g., pesticides and lead
dust), and the air nearer the floor may
have greater levels of certain airborne
toxicants (e.g., mercury from latex
paints).
Exposures in environmental settings
are frequently lower than in industrial
and agricultural settings. However, this
relationship may change as exposures
are reduced in workplaces, and as more
is learned about environmental
exposures (e.g., indoor air exposures,
pesticides usage). Larger populations are
necessary in settings with lower
exposures (Lemasters and Selevan,
1984). Other factors affect the
identification of reproductive or
developmental events with various
levels of exposure. Exposed individuals
may move in and out of areas with
differing levels and types of exposures,
affecting the number of exposed and
comparison events for study. Thus,
exposures can be short-term or chronic.
Data on exposure from human studies
are frequently qualitative, such as
'employment or residence histories. More
quantitative data may be difficult to
obtain due to the nature of certain study
designs (e.g., retrospective studies) and
historical limitations in exposure
measurements. Many developmental
outcomes result from exposures during
certain critical times. The appropriate
exposure classification depends on the
outcome(s) studied, the biologic
mechanism affected by exposure, and
the biologic half-life of the agent. The
biologic half-life, in combination with
the patterns of exposure (e.g.,
continuous or intermittent) affect the
individual's body burden and
consequently the "true" dose during the
critical period. The probability of
misclassification of exposure status may
affect the ability to recognize a true
effect in a study (Selevan, 1981; Hogue,
1984; Lemasters and Selevan, 1984;
Sever and Hessol, 1984; Kimmel et al,,
1986). As more prospective studies are
done, better estimates of exposure will
be developed.
b. Examination of Clusters or Case
Reports/Series. The identification of
cases or clusters of adverse pregnancy
outcomes is generally limited to those
identified by the women involved, or
clinically by their physicians. Examples
of outcomes more easily identified
include mid to late fetal loss or
congenital malformations. Identification
of other effects, such as very early
in i n in n ii inn linn i ii nn i in
ir'JIiii ' „'
BllllIT, •(:;«!". H1,, ..
IIIIIIIIII 11(11
IIIIIIIIII I IIIIIIIIII
IIIIIIIIII IIIIIIIIII | IIIIIIIIII III I III III
111 III 111 II 111 111
II 111 11 IIIIIIIIII 111 111 Illllll 1111(1111111111
IIIIIIIIII II II 111
1 1 11(11 j
i I II IIIIIIIIII]
-------
Federal Register. / Vol. 56, NQ. -234 .
December 5, 1991 / Notices
63813
embryonic loss may be difficult to
separate from the study of sub- or
infertility. Such "nonevents" (e.g., lack
of pregnancies or children) are much
harder to recognize than are
developmental effects such as
malformations resulting from in utero
exposure. While case reports have been
important in the recognition of some
agents that cause developmental
toxicity, they may be of greatest use in
suggesting topics for further
investigation (Hogue, 1985). Reports of
clusters and case reports/series are best
used in risk assessment in conjunction
with strong laboratory data to suggest
that effects observed in animals also
occur in humans. Previous discussion of
the use of human data should be taken
into account wherever possible.
3. Other Considerations
Several other types of information
may be considered in the evaluation and
interpretation of human and animal
data. Information on pharmacokinetics
and structure-activity relationships may
be very useful, but is often lacking for
developmental toxicity risk
assessments.
. a. Pharmacokinetics. Extrapolation of
toxicity data between species can be
aided considerably by the availability of
data on the pharmacokinetics of a
particular agent in the species tested
and, when available, in humans.
Information on absorption, half-life,
steady-state and/or peak plasma
concentrations, placental metabolism
and transfer, excretion in breast milk,
comparative metabolism, and
concentrations of the parent compound
and metabolites may be useful in
predicting risk for developmental
toxicity. Such, data may also be helpful
in defining the dose-response curve,
developing a more accurate comparison
of species sensitivity {Wilson et al.,
1975,1977), determining dosimetry at
target sites, and comparing
pharrnacokinetic profiles for various
dosing regimens or routes of exposure.
Pharrnacokinetic studies in
developmental toxicology are most
useful if conducted in animals at the
stage when'developmental insults occur.
The correlation of pharrnacokinetic
parameters and developmental toxicity
data may be useful in determining the
contribution of specific pharrnacokinetic
parameters to the effects observed
(Kimmel and Young,.1983).
While human pharmacokinetic data
are often lacking, absorption data in
laboratory animals for studies
conducted by any relevant route of
exposure may assist in the
interpretation of the developmental
toxicity studies in the animal models for
the purposes of risk assessment. Specific
guidance regarding both the
development and application of
pharmacokinetic data was agreed upon
by the participants at the "Workshop on
the Acceptability and Interpretation of
Dermal Developmental Toxicity
Studies" (Kimmel and Francis, 1990). It
was concluded that absorption dataare
needed both when a dermal
developmental toxicity study shows no
developmental effects, as well as when
developmental effects are seen. The
results of a dermal developmental
toxicity study showing no adverse
developmental effects and without
blood level data (as evidence of dermal
absorption) are potentially misleading
and would be insufficient for risk ;
assessment, especially if interpreted as
a "negative" study. In studies where
developmental toxicity is detected,
regardless of the route of exposure,
absorption data can be used to establish
the internal dose in maternal animals for
risk extrapolation purposes,
b. Comparisons of Molecular
Structure. Comparisons of the chemical
or physical properties of an agent with
those known to cause developmental
toxicity may indicate a potential for
developmental toxicity. Such
information may be helpful in setting
priorities for testing of agents or for
evaluation of potential toxicity when .
only minimal data are available.
Structure-activity relationships have not
been well studied in developmental
toxicology, although data are available
that suggest structure-activity . ; .
relationships for certain classes of
chemicals (e.g., glycol ethers, steroids,
retinoids). Under certain circumstances
(e.g., in the case of new chemicals), this
is one of several procedures used to
evaluate the potential for toxicity when
little or no data are available.
B. Dose-Response Evaluation
The evaluation of dose-response
relationships for developmental toxicity
includes the evaluation of data from
both human and animal studies. When
quantitative dose-response data are
available in humans and with sufficient
range of exposure, dose-response - -, :
relationships may be examined. Since
data on human dose-response
relationships have been available
infrequently, the dose-response
evaluation is usually based on the
assessment of data from tests performed
in laboratory animals.
Evidence for a dose-response
relationship is an important criterion in
the assessment of developmental
toxicity, which is usually based on
limited data from standard studies using
three dose groups and a control group.
Most agents causing developmental
toxicity in humans alter development at
doses within a narrow range near the
lowest maternally toxic dose (Kimmel et
al., 1984). Therefore, for most agents, the
exposure situations of concern will be
those that are potentially near the
maternally toxic dose range. For those
few agents that produce developmental
effects at much lower levels than
maternal effects, the potential for
exposing the'conceptus to damaging
doses is much greater than when the
maternal and developmental toxic doses
are similar. As mentioned previously
(Section III.A.l.b), however, traditional
dose-response relationships may not
always be observed for some end
points. For example, as exposure
increases, embryolethal levels may be
reached, resulting in an observed
decrease in malformations with
increasing dose (Wilson, 1973; Selevan
and LeMasters, 1987)! The potential for
this response pattern indicates that
dose-response relationships of
individual end points as well as
combinations of end points (e.g., dead
and malformed combined) must be
carefully examined and interpreted.
The evaluation of dose-response
relationships includes the identification
of effective dose levels as well as, doses
that are associated with no increased
incidence of adverse effects when
compared with controls. Much of the
focus is on the identification of the
critical effect(s) (i.e., the adverse
effect(s) observed at the lowest dose
level) and the LOAEL and NOAEL
associated with that developmental
effect, which may be any of the four
manifestations of developmental
toxicity. The NOAEL is defined as the
highest dose at which there is no
statistically or biologically significant
increase in the frequency of, an adverse
effect in any of the possible
manifestations of developmental
toxicity when compared with the.
appropriate control group in a data base
characterized as having sufficient
evidence for use in a risk assessment
(see Section III.C). The LOAEL is the
lowest dose at which there is a
statistically or biologically significant"
increase in the frequency of adverse
developmental effects when compared
with the appropriate control group in a
data base characterized as having
sufficient evidence. Although a
threshold is assumed for developmental
effects, the existence of a NOAEL in an
animal study does not prove or disprove
the existence'or level of a biological
' threshold; it only defines the'highest
level of exposure under the conditions of
-------
• . i, "0 :«' " ,' •: :;t , :,i&, v. n..i' ' I, ,:' ,, ,.• x ; i;,;:- ,diiii>!!,,;" ;"jB*i,ii|i\{:>jai/i.«»y •;:;*;.J-
. , • .' • itii:»• in*,. .:,. ',;,,;i ,,ij .„s iii •( .i Jniiiiiiif i,.St$y,n•• tat, ,n', !»",( a<«: *" v* HW*Bvaasisw is
Jg'edcra.IRegister / Vol. 56, No. 234 / Thursday, December 5, 1991 /Notices
itut ^ucSj that is not associated with a
Significant increase in adverse effects.
Several limitations tn the use of the
XQAELhave been,described (Gaylor,
1883; Crump, 1984,; Kimmel and Gaylor,
1888; G, j»|:l;or, 188Si" Brown and'"Erdreich,
Hjfej, ^{ppiol^gaoj: (i) Usa of the
NO.:
,;:,:!,:'":
dose chosen for the NOAEL (5) Since
theNOAEL is.definedas a dpse that
does not produce an observed increase
in adverse responses from control levels
and is dependent on the power of the
study, theoretically, the risk associated
with it may fall anywhere between zero
and an incidence just below that
detectable from control levels (usually
in the range of 7% to 10% for quantal
data). Crump (1984) and Gaylor (1989)
have estimated the upper confidence
limit on risk at the NOAEL to be 2% to
Q% for specific developmental end
points from several data sets.
Because of the limitations associated
with the use of the NOAEL (Kimmel and
Gaylor, 1988; Gaylor, 1989; Kimmel,
1990), the Agency is evaluating the use
of an additional approach for more
quantitative dose-response evaluation
'i! 'i::!J! *f i, >;'^
when sufficient data are available, i.e.,
the benchmark dose (Crump, 1984). The
benchmark dose is based on a model-
derived estimate of a particular
incidence level, such as 10S incidence.
More specifically, the benchmark dose
(BE!) is derived by modeling the data in
the observed range, selecting an
incidence level within or near the
observed range (e.g., the effective dose
to produce a 10% increased incidence of
response, the EDio), and determining the
upper confidence limit on the model.
The upper confidence value
corresponding to, for example, a 10%
excess in response is used to derive the
BD which is the lower confidence limit
on dose for that level of excess
response, in this case, the LEDja (see
Figure 1).
BILLING CODE 6KO-50-M
villpi I:TI" [T'ljjjilliiiM ,i', Ir ni F " Jllilm1!'1,;'!!: ui! I'Uil'ii"'!''!: .iin*" ll n t'li'i'iiilli
m a :sSnisira
I
: • ",,:":!i
i'* 'in*:; jiiii, I
1 i, t, , iE •',,,,",,,;,; f!,,,;;,;,;,},,;;,j.ll*1:!,^!V'^*;y! i.^'-'''^»;;-• • ;;';!'"5!*'i;'!;f' ^''• •'&'•• *^
';' T '= ' '*T'* »'" , 'i1 Jl, I T" '"' i"1'11 !'";'! I""1'" I
''"lilBlp'Tl^'' 'iii,' I1'!11!!1!', V 'I1'1 l*lll' -^' I"" '"S "ll' '„;,;„'!",' !!|Ei!li
iiiiiiiiiiiirii"ii',;,i"''til i'i'"'i,imi iiiiui'iiin;iiiii!iiii|i;; ,," .1;;<. ,„ <,,:,HI IJLIINIIH K ,,HI,inii
-------
Federal Register /Vol. 58. No. 234 / Thursday. December 5,-1991 / Notices
B3S15
CO
O
;s 5
o\
•^•r -, . /""S
£§QS
i
8-S
£0
u w
S§
pa
60
.5 •5 2? 2
S * JO 5
-------
'"(Hi •
..m
!!« k3! J't . ;'
'
.. , .
;;;&|81C5ii
., , iiijiii "i1 i ...... :.M, •" , ....... , in ,tl! „ ...... .ii i1!' ii: ::;,ii:'ijiii';il|ijvl;ii ..... ..... iii.vai ..... 'l'-xi ..... ~
* ,!?g*ster /. Y°l- 56, No. 234 / Thursday, 'December 5, 19S1 / Notices
' ' -^--^-^^^^^-^^^^^ -- , - : -- ,__, - = ----- : -
/adous mathematical approaches
nave been proposed for deriving the
enchmark dose for developmental
toxlcfty data (e.g.. Crump. 1984; Rai and
/an Ryzin, 1085; Kimmel and Gaylor,
WQS; Faustrnan et al., 1988; Chen and
: Kottegj, iqs$ Kodell et alj, 19§i]. Such .......
11 'tHocJcIs may tie used to calculate .the ......
benchmark dose, and the particular
node! used may be less critical since
estimation of the benchmark dose is
limited to the observed dose range.
Since the model is only used to fit the
observed datttt the assumptions about
the existence or nonexistence. of a
threshold are not as pertinent Thus,
Models that fit the empirical data well
a«y provide a reasonable estimate of
sue bj:nehmarjK dose, although biological
factors known to influence data should
be Incorporated into the model (e.g.,
fntratittei1, conflations,, porrelations .......
among end points (kyan et al,, 1991 |),
T^he Agency is currently conducting
,:j," litadies to evaluate. th,e application of
,;"' fievcal models to actual data sets for
calculating the benchmark dose, to
determine the minimum data required
for modeling, and to develop methods
for application to continuous data. In
addition, information from these studies
will be used to develop guidance for
application of the benchmark dose
approach to the calculation of the RfDM
or the RfCl)T, since the Agency has
limited experience with this approach
{sea Section III.D for a discussion of the
............. '
oT an
Using the benchmark dose approach,
an LED** can be calculated for each
effect of an, agent for which there is a
data base with sufficient evidence to
conduct a risk assessment. In some
^wes, the data may be sufficient to also
tettmale the EDes or ED®t which should
be closer to a Jrue no effect dos.e. A
level between the EDoi and the ED™
usually corresponds to the lowest level
of risk that can be estimated for
binomial end points from standard
developmental toxicity stadias.
Certain principles are especially
applicable for determining the NOAEL.
LOAEL, and benchmark dose for
developmental toxicity studies. First, the
NOAEL. LOAEL, or benchmark dose are
Identified for both developmental and
maternal or adult toxicity, based on the
information available from studies in
which developmental toxicity has been
evaluated. The NOAEL, LOAEL. or
benchmark dosa for maternal or adult
to\(cfty should be compared with the
corresponding values from other adult
toxicity data tq determine if the
pregnant or lactating female or the
paternal animal (if exposure is prior to
mating) may be more sensitive to an
agent than adult males or nonpregnant
females in other toxicity studies that
generally involve longer exposure times.
Second, for developmental toxic
effects, a primary assumption is that a
single exposure at a critical time in
development may produce an adverse
developmental effect, i.e., repeated
exposure is not a necessary prerequisite
for developmental toxicity to be
manifested. In most cases, hoxvever, the
data available for developmental
toxicity risk assessment are from studies
using exposures oyer several days of
development, and the NOAEL, LOAEL,
and/or benchmark dose is most often
based on a daily dose, e.g., mg/kg/day.
Usually, the daily dose is not adjusted
for duration of exposure because
appropriate pharmacokinetic data are
not available. In cases where such data
ar,e ayailghje, adjustments may be made
to provide an estimate of equal average
concentration at the site of action for the
human exposure scenario of concern.
For example, inhalation studies often
use 6 hr/dsy exposures during
development If the human exposure
scenario is continuous and
pharmacokinetic data indicate an
accumulation with continuous exposure,
appropriate adjustments can be made.
If, on the other hand, the human
exposure scenario of concern is very
brief or intermittent, pharmacokinetic
data indicating a long half-life may also
require adjustment of dose. When
quantitative absorption data by any
route of exposure are available, the
NQAEL rnay be adjusted accordingly;
e.g., absorption of 50% of administered
dp.se. could resultjn.a 5Q% jgdiisflsn in
the NOAEL. If absorption in the
experimental species has been
determined, but human absorption is not
known., human absorption is generally
assumed to be the same as that for the
species,with the greatest degree of
absorption. NOAELs from inhalation
exposure studies are adjusted to derive
a human equivalent concentration
(HECJ by taking into account known
anatomical and physiological species
differences [e.g., minute volume,
respiratory rate, etc.) (U.S. EPA, 1991b).
In summary, the dose-response
evaluation identifies the NOAEL,
LOAEL, or benchmark dose, defines the
range of doses for a given agent that are
effective in producing developmental
and maternal toxicity, the route, timing
apd dura tigs gf exposure, species
specificity of effects, and any
pharmacokinetic or other considerations
that might influence the comparison
with human exposure scenarios.This
information should always accompany
Ihp sharacteFization of the health-
related data base (discussed in the next
section).
C. Characterization of the Health-
Related Data Base
This section describes the process for
evaluating the health-related data base
as a whole on a particular agent and
provides criteria for characterizing the
evidence for judging a potential
developmental hazard in humans within
the context of expected exposure or
dose. This determination provides the
basis for judging whether or not there
are sufficient data for proceeding further
in the risk assessment process. This
section does not address the nature and
magnitude" of human health risks which
are discussed as part of the final
characterization of risk along with
. estimates of potential human exposure
and the relevancy of available data for
estimating human risk. Characterization
of hazard potential within the context of
exposure or dose should assist the risk
assessor in clarifying the strengths and
uncertainties associated with a
particular data base. Because a complex
interrelationship exists among study
design, statistical analysis, and
biological significance of the data, a
great deal of scientific judgment, based
on experience with developmental
toxicity data and with the principles of
study design and statistical analysis,
may be required to adequately evaluate
the data base. Scientific judgment is
always necessary, and in many cases,
interaction with scientists in specific
disciplines (e.g., developmental
toxicology, epidemiology, statistics) is
..recommended.
A categorization scheme for
characterizing the evidence for
developmental toxicity is presented in
Table 3. The categorization scheme
contains two broad categories, sufficient
evidence and insufficient evidence,
which are defined in the table. Data
from all available studies, whether
indicative of potential hazard or not,
must be evaluated and factored into a
judgment as to the strength of evidence
available to support a complete risk
assessment for developmental toxicity.
The primary considerations are the
human data, if available, and the
experimental animal data. The judgment
of whether the data are sufficient or
insufficient should consider quality of
the data, power of the studies, number
and types of end points examined,
replication of effects, relevance of the
test species to humans, relevance of
route and timing of exposure for both
human and animal studies,
appropriateness of the dose selection in
animal studies, and number Q! species
i' -Si: f HI, '•'^^•"•'U'lf •''! k LTMiS I! fiW^'ffi'iEtK'iW! f,S itM ^^SpSii'SiRIBi P
,:"" : : "" ™";111"",.;!I'll, 1 111,111111|(,111,!!!,'!!!!:!! '''122
-------
! t, M federal Register;/ -Vol., 56, No., 234 / Thursday,- December 5, 1991 /Notices
* 63817
examined. In addition, pharmacokinetic
data and structure-activity
considerations, data from other toxicity
studies, as well as other factors that
may affect the strength of the evidence,
should be taken into account.
Table 3.—Categorization of the Health-
Related Data Base for Hazard Identifica-
tion/Dose-Response Evaluation
Sufficient Evidence
The sufficient evidence category includes
data that collectively provide enough infor-
mation to judge whether or not a human
developmental hazard could exist within the
context of dose, duration, timing and route of
exposure. This category includes both human
and experimental animal evidence.
Sufficient Human Evidence: This category
includes data .from epidemiologic studies
(e.g., case control and cohort) that provide
convincing evidence for, the scientific com-
munity to judge that a causal relationship is
or is not supported. A case series in conjunc-
tion with strong supporting evidence may
also be used. Supporting animal data may or
may not be available.
Sufficient Experimental Animal Evidence/
Limited Human Data: This category includes
data from, experimental animal studies and/
or limited human data that provide convinc-
ing evidence for the scientific community to
judge if the potential for developmental tox-
icity exists. The minimum evidence neces-
sary to judge that a potential hazard exists
generally would be data demonstrating an
adverse developmental effect in a single, ap-
propriate, well-conducted study in a single
experimental animal species. The minimum
evidence needed to judge that a potential
hazard does not exist would include data
from appropriate, .well-conducted laboratory
animal studies in several species (at least
two) which evaluated a variety of the poten-
tial manifestations of developmental toxicity,
and showed no developmental effects at
doses that were minimally toxic to the adult.
Insufficient Evidence
This category includes situations for which
there is less than the minimum sufficient
evidence necessary for assessing the poten-
tial for developmental toxicity, such as when
no' data are available on developmental tox-
icity, as well as for data bases from studies
in animals or humans that have a limited
study design (e.g., small numbers, inappro-
priate dose selection/exposure information,
other uncontrolled factors), or data from a
single species reported to have no adverse
developmental effects, or data bases limited
to information on structure/activity relation-
ships, short-term tests, pharmacokinetics, or
metabolic precursors.
In general, the categorization is based
on criteria that define the minimum
evidence necessary to conduct a hazard
identification/dqse-response evaluation.
Establishing the minimum sufficient
human evidence necessary to do a
hazard identification/dose-response
.evaluation is difficult, since there are
often considerable variations in study
designs arid study group selection. The
body of human data should contain
convincing evidence as described in the
"Sufficient Human Evidence" category.
Because the human data necessary to
judge whether or not a causal
relationship exists are generally limited,
there are currently few agents that can
be classified in this category. In the case
of animal data, agents that have been
tested adequately in laboratory animals
according to current test guidelines
generally would be included in .the
"Sufficient Experimental Animal
Evidence/Limited Human Data"
category. The strength of evidence for a
data base increases with replication of
the findings and with additional animal
species tested. Information on
pharmacokinetics or mechanisms, or on
more than one route of exposure may
reduce uncertainties in extrapolation to
the human.
More evidence is necessary to judge
that an agent is unlikely to pose a
hazard for developmental toxicity than
that required to judge a potential
hazard. This is because it is more
difficult, both biologically and
statistically, to support a finding of no
apparent adverse effect than a finding of
an adverse effect. For example, to judge
that a hazard for developmental toxicity
could exist for a given agent, the
minimum evidence necessary would be
data from a single, appropriate, well-
executed study in a single experimental
animal species that demonstrate
developmental toxicity, and/or
suggestive evidence from adequately
conducted clinical/epidemiologic
studies. On the other hand, to judge that
an agent is unlikely to pose a hazard for
developmental toxicity, the minimum
evidence would include data from
appropriate, well-executed laboratory
animal studies in several species (at
least two) which evaluated a variety of
the potential manifestations of
developmental toxicity and showed no
adverse developmental effects at doses
that were minimally toxic to the adult
animal. In addition, there may be human
data from appropriate studies
supportive of no adverse developmental
effects.
If a data base on a particular agent.
includes less than the minimum
sufficient evidence {as defined in the
"Insufficient Evidence" category)
necessary for a risk assessment, but
some data are available, this
information could be used to determine
the need for-additional testing. In the
event that a substantial data base exists
for a given chemical, but no single study
meets current test guidelines, the risk
assessor should use scientific judgment
to determine whether the composite
data base may be yiejved as meeting the
"Sufficient Evidence" criteria. In some
cases, a data base may contain
conflicting data. In these instances, the
risk assessor must consider each study's
strengths'and weaknesses within the
context of the overall data base in an
attempt to define the strength of
evidence of the data base for assessing
the potential for developmental toxicity.
Judging that the health-related data
base is sufficient to indicate a potential
developmental hazard does not mean
that the agent will be a hazard at every
exposure level (because of the
assumption of a threshold) or in every
situation (e.g., hazard may vary
significantly depending on route and
timing of exposure). In the final risk
characterization, the characterization of
the health-related data base should *
always be presented with information
on the dose-response evaluation (e.g.,
LOAEL,, NOAEL, and/or benchmark '
dose), exposure route, timing and
duration of exposure, and with the
human exposure'estimate.
D. Determination of the Reference Dose
(RfDoy) or Reference Concentration
(RfCor) for Deyelopmen tal Toxicity
The RfDDT or RfCDT is an estimate of a
daily exposure to the human population
that is assumed to be without
appreciable risk of deleterious
developmental effects. The use of the'
subscript DT is intended to distinguish
these terms from the reference dose
(RfD) for oral or dermal exposure or, the
reference concentration (RfC) for
inhalation exposure, terms that refer
primarily to chronic exposure situations
(U.S. EPA, 1991b). The RfLV or RfCDT is
derived by applying uncertainty, factors
to the NOAEL (or the LOAEL, if a
NOAEL is not available), or the
benchmark dose. To date, the Agency
has applied uncertainty factors only to
the NOAEL or LOAEL to derive an
RfDDT or RfCDT. The Agency is planning
eventually to use the benchmark dose
approach as the basis for derivation of
the RfDDT or RfGDT and will develop
guidance as information is acquired and
analyzed from ongoing Agency.studies.
The most sensitive developmental
effect (i.e., the critical effect) from the
most appropriate and/or sensitive
mammalian species is used for
determining the NOAEL, LOAEL, or the
benchmark dose in deriving the RfDPT or
RfCDT (Section IH.B). Uncertainty factors
(UFs) for developmental and maternal
toxicity applied to the NOAEL generally
include a 10-fold factor for interspecies
variation and a 10-fold factor for
-------
,., ; .
Federal. Register / Vol. SB. No. 234 /../Thursday, December 5, 1991 / Notices
i variation. In general, an
„ factor is not applied to
for duration of exposure.
Additional factors may be applied So
" fece'tiri! for otfccr ttncertainties or
", additions,! btfoq&ation that may exist in
I', Ine data base. For example, the
'•fatutanl study cJ^lga for a
developmental toxicity study cai^s for a
low dose that demonstrates a NOAEL,
'' tpt Itj tpiue pases, the lowest dose
administered may cause significant
,'; advVba effects) and,, thus, be identified,
its thfi 'LQABL In circumstances where
only a LOAEi is available, the use of an
•ddittoiMi uncertainty factor of up to 10
11 may bt" tctpjifpd, depending on the
sensitivity of the end points evaluated,
adequacy of dose levels tested, or
general confidence in the LOAEL In
addition,if a benchmark dose hag. been
'"' , cMc-dated,, it may be used to help"
'Interpret row c1
-------
Federal Register /Vol. 56, No. 234 7 Thursday, December 5, 1991 / Notices
63819
to be manifested, although it should be
considered in cases where there is
evidence of cumulative exposure or .
where the half-life of the agent is
sufficiently long to produce an
increasing body burden over time).
Therefore, it is assumed that, in most
cases, a single exposure at any of
several developmental stages may be
sufficient to produce an adverse
developmental effect. Most of the data
available for risk assessment involve
exposures over several days of
development. Thus, human exposure
estimates used to calculate margins of
exposure (MOE, see following section)
or to compare with the RfDnr or RfCDT
are usually based on a daily dose that is
not adjusted for duration or pattern of
exposure. For example, it would be
inappropriate in developmental toxicity
risk assessments to use time-weighted
averages or adjustment of exposure over
a different time frame than that actually
'encountered (such as the adjustment of
a 6-hour inhalation exposure to account
for'a 24-hour exposure scenario), unless
pharmacokinetic data were available to
' indicate an accumulation with
continuous exposure. In the case of
intermittent exposures, examination of
the peak exposure(s), as well as the
average exposure over the time p'eriod
of exposure, would be important.
It should be recognized that, based on
the definition used in these Guidelines
for developmental toxicity, exposure of
almost any segment of the human
population may lead to risk to the
developing organism. This would
include fertile men and women, the
developing embryo and fetus, and
children up to the age-of sexual
maturation. Although some effects of
developmental exposures may be
manifested while .the exposure is
occurring (e.g., spontaneous abortion,
structural abnormality present at birth,
childhood mental retardation), some
effects may not be detectable until later
in life, long after exposure has ceased
(e.g., perinatally induced carcinogenesis,
impaired reproductive function,
shortened lifespan).
V. Risk Characterization
a. Overview
Risk characterization is the
culmination of the risk assessment
process. In this final step, risk
characterization involves integration of
the toxicity information from the hazard
identification/dose-response evaluation
with the human exposure estimates and
provides an evaluation of the overall
quality of the assessment, describes risk
in termsjpf the nature and extent of
harm, and communicates the results of
the risk assessment to a risk manager.
The risk manager can then use the risk
assessment, along with other risk
management elements, to make public
health decisions. The following sections
describe these three aspects of the risk
characterization in more detail, but do
not attempt to provide a full discussion
of risk characterization. Rather these
Guidelines point out issues that are
important to risk characterization for . -
developmental toxicity.
B. Integration of the Hazard
Identification/Dose-Response
Evaluation and Exposure-Assessment
In developing the hazard
identification/dose-response and
exposure portions of the risk
assessment, the risk assessor makes
many judgments concerning human
relevance of the toxicity data, including
the appropriateness of the various
animal models for which data are
available, the route, timing, and duration
of exposure relative to expected human
exposure, etc. These judgments should
be summarized at each stage of the risk
assessment process (e.g., the biological
relevance of anatomical variations may
be made in the hazard identification •
process, or species differences in
metabolic patterns in the dose-response
evaluation). When data are not
available to make such judgments, as is
often the case, the background
information and assumptions discussed
in the Introduction (Section I) provide a
default position. The risk assessor must
determine if some of these judgments
have implications for other portions of
the assessment, and whether the various
components of the assessment are
compatible..
The description of the relevant data
should convey the major strengths and
weaknesses of the assessment that arise
from availability of data and the current
limits of understanding of the
mechanisms of toxicity. Confidence in
the results of a risk assessment is a
function of confidence in the results of
the analysis of these elements. Each of
these elements should have its own
characterization as a part of it.
Interpretation of data should be
explained, and the risk manager should
be given a clear picture of consensus or
lack of consensus that exists about
significant aspects'of the assessment.
Whenever more than one view is
supported by the data and choosing ;
between them is difficult, both views
should be presented. If one has been
selected over another, the rationale
should be given; if not, then both should
be presented as plausible alternative
results.
The risk characterization should not
only examine the judgments, but also
explain the constraints of available data
and the state of knowledge about the
phenomena studied in making them,
including;
• The qualitative conclusions about
the likelihood that the agent may pose a
specific hazard to human health, the
nature of the observed effects, under
what conditions (route, dose levels,
time, and duration) of exposure these
effects occur, and whether the health-
related data are sufficient to use in a
risk assessment;
• A discussion of the dose-response
patterns for the critical effect(s), data
such as the shapes and slopes of the
dose-response curves for the various
end points, the rationale behind the
determination of the MOAEL, LOAEL,
and/or calculation of the benchmark
dose, and the assumptions underlying
the estimation of the RfDur or RfCo?;
and
• The estimates of the magnitude of,
human exposure, the route, duration,
and pattern of the .exposure, relevant
pharmacokinetics, and the size and
characteristics of the populations
exposed. ,
The risk characterization of an agent
, should be based on data from the most
appropriate species, or, if such -
information is not available, on the most
sensitive species tested. It should also
be based oh the most sensitive indicator
of toxicity, whether maternal, paternal,
or developmental, when such data are
available, and should be considered in • • •
relationship to other forms of toxicity.
If data used in characterizing risk are
from a route of exposure other than the
expected human exposure, then
pharmacokinetic data should be used, if
available, to extrapolate across routes
of exposure. If such data are not
available, the Agency makes certain
assumptions concerning the amount of
absorption likely or the applicability of
the data from one route to another (U.S.
EPA, 198d985b). ,
The..tevel of confidence in the hazard
idenfification/dose-response evaluation
should be stated to the extent possible,
including determination of the •
appropriate category regarding
sufficiency of the health-related data. A
comprehensive risk assessment ideally
includes information on a variety of end
points that provide insight into the full
spectrum of developmental responses. A
profile that integrates both human and
test species data and incorporates a
brojd range of developmental effects
proftdes more confidence in a risk
assessment for a given agent.
-------
, ,;„
In, ::!:
„: ,f iii
. ' ;: ':;l'. ;>•, •,' ,'•" ,•••' £•:• i1"* :, i'1!1 ill!, i/l';,'! '? ;.:i K;:H! I»»fcg|-Bf;«;:„! l^iAlklbi.!^ : Vs. >::,,-: j'Ji'l ff'-. WH .i!» -'''!,' ririj HilMr«lRp|WjL.'.'
:,;'.!;;s, ft,j,'•£$ \ '• f:;I?»'i:• S^J• •;f 1'M^fES^Ip(^]WffilKf?5'• ]!!;;;' ii.?!f,|(:S*f I j"li*;:;ifIf:;!>f'|!3fl^illffiifll^* I
,,,,,,§3120 ' Federal'Register /" Vol. 56, No.";234 J-Thursd'ay, December 5," 1991 /"Notices '"
The ability to describe the nature of
human exposure is Important for
prediction of specific outcomes and the
likelihood of permanence or reversibility
of the effect. An important part of this
effort is a description of the nature of
the exposed populations. For example,
the consequences of exposure to the
<&vejop(ng individual versus the adult
can differ markedly and again can
,.j ,hj1Jlu^2q®iwheitherithei|e|fect!j ..... are ......
^iJwnslent o^ permanent.' Other .................
! '4. coiitJip{tler^t|pn|i,iirelatiye to human .................
.•'•• exjj&iiurei m'igh't include potential ........
s.v neigistic; effects, increased
susceptibility resulting from concurrent
for exposures to other agents,
''corie'itrrcnl disease, ..... and nutritional .....
"status. ............... .................... '
C. ZTffsmptors of Developmental
ToxMty Risk
There are a number, of ways to
describe risks. These include:
1. Estimation of the Number of
Individuals Exposed to Levels of
Concern
The RfD0f or RfC»r is assumed to be a
level at or below which no significant
risk occurs. TJiercfore, information ;t>n
the populations at or below the RfDor or
RfOoT ("not Hkely to be at risk") and
above the RfDor or RfC0T ("may be at
risk") may be useful information for risk
managers.
This method is particularly useful to a
risk manager considering possible
actions to ameliorate risk for a
population, If the number of persons in
the "at risk/ category can be estimated,
than the number of persons potentially
h:; ' tppove^frpintlie'/'atri^lg" cajegory
after a contemplated action is taken can
..... baiuse^i,asi anjndjcg tiqn of thg ..... effjejacy
Y'jpf tjiat action- ....... [[[ ............................
2. Presenting Specific Scenarios
Presenting specific scenarios in the
form of "what |f?" questions is
particularly useful to give perspective to
the risk manager, especially where
criteria, tolerance limits, or media
qialfty limits are being set. The question
being asked in these cases is, "At this
proposed Ifmit, what would be the
.•resulting risk for developmental toxicity
above
3, Risk Characterization for Highly
Exposed Individuals
Th|s measure and the next are
wSmples of specific scenarios. The
purpose of this measure is to describe
thp upper end of the exposure
distribution. This allows risk managers
lo evaluate whether certain individuals
tire at disproportionately high or
unacceptable high risk.
The objective of looking at the upper
end of the exposure distribution is to
derive a realistic estimate of a relatively
highly exposed individual(s), for
example by identifying a specified upper
percentile of exposure in the population
and/or by estimating the exposure of the
most highly exposed individual(s).
Whenever possible, it is important to
express the number of individuals who
qomprise the highly exposed group and
discuss the potential for exposure at still
higher levels.
If population data are absent, it will
often be possible to describe a scenario
representing high end exposures using
upper percentile or judgment-based
values for exposure variables. In these
instances^' caution should be taken not
to overestimate the high end values if a
"reasonable" exposure estimate is to be
achieved.
4. Risk Characterization for Highly
Sensitive or Susceptible Individuals
The purpose of this measure is to
quantify exposure to identified sensitive
or susceptible populations to the effect
of concern. Sensitive or susceptible
individuals are those within the exposed
population at increased risk of
expressing the adverse effect All stages
of development might be considered
highly sensitive or susceptible, but
certain subpopulations can sometimes
be identified because of critical periods
for exposure; for example, pregnant or
lactating women, infants, children,
adolescents,.
In general, not enough is understood
about the mechanisms of toxicity to
identify sensitive subgroups for all
agents, although factors such as
nutrition, personal habits (e.g., smoking,
alcohol consumption, illicit drug abuse),
or pre-existing disease [e.g., 'diabetes)
may predispose some individuals to be
more sensitive to the developmental
effects of various agents.
5. Other'Risk Descriptors
In risk characterization, dose-
response information and the human
exposure-estimates may be combined
either by comparing the RfDDT or RfCDT
and the human exposure estimate or by
calculating the margin of exposure
(MOE). The MOE is the ratio of the
NOAEL from the most appropriate or
sensitive species to the estimated
human exposure level from all potential
sources (U.S. EPA, 1985b). If a NOAEL.js
not available, a LOAEL may be used in
the calculation, of the MOE, but
considerations for the acceptability
would be different than when a NOAEL
is used. Considerations for the
acceptability of the MOE are similar to
that for the uncertainty factor applied to
the LOAEL, NOAEL, or the benchmark
dose. The MOE is presented along with
the characterization of the data base,
including the strengths and weaknesses
of the toxicity and exposure data, the
number of species affected, and the
dose-response, route, timing, and
duration information. The RfDDT or
RfCDT comparison with the human
exposure estimate and the calculation of
the MOE are conceptually similar but
are used in different regulatory
situations. If the MOE is equal to or
more than the uncertainty factor used as
a basis for an RfDDT or RfCDT, then the
need for regulatory concern is likely to
be reduced.
The choice of approach is dependent
upon several factors, including the
statute involved, the situation being
addressed, the data base used, and the
needs of the decision maker. While
these methods of describing risk do not
actually estimate risks per se, they give
the risk manager some sense of how
close the exposures are to levels of
concern. The RfDDT, RfCBT, and/or the
MOE are considered along with other
risk assessment and risk management
issues in making risk management
decisions, and the scientific issues that
must be taken into account in
establishing them have been addressed
here.
E. Communicating Results
Once the risk characterization is
completed, the focus turns to
communicating results to the risk
manager. The risk manager uses the
results of the risk characterization, other
technologic factors, and
nontechnological social and economic
considerations in reaching a regulatory
decision. Because of the way in which
these risk management factors may
impact different cases, consistent but
not necessarily identical risk
management decisions must be made on
a case-by-case basis. Consequently, it is
entirely possible and appropriate that an
agent with a specific risk
characterization may be regulated
differently under different statutes.
These Guidelines are not intended to
give guidance on the nonscientific
aspects of risk management decisions.
VI. Summary and Research Needs
These Guidelines summarize the
procedures that the U.S. Environmental
Protection Agency uses in evaluating the
potential for agents to cause
developmental toxicity. While these are
the first amendments to the
developmental toxicity guidelines issued
in 1986, further revisions and updates
will be made as advances occur in the
-------
5, "1991 /-Notices
63821
field. These Guidelines discuss the
assumptions that should be made in risk
assessment for developmental toxicity
because of gaps in our knowledge about
underlying biological processes and how
these compare across species.
Research to improve the risk
assessment process is needed in a
number of areas. For example, research
is needed to delineate the mechanisms
of developmental toxicity and
pathogenesis, provide comparative
pharmacokinetic data, examine the .
validity of short-term in vivo and in ,
vitro tests, elucidate possible functional
alterations and their critical periods of
exposure to toxic agents, develop
improved animal models to examine the
developmental effects of exposure
during the premating and early
postmating periods and in neonates,
further evaluate the relationship
between maternal and developmental
toxicity, provide insight into the concept..
of threshold, develop approaches for
improved mathematical modeling of
adverse developmental effects, and
improve animal models for examining
the effects of agents given by various
routes of exposure. Epidemiologic
studies with quantitative measures of
exposure are also strongly encouraged.
Such research will aid in the evaluation
and interpretation of data on
developmental toxicity, and should
provide methods to more precisely
assess risk.
VI. References
Adams, J. (1986) Clinical relevance of
experimental behavioral teratology.
Neurotoxicology 7:19-34.
Anderson, L.M.; Donovan, P.J.; Rice, J.M.
(1985) Risk assessment for transplacental
carcinogens. In: Li, A.P., ed. New
approaches in toxicity testing and their
application in human risk assessment.
New York, NY: Raven Press, pp. 179-202
Axeison, O. (1985) Epidemiologic methods in
the study of spontaneous abortions:
source of data, methods, and sources'of
error. In: Hemminki, K.; Sorsa, M.;
Vainio, H., eds. Occupational hazards
and reproduction. Washington, DG:
Hemisphere Pub., pp. 231 236.
Baird, D.D.; Wilcox, A.J.; Weinberg, C.R.
(1988) Use of time to pregnancy to study
environmental exposures. Am. J.
Epidemiol. 124:470-480.
Bellinger, D.; Leviton, A.; Waternaux, C.;
Needleman, H.; Rabinowitz, M. (1987)
Longitudinal analyses of prenatal and
postnatal lead exposure and early
cognitive development. N. Engl. J. Med.
316MG37-1Q43.
Bloom, A.D. (1981) Guidelines for
reproductive studies in exposed human
populations. Report of Panel II. In:
Guidelines for .studies of human
populations exposed to mutagenic and
reproductive hazards. White Plains, NY:
March of Dimes Birth Defects
Foundation, pp. 37-110.
Brown, J.M. (1984) Validation of an in vivo
screen for the determination of embryo/
fetal toxicity in mice. Prepared by SRI
International for the U.S. EPA,
Washington, DC, under EPA contract no.
68-01-5079. .
Brown, N.A. (1987) Teratogenicity testing in
vitro: status of validation studies. Arch.
Toxicol. Suppl. 11:105-114.
Brown, K.G.; Erdreich, L.S. (1989) Statistical
uncertainty in the no-observed-adverse-
effect level. Fundam. Appl. Toxicol.
13:235-244.
Brown, N.A.; Fabro, S.E. (1982) The in vitro
approach to teratogenicity testing. In:
Snell, K., ed. Developmental toxicology.
London, England: Croom-Helm, pp. 31-
. 57.
Brown, N.A.; Freeman, S.J. (1984) Alternative
tests for teratogenicity. Alternatives Lab,
Anim. 12:7-23.
Buelke-Sam, J.; Kimmel, C.A.; Adams, J., eds.
1985. Design considerations in screening
for behavioral teratogens: results of the
Collaborative Behavioral Teratol. Study.
Neurobehav. Toxicol. Teratology
7(6):537-789.
Butcher, R.E.; Wootten, V.; Vorhees, C.V.
(1980) Standards in behavioral teratology
,, testing: test variability and sensitivity.
Teratogenesis Carcinog. Mutagen. 1:49-
61.
Centers for Disease Control. (1988a) Trends
in years of potential life lost due to infant
mortality arid perinatal conditions, 1980-
1983 and 1984-1985. Morbidity and
Mortality Weekly Repdrt 37:249-256.
Centers for Disease Control. (1988b)
Premature mortality due to congenital
anomalies—United States. Morbidity and
Mortality Weekly Report 37:505-506.
Chen, J.J.; Kodell, R.L. (1989) Quantitative risk
assessment for teratological effects. J.
Amer. Statistical Asspc, 84:966-971.
Chernoff, N.; Kavlock, R.f. (1982) An in vivo
teratology screen utilizing pregnant mice.
J. Toxicol. Environ. Health 10:541-550.
Couture, L.A. (1990) 2,3,7,8-
Tetrachlorodibenzo-p-dioxin-induced
hydronephrosis: characterization of the
peak period of sensitivity for placentally-
and lactationally induced renal lesions,
and assessment of persistence
[dissertation]. Chapel Hill, NC:
University of North Carolina. Available
from: University of Michigan,
Dissertation Library, Ann Arbor, MI..
Crump, K.S. (1984) A new method for
determining allowable daily intakes.
Fundam. Appl. Toxicol. 4:854-871.
Daston, G.P.; Rehnberg, B.F.; Carver, B.A.; :
Kavlock, R.J. (1988) Functional teratogens
of the rat kidney. II. Nitrofen and
elhylenethiourea. Fundam. Appl. Toxicol.
11:401-415.
: Davis, J.M.; Otto, D.A.; Weil, D.E.; Grant, I.D.
(1990) The comparative developmental
neurotoxicity of lead in humans and
animals. Neurotoxicol, Teratol. 12:215-
229.
Deane, M.; Swan, S.H.; Harris, J.A.; Epstein,
D.M.; Neutra, R.R. (1989) Adverse
pregnancy outcomes in relation to water
contamination, Santa Clara County, CA,
1980-1981. Am. J. Epidemiol. 129:894-904.
Dobbins, J.G.; Eifler, C.W.; Buffler, P.A. (1978)
The use of parity survivorship analysis in
the study of reproductive outcomes.
, Presented at the Society for
Epidemiologic Research Conference;
June; Seattle, WA.
Eisner, J.; Suter, K.E.; Ulbrich, B.; Schreiner,
G. (1986) Testing strategies in behavioral
: ' teratology: IV. Review and general
conclusions. Neurobehav. Toxicol.
. Teratol. 8:585T590. , - '
Epidemiology Workgroup of the Interagency
.Regulatory Liaison Group (1981)
Guidelines for documentation of
epidemiologic studies. Am. J. Epidemiol.
114(5):609-613.
Everson, R.B.; Sandier, D.P.; Wilcox, A.J.;
Schreinemachers, D.; Shore, D.L.;
Weinberg, C. (1986) Effect of passive
exposure to smoking on age at natural
menopause. Br. Med. J. 293(6550):792.
Fabro, S.; Shull, G.; Brown, N.A. (1982) The
relative teratogenic index and
teratogenic potency: proposed
components of developmental toxicity
risk assessment. Teratogenesis Carcinog.
Mutagen. 2:61-76.
Faustman, E.M. (1988) Short-term tests for
teratogens. Mutat. Res. 205:355-384.
Faustman, E.M.; Wellington, D.G.; Smith,
W.P.; Kimmel, C.A. (1989)
Characterization of a developmental
toxicity dose-response model. Environ.
Health Perspeot. 79:229-241.
Food and Drug Administration. (1966)
Guidelines for reproduction and-studies
for safety evaluation of drugs for human
use. Bureau of Drugs, Rockville, MD.
Food and Drug Administration. (1970)
Advisory Committee on Protocols for
Safety Evaluations. Panel on
reproduction report on reproduction
studies in the safety evaluation of food
additives and pesticide residues. Toxicol.
Appl. Pharmacol.x 16:264-296.
Food and Drug Administration (1987) Report
of the in vitro teratology task force.
Environ. Health Perspect. 72:201-249.
Francis, E.Z.; Farland, W.H. (1987)
Application of the preliminary
.developmental ioxicity screen for
chemical hazard identification un< tor the
Toxic Substances Control Act,
Teratogenesis Carcinog. Mutagen. ":107-
117.
Fujii, T.; Adams, P.M. (1987) Functional
teratogenesis: functional effects on the
offspring after parental drug exposure.'
Tokyo, Japan: Teikyo University Press.
Gaffey, W.R. (1976) A critique of the standard
mortality ratio. J. Occup. Med. 18:157-
160. . i --..••. ' /
Gaylor, D.W. (1983) The use of safety factors
for controlling risk. J. Toxicol. Environ. -
Health 11:329-336.
-------
63822
Federal Register / Vol. 56, No. 234 / Thursday, December 5, 1991 /Notices
toCi D.W. tl989)'"QuB.ntitat!ve risk'
«tw!j'»i» for quintal reproductive and
l> (i87i) A^onoaircinoma of the vagina:
11 HssoKhilhtn of ma!en!fll stiibestrol
i"1 therapy with appearance in young
i Women. I^Engi. ]. Med. 284:878.
jfertlg. A.T. (1967} The overall problem in
man- In: Bcr.irsohkc, K.. ed. Comparative
aspects of reproductive failure. New
York. NY: Springer-Verlag. pp. 11-41.
ffogwr. CJJt (1064) Reducing
mtlclasaiftcaUofi errors through
questionnaire design. In: Loekey, J.E.;
uimaster*. G.K,- Keye. WJt, eds.
Reproduction: the new frontier in
occupations! and environmental health
fesciitch. New York, NY: Alaa R. Lias,
I" I '. Inc., pp. 81-07.
Hflgfie, CJJl. (1965) Developmental risks.
Presented ul: symposium oa
cpfdamtelagy ana health risk
K8H05Sn'.cnt; May 14; Columbia. MD.
Joffc, M. (1985) Biases in research on
raprodor.lion and women's work. lot. J.
tipidciRloL 14|l);118-23.
Johnnm, E.M, (1981) Screening for teratogenic
nanrclf • nutmo of the problem. Annu.
"..' Rev, ph'I'riStcoL foxicoL 21:417-429.
Johnion. B.M.; Gabel, B.E.G. (1983) An
artificinl embryo for detection of
Hbnormjil developmental biology.
Fundam. Appl. toxicot 3:243-249.
Kavlock, R.J.; Grabowski, C.T., eds. (1983)
Abnormal functional development of the
heart, lungs, gnd kidneys: approaches to
.. functional teratology. Prog. Clin. Bipj.
Res.,""vbl. 140. New York. NY: Alan R, '
lass, Inc.
Kavlock, R.J.; Rehnberg, B.F.; Rogers, E.H.
(1986) Congenital renal hypoplasia:
effects on basal renil function in the
developing rat. Toxicology 40:247-258,
Kavlock, R.J.; Relinberg, B.F.; Rogers, E.H.
(1987a) The fate of adriamycin induced
dilated renal pelvis in the fetal rat:
physiological and morphological effects
in the offspring. Teratology 36:51-58.
Kavlock, R.J.; Rehnberg, B.F.; Rogers, E.H.
(i987b) Critical prenatal periods for
chlorambucil induced functional
teratology of the kidneys. Toxicology
43:51-64. ::: '
Kavlock, R.J.; Short R.D.. Jr.; Chernoff, N.
(19SJ'c) Further evaluation of an in yiyo
teratology screen. Teratogenesis
" Carcinog. Mutagen."7:7-16.
Kavlock, R.J.; Hoyle, B.R.; Rehnberg, B.F.;
Rogers, E. (1988) The significance of
dilatedreggj pelvis^ in the nitrofen
exposeS fetal rat. Toxicol. Appl.
Phannacol. 94:287-296.
Khera, K.S. (1984) Maternal toxicity—a
possible factor in fetal malformations in
mice. Teratology 29:411-418.
Khera, K.S. (1985) Maternal toxicity: a
possible ettologic factor in embryo-fetal
deaths and fetal malformations in
rodent-rabbit species. Teratology 31:129-
153.
Khera, K.S. (1987) Maternal toxicity of drugs
and metabolic disorders—a possible
etiologic factor in the intrauterine death
and congenital malformation: a critique
on human data. CRC Crit. Rev. Toxicol.
,17:345^3757 ~^' ~. '„' " '^ "^'~'x"' "m '^ ~_
Kimmgl, C.A. (1988) Current status of
behavioral teratology—science and
regulation. CRC Crit. Rev. Toxicol.
Kimmel, C.A. (1990) Quantitative approaches
to human risk assessment for noncancer
health effepts. Neurotoxicology 11:189-
198.
Kimmel, G.L. (1985) In vitro tests in screening
teratogens: "considerations' to aid the
validation process. In: Marois, M., ed.
Prevention of physical and mental
congenital defects. Part C. New York,
NY: Alan R. Liss, Inc., pp. 259-263.
Kimmel, G.L. (1990) In vitro assays in l
developmental toxicology: their potential
application in risk assessment. In: In
vitro methods hi developmental
toxicology: use in defining mechanisms
and risk parameters. Kimmel, G.L.;
Kochhar. D.M., eds. Boca Raton, FL; CRC
. Press, pp7ifiiPl73i
Kimmel, C.A.; Francis. E.Z. (1990)
Proceedings of the workshop on the
acceptability and interpretation of
dermal developmental toxicity studies.
Fundam. Appl. Toxicol. 14:386-398.
Kimmel C.A.: Gaylor. D.W. (1988) Issues in
qualitative and quantitative risk analysis
for developmental toxicology. Risk Anal.
8:15-20.
Kimmel, C.A.; Price, C.]. (1990)
Developmental toxicity studies. In:
Arnold, D.L.; Grioe, H.C.; Krewski, D,R.,
eds. Handbook of in vivo toxicity testing.
'. S^nBIegoi'CAT" Academic Press, pp. 271- •
301. -
Kimmel, C.A.; Young, J.F. (1983) Correlating
pharmacokinetics and teratogenic end
points. Fundam. Appl. Toxicol. 3:250-255.
Kimmel, G.L.; Smith, K.; Kochhar, D.M.; Pratt,
R.M. (lS82a) Overview of in vitro
, teratogenicity testing: aspects of
validation and application to screening.
Teratogenesis Carcinog. Mutagen. 2:221-
229.
Kimmel, G.L.: Smith, K.; Kochhar, D.M.; Pratt,
R.M. (1982b) Proceedings of the
consensus workshop on in vitro
teratogenesis testing. Teratogenesis
Carcinog. Mutagen. 2:221-374.
Kimmel, C.A.; Holson, J.F.; Hogue, C.J.; Carlo,
G.L. (1934) Reliability of experimental
studies for predicting hazards to human
development. National Center for
Toxicological Research, Jefferson, AS.
NCTR Technical Report for Experiment
No. 6015.
Kimmel, C.A.; Kimmel, G.L.; Frankos, V., eds.
(1986) Interagency Regulatory Liaison
Group workshop on reproductive toxicity
rislc assessment Environ. Health
Perspect. 86:193-221.
Kimmel, G.L.; Kimmel, C.A.; Francis, E.Z.,
eds. (1987) Evaluation of maternal and
developmental toxicity. Teratogenesis
Carcinog. Mutagen. 7:203-338.
Kimmel, C.A.; Wellington. D.G.: Farland, W.;
Ross, P.; Manson, J.M.; Chernoff, N.;
Young, J.F.; Selevan, S.G.; Kaplan, N.;
Chen, C.; Chitlik, L.D.: Siegel-Scott. C.L.;
Valaoras, G.; Wells, S. (1989) Overview
of a workshop on quantitative models for
developmental toxicity risk assessment.
Environ. Health Perspect. 79:209-215.
Kimmel, C.A.; Rees, D.C.: Francis. E.Z., eds.
(1990a) Proceedings of the Workshop on
the Qualitative and Quantitative
Comparability of Human and Animal
Developmental Neurotoxicity.
Neurotoxicol. Teratol. 12(3):173-292.
Kimmel, C. A.; Kimmel, G.L.; Francis, E.Z.;
Chitlik, L.D. (1990b) An overview of the
U.S. EPA's proposed amendments to the
guidelines for the health assessment of
suspect developmental toxicants, f. Am.
ColL Toxicol. 9:39-47.
Kissling, G. (1981) A generalized model for
analysis of non-independent
observations [dissertation]. Chapel Hill,
NC: University of North Carolina.
Available from: University Microfilms,
Ann Arbor. MI.
Kleinbaum, D.G.; Kupper, L.L.; Morgenstern,
H. (1982) Epidemiologic research:
principles and quantitative methods.
London: Lifetime Learning Publications.
Kodell, R.L.; Howe, R.B.: Chen, J.J.; Gaylor,
D.W. (1991) Mathematical modelling of
reproductive and developmental toxic
effects for quantitative risk assessment.
Risk Analysis 11, in press, r
Kwa, S.-L.; Fine, L.J. (1980) The association
between parental occupation and
childhood malignancy. ]. Occup. Med.
22:792-794. ,
«i/5-,;'lriA,,ll !»S,i. ':;•'! '! I!!!!!! II '
-------
Federal Register /Vol. 56, No. 234 / Thursday. December 5, 1991 / Notices
63323
Lamb, J.C., IV. 1985. Reproductive toxicity
testing: evaluating and developing new
testing systems. J. Am. Coll. foxicol.
4:163-171.
Lemasters, G.K.; Selevan, S.G. (1984) Use of
exposure data in occupational
reproductive studies. Scand. J. Work
Environ. Health 10:1-6.
Lemasters, G.K.; Pinney, S.M. (1989)
Employment status as a confounder
when assessing occupational exposures
and spontaneous abortion. J. Clin.
Epidemic!. 42:975-81.
Leridon, H. (1877) Human fertility: the basic
components. Chicago, IL: The University
of Chicago Press.
.eukroth. R.W., ed. (1986) Predicting
neurotoxicity and behavioral dysfunction
rrom preclinical toxicologic data.
Jeurotoxicol. Teratol. 9:395-471.
Levinerkj. (1983) Methods for detecting
occupational causes of male infertility:
reproductive history versus semen
analysis. Scand. J. Work Environ. Health
9:371 376.
Levine, T.E.; Butcher, R.E. (1990) Workshop
on the qualitative and quantitative
comparability of human and animal
developmental neurotoxicity. Work
group IV report: Triggers for
developmental neurotoxicity testing.
Neurotoxicol. Teratol. 12:281-284.
Levine, R.J.; Symons, M.J.; Balogh, S.A.;
Arndt, D.M.; Kaswandik, N.R.; Gentile,
J.W. (I960) A method for monitoring the
• fertility of workers: I. Method and pilot
studies.}. Occup. Med. 22:781-791.
Levine, R.J.; Symons, M.J.; Balogh, S.A.;
Milby, T.H.; Whorton, M.D. (1981) A
method for monitoring the fertility of
workers: II. Validation of the method
among workers exposed to
dibromochloropropane. J. Occup. Med.
23:183-188.
Mackeprang, M.; Hay, S.; Lunde, A.S. (1972)
Completeness and accuracy of reporting
of malformations on birth certificates.
• HSMHA Health Reports 84:43-49.
McMichael, A.J. (1976) Standardized
mortality ratios and the 'healthy worker
effect': scratching beneath the surface. J.
Occup'. Med. 18:165-168.
Morrissey, R.E.; Harris, M.W.; Schwetz, B.A.
(1989) Developmental toxicity screen:
results of rat studies with diethylhexyl
phthalate and ethylene glycol
monomethyi ether. Teratogenesis
Carcinog. Mutagen. 9:119-129.
Morrissey. R.E.; Welsch, F.; Kavlocfc, R.J.;
Schwetz, B.A. (1991) Proceedings of a
conference on in vitro teratology.
Environ. Health Perspect., in press.
National Center for Health Statistics. (1988)
Advance report of final mortality
statistics, 1986. Monthly Vital Statistics
Report 37(6): Supp 1. NCHR, Hyattsville,
MD. DHHS Publ. No. (PHS) 88-1120.
National Research Council. (1983) Risk
assessment in the Federal government:
managing the process. Committee on the
institutional Means for the Assessment
of Risks to Public Health. Commission on
Life Sciences, National Research
Council. Washington, DC: National
Academy Press, pp. 17-83.
Needleman, H. (1988) The neurotoxic,
teratogenic, and behavioral teratogenic
effects of lead at low. dose: a paradigm
for transplacental toxicants. In:
Transplacental effects on fetal health.
. New York, NY: Alan R. Liss, Inc., pp.
279-287.
Nelson, C.J.; Holson, J.F. (1978) Statistical
analysis of teratogenic data: problems
and advancements. J. Environ. Pathol.
Toxicol. 2:187-199.
Nelson, K.; Holmes, L.B. (1989) Malformations
due to presumed spontaneous mutations
in newborn infants. New Engl. J. Med.
320:19-23.
. Nisbet, I.C.T.; Karch, N.J. (1983) Chemical
hazards to human reproduction. Park
Ridge, IL: Noyes Data Corp.
Organization for Economic Cooperation and
Development (OECD). (1981) Guideline
for testing of chemicals' teratogenicity.
Papier, C.M. (1985) Parental occupation and
congenital malformations in a series of
35,000 births in Israel. Prog. Clin. Biol.
Res. 163:291-294.
Perlin, S.A.; McCormack, C. (1988) Using
weight-of-evidence classification
schemes in the assessment of non-cancer
health risks. In: Proceedings of the 5th
National Conference on Hazardous
Wastes and Hazardous Materials
(HWHM '88); April 19-21; Las Vegas, NV.
Peters, J.M.; Preston-Martin, S.; Yu, M.C.
(1981) Brain tumors in children and
occupational exposure of parents.
Science 213:235-237.
Rai, K.; Van Ryzin,}. (1985) A dose-response
model for teratological experiments
involving quantal responses. Biometrics
41:1-9.
Riley, E.P.; Vorhees, C.V., eds. (1986)
Handbook of behavioral teratology. New
York, NY: Plenum Press.
Rodier, P.M. (1978) Behavioral teratology. In:
Wilson, J.G.; Fraser, P.C., eds. Handbook
of teratology, vol. 4. New York, NY:
Plenum Press, pp. 397-428.
Rothman, K.J. (1986) Modern epidemiology.
Boston, MA: Little, Brown and Co., pp.
83-94;
Ryan, L-.M.; Catalano, P;J.; Kimmel, C.A.;
Kimmel, G.L. (1991) Relationship -
between fetal weight and malformation
in developmental toxicity studies.
Teratology 44:215-223.
Schardein, J.L. (1983) Teratogenic risk
assessment. In: Kalter, H., ed. Issues and
reviews in teratology, vol. 1. New York,
NY: Plenum Press, pp. 181-214.
Schnatter, A.R.L. (1990) The development of
methods for implementing industry-
based reproductive surveillance
[dissertation]. New York, NY: Columbia
University. Available from: University
• Microfilms, Ann Arbor, ML
Schuler R.; Hardin, B.: Niemeyer, R.;'Booth,
. G.; Hazelden, K:; Piccirillo, V.; Smith, K.
(1984) Results of testing fifteen glycol
ethers in a short-term, in vivo
reproductive toxicity assay. Environ.
Health Perspect. 57:141-148.
Selevan, S.G. (1980) Evaluation of data
sources for occupational pregnancy
outcome studies [dissertation].
Cincinnati, OH:.University of Cincinnati.
Available from: University Microfilms,
Ann Arbor, MI.
Selevan, S.G. (1981) Design considerations in
pregnancy outcome studies of
occupational populations. Scand. J. Work
Environ. Health 7:76-82.
Selevan, S.G. (1985) Design of pregnancy
outcome studies of industrial exposure.
In: Hemminki, K.; Sorsa, M.; Vainio, H.,
1 eds. Occupational hazards and
reproduction. Washington, DC:
Hemisphere Pub., pp. 219-229.
Selevan, S.G,; Hemminki, K.; Lindbohm, M-L.
(1986) Linking data to study reproductive
effects of occupational exposures.
Occupational Medicine: State of the Art
Reveiws l(3):445-455i '.
Selevan, S.G.; Lemasters, G.K. (1987) The
dose-response fallacy in human
reproductive studies of toxic exposures.
J. Occup. Med. 29:451^154.
Sever, L.E.; Hessol, N.A. (1984) Overall design
considerations in male and female
1 occupational reproductive studies. In:
Lockey, J.E.; LeMasters, G.K.; Keye, W.R.,
eds. Reproduction: the new frontier in
occupational and environmental
research. New York, NY: Alan R. Liss,
Inc. pp. 15—47.
Shepard, T.H. (1980) Catalog of teratogenic
agents. Third edition. Baltimore, MD:
Johns Hopkins University Press.
Shepard, T.H. (1986) Human teratogenicity.
Adv. Pediatr. 33:225-268.
Silverman, J.; Kline, }.; Hutzler, M.; Stein, Z.;
Warburton, D. (1985) Maternal
employment and the chromosomal
characteristics of spontaneously aborted
conceptions. J. Occup. Med. 27:427-438.
Slotkin, T.A.; Lau, C.; Kavlock, U.J.; Gray,
J.A.; Orband-Miller, L.; Queen, K.L.;
Baker, F.E.; Cameron, A.M.; Antolick, L.;
Haim, K.; Bartolome, M.; Bartolome, J.
(1988) Role of sympathetic neurons in
biochemical and functional development
of the kidney: neonatal sympathectomy
with 6-hydroxydopamine.}. Pharmacol.
Exp. Ther. 246:427 433.
Starr, T.B.; Dalcorso, R.D.; Levine, R.J. (1986)
Fertility of workers: a comparison of
logistic" regression and indirect
standardization. Am.}. Epidemiol.
123:490-498.
Stein, Z.; Hatch, M. (1987) Biological markers
in reproductive epidemiology: prospects
and precautions. Environ. Health
Perspect. 74:67-75.
Stein, Z.; Susser, M.; Warburton, D.; Wittes,
J.; Kline, J. (1975) Spontaneous abortion
' as a screening device. The effect of fetal
surveillance on the incidence of birth
defects. Am.}. Epidemiol. 102:275-2au.
Stein, Z.; Kline, J.; Shrout, P. (1985) Power in
surveillance. In: Hemminki, K.; Sorsa, M.;
Vaninio, H., eds. Occupational hazards
and reproduction. Washington, DC:
Hemisphere Pub., pp. 203-208. ,
Stiratelli, R.; Laird, N.; Ware, J.H. (1984)
Random-effects models for serial
observations with binary responses.
Biometrics 40:961-971.
-------
'3*
.< i, ....... i; 'i I ; ! ..... '.'I! •• i- • ..... . Sii;!1 ..... ! "":
: si™ ...... iiiiiSH ii ' , fill ..... li».i ..... Mill ...... iJE" SU-M^li) "« . "ill ..... ii ...... < W: ....... , " i" ?:?• •? ^rSm§«BBfcf fljfflrai
, — ,,. t , n| (i ^ ^ n ^ n ^ , ;
Federal Register / Vol.'56, No. 234 / Thursday, December 5, 1991 / Notices
;;- . V >ili I ...... i" .. '. !> Sill' ..... , "":!*,ii|P •>",' ." ,' ... ..... I
' S»v«m, S,H J Shaw, G4 Harris, J.A.; Nentra,
..... 'i: 1 !iR. l!?88l Congenital cardiac anomalies
...... In relation to wider contamination, Santa
y, CA, 1961-1983. Am. J.
. 129:835-893.
' r, MJR^ Aarons. J.H.;
\:,i;: , Mplfs, |,L; LaPorte, R.E, (1983) Evaluation
"" tfmattodii for ihe prospective
t ...... Itfwitjfltttlion of early fetcl fosses in
' tavttqhmenlai epidemiology studies. Am.
l, 127;B«3-850. '"
in. '•£. (1900) Collaborative1 studies oa
. '.;' behavioral teratology Jn Japan.
..... N^trotoxtcciogy 7:3S-4§-
TilJey, B.C.; Burnes, A.E; Be^gstrelh, E.;
Lubi'iribe, D.: Nu!t«r. K.L.; Cotton. T.;
, .if A«J«*!|;iiE. {1965J A comparison of
• prtgr;libc|'lJi!itory recall and medical
1 recefdii: Implication* far retrospective
,- stud1!** Am. J. Rpidcmlqi. IZ1;2S8~281.
Tlbaii, HA,: Jacobsoti, JX.; Rqgitn, W.J. (1990)
PotychloflnKted blpbersyls and (be
developing nervooi system: cross-species
oontpttlsons, NeurotoxlcoL Teratol.
12;Z39 Z48,, ,
Ti«t» S.P.S Wen, C.P. (1986) A review of
metboduiogteat Issues of the
iiiii^ »t»n ,,tet gatdelines; final rules. Federal
.' Reglsles 50:39425-39428 and 39433-3.9434.
US. Environmental Protection Agency.
{I9T,i'jJ Hazard Evaluation Dlyisioa
, «te»d«rd evaJuuMgp procedMre:
teratology studies, pp. 22-23. Office of
PMUdda Programs, Washington. DC.
BPA-MO/9-8S-O18.
VS. Enykontpents! Protecttpn Agency.
(1M5c) Toxtc Substances Coatrol Act
Iwt |tiltldtoes; final rules. Federal
•' : Roaster 50:33428-39429.
." U.S. Environmental Protection Agency.
'' ' {ifeij, TTWefhy lene glyco! monomethyl,
• njonocttol, and oonobutyl ethers:
, i , Br^ioseil'lwl role. Federal Register
3!st~S83-17(»4.
••••' u .......... i:::!Ii!ll ...... IliSiiV' ...... W^ri'rTO'
U.S. Envi»3Rraental Protection Agency.
(1986bt Sept. 24} Gnidelines for
carcinogen risk assessment. Federal
Register 5i(185):3399Z-340G3.
' '
.
(1986c, Sept. 24) Guidelines For
mutagenioity risk assessment. Federal
Register-51(18S]:34006-3!4012.
U.S. Environmental Protection Agency.
(1988d, Sept. 24.) Guidelines for
estimating exposures. Federal Register 51
(185):34a42-34054.
U.S. Enviropnjental ProtectJon Agency.
(|988a,, Feb. 26) Diethylene glycol butyl
etlier anidiethylene glycol butyl ether
acetate; final test rule. Federal Register
53:5332-5953.
U.S. Environmental Protection Agency.
(1988b) Proposed guidelines for assessing
male reproductive risk. Federal Register
53:24850-24869.
U.S. Environmental Protection Agency.
{1988c) Proposed guidelines for assessing
female reproductive risk. Federal
...... Register 53:24834-24847.
U.S. Environmental Protection Agency.
(1989a)FIFRA accelerated reregistration
phase 3 technical guidance, Appendix D.
Office of Pesticides and Toxic
Substances, Washington, DC. EPA No.
54O/09-9O-O78. Available from: NTIS,
Springfield, VA.
U.S. Environmental Protection Agency.
(1983b) Triethylene glycol monotnethyl
ether; final test rule. Federal Register
54:13472-13477.
US. Environmental Protection Agency.
(1991a) Pesticide assessment guidelines,
subdivision F. Hazard evaluation: human
and domestic animals. Addendum 10:
Neurotoxicity, series Si, 82, and 83.
Office of Pesticides and Toxic
• Substances, Washington, DC. EPA 540/
09-91-123. Available from: NTIS,
Springfield, VA. PB91-154817. .
U.S. Environmental Protection Agency.
f!991b) Integrated Risk Information
System [IRIS). Online. Office of Health
and Environmental Assessment,
Washington, DC.
Weinberg, C.R.; Gladen, B.C. (1986) The beta-
geomekic distribution applied to
comparative fecundability studies.
Biometries 42:547-580.
Wickramaratne, G.A. de S. (1987) The
Chernoff-Kavlock assay: its validation
and application in rats. Teratogeneais
Carcinog. Mutagen. 7:73-83.
Wilcox, A.J. (1983) Surveillance of pregnancy
loss in human populations. Am. J. Ind.
Med. 4:285-291.
Wilcox, A.J.; Weinberg, C.R.; Wefamann, R.E.;
Armstrong, E.G.; CanSeld, RJ3.; Nisula,
B.C. (1985) Measuring early pregnancy
loss: laboratory and field methods. Fertil.
Steril. 44:366-374-.
Wilson, }.G. (1973) Environment and birth
defects. New York, NY: Academic Press,
pp. 30-32.
Wilson, J.G. (1977) Ernbryotoxicity of drugs in
man. In: Wilson, J.G.; Fraser, F.C., eds.
Handbook of teratology. New York, NY:
Plenum Press, pp. 309-355.
^ i "fern :« i»»!ii • ii I
Wilson, J.G. (1973} Sttrvey of in vitro systems:
the?r potential use in teratogenicity
screening. la: Wilson, J.G.; Fraser, F.C.,
eds. Handbook of teratology, vof. 4. New
York; NY: Plenum Press, pp. 135-153.
Wilson, J.G.; Scott, W.f.v Ritter, E.J.; Fradkin,
R. (1975) Comparative distribution and
embryotoxicity of hydroxyurea in
pregnant rate and rhesus monkeys.
Teratology 11:183-178.
Wilson, f.G.; Ritter, E.J.; Scott, W.J.; Fradkin,
R. J1977} Comparative distribution and
embryotoxicity of acetylsalieylic acid in
pregnant rats and rhesus monkeys.
Toxicol. Appl. Pharmacol. 41:67-78.
Wong, O.; Utidjian, H.M.D.; Karten, V.S.
(1979) Retrospective evaluation of
reproductive performance of workers
exposed to ethylene dibromide.}. Occup.
Med. 21:93-102,
Woo, D.C.; Hoar, R.M. (1972) "Apparent
liydronephrosis" as a normal aspect of
renal developmentjua late gestation of
rats: the effect of methyl salicylate.
Teratology 8:191-190.
World Health Organization. (1984) Principles
for evaluating health risks to> progeny
associated with exposure to chemicals
during pregnancy. In: Environmental
Health Criteria, vol. 30. Geneva: World
Health Organization.
Zack, M.; Cannon, 84 Lloyd, D.; Heath. C.W.,
Jr., Falletta, J M.; Jones, B.; Housworlh, J.;
Cuowley; S. (1980) Cancer in children of
parents exposed to hydrocarbon-related
industries and occupations. Am. f.
EpidemioL 3.-329-33S.
Zenick, H.; Clegg, E.D. (1989} Assessment of
male reproductive toxicity: a risk
assessment approach. In: Hayes, A.W.,
ed. Principles and methods of toxicology.
Second ed. New York, NY: Raven Press,
pp. 279-309.
PART B: RESPONSE TO PUBLSC AND
8C8ENCE ADVJSORY BOARD COMESEMTS
I. Introduction
This section summarizes the major
issues raised in the public and Science
Advisory Board [SAB) comments on the
Proposed Amendments to the Guidelines
for the Health Assessment of Suspect
Developmental Toxicants published
March 6,1989 (54 FR S385-9403J.
Comments were received from 25
individuals or organizations. The
Agency's initial summary of the public
comments and proposed responses were
presented to the Environmental Health
Committee of the SAB on October 27,
1989. The report of the SAB Committee
was provided to the Agency on Aprfl 23,
1990.
The SAB and public comments were
di\rerse and addressed issues from a
variety of perspectives. The majority of
the comments were favorable and hi
support of the Proposed Amendments to
the Guidelines. Many praised the
Agency's efforts as being timely and
%vell-justified. Most commentors also
gave specific comraonts or criticisms for
1 "III
1 lil
-------
63825
further consideration, clarification, or
re-evaluation. For example, there was
concern expressed about the Guidelines
. imposing further testing requirements,
particularly functional testing, and many
commentors felt that the Proposed
Amendments discounted the role of
maternal toxicity in developmental
toxicity. In addition, there was concern
that the proposed weight-of-evidence
scheme would promote labeling of
agents as causing developmental
toxicity before the entire risk
.-s-sessment process was completed.
The SAB Committee also indicated
that the proposed revisions were
adequately founded in developmental
toxicology and represented a step
. forward for the Agency. They suggested
that the Agency revisit the weight-of-
evidence scheme, to avoid confusion
with more commonly applied uses of
such classifications, and to develop a
more powerful conceptual approach.
Further, the SAB Committee urged that
the Agency begin to move away from
the current use of the no-observed-
adverse-effect level (NOAEL) and •
lowest-observed-adverse-effect level
(LOAEL) basis for calculating the
reference dose for developmental
toxicity to a benchmark dose and
confidence limit approach tied to
empirical models of dose-response "
relationships.
In response to the comments, the
Agency has modified or clarified many
sections of the Guidelines. For the
purposes of this discussion, the major
issues reflected by the public and SAB
comments are discussed. Several minor
recommendations, which are not
discussed specifically here, also were
considered by the Agency in the
revision of these Guidelines.
II. Intent of the Guidelines
Many of the public comments
indicated some misunderstanding of the
intent of the Guidelines, apparently
assuming that the risk assessment
guidelines impose testing requirements.
In particular, some commentors
suggested that because the Agency was
providing guidance on the interpretation
of tests not required in the EPA testing
guidelines, the Agency was suggesting
that these tests be required in the future.
The 1986 Guidelines and the 1989
Proposed Amendments clearly state that
these guidelines are not Agency testing
guidelines, but rather are intended to
ensure uniform interpretation of all
existing, relevant data. However, to
avoid any confusion, the discussion of
study designs has been changed to
avoid the impression that these
Guidelines set testing requirements. In
,the evaluation of data on an agent for
risk assessment, relevant data are often
encountered that have been generated
from nontraditional tests. In such cases,
it is imperative that the Agency provide
guidance so that all data considered to
be relevant are included in the risk
assessment and are interpreted
uniformly.
III. Basic Assumptions
In the 1986 Guidelines, several
assumptions were implicit in the
approach to risk assessment, but were
not explicitly stated. These assumptions
were detailed in the 1989 Proposed
Amendments. Comments received from
the public and the SAB favored
presentation of these assumptions and
generally agreed with the wording,
except for the fourth assumption which
concerns the use of the most relevant or
most sensitive species. The 1989
Proposed Amendments stated that "it is
assumed that the most sensitive species
should be used to estimate human risk.
When data are available (e.g.,
pharmacokinetic, metabolic) to suggest
the most appropriate species, that
species will be used for extrapolation."
The SAB recommended that, for this
assumption, the basic position of the
Agency should be to use data from the
most relevant species, and that use of
data from the most sensitive species
should be the default position. In
addition, the SAB recommended that the
threshold assumption be considered
carefully in the dose-response
assessment of any agent, and that the
Agency develop more comprehensive
approaches to risk assessment as
discussed further in the following
sections. , '
Changes have been made in the
statement of the basic assumptions in
line with the SAB and public comments
that clarify, but do not alter, the intent of
the assumptions.
IV, Maternal/Developmental Toxicity
The 1989 Proposed Amendments
stated that "when adverse
developmental effects are produced only
at maternally toxic doses, they are still
considered to represent developmental
toxicity and should not be discounted as
being secondary to maternal toxicity."
This statement and others concerning
the interpretation of developmental
toxicity in the presence of maternal
toxicity were the subject of a
considerable number of public
comments and were also addressed by
the SAB. In general, commentors were
divided in their opinions on whether
they supported the Agency's statements
or felt that they discounted the role of
maternal toxicity in developmental
toxicity, but in general, the
recommended changes did riot
significantly alter the intent of the
statements. The SAB endorsed the
proposed revision, and suggested that
the Agency retain the statement that
was made in the Proposed Amendments.
In these Guidelines, the position is
further clarified by indicating that when
maternal toxicity is significantly greater
than the minimal'maternally toxic dose,
developmental effects at that dose may
be difficult to interpret. This statement
is added to clarify, but not to change,
the intent or meaning of the statements
regarding the relationship between
matdrnal and developmental toxicity.
From a risk assessment point of view,
whether a developmental effect is or is
not secondary to maternal toxicity, does
not impact on the selection of the
NOAEL or other dose-response
methodology.
V. Functional Developmenial Toxicity
The 1989 Proposed Amendments
provided information on the state-of-the-
art in the evaluation of functional effects
resulting from developmental exposures.
Several commentors voiced strong
objection to this section because they
perceived it as indicating an imminent
requirement-for testing. Several
indicated there are no standard methods
for functional testing, some felt that
functional end points should not be used
to.establish the NOAEL, and others
voiced concern about the problems with
using postnatal exposures in animal
studies.
The final Guidelines further update
this section to include a discussion of
the latest changes in the requirements
for functional developmental toxicity
testing by the Agency, and reflect the
current approach to interpretation of.
such data, with incorporation of
information from the EPA/NIDA-
sponsored "Workshop on the
Qualitative and Quantitative
Comparability of Human and Animal
Developmental Neurotoxicity" (1990).
The intent of, these Guidelines as stated
above, is not to change testing
requirements but to give guidance when
these types of data are encountered in
the risk assessment process. The
Guidelines also indicate that functional
developmental toxicity end points will
be used for establishing the NOAEL
when they are found to-be the adverse
effect occurring at the lowest dose in
appropriate, well-conducted studies.
Interpretation of postnatal exposure
data is a concern, and must take into
consideration effects on the mother, her
offspring, and possible interactions; a
statement to this effect has been added.
Further interpretation of data will be
-------
63828
Fedora} gegister / Vol. 56, No. 234 /' Thursday, December 5, 1991 / Notices
"•• dismissed In ftp guidance being
; doyojopcd fay the Agency on
nmiiofbxiclty risk assessment.
VI, U'eight-of-Evidence Scheme
'. '."" TJhe tP8 Rfoposcd Amendments
described important considerations in
''determining the relative weight of
„ various kinds of data in .estimating the "
- risk' of developmental tpxicity in
„ .,,. Isniii)an%ii1%^ intent of, th.fi, proposed
"'i iWdtght-cif-evidence'fWOE] scheme was,
thai it not lie used in isolation, but be
"_, i uaod[as the fir|| step in the risk
:""'':, ' AdaesMen'f pto&sS.'I'Q be integrated'
""i dose.'.response information and the....
.The Wt?,h scheme,was the suject of a
1ifomfd*;r III" !l
ll.ilillWllM.tilliW Mi .'I'lUfi!1 j
i^iiiRi.
! ft,.
'lr VvMH ,i'Li illlJ'flt'TiililrMllI:1111!11,!!!!11; ,»',' ' llll i1 1 ," '!»' ,1,1 rf';,.'. i
l \it iff i;Ne>|! '|li! V'i ty : I 'i W • »t[:' ••.• > i:-' a "
', • t •• i, ',".i; ,;f jli ;,i R, 'S. I /liaiif: jiiHif 5»j;:;''«, j-K..! ii '•', 'i • '•'!',- +,. t VK f :•, f<"WJ!ftf;jjplfi!
" 'iii-'i1-;' ''. ?':f'«, -!; i''. 'l\, JSSryt', f,,''"' if. i '''.. "f '*'':,' ;" ;i• 'Ji. 'inI,;!?1 i1"
'.i ,:;"i",' j'Sj: JB1 liiiB: ..... tf
:!;!::v>;^ ..... m .....
i'!"ii|i!!!i!ii!;i!",!'":"«iH!,, ».
iri::i|"r|1,iji 'm"i ',/nv, i 'i/,1 ,:,:, "n ii'innry,,, »
i'i, i'i 41 , ll',,;1 i|| MI, I "'I,,'4
------- |