j United States	Prevention, Pesticides	EPA 740-C-09-009
Environmental Protection	and Toxic Substances	October 2009
%#Crri A9ency		
Endocrine Disruptor
Screening Program
Test Guidelines
OPPTS 890.1450:
Pubertal Development
and Thyroid Function in
Intact Juvenile/
Peripubertal Female

This guideline is one of a series of test guidelines established by the Office of
Prevention, Pesticides and Toxic Substances (OPPTS), United States Environmental Protection
Agency for use in testing pesticides and chemical substances to develop data for submission to
the Agency under the Toxic Substances Control Act (TSCA) (15 U.S.C. 2601, et seq.), the
Federal Insecticide, Fungicide and Rodenticide Act (FIFRA) (7 U.S.C. 136, etseq.), and section
408 of the Federal Food, Drug and Cosmetic (FFDCA) (21 U.S.C. 346a).
The OPPTS test guidelines serve as a compendium of accepted scientific
methodologies and protocols that are intended to provide data to inform regulatory decisions
under TSCA, FIFRA, and/or FFDCA. This document provides guidance for conducting the test,
and is also used by EPA, the public, and the companies that are subject to data submission
requirements under TSCA, FIFRA and/or the FFDCA. As a guidance document, these
guidelines are not binding on either EPA or any outside parties, and the EPA may depart from
the guidelines where circumstances warrant and without prior notice. The procedures contained
in this guideline are strongly recommended for generating the data that are the subject of the
guideline, but EPA recognizes that departures may be appropriate in specific situations. You
may propose alternatives to the recommendations described in these guidelines, and the
Agency will assess them for appropriateness on a case-by-case basis.
For additional information about OPPTS harmonized test guidelines and to access the
guidelines electronically, please go to http://www.epa.gov/oppts and select "Test Methods &
Guidelines" on the left side navigation menu. You may also access the guidelines in
http://www.regulations.gov grouped by Series under Docket ID #s: EPA-HQ-OPPT-2009-0150
through EPA-HQ-OPPT-2009-0159, and EPA-HQ-OPPT-2009-0576.

OPPTS 890.1450: Pubertal Development and Thyroid Function in Intact Juvenile/
Peripubertal Female Rats
(a)	Scope.
(1)	Applicability. This guideline is intended to meet testing requirements of
the Toxic Substances Control Act (TSCA) (15 U.S.C. 2601, et seq.), the
Federal Insecticide, Fungicide, and Rodenticide Act (FIFRA) (7 U.S.C.
136, et seq.), and the Federal Food, Drug, and Cosmetic Act (FFDCA) (21
U.S.C. 346a).
(2)	Background. The Endocrine Disruptor Screening Program (EDSP)
reflects a two-tiered approach to implement the statutory testing
requirements of FFDCA section 408(p) (21 U.S.C. 346a). In general, EPA
intends to use the data collected under the EDSP, along with other
information, to determine if a pesticide chemical, or other substances, may
pose a risk to human health or the environment due to disruption of the
endocrine system.
This test guideline is intended to be used in conjunction with other
guidelines in the OPPTS 890 series that make up the full screening
battery under the EDSP to identify substances that have the potential to
interact with the estrogen, androgen, or thyroid hormone (Tier 1
"screening"). The determination will be made on a weight-of-evidence
basis taking into account data from the Tier 1 assays and other
scientifically relevant information available. The fact that a substance may
interact with a hormone system, however, does not mean that when the
substance is used, it will cause adverse effects in humans or ecological
Chemicals that go through Tier 1 screening and are found to have the
potential to interact with the estrogen, androgen, or thyroid hormone
systems will proceed to the next stage of the EDSP where EPA will
determine which, if any, of the Tier 2 tests are necessary based on the
available data. Tier 2 testing is designed to identify any adverse
endocrine-related effects caused by the substance, and establish a
quantitative relationship between the dose and that endocrine effect.
(b)	Purpose. The purpose of the female pubertal assay is to help identify chemicals
or mixtures that have the potential to interact with the endocrine system, by
identifying effects on pubertal development and thyroid function in the intact
juvenile/peripubertal female rat. This assay is capable of detecting anti-thyroid,
estrogenic or anti-estrogenic chemicals (including agents which act via
alterations in receptor binding or steroidogenesis), or agents which alter pubertal
development via changes in luteinizing hormone, follicle stimulating hormone,
prolactin or growth hormone levels or via alterations in hypothalamic function.
Page 1

(c)	Endpoints.
~	Growth (daily body weight)
~	Age and body weight at vaginal opening
~	Organ weights
	Uterus (blotted)
	Ovaries (paired)
	Thyroid (after fixation)
	Kidneys (paired)
	Adrenals (paired)
~	Histology
	Thyroid (colloid area and follicular cell height)
~	Hormones
	Serum thyroxine (T4), total
	Serum thyroid stimulating hormone (TSH)
~	Estrous cyclicity
	Age at first vaginal estrus after vaginal opening
	Length of cycle
	Percent of animals cycling
	Percent of animals cycling regularly
~	Clinical (serum) chemistry
	Standard blood panel, including creatinine and blood urea nitrogen
(d)	General Conditions. Conduct the study in facilities accredited by the
Association for Assessment and Accreditation of Laboratory Animal Care
International (AAALAC) if in the U.S., or the applicable national or international
accreditation authority if outside the U.S. Care should be taken to minimize
stress from all sources including noise, other species housed nearby, or other
Rats are housed in clear plastic cages (approximately 20 x 25 x 47 cm) with
heat-treated1 laboratory-grade wood shavings other than cedar as bedding.
Corn cob bedding should not be used due to its potential to disrupt endocrine
activity2. Wire-mesh-bottomed caging should not be used due to the potential for
pup loss. Do not use polycarbonate water supply equipment.
1	to eliminate resins that induce liver enzymes.
2	Markaverich BM, Alejandro MA, Markaverich D, et al. 2002. Biochem Biophys Res Commun
291 (3):692-700.
Page 2

Prior to the onset of the study, pregnant female rats are housed individually. At
weaning, pups are housed in groups of either two or three females of the same
treatment group per cage.
Animals are maintained on a balanced laboratory diet3 and deionized water4 ad
libitum, in a room with a 14:10 hour light:dark photoperiod (on at 0500 hours, off
at 1900 hours local time). Temperature, humidity, and other conditions not
specified above should be in accordance with the recommendations contained in
Guide for the Care and Use of Laboratory Animals5 or other appropriate
Report the diet that the timed pregnant females were fed by the supplier prior to
shipment to the test laboratory.
(e) Animals: Juvenile Female Rats. The Sprague-Dawley strain of rats is the
preferred strain for this assay until a more-appropriate strain (or set of strains) is
identified and associated performance criteria developed. Results similar to
those from Sprague-Dawley rats have been produced using Wistar and Long-
Evans rats in this assay or relevant modifications of this assay, suggesting that
strain is not the major determinant of sensitivity in this assay.
Juvenile female rats are derived from individually housed pregnant females that
were bred in-house or purchased from a supplier as "timed pregnant" dams.
Dams obtained and transported from an external supplier may not be used in the
same study as dams bred in-house. All dams must be pregnant for the first time
and timed to deliver on the same day. If purchased from a supplier, all dams
should be on the same gestation day (GD) but that GD may be GD 7, 8, 9, or 10
at the time of arrival at the performing laboratory (where GD 0 = day of sperm
positive). Dams are allowed to deliver their pups naturally. Any litters with fewer
than 8 total pups/litter {i.e., including both males and females) and any litters not
delivered by GD 23 are excluded from the study. To maximize uniformity in
growth rates, the litters are standardized to 8 to 10 pups per litter between post-
natal days (PND) 3 and 5. (PND 0 is defined as the day on which the pup is first
seen, assuming that the cages are checked for new births daily, in the morning.)
All litters within a study are standardized to the same number of pups (8, 9, or
10). Reducing litter size to 8-to-10 is required when dams have more than
enough pups, but cross-fostering to raise litter size to the required number is not
acceptable. Body weights are monitored weekly and any unthrifty litters or
runted pups excluded from the study. Enough litters should be available to
assure that a sufficient number of juvenile females are available for 15 female
3	N.B.: Genistein-equivalent content of genistein plus daidzein (aglycone forms) of each batch must be
less than or equal to 300 ug/g, and the same batch of feed must be used for treated and control groups at
all times. ("Genistein-equivalent content" of daidzein is approximately 0.8. Owens WB, Ashby J, Odum J,
Onyon L. 2003. Environ Health Perspect 111(12):1559-1567.)
4	Deionized water is required. Tap water is not acceptable.
5	Institute of Laboratory Animal Resources. Washington, DC: National Academy Press. 1996.
Page 3

pups per treatment group. (If parallel male and female pubertal studies are being
conducted, males and females from the same litter can be used in their
respective studies.) Also, enough litters should be available to avoid the need for
placing littermates in the same experimental group.
The pups are weaned on PND 21. Also on day 21, female pups are marked by
litter, and then all of the female pups from all of the litters are weighed
individually to the nearest 0.1 g and ranked by body weight. A population of
female pups that is as homogeneous as possible is selected for the study by
eliminating an equal number of pups from the heavy end and the light end of the
distribution, leaving the number of animals needed for the study in the middle. In
this way, one nuisance variable (viz., body weight at weaning) is experimentally
controlled. The female pups are assigned to treatment groups such that the
mean body weights and variances for all groups are similar. Avoid placing
littermates in the same group.
After assignment to treatment groups, female offspring belonging to the same
treatment group are housed in groups of either 2 or 3 per cage, such that each
cage has the same number of animals. (In the case of housing 2 per cage and a
planned N of 15, it will be necessary to add an additional rat to the last cage.)
It is imperative that treatments be initiated no later than PND 22. Waiting just a
few days longer can result in failure of the study as the onset of puberty {i.e.,
vaginal opening) in the control female rats will begin within a few days.
As described in more detail later in this document, the preferred procedure is to
kill all the animals on a single day to close the in-life portion of the study, but if
the number of planned necropsies is considered too large to allow careful
measurement of endpoints on one day {e.g., when multiple chemicals are being
tested simultaneously) kills may be conducted over two days rather than one,
with half of each group killed on each day. Day of kill is assigned to individuals at
the time of distribution of the pups to groups; kills over two days must not be
adopted as a matter of convenience on the day of necropsy. If kills over two
days are planned, the body weights are distributed across kill groups such that
the mean body weights and variances for all groups are similar, and, when
possible, litter mates are not in the same group.
Experimental Design. This protocol uses a randomized complete block design
(time-separated necropsy is the blocking factor) with fifteen female rats in each
treatment group. The treatment conditions are (1) vehicle-treated and (2)
xenobiotic-treated (two dose levels). The highest dose level should be at or just
below the Maximum Tolerated Dose (MTD) level but need not exceed the limit
dose of 1 g/kg/day. A dose level will generally be considered to be at or just
below the Maximum Tolerated Dose level if it causes a statistically significant
reduction in terminal body weight gain in treated animals vs. controls, the
reduction is no greater than approximately 10% of the mean terminal body weight
for the controls, and no clinical signs of toxicity associated with the dose level are
Page 4

observed throughout the study. In addition, abnormal blood chemistry values at
termination (particularly creatinine and blood urea nitrogen (BUN)) may indicate
that MTD was exceeded, even in the absence of a reduction in terminal body
weight compared to controls. Histopathology of the kidney (or any other organ
where gross observations indicate damage) may be used as evidence that MTD
was exceeded.
The second dose level should be one half of the highest dose level being tested
unless justification is provided for testing at a different level.
If necessary, the study can be conducted in time-separated blocks rather than at
one time. In this case, each block should contain all treatment groups and be
balanced with respect to numbers of animals and body weight at weaning.
Test substance. Chemical purity and stability in vehicle must be known prior to
testing so that dose levels are correctly prepared.
The test substance is dissolved or suspended in a suitable vehicle. In choosing
a vehicle, consideration should be given to the following characteristics: effects
on the absorption, distribution, metabolism, or retention of the test substance;
effects on the chemical properties of the test substance which may alter its toxic
characteristics; and effects on the food or water consumption or the nutritional
status of the animals. Use of vehicles with potential intrinsic toxicity should be
avoided (e.g., acetone, DMSO). If corn oil is used, it must be clear and free of
sediment. It should have a bland odor, free from rancid, musty, metallic, putrid or
any other undesirable odor. Other solvents such as water or
carboxymethylcellulose may be used where appropriate. Gentle warming may
be used to assist solubilization but the solution must not be administered warm
and the solution should be checked to make sure that precipitation did not occur
upon cooling. Use of intermediate solvents (e.g., ethanol) to assist in
solubilization is not appropriate. If the test substance is not soluble in any of the
conventional solvents, it is administered as a suspension. Sonication may be
used to assist in suspending particles. It is important that the dosing solution or
suspension be well-mixed to keep the chemical well-distributed prior to and
throughout dosing, and care must be taken to ensure that the particle size of
insoluble substances does not interfere with delivery of the full dose through the
gavage tube or needle tip.
Treatment. Each animal is weighed daily, prior to treatment, and the body
weight recorded. Clinical observations are also recorded daily. Animals which
are found dead or which must be euthanized because they are near the point of
death are removed from the cage. Endpoint measures (organ weights, hormone
levels, histology, etc.) are not taken from these animals.
Treatments are administered daily by oral gavage from PND 22 through PND 42.
This duration of treatment is unnecessary to detect estrogenic chemicals, but is
required for the detection of pubertal delay and antithyroid effects. Test
Page 5

chemicals are administered in 2.5 to 5.06 ml vehicle/kg body weight at 0700-0900
daily using an 18 gauge gavage needle (1 to 1% inch length, 2.25 mm ball) and a
1 cc (disposable) tuberculin syringe for each treatment. Needle gauge may be
optimized to animal size but must be constructed of metal to avoid the potential
for absorption by or leaching of substances from rubber or plastic tubing. The
treatments are administered on a mg/kg body weight basis using the current
day's weight, and volume of the dose administered is recorded each day.
In the absence of other clinical signs that would normally lead to removal of an
animal from the study, failure to gain weight at the same rate as controls is
generally not a reason to remove a treated animal during the course of the study.
However, it is recognized that severe failure to grow may be a reason to
disqualify an animal even in the absence of other signs of toxicity. As general
guidance, EPA suggests that a reduction in body weight gain when compared to
controls of more than 20% in the absence of other signs of toxicity may justify
(i) Vaginal Opening. Beginning on PND 22, females are examined daily for vaginal
opening. The appearance of a small "pin hole"7, a vaginal thread, and complete
vaginal opening are all recorded on the days they are observed. The day of
complete vaginal opening is the endpoint used in the analysis for the age at
vaginal opening; a "pin hole" or thread does not represent complete vaginal
opening, even though it must be recorded when observed. However, if any
animal within any treatment group shows incomplete opening (such as persistent
threads or a "pin hole") for greater than three days, a separate analysis is
conducted using the ages at which incomplete opening was first observed.
Documentation of a vaginal thread even if vaginal opening otherwise appears
complete is important. It is also critical that "initiation" of vaginal opening be
recorded. It is preferred but not critical that vaginal opening observations be
taken after the daily dosing. Whether collected before or after dosing, the vaginal
opening observations must be collected at approximately the same time each
(j) Estrous Cyclicity. Beginning on the day of vaginal opening, through to and
including the day of necropsy, daily vaginal smears are obtained and evaluated
under a low-power light microscope for the presence of leukocytes, nucleated
epithelial cells, or cornified epithelial cells8. The vaginal smears are classified as
diestrus (predominance of leukocytes mixed with some cornified epithelial cells),
proestrus (predominance of clumps of round, nucleated epithelial cells), or estrus
(predominance of cornified epithelial cells) and the stage is recorded daily.
(Metestrus is classified as (an early part of) diestrus rather than (a late part of)
6	Dosage volume per kg body weight must be the same for all treated animals in the experiment but the
value chosen for the study may be anywhere in the specified range.
7	See attached photograph for an example of a "pin hole" vs. complete vaginal opening.
8	See Cooper RL, Goldman JM, Vandenbergh JG. Monitoring of estrus cyclicity in the laboratory rodent
by vaginal lavage. In: Methods in Toxicology. Vol III, Part B. Female Reproductive Toxicology. Edited
by Chapin RE and Heindel J. Academic Press: Orlando. 1993. pp. 45-56.
Page 6

estrus.) Age at first vaginal estrus is noted9. It is preferred but not critical that
estrous cycle observations be taken after the daily dosing. Whether collected
before or after dosing, estrous cycle observations must be collected at
approximately the same time each day.
At the end of the study, the overall pattern of each female is characterized as
regularly cycling (having recurring 4- to 5-day cycles), irregularly cycling (having
cycles with a period of diestrus longer than 3 days or a period of cornification
longer than 2 days), or not cycling (having prolonged periods of either vaginal
cornification or leukocytic smears). In cases where there are too few days
between vaginal opening and the end of the study to observe more than one
cycle, classification will have to be based on the available data with the default
assumption that animals are cycling regularly if the partial data fit the definition,
and are irregularly cycling if the study ends without being able to distinguish
between irregular cycling and not-cycling.
(k) Necropsy. Females are killed on PND 42. If necessary, one half of the females
may be killed on PND 42 and the remaining females on PND 43 as long as the
animals in each treatment group are equally dispersed between the two necropsy
days. Animals killed on PND 43 are dosed and treated on the day of kill just like
the animals killed on PND 42 with regard to time of dosing, collection of vaginal
opening and estrous cycle observations, etc. All animals are dosed between
0700 and 0900 hours local time, and killed beginning 2 hours following dosing. It
is critical that kills are completed by 1300 hours due to normal diurnal fluctuation
in thyroid hormone levels.
On the day before the kills are to be performed, the animals are moved to a
holding room separate from the room in which the kills and/or necropsies are
performed. On the day of kill, dosing is done in the holding room. During kills,
the holding room should be undisturbed except for the removal of the next
individual to be killed. Only the animal which is to be killed next should be
transferred from the holding room to the room in which the kill is performed, and
the time for transfer and kill should be as brief as possible.
The preferred method of kill is by injectable anesthetic, followed immediately by
decapitation in order to obtain a sufficient volume of blood for the T4 and TSH
measurements. If necessary to obtain a sufficient volume of blood, the
decapitation may be performed after anesthesia has been achieved but before
death. Carbon dioxide is not an anesthetic. A less preferred but still acceptable
method of kill is by decapitation without anesthesia. Decapitation has generally
not been found to interfere with the integrity of the thyroid, which must be
maintained in order to obtain thyroid weight and histology sections.
9 See Goldman, J.M., Murr, A.S., and Cooper, R.L. (2007) The rodent estrous cycle: characterization of
vaginal cytology and its utility in toxicological studies. Birth Defects Research (Part B) 80:84-97. A typical
cycle consists of two or three days of diestrus, one day of proestrus, and one or two days of estrus. In
the postpubertal female, this pattern develops shortly after vaginal opening and regular cycling is the
norm for the young adult female.
Page 7

The order of necropsy is randomized or otherwise evenly distributed across all
groups being necropsied that day. That is, do not necropsy all animals in one
group before moving to the next group. When two or more test chemicals use
the same control group, it is particularly important to intersperse the control
animal necropsies across the entire time span in which all of the necropsies for
all the test chemicals and dose levels are conducted.
Blood from the trunk of the animal is collected immediately (e.g., by inversion
over a funnel). Collect the blood in serum separation tubes (i.e., without EDTA or
heparin). The amount of blood needed is specified by the hormone kits'
manufacturers. After collection, the blood is centrifuged at 3000 X g for 30
minutes. The serum is pipetted into siliconized microcentrifuge tubes10 and
stored at -20C or colder for subsequent hormone and blood chemistry
At necropsy, the ovaries (without oviducts), uterus, thyroid (with attached portion
of trachea), liver, kidneys, pituitary, and adrenals are removed and the weights of
each except the thyroid/trachea and uterus recorded in milligrams to one decimal
place with the exception of kidney and liver, which are recorded in grams, to two
decimal places. (Kidneys, adrenals, and ovaries are weighed as pairs.) Care
must be taken to remove mesenteric fat from the uterine horns with small
surgical scissors. The uterus and cervix are separated from the vagina11. The
uterus is then placed on filter paper, slit to allow the fluid contents to leak out,
gently blotted dry and weighed. Small tissues such as the adrenals and pituitary,
as well as tissues that contain fluid, should be weighed immediately to prevent
tissues from drying out prior to weighing. Measures to prevent drying out may be
necessary if such organs cannot be weighed immediately. For example, the
organs may be placed in a weigh-boat and a moist paper towel used to cover the
weigh-boat, but the paper towel must not come into contact with the organs at
any time.
The ovaries and uterus are placed in 10% buffered formalin for at least 24 hours,
after which they are rinsed and stored in 70% ethanol until embedded in paraffin.
They are then stained with hematoxylin and eosin (H&E) for subsequent
histological evaluations. The thyroid, with attached trachea, is fixed in 10%
buffered formalin for at least 24 hours. Then the thyroid (with parathyroids) is
dissected from the trachea, blotted and weighed to the nearest 0.01 mg, placed
in 70% ethanol until embedded in paraffin, stained with H&E, and histologically
evaluated. Kidney, like thyroid, is fixed in 10% buffered formalin for at least 24
hours, then placed in 70% ethanol until embedded in paraffin, stained with H&E,
and histologically evaluated.
1 n
If there is a greater volume of blood than will fit in one microcentrifuge tube, prepare as many separate
aliquots as appropriate before freezing. Do not freeze in large aliquots, to avoid excessive freeze/thaw
1 See attachment for guidance.
Page 8

(I) Hormonal Assays. Hormonal measurements can be conducted using
radioimmunoassay (RIA), immunoradiometric assay (IRMA), enzyme-linked
immunosorbent assay (ELISA), or time-resolved immunofluorescent procedures.
Regardless of which is used, always include multiple quality control (QC)
samples run in duplicates that are dispersed among the test samples. Any
measurement kit that is used must be shown to yield appropriate values for
control rats at the laboratory performing the pubertal assay. This includes
demonstrating that QC was performed as described by the kit manufacturer and
that the performance falls within the range defined by the manufacturer. If the kit
does not provide or specify a standard control, then the lab should use its own
historical quality control samples. The lab's criteria for evaluating the kit's
performance must be included in the study report. If the laboratory has never
had experience with the kit for making measurements specifically in the rat, it
should test the kit in one or more untreated rats outside of the pubertal assay
before relying on it for the full study.
(m) Blood Chemistry. Any standard panel of blood chemistry tests that includes
creatinine and blood urea nitrogen (BUN) may be used as long as the
measurements are calibrated for rats and the normal ranges for controls are
reported. The normal ranges for controls may be from the literature (in which
case the reference should be given), or from historical controls.
(n) Histology. Uterus, thyroid, one ovary, and one kidney are evaluated for
pathologic abnormalities and potential treatment-related effects.
Thyroid sections are subjectively evaluated for follicular cell height and colloid
area using a five point grading scale (1 = shortest/smallest; 5 = tallest/largest)12
and any abnormalities/lesions noted. A minimum of two sections of each of the
two lobes of the thyroid are evaluated. Example photomicrographs are attached.
The examples illustrate the magnitude of differences that are typically evaluated
as separate scores, but the reader will need to establish the range appropriate
for the particular study being evaluated.
Ovarian histology following H&E staining should include an evaluation of follicular
development (including presence/absence of tertiary/antral follicles,
presence/absence of corpora lutea, changes in corpus luteum development,
changes in number of both primary and atretic follicles) in addition to any
abnormalities/lesions, such as ovarian atrophy. Five random sections are
evaluated using the method of Smith BJ et al.13
12	see Capen CC, Martin SL. 1989. The effects of xenobiotics on the structure and function of thyroid
follicular and C-cells. Toxicol Pathol 17(2):266-93.
13	Smith BJ, Plowchalk DR, Sipes IG, Mattison DR. 1991. Comparison of random and serial sections in
assessment of ovarian toxicity. Reproductive Toxicology 5(4):379-383. As cited in Plowchalk DR, Smith
BJ, Mattison DR. "Assessment of toxicity to the ovary using follicle quantitation and morphometries".
Chapter 5 in Tyson CA, Witschi H, Methods in Toxicology, Vol 3B, Female reproductive toxicology,
Heindel JJ, Chapin RE eds.,1993.
Page 9

Uterine histology should document cases of uterine hyper- or hypotrophy as
characterized by changes in uterine horn diameter and myometrial, stromal, or
endometrial gland development.
The final histological assessment should take into account the stage of the
estrous cycle of the female at the time of necropsy as ovarian and uterine cellular
changes are dependent upon endocrine status.
(o) Statistical Analysis. Consideration should be given to whether there are any
data points that should be excluded from the data set, and whether any data
points that are identified as outliers by an appropriate statistical test should
actually not be excluded, based on toxicological judgment. Values due to
obvious technical errors are excluded. Justification for exclusion of each data
point must be given. Outliers must be specified in the raw data. Do not test
incidence data, e.g., from histopathology evaluation, for outliers.
All data except histology and cyclicity evaluations {i.e., initial body weight [PND
22], body weight gain14, age and body weight at vaginal opening, body and organ
weights at necropsy, and serum hormones) are analyzed by Analysis of Variance
(ANOVA). If the study was conducted in blocks, then the analysis is a two-way
ANOVA with Block and Treatment as main effects. Age and body weight at VO,
and all organ weights are also to be analyzed by Analysis of Covariance
(ANCOVA) using the body weight at PND 21 as the covariate15. When
statistically significant effects are observed (p < 0.05), treatment means are
examined further using appropriate pairwise comparison tests to compare the
control with each dose group16. Where there is heterogeneity of variance, data
are to be transformed appropriately prior to ANOVA/ANCOVA, or analyzed using
an appropriate nonparametric test. Non-parametric analysis should be the
method of last resort since it does not allow analysis of covariation. In addition to
ANOVA and ANCOVA, examine the unadjusted and adjusted values for linear
trend with dose level.
In cases where vaginal opening has not occurred prior to necropsy, use the last
day of observation +1 as the age at vaginal opening when determining the mean
for each group. For example, if the animal was killed on PND 42 without vaginal
opening, use PND 43 as the value for that animal when determining the mean for
the treatment group.
Use the body weight on the last day all the animals were weighed. Specifically, if kills were performed
over two days, do not use the day when only the last half of the animals were available.
15	The covariate is body weight on the day of weaning rather than on the day of kill because ANCOVA
assumes that the covariate that is being adjusted for is not affected by the treatment, whereas in this
assay endocrine-active substances may affect the overall body weight gain and thus body weight at kill.
Using body weight at kill as covariate could mask a potentially endocrine effect on an organ. The Agency
understands that using body weight at kill as covariate might identify which organs are more sensitive (or
less sensitive) than body weight to potentially endocrine effects, but has chosen to maximize the potential
of identifying organ-specific effects rather than relying on bodyweight as an indicator of potential
endocrine activity.
16	Comparison of the means of dose groups to each other is not required.
Page 10

Chi-square analysis is used to determine significant differences between the
cycling status (cycling vs. not cycling) of the treated groups from the control
group. Similarly, chi-square analysis is used to determine significant differences
between treated groups from the control group for the percent of animals cycling
Cycle length may be defined as either the number of days from one proestrus to
the next proestrus, or from one diestrus to the next diestrus. Whichever
definition is chosen must be applied uniformly to all groups in the study.
Incomplete cycles are not counted in calculating mean cycle length. Mean cycle
length for each animal is calculated first, and the mean of these means is then
calculated to represent the group.
When possible, appropriate statistical analysis should be applied to the histology
(p) Data Summary. The Agency requests that all raw data and data summaries be
provided in electronic format (spreadsheet or comma-separated values), along
with all formatting information that is necessary to read the data. The Agency
intends to provide a suggested template with the posting of this guideline on the
Agency's Web site.17 Provide the following figure and tables for each test
chemical along with the respective control. Be sure to use the units shown in
the example tables. Provide values to one decimal place for the organs
reported in milligrams, and to two decimal places for those organs that are
required to be reported in grams.
Prepare an executive summary describing the number and strain of rats used in
the study, the dose levels and chemicals tested, and the effects, with levels of
statistical significance for all endpoints except histology. Include a summary of
the histological findings.
Electronic and hard copies of spreadsheets containing the raw data from all
animals for each endpoint are to be submitted to the EPA. Also provide the full
report of the histological findings with photomicrographs of significant
(1) Figure 1. Graph the mean body weight+/-Standard Deviation (SD) for
each day during dosing for each treatment group, including vehicle
control. (If animals were necropsied over 2 days, do not include the body
weight from the last day of necropsy since only half of the animals are
available.) Place an arrow at the mean age of the controls at VO. All
17 Available on-line at: http://www.epa.gov/oppts (select "Test Methods & Guidelines" on the left side
navigation menu). You may also access the guidelines in htto://www.reaulations.gov under Docket ID #:
Page 11

three groups (control, dose level 1, and dose level 2) are plotted on the
same graph but are distinguished from each other by point and line styles.
Table 1. Vaginal Opening; General Growth. Report the mean,
standard deviation, coefficient of variation, number of animals (N), and p-
value for the following endpoints, for each test-chemical dose group and
control, both unadjusted (U) and adjusted (A) for body weight on PND 21:
~	Age at VO
~	Body weight at the age of VO
~	Initial body weight (PND 22)
~	Body weight on the last day all the animals were weighed {i.e., if
kills were performed over two days, do not use the day when only
the last half of the animals were available)
~	Final body weight as percent of control (leave control column blank)
~	Body weight gain from first dose to the (first) day of necropsy
Mark endpoints that show an effect (by ANOVA/ANCOVA or a non-
parametric test, as appropriate) with an asterisk in the "Effect" column.
List the transformation (if any) used to eliminate heterogeneity of variance,
or the non-parametric test used, in the "Transform or nonparam" column.
Name the pairwise test used to compare the means of dosed groups to
the mean of controls in the "Pairwise test" column.
Mark means that are significantly different from control means (p < 0.05)
by shading the cell (rather than by the traditional asterisk).
Show the proportion of animals in which VO had not occurred by the time
of necropsy {e.g., X/15), and explain that age-at-necropsy-plus-one {e.g.,
43) was used for those animals when calculating the mean.
Page 12

Table 1. Vaginal Opening; General Growth.
Vehicle Control

Dose Leve

Dose Leve
Age at VO


Body weight at VO


Initial body weight
(PND 22, g)


Final body weight


Final body weight
(% of control)


Body weight gain
(final minus initial
body weight) (g)


Proportion unopened (#/N)

*Means different from controls at p<0.05 are marked by a shaded cell.
Page 13

Table 2. Estrous Cycle Status and Organ Weights at Necropsy.
Report the number of animals in each stage of the estrous cycle at
necropsy, for each treatment group. An animal is considered to be "not
cycling" if she shows three or more consecutive days of estrus or five or
more consecutive days of diestrus.
Report the mean, standard deviation, coefficient of variation, number of
animals (N), and p-value for liver, kidneys, pituitary, adrenals, ovaries,
uterus, and thyroid weights, for each treatment group, both unadjusted (U)
and adjusted (A) for body weight on PND 21. Also report the mean,
standard deviation, and p-value of the organ-weight-to-body-weight ratio
for liver, kidney, adrenals, and pituitary only. For ovaries, uterus, and
thyroid weights, do not use relative organ to body weight ratios, and do not
adjust for body weight at necropsy.
Page 14

Table 2. Estrous Cycle Status and Organ Weights at Necropsy.
Chemical Name
Vehicle Control
(Dose Level 1)
(Dose Level 2)
Cycle status at kill



Not cycling

Organ weights
Transform or
Pairwise test














Uterus, wet


Uterus, blotted




*Means different from controls at p<0.05 are marked by a shaded cell.
U=Unadjusted, A=Adjusted, R=Organ-to-body-weight ratio
Page 15

(4) Table 3. Estrous Cyclicity. Show the mean age at first vaginal estrus
for each treatment group and mark those groups which are significantly
different from the vehicle control group (p < 0.05). Report the mean cycle
length for the group. Report also the percent of each group cycling, and
the percent cycling regularly, and mark those groups which are
significantly different from the vehicle control group.
Table 3. Estrous Cyclicity.
Test Article
Mean Age at
First Vaginal
Estrus (PND)
Mean Cycle

Test chemical

Differences from controls at p<0.05 are marked by a shaded cell.
Page 16

(5) Table 4. T4 and TSH Levels. Report the mean, standard deviation, coefficient of variation, number of
animals, and p-value for the T4 and TSH levels, for each treatment group, including vehicle control.
Table 4. T4 and TSH Levels.
Chemical name
Vehicle Control
(Dose Level 1)
(Dose Level 2)

Transform or
Pairwise test
Serum T4, total

Serum TSH,

*Means different from controls at p<0.05 are marked by a shaded cell.
(6) Table 5. Blood Chemistry. Similarly to Table 4, report the mean, standard deviation, coefficient of
variation, number of animals, and p-value for each of the parameters measured. Also report the normal
range for each parameter, and indicate whether these normal values are from the literature (provide
reference) or from historical controls.
Table 5. Blood Chemistry.
Chemical Name
Vehicle Control
(Dose Level 1)
(Dose Level 2)

Transform or
Pairwise test
Normal range+



*Means different from controls at p<0.05 are marked by a shaded cell.
Page 17

(q) Performance Criteria. The following performance criteria have been
established for the vehicle-control animals. See the Data Interpretation
Procedure for use of the performance criteria. Units for the endpoints are shown
in the table. Coefficients of variation (CVs) are in percent. The "mean", "2 SDs",
"CV", and "1.5 CV" columns describe the mean, two standard deviations,
coefficient of variation, and 1.5 times the coefficient of variation for that endpoint
in historical controls. Mean values and CVs for the vehicle control group must
fall in the acceptable range of each to be considered fully acceptable.
Table 6. Performance Criteria for Controls (Sprague-Dawley Strain).
2 SDs
Acceptable range
1.5 CV
Top of
Uterus, blotted (milligrams)

187.40 to 410.38
Ovaries (milligrams)

36.54 to 114.77
T4 (total, jjg/dl)

2.69 to 5.38
Thyroid weight (milligrams)

6.20 to 22.20
Age at VO (postnatal day, where day of birth
= PND 0)

30.67 to 35.62
Weight at VO (grams)

101.71 to 131.44
Final body weight (grams)

104.86 to 204.55
Adrenals (milligrams)

38.34 to 48.84
Kidneys (grams)

0.95 to 2.20
Liver (grams)

4.32 to 11.78
Pituitary (milligrams)

5.86 to 12.08
aBottom of the acceptable range for coefficient of variation is zero.
No performance criteria have been established yet for TSH since there were too
few studies from which reliable historical control values resulting from the same
analytical method could be obtained. Such criteria may be established in the
future as more data become available.
(r) Data Interpretation Procedure. The female pubertal assay is intended to be
one of a suite of in vitro and in vivo assays for determining the potential of a
substance to interact with the endocrine system (Tier 1 assays). Therefore, it is
important to emphasize that the data interpretation of a specific chemical will be
a combination of the results from a number of these Tier-1 screening assays
taken as a whole and not merely the sum of results of assays interpreted in
isolation. That said, there are certain guidelines that can be given for interpreting
data from a female pubertal assay.
Page 18

First, the dose levels tested should be examined to see if a Maximum Tolerated
Dose was used. (The highest dose level need not exceed a limit dose of 1
g/kg/day, even if MTD has not been reached.) Body weight loss (compared to
controls at termination) that does not exceed approximately 10% is an indication
that MTD was approached but not exceeded. Adverse clinical observations or
histopathology of the kidney and/or other organs, and/or significant deviations of
blood chemistry values of treated animals vs. controls may be indications that
MTD was exceeded.
Negative results for interaction with the endocrine system in the pubertal assay
will generally require demonstration that the highest dose level tested was at or
near the MTD. Positive results in the assay generally require no such proof, but
will generally require demonstration that interference due to decrease in body
weight gain compared to controls perse was not a factor in generating the
results. Studies that suggest interaction with endocrine systems only at a dose
level that causes more than approximately 10% decrease in body weight gain at
termination compared to controls may require additional studies and/or a weight-
of-evidence approach using other information in order to be interpretable.
The endpoint values for the control group should be compared to the
performance criteria. Comparison should be made on the basis of the measured
values, not adjusted values. Any endpoints which do not meet the performance
criteria in controls will generally be given little weight for the test chemicals if they
are negative but may provide useful information if they are positive.
Information that is missing due to inability to meet a performance criterion is not
the same as a negative result. The more endpoints that are missing, the less
likely the study will be regarded as adequate. No firm rules can be given for the
minimum number of endpoints that must be available for evaluation since some
of the endpoints are somewhat redundant while others are not. In general,
however, missing one or two performance criteria will not be regarded as fatal to
the study.
More emphasis will be placed on meeting performance criteria for the coefficients
of variation than for the endpoint control means. Laboratories may submit
historical data for their own colonies to substantiate claims that tissue weights or
other endpoints in the study being evaluated are in line with historical values of
controls in that laboratory.
Once the data set has been compared to the performance criteria, it is evaluated
to see if there is evidence of interaction of the test chemical with the endocrine
Due to the covariance of certain organ weights with body weight, care should be
taken in interpreting pituitary, liver, and kidney weight changes. Only if a change
in the organ weight relative to body weight is significant for these particular
Page 19

organs (i.e., not all the organs) should the weights adjusted for covariance with
body weight at weaning for these particular organs be interpreted as relevant.
Do not evaluate endpoints other than pituitary, liver, and kidney weights on their
values relative to terminal body weight, nor use analysis of covariance with
terminal body weight for interpretation of any but pituitary, liver, and kidney
weights. Since endocrine-active agents themselves may have an effect on body
weight, it is most appropriate to adjust for covariance with body weight at
weaning, before chemical treatment began.
Weight of ovaries and uterus must be interpreted carefully, due to the natural
variability in these endpoints in cycling animals. In general, regularity of cycling
should be given more weight than lack of statistical significance for the difference
in weight of ovary or uterus in treated animals compared to controls. (Presence
of a statistically significant difference from controls should be considered more
informative than absence of such a difference.)
The judgment of the histopathologist as to whether an effect on ovary, uterus,
and/or thyroid is associated with exposure to the test chemical is important to
consider when evaluating the organ weights and hormone levels measured in
this assay. Severity and incidence of effect(s), and dose-response relationship
may also be important information to consider.
Because there are multiple endpoints examined in this assay, there is
redundancy for the detection of potential endocrine system interaction. For
example, both strong (ethynyl estradiol) and weak (methoxychlor) estrogens
dramatically advanced the age of vaginal opening, altered body weight at VO,
and age at first vaginal estrus. Redundancy is particularly useful when the
responses from all the redundant endpoints are consistently positive since it
gives greater confidence that the interaction with the endocrine system is real.
However, consistency across all redundant endpoints is not required in order to
infer interaction with the endocrine system. There may be valid reasons for
apparently-redundant endpoints to differ in their response.
If an isolated endpoint is positive at the lower dose and no effect is seen at the
higher dose, then the effect and the overall conclusions about the substance may
need to be questioned. However, since the assay provides information from only
two dose levels, the dose-response information from the female pubertal assay is
sparse and informs the weight of evidence for interaction with the endocrine
system but generally does not control it.
Compounds that exert effects via various mechanisms or modes of interaction
with the endocrine system can be identified using the female protocol. A
summary of the kinds of effects that might be seen from various different modes
of action is shown in Table 7. The table is provided to help with interpretation of
results, but determining a mode of action is not required in order to consider the
assay positive for interaction with the endocrine system. Furthermore, this table
Page 20

is not to be interpreted as requiring that all of the endpoints shown to respond as
indicated for a particular mode. Interaction with the endocrine system may be
occurring without the complete profile shown.
Table 7. Potential Changes Indicative of Different Modes of Action that May Be
Observed in the Female Pubertal Protocol.
Estrogen Agonist
Inhibition of
Disruption of Hypo-
pit axis1
Early VO,
pseudoprecocious puberty
Delayed VO
Alterations in VO
Decreased T4
Reduced BW at VO
Delayed first estrus
Alterations in cyclicity
Alterations in TSH
Early first estrus
Persistent diestrus
Altered ovarian,
uterine or pituitary
Changes in thyroid
Altered organ histology
Reduced uterine
Altered organ
Changes in thyroid
Possible persistent estrus
Altered organ histology

Changes in liver
weight/enzyme profile
Reduced ovarian weight

Increased uterine weight

Changes in hypothalamic-pituitary function may advance or delay puberty, modify the ovarian cycling by
inducing early cycles, alter the regularity of cycles and alter tissue weights depending on whether the
chemical activates or inhibits pubertal development.
Figure 1. Removal and Preparation of the Uterine and Ovarian Tissues for Weight

Make a medial incision approximately five inches long on the ventral aspect of
the rat from the vaginal opening towards the head. Locate the vagina ventral to
the urinary bladder. Locate the uterine horns and ovaries bilaterally and detach
from the dorsal abdominal wall. Detach the uterus and vagina from the body by
incising the vaginal wall at A. Transfer the tissues to a tared weigh boat. Detach
the ovaries by cutting between the small white oviducts and the ends of the
uterine horns (B). Before measuring the uterine weight, carefully trim away the
excess fat and connective tissue from the uterine horns and body and remove
the vagina by cutting just caudal to the uterine cervix as shown in the figure (C).
Uterine weight without luminal fluid {i.e., blotted weight) should be measured.
Finally, cut between each oviduct and ovary (D), remove the ovarian bursa, and
trim away the ovarian fat before measuring the weight of each ovary.
Page 22

Stages of Vaginal Opening
Page 23

(s) Explanation of Thyroid Slides. The slides are coded as follows:
~	Positions 1 and 2: follicular cell height score, 1 to 5 with 5 as greatest
~	Positions 3 and 4: colloid area score, 1 to 5 with 5 as greatest area.
"F5C1" represents a follicular score of 5 and colloid score of 1.
F1 C5 is considered "normal".
These slides were taken at 20X and excluded the edge of the tissue.
The slides have been chosen to illustrate both follicular cell height and colloid area
changes in the minimum number of slides. While all of these examples show scores for
follicular cell height and colloid area that happen to add up to "6", this is not expected to
be the case for most slides.
Page 24


Page 25 of 31

- ^ '


3L J?
Page 26 of 31

Page 27 of 31

Page 28 of 31

Page 29 of 31