Investigation of Cancer Risk
      Assessment Methods
      Volume 3.  Analyses
      Clement Associates, Inc., Ruston, LA


      Prepared for

      Environmental Protection Agency.  Washington, DC
      Sep 87
                                                                  PB88rl27139
L

-------
-'••' TECHNICAL REPORT DATA
/Please read Inuructions on the went before completing/
1 REPORT NO. . 12.
EPA/600/6-87/007d |
4. TITLE ANO SUBTITLE
.Investigation of Cancer Risk Assessment Me
Volume 3. Analyses
3 PB88-127139
5. REPORT DATE .,
thods: September 1987
6. PERFORMING ORGANIZATION CODE
-/AUTHOR*. Bruce c> Allen, Annette M. Shipp, Kenny S. B.PERFORM.NGOR
Crump, Bryan Kill an, Mary Lee Hogg, Joe Tudor,
Barbara Keller
9 PERFORMING ORGANIZATION NAME AND ADDRESS
Clement Associates, Inc.
-1201 Gaines Street
Ruston, LA 71270
3ANIZATION REPORT NO.
10. PROG-AM ELEMENT NO.
11. CONTRACT /GRANT NO.
68-01-6807
12. SPONSORING AGENCY NAME AND ADDRESS . 13; TYPE OF REPORT AND PERIOD CO VERED
Office of Health and, Environmental Assessment
Carcinogen Assessment Group (RD-689)
U.S. Environmental Protection Agency
Washington, DC 20460
14. SPONSORING AGENCY CODE
EPA/600/21
15. SUPPLEMENTARY NOTES £pA pr0ject officer: Chao Chen, Carcinogen Assessment Group
Office of Health and Environmental Assessment, Washington, DC (382-5719)
i6. ABSTRACT jne major focus of this study is upon making quantitative comparisons of
carcinogenic potency in animals and humans for 23 chemicals -for which suitable
animal and human data exists. These comparisons are based upon estimates of risk
related doses (RRDs) obtained from both animal and human data. An RRD represents
the average daily dose per body weight of a chemical that would result in an extra
cancer risk of 25%. Animal data on these and 21 other chemicals of interest to the
EPA and the DOD are coded into an animal data base that permits evaluation by
computer of many risk assessment approaches.
This report is the result of a two-year study to examine the assumptions,
other than those involving low dose extrapolation, used in quantitative cancer risk
assessment. The study was funded by the Department of Defense [through an inter-
agency transfer of funds to the Environmental Protection Agency (EPA)J, the EPA,
the Electric Power Research Institute and, in its latter stages, by the Risk Science
Institute.
17. -"* KEY WORDS ANO DOCUMENT ANALYSIS
* DESCRIPTOR
^PRODUCED BY
U.S. DEPART
NATI
INFO
SPFW
13 DISTRIBUTION STATEMENT
i Distribute to public
b. IDENTIFIERS/OPEN ENDED TERMS
•MENT OF COMMERCE
ONAL TECHNICAL
RMATION SERVICE
JGFIELD. VA 22161
19 SECURITY CLASS (This Report!
Unclassified
20 SECURITY CLASS ,Thii pagei
Unclassified
c. COSATI Tield 'Group

21. NO. OF PAGES
22 PRICE

-------
                                 DISCLAIMER                  :

     This document has been reviewed in accordance with the U.S. Environmental
Protection Agency's peer and administrative review policies and /approved for
pufolication.  Mention of trade names or commercial products does not constitute
endorsement or recommendation for use.  The information in thisi-document has
been funded by the U.S. Environmental Protection Agency, the Department of
Defense (through Interagen^y Agreement Number RW97Q751Q1), the Electric
Power Research Institute, and the Risk Science Institute.

-------
                                CONTENTS

Section                                                             Poqe

1    METHODOLOGY
          Introduction                                              1-1
          Analysis of Bioassay Data                                 1-2
             Definition of Analysis Methods                         1-3
             Description of Approaches to Components                1-7
             Definition of Standard Methods                         1-8
             Sieve                                                  1-10
          Comparison With Epidemiological Results                   1-13
             Correlation Analysis                                   1-13
             Prediction Analyses                                    1-16
             Uncertainty                                            1-2<»

2     RESULTS
          Correlation Analysis                                      2-1
             Evaluation of Sieve                                    2-3
             Analyses that Use Combination of                       2-5
              All Significant Individual Responses
             Analyses That Utilize Malignant Neoplasms Only         2-6
             Adjustment for Early Deaths by Considering Only        2-7
              Animals Alive at the Time of Occurrence of the
              First Tumor
             Analyses That Utilize Only Long Studies                2-7
             Analyses That Use Same Tumor Response in               2-8
              Animals as in Humans
             Comparison of Correlations from Data from              2-8
              Specific Animal Species
             Choice of Dose Units                                   2-9
             Identification of Analyses Yielding Higher             2-10
              Correlations
          Prediction Analysis            '•'                    >       2-12
             Sieve                                                  2-13
             Predictors                                             2-1 «*
             Comparison of Analysis Methods                         2-16
             Asymmetric Loss                                        2-20
             Animal-to-Human Conversion                             2-22
             Uncertainty                                            2-26

 3      DISCUSSION
          Positive  Correlation                                      3-1
          Data  Quality  and Data  Screening                           3-3
          Application  of  Analysis  Results  in  Extrapolating         3-6
            from Animals to Humans
          Identification  of  Good Methods                           3-7
             Predictors                                            3-7
             Analysis  Methods                                       3-9
          Coraponent-Speci.'ic Uncertainty                           3-15
          Options for  Presenting a Range of Risk Estimates          3-18
             Option 1                                                3-19
             Options 2 and  3                                        3-19
             Comparison of  Option*                                  3-21

-------
Section                                                             Page

             Example*                                               3-2^
          General Considerations and Major Conclusions              3-26
          Directions for Future Research                            3-30

-------
                              ILLUSTRATIONS
1-1   Final RRD Estimate
2-1   Correlation Analysis :
2-2   Correlation Analysis :
2-3   Correlation Analysis :
2-4   Correlation Analysis :
2-5   Correlation Analysis:
2-6   Correlation Analysis:
2-7  . Correlation Analysis:
2-8   Correlation Analysis:
2-9   Correlation Analysis:
2-10  Correlation Analysis:
2-11  Correlation Analysis:
2-12  Correlation Analysis:
2-13  Correlation Analysis:
      Experiment (5)
2-14  Correlation Analysis:
2-15  Correlation Analysis:
      Responses (8a)
2-16  Correlation Analysis:
2-17  Correlation Analysis:
2-18  Correlation Analysis:
2-19  Correlation Analysis:
2-20  Correlation Analysis:
2-21  Correlation Analysis:
2-22  Correlation Analysis:
2-23  Correlation Analysis:
2-24  Correlation Analysis:
2-25  Correlation Analysis:
2-26  Correlation Analysis:
2-27  Correlation Analysis:
2-28  Correlation Analysis:
      and  Species (12)
2-29  Correlation Analysis:
      and  Species (12)
2-30  Correlation Analysis:
      and  Species (12)
2-31  Correlation Analysis:
      and  Species (12)
2-32  Correlation Analysis:
      Significant Response
2-33  Correlation Analysis:
      Animals (20)
2-34  Correlation Analysis:
2-35  Prediction Analysis:
      Predictor
2-36  Prediction Analysis:
      Predictor
 Standard Analysis (0)                  2-58
 Standard Analysis (0)                  2-59
 Standard Analysis (0)                  2-60
 Standard Analysis (0)                  2-61
Long Experiment Only (1)                2-62
Long Dosing Only (2)                    2-63
Long Dosing Only (2)                    2-64
Route That Humans Encounter (3a)        2-65
Any Route of Exposure (3b)              2-66
Any Route of Exposure (3b)              2-67
Any Route of Exposure (3b)              2-68
Any Route of Exposure (3b)              2-69
Average Dose Over BOH of                2-70

Malignant Tumors Only (7)               2-71
Combination of Significant              2-72

Total Tumor-Bearing Animals (8b)        2-73
Response That Humans Get  (8a)           2-74
Average Over Sex (9)                    2-75
Average Over Sex (9)                    2-76
Average Over Study  (10)                 2-77
Average Over Study  (10)                 2-78
Average Over All Species  (11a)          2-79
Average Over All Species  (Ha)          2-80
Average Over Rats and Mice  (11b)        2-81
Average Over Rats and Mice  (11b)        2-62
Rat Data Only (11c)                     2-83
Mouse Data Only  (11d)                   2-84
Average Over Sex, Study,                2-85

Average Over Sex, Study,                2-86

Average Over Sex, Study,                2-87

Average Over Sex, Study,                2-88

Average Over All: Combination  of        2-89
(16)
Average Over All: Total  Tumor-Bearing  2-90

 Route and  Response  Like  Humans (25)     2-91
Analysis  17, Median  Lower Bound         2-92

Analysis  17, Median  Lower Bound         2-93

-------
                              ILLUSTRATIONS
2-37  Prediction Analysis: Analysis 3b, Median Lower Bound          2-94
      Predictor
2-38  Prediction Analysis: Analysis 20, Median Lower Bound          2-95
      Predictor
2-39  Prediction Analysis: Analysis 3b, Median Lower Bound;         2-96
      Best-Fitting Lines with Increasing Degrees of Asymmetry
2-40  Prediction Analysis: Analysis 20, Median Lower Bound;         2-9.'
      Best-Fitting Lines with Increasing Degrees of Asymmetry
2-41  Prediction Analysis: Analysis 22, Median Lower Bound;         2-98
      Best-Fitting Lines with Increasing Degrees of Asymmetry
2-42  Component-Specific Uncertainty; Ratios of RRDs for            2-99
      Analysis 31 (mg/m^/doy) to RRDs for Analysis 30
2-'»3  Component-Specific Uncertainty; Ratios of RRDs for            2-99
      Analysis 32 (ppm diet) to RRDs for Aoalysiu 30
2-44  Component-Specific Uncertainty; Ratios of RRDs for            2-100
      Analysis 33 (ppm air) to KRDs for Analysis 30
2-45  Component-Specific Uncertainty; Ratios of RRDs for            2-100
      Analysis 34 (mg/kg/lifetime) to RRDs for Analysis 30
2-46  Component-Specific Uncertainty; Ratios of RRDs for            2-101
      Analysis 35 (Long Experiments Only) to RRDs for Analysis 30
2-47  Component-Specific Uncertainty; Ratios of RROs for            2-101
      Analysis 36 (Long Dosing Only) to RRDs for Analysis 30
2-48  Component-Specific Uncertainty; Ratios of RRDs for            2-102
      Analysis 37 (Route Like Humans) to RRDs for Analysis 30
2-49  Component-Specific Uncertainty; Ratios of RRDs for            2-102
      Analysis 38 (Inhalation, Oral, Gavage, Route Like Humans)
      to RRDs for Analysis 30
2-50  Component-Specific Uncertainty; Ratios of RRDs for            2-103
      Analysis 41 (Malignant Tumors Only) to RRDs for Analysis 30
2-51  Component-Specific Uncertainty; Ratios of RRDs for            2-103
      Analysis 42 (Combination of  Significant Responses)
      to RROs for Analysis 30
2-52  Component-Specific  Uncertainty;  Ratios of RRDs for Analysis   2-104
      43 (Total  Tumor-Bearing Animals) to RRDs for Analysis  30
2-53  Component-Sp»cific  Uncertainty;  Ratios of RRDs for            2-104
      Analysis 44 (Response  Like Humnas) to RRDs for Analysis 30
2-54  Component-I.pecific  Uncertainty;  Ratios of RRDs for            2-105
      Analysis 45 (Average Ovsr Sex)  to  RRDs for Analysis  30
2-55  Component-Specific  Uncertainty;  Ratios of RROs for            2-105
      Analysis 46 (Average Over  Study)  to  RRDs for Analysis 30
2-56  Component-Specific  Uncertainty;  Ratios of RRDs for Analysis   2-106
      47 (Average Over  All Species)  to RRDs for Analysis 30
2-57  Component-Specific  Uncertainty;  Ratios of RROs for Analysis   2-106
      48 (Average Over  Rats  and Mice)  to RROs  for  Analysis 30
2-58  Component-Specific  Uncertainty;  Ratios of RROs for Analysis   2-107
      49 (Rat  Data  Only)  to  RRDs  for Analysis  30
2-59  Component-Specific  Uncertainty;  Ratios of RRDs for            2-107
            ti«  50  fmauma  Oatn Onlv^  to RRDm for Analvmia  30

-------
                                 TABLES
Toble                                                               Poge

1-1   Approaches to Risk Assessment Components                      1-30
1-2   Approaches Used For Initial Thirty-Eight Analyses             i-32
1-3   Standard Values Used in Analysis of Animal Bioassay Data      1-33
1-4   Approaches Used for Supplemental Analyses                     1-34
1-5   Descriptions of All Analyses                                  1-35
1-6   RanKs Based on Length of Experiment                           1-77
      and Number of Treated Animals
2-1   Correlation Coefficients and Associated p-Values,             2-33
      by Analysis Method and Sieve
2-2   Abbreviations for Chemicals Tncluded in the Study             2-34
2-3   Average Loss as Determined by the Symmetric DISTANCF2         2-35
      Loss Function, by Analysis Method, Predictor, and Sieve
2-4   Average Loss as Determined by the Symmetric CAUCHY            2-36
      Loss Function, by Analysis Method, Predictor, and Sieve
2-5   Average Loss as Determined by the Symmetric TANK              2-37
      Loss Function, by Analysis Method, Predictor, and Sieve
2-6   Comparison of Analyses; Five Best Analyses,                   2-38
      by Predictor and Loss Function
2-7   Comparison of Analyses; Five Best Analyses, Excluding         2-39
      Analyses 6, 18, and 19, by Predictor and Loss Function
2-8   Total Incremental Normalized Losses, by Analysis and Sieve    2-40
2-9   Average Loss for Restricted Sets of Chemicals                 2-41
      for Analyses 3b, 17, ana 20, by Loss Function
2-10  Average 'Loss as Determined by the Asymmetric TANH Loss        2-42
      Function  for LM. by Analysis and Degree of Asymmetry
2-11  Average Loss as Determined by the Asymmetric TANH Loss        2-43
      Function  for L2Q, by Analysis and Degree of Asymmetry
2-12  Y-Intercept Values  for  Best-Fitting Lines, LM Predictor,      2-44
      By Analysis, Sieve, and Loss Function
2-13  Y-Intercept Values  for  Best-Fitting Lines, L2Q  Predictor,     2-45
      By Analysis,  Sieve, and Loss Function
2-14  Y-Intercept Values  for  Best-Fitting Lines, MLEM Predictor,    2-46
      By Analysis,  Sieve, and Loss Function
2-15  Y-Intereept Values  for  Btst-Fitting Line-, MLE2Q Predictor,   2-47
      By Analysis,  Sieve, and Loss Function
2-16  Average Loss  for  Supplemental Analyses With  the LJQ          2-48
      Predictor,  By  Analysis, Stive,  and Loss  Function
2-17  Y-Intercept Values  for  Best-Fitting Lines, Among             2-49
      Supplemental  Analyses,  by  Analysis, Sieve, and  Loss Function
2-18  Average Loss,  by  Dose  Units, Sieve and Loss  Function          2-50
2-19  Y-Intercepts  by Dose Units,  Sieve, and Loss  Function          2-51
2-20  Conversion Factors  for  All  Dose Units,  by  Method of          2-52
      Analysis  and  Sieve
2-21  Uncertainty Factors for Analyses Without the Sieve            2-53
2-22  Uncertainty Factors for Analyses With  the Sieve              2-55
2-23  Component-Specific  Uncertainty:  Modes  and Dispersion         2-57
      Factors for Ratios  of  RRDs,  by  Supplemental  Analysis
3-1   Comparison of Selected Results  for Selected  Analyses          3-35

-------
                                 TABLES
Trhle                                                               Page

3-2   Median Lower Bound RRO Estimate*, by Chemical                 3-36
      and Analysis Method
3-3   RRO Predictions, by Jhemical and Analysis Method              3-37
3-4   Uncertainty Intervals for RRD Predictions, by                 3-38
      Chemical ant Analysis Method
3-5   Ranges of Human RROs Dorived from the Recommended             3-39
      Set of Analyses
3-6   Ranges of Human RRDs Derived from the Recommended             3-40
      Set of Analyses Ignoring Analysis 43

-------
                                Section 1
                               METHODOLOGY
INTRODUCTION

One goal of this project is to examine various methods for analyzing
bioassay data to determine which methods produce results that correlate
well with the results obtained from epidemiologica] data and to
characterize the uncertainties involved.  For this to be possible,
reasonable, alternotive methods of analysis need to be defined.  Recall
that in the introductory section (in Volume 1 of this report) were
listed the components of risk assessment and several approaches for each
component; that list is reproduced in Table 1-1.  Consider Figure 1-1,
which depicts the process of risk assessment based on bioassay data: for
several experiments in each of a few species, particular carcinogenic
responses yield estimates of RRDs that are combined in some way to yield
the final estimate.  The components listed in Table 1-1 correspond to
the different levels in the tree shown in Figure 1-1 and the approaches
specify how to handle the corresponding level.  The basic method for
defining analysis methods has been to select different combinations of
the approaches, as is described in this section.

Also in this section is a description of the methods used to compare  the
bioassay-based results to the epidemiologically derived estimates.  A
nonparametric generalizea rank  test is  used  to  evaluate the  correlation
between the  two sets of estimates.  When specific  point estimates  from
the bioassay analyses are employed as predictors,  their performance i."
compared  on  the basis of  tf.e  fit  of a straight  line with  slope of one to
the data.   Three  approaches (loss functions)  used  to fit  the line are
described.
                                 1-1

-------
ANALYSIS OF BIOASSAY DATA

For each chemical being analyzed, the procedure described here is
followed to derive the RRDs of interest.  For each carcinogenic response
coded from a study testing the chemical of interest, the multistage
mod»l that best describes the response rates from all dose groups is fit
tc the dose-respopse data.  The multistage model has the form

     P(d) - 1 - exp<-(q0 v q-|d +  ... + qkdk)},                     (1-1)

where P(d) is the probability of  cancer when exposed to average daily
dose d; qQ, q-|, . . . ,qk > 0 and K is equal to one less than the number of
dose groups.  The model is fit by an updated version of GLOBAL82 Q)
that gives maximum  likelihood, lower bound, and upper bound estimates
for the dose 0  such that

     PCD) - P(0) -  0.25,
        1 - P(0)

i.e. D  is the  dose  corresponding to an  extra risk of one in four.  This
dose will be called a  risk related  dosa  (RRD)  corresponding to a risk of
one in  four.    Similar definitions  of doses corresponding to the
particular levels of risk  can  be found  in  the  literature.  Sawyer et al.
(2), for example, discuss  "TD50", the daily dose  required to halve the
probability of remaining  tumorless.

Actually, the  model is fit to  each  combination 3f dose  and response
values  that might arise  by combination  of  the  approaches listed  in Table
1-1.   In particular, the  components that affect the fitting of the model
are numbered  4,  5,  and 6  in  that table;  20 combinations of the
approaches  to  these components are possible.   Hsnce,  as many  as  20

-------
models hove been fit to each response.   (Many responses have been fit by
only 10 models since tne data needed to analyze only the affective
number of animals,  approach b to component 6, are not generally
available in the published literature.)  The triple of estimates
composed of the 95Jf lower bound, the MUE, and the 95* upper bound for
the RRD corresponding to an extra risk of 0.25, which is labeled (DL,
       . is available for each of the models fit to each response.
Definition of Analysis Methods

Each analysis mathod specifies which soecies of animal to consider, what
criteria the experiments on those species must satisfy, wf oh responses
within those experiments to consider, and which of the 20 model results
to use.  (Throughout this report, "experiment" denotes the data from all
dose groups in a single bioassay of one species and one sex of test
animal, except when results for two sexes are reported together and
cannot be  separated. )  In every case, the first step  is to assign  one
triple to  each experiment, selecting the triple from  the  responses that
are eligible for that  method  (components 7  and 8).  The tr pie that  is
selected  is the one that has  the smallest DL, lower limit on  RRD.  This
procedure  is adopted because  we are  interested in  the evidence for
carcinogenicity and the manipulation of that evidence.  The eligible
response with  the  smallest DL is the one that is consistent with  tf-«
highest carcinogenic potential and may, therefore, provide the best
evidence  of  carcinogenicity  from th* experiment une'er consideration

Given  this method  of assignment of RRO  triples to  experiments, and given
 the approaches listed  in  Table 1-1,  88320  possible analysis methods
 could  be  defined.   Thirty-eight analyses were  run  as  the  first  set
 (Table 1-2).   The thirty-eight analysis methods  fall  into five
 categories that are defined by the  manner  in which the data  from
 individual experiments ore combined  to yield the twelve values  that are
                                 1-3

-------
of interest in tlie investigation.  Those twelve values are the minimum,
the first quartile, the median,  and the third quartile of the lower
bounds, MLEs and upper bounds.  The five categories are described below.

No Averaging.  The first category of analyses includes those that treat
each species, each study within species, and each sex within study
separately (approach a for component 9, a for 10, and d for 11; see
Table 1-1).  Let Ykj be the ith lower bound for RRD in species k, and
assume that the Ykj values are ordered with Yk-| the smallest and Ykn(k)
the largest; n(k) is the number of experiments in species k.  Define the
species-specific quartile values as follows:

     Yk1Q - Yk! for n(k) 1 *                                      (1-2)
                       for n i 5
     Yk2Q - Yk! for n(k) 1 2
     yk3Q  • Ykn(k) f°r  "CO i *
            Ykn(k)-l(n(k)/<*J for n(k) i 5
 Then the minimum and  quartile  values of  the  lower bounds for the
 analysis are  defined  by

     Ymin  • -"in  (Yk,)                                             (1-3)
              k
     Y1Q   - min  (Yk10)
              k
     Y2Q   • median  (Yk2Q)
              k
     Y3Q   - max  (Yk30).
              k
 The minimum and  maximum  over species are adequate to  define the first
 and third  quartiles,  respectively,  because rarely are there more  than

                                 1-4

-------
four specie* tested for any chemical.   The MLE and upper bound value*
are defined in exactly the same manner.   Analyses 0 through Be,  He,
11d, and 25 are in this category.

Averaging Over Sex.  The second category of analyses is represented by
analysis 9, where the results from different sexes tested in the same
study are combined.  A study may include as many as two experiments.
Two experiments are considered to be from tft* same study only if tn.»y
were carried out in the same laboratory by the same experimenters, the
same moiety of chemical was used,  the same strain of animal was tested,
the numbers of animals initially on test were nearly identical,  and the
study protocols ware nearly identical.  If that is the case, this
analysis methods calls for harmonically averaging the values from the
two experiments, lower bound with lower bound, MLE with MLE, and upper
bound with upper bound.  The weights for the average are equal to the
initial numbers of animals on test in each experiment.  After the
averaging, one triple  is associated with each study.  These can be
ordered, the species-specific quartiles defined, and the minimum and
quart.ile values for the analysis defined in exactly the same manner as
the first  category.

Averaging  Over  Study.  The  third category  entails  combining studies
within  species  (Analysis  10).   Note that different experiments falling
under the  same  study  are  not  averaged,  so  each  study may contain  more
than one  triple of estimates.   Once again,  let  Yki be  the  ith ordered
lower bound  from  an experiment  testing  species  k (the  same procedure  is
followed  for MLEs and upper bounds).   Species-specific minimum values
are obtained by harmonically  averaging  the smallest Y*i values from  each
study.   Species-specific  quartiies  are  obtained by randomly sampling  a
single  Yk^ value  from each study  and  then  harmonically averaging  the
values  selected.   A total of  100  samples  is taken for  each jpecies so
that when the averages are ordered,  the 25th,  50th, and 75th  estimate
                                 1-5

-------
the first, second, and third quartiles, respectively.  The weight
attached to each study for the harmonic averages is the total number of
animals initially on test from all experiments under that study.  The
minimum value associated with the analysis is the smallest of the
species-specific minimums and the quartiles associated with the analysis
are defined from the species-specific quartiles as shown in Eq. 1-3.

Averaging Over Species.  Examples of the fourth category of analyses are
provided by Analyses 11a and 11b, in which results from different
species are averaged.  Once again, species-specific results are averaged
harmonically; in this case an unweighted average *s used.  To obtain the
minimum average lower bound, one selects the smallest lower bound found
among experiments in each species, then these species minimjms are
averaged.  The lower bound quartile values associated with the analysis
are estimated by random sampling:  100 times, a lower bound is randomly
selected from each species and the average computed.  When ordered, the
25th, 50th, and 75th avernge represent the first, second, and third
quartile, respectively.  The same procedure is followed for MLEs and
upper bounds.

Averaging Over Sex.  Study, and Species.  The final category of analyses
includes  Analyses 12 through 2<>d.  In  this category, results are
sequentially averaged  over experiment  within study,  over study within
species,  and finally,  over species.  Note that at each step, one triple
(averaged)  is associated with each study, species, and analysis,
respectively.  Consequently, only one  average  lower  bound, one  average
MLE,  and  one average upper bound is  associated with  such on analysis.
As a  result, the  minimum  and all quartiles of  the  lower  bounds  are  the
same,  i.e.  the one  overall average of  lower  bounds.  The same  is tri
-------
all experiments contained within a study when averaging over study
within species, and the constant 1 when averaging over species.
Description of Approaches to Components

The previous discussion has described the manner in which the different
analysis methods are definod.  In so doing, it has described the
approaches to several of the components of risk assessment.  In
particular, it has been shown how the approaches to components 
-------
possible, experiment-specific, indeed dose-group-specific,  body weights
and food intake* have b«».i used to convert among the unit*.  Standard
value* (Table 1-3) have been u*ed when nece**ary.

Calculation of Average Dose.  Component 5 relate* to the calculation of
average doses for each do*e group.  Either all dosing is considered
(approach a), in which case the average is calculated over the entire
experiment, or only dosing over the first 800 of the experiment is
considered (approach b), in which case average daily dose is based on
that time period as well.  The 60% figure is predicated on the
assumption that exposures during the last 20% of the life of an animal
are unlikely to affect cancer incidence due to the latency period.
Crump and Howe (ft) observe that 'chemically induced tumors are apt to
have a latent period of about 1/5 of the life span of the species.*
(The »'jme assumption is used when specifying approach b to component 2,
i.e., when restricting experiments to those with "long" dosing of 80*  or
more.)   As an example of the  difference between the approaches to
component 5, consider a  100-week  study with a dose group receiving  1
mg/kg/day for 90 weeks.  In approach a, average daily dose is calculated
as

      (90-1 + 10-0)/100  • 0.9  mg/kg.

For approach b,  the  average daily dose  is

      (80-1)/80  •  1 mg/kg.
 Tumor Type to U«e.   A* discussed in Volume 2 of this report,  special
 codes have been designated for two types of response,  all tumors  and  the
 combination of all  significantly increased tumors.   The codes used,  in
 general,  are based  on the International Classification of Diseases for
                                 1-8

-------
Oncology (5).  The tcpoiogy-mor ->hology cod* applied for all tumors is
1000-8000;  for the combination of significantly increased tumorm it is
1000-7000.   These are the responses included in approaches b and a,
respectively, to component B.
Definition of Standard Method*

Analyst* 0 resembles the procedure employed by EPA'a carcinogenic
Assessment Group in many respects:  ing/m2/day are the units assumed to
yield human and animal equivalence; species, studios, and experiments
are not combined so that the minimum lower bound comes from the most
sensitive species and sex and from the experiment- yielding the smallest
RRO lower bound (largest upper bcund on risk); and route of exposure is
limited to the more common routes, inhalation, gavage, and oral, unless
humans are exposed by some other route.  Of course, no automatic
procedure can exactly duplicate the decision-laden process of risk
assessment.  Nevertheless Analysis 0 is one reasonable procedure and,
more importantly for this project, is the one that serves as a template
for defining other analyses.

Another standard method  nas  been  defined.   It is called Analysis 30 and
has been  used as a template  to  define an additional  set of twenty
analysis  methods (Table  !-*•).   Analysis 30  differs from Analysis 0  in
that mg/kg/day  rather  than mg/m^/day are the  units assumed to yielri
equivalence  between  humaits and  animals for  extrapolation  of  RRO
estimates.   Moreover,  the route of administration  of the  test chemicals
is not limited  to  any  particular route;  injection  and instillation
studies are  included olong with gavoge,  oral,  or inhalation  experiments.
The eighteen methods that are single-component  variations of
Analysis 30,  i.e.  Analyse* 31 through  50 (Analyses 39 and *0 were not
performed),  are not  duplicated in Analyses 1  through 25,  except for
                                 1-9

-------
Analysis 31 which is the •am* as Analysis 3b and Analysis 38 which is
the sanw as 4a.

This alternative standard and its single-component variants are not used
in the majority of the analyses performed for this project.  That set of
methods was defined only after the bulk of the analysis was completed.
Its purpose is to provide information on the uncertainty associated with
single components of the risk assessment process.  It is used to
investigate component-specific uncertainty or variability in the manner
described  later in this section.  However, information on the
predictiveness of the estimates from vhe supplemental analyses is
considered when the best mnthod(s) ore identified.

Table 1-5  gives a verbal description of the 38 initial analyses and the
19 supplemental analyses.
Sieve
In addition to criteria  restricting the type of experiments that are
used  in some analyses, another procedure hae been set up to select  ,
subsets of the data  for  analysis.  This procedure is called a sieve and
operates as follow*.  Criteria are defined that rank experiments in
terms of preference  for  analysis.  Say a rank of 1 is preferred over 2.
2 over 3. etc.   The  experiments  and responses that are used in any
specific analysis  are those  that nave the lowest rank; if there are any
rank  1 data sets those and only  those are used, if no rank 1 data sets
are available all  the rank 3 data sets are used, etc.  This procedure  is
an attempt to use  the best data  that are available but yat to do
something when the best  type of  data is unavailable.

The sieve may have more  than one 3evel.  That  is, a  selection from among
the experiments  may  tit made  on  the  basis of  one criterion and then the
                                1-10

-------
selected bioassay* may be subjected to further screening on the basis of
another criterion.  In each case,  the best,  the lowest rank,  data cets
survive the screening and become available for analysis.  If one
criterion is based on properties of the carcinogenic responses (e.g.,
rank 1 is given to responses that show a significant relationship to
dosing) as opposed to another criterion that is based on features of
the entire experiment (e.g., rank 1 is given to experiments that have at
least 50 dosed animals) the former screening is applied first.  In that
way, the greatest amount of data passes from one level of the sieve to
the other; individual responses, not entire experiments, are eliminated
at the first stage.  Also, for the example given above, a significant
response (i.e. evidence of carcinogenicity) is to be preferred, no
matter how many animals are teuted.  This is not guaranteed to happen if
the fir»t screening  is based on nunber of dosed animals.

The sieve technique  is designed to work *ith any of the analyses defined
in terms of the components of risk assessment as described above.  The
sieve  is applied  only after any inclusion criteria specific to an
analysis.  For example.  in Analysis  1, only experiments that lasted  at
least  90* of  the  standard  experiment length are included;  the sieve  is
applied after  that selection  is made.  Note also that  the  selections
that  define the analyses are  unlike  the selection procedure for  the
sieve.  The analysis-defining selections  do not rank  studies.   If there
are not experiments  that-lasted at least  90*  of the  standard experiment
length for a  Chen-leal,  that  chemical is not included  in Analysis 1.   The
sieve technique  does rank  experiments so  that the best can be  used;  a
chemical  cannot  be excluded from  an  analysis  because  of the action  of
the sieve.

The sets  of  analyses that employ  a sieve  use  one  or both of two screens.
 The first examines each response to tee if a  significant dose-related
 effect on response rates is evident.  Priority (i.e.,  the lowest rank)
                                1-11

-------
is given to those responses that do «how such a significant
relationship.  The second screen is based on a combination of two
features of each experiment, the length of observation and the number of
dosed animals.

Significance of a relationship between dose and response rates is
assessed by use of the Fisher's exact test (6) and the Cochran-Armitoge
trend test (7).  If the response rate in any treated group is
significantly greater than that in the control group at the 0.05 level
as determined by Fisher's exact test or if the trend of response rates
is significant at the 0.05 level as determined by the Cochran-Armitage
trend test, then the response is considered significant and is given
lower rank.  Otherwise, the higher rank is assigned.  This screening is
called the significance screening.

The ranking  scheme based on experimental protocol, i.e. on length of
experiment and number of treated animals, is depicted in Table 1-6.
Note that this is just one  of infinitely many  ranking schemes possible.
Of the two features  (experiment  length and number of dosed animals)
slightly more weight, in terms of  the perceived  quality of the study,
has been given to length of observation.  This part of the sieve is
labeled the  quality  screen.

Given the two  screens described  above,  four  sets of analyses  have been
defined, one with no screenings, one with the  significance screen alone,
one with the quality screen alone,  and  one with  both  screens.  As
described earlier, when  both  screens are  used, the  significance  screen
applied to  individual responses  operates  before  the quality  screen.   Of
course, the  entire  sieve procedure comes  into  play  only after the
application  of  the  exclusion  criteria  that  define each analysis  method.
                                1-12

-------
COMPARISON WITH EPIDEMIOLOGICAL RESULTS

Onca the bioassay data have been analyzed by the many methods defined,
one wishes to use the results to compare and evaluate those methods.
The techniques that have been selected to do this use the RRO estimates
obtainad from the epidemiological data as the basis for comparison.  A
method of bioassay analysis that yields estimates "close to* the
epidemiologically derived estimates is judged to be satisfactory.  The
following describes the techniques for determining how close a set of
bioafcsay-basbd estimates is to the human-based estimates.
Correlation Analysis

In the onalysi* of the epidemiological data, we have produced a "best"
estimate of the RRD corresponding to a one-in-four risk, RRO^, and upper
and lower bounds on that dose, RRD^u and RRD^Lt respectively.  The
interval [RRDHL, RRDHij] represents the range of estimates that are in
some  sense consistent with  the epidemiological data because of data
uncertainty or statistical  variability.

Because of the many bioassays for any given chemical and because  of
statistical variability within each  bioassay,  the  bioassay analysis
results may also be reasonably characterized by a  range of RRO
estimates   The  interval  selected  in the correlation analysis to
represent  that range  is  defined  by  the median  RRDs;  it extends  from  the
median of  the lower bour.d estimates to the median  of the upper  bound
estimates.  That choice  of interval considers  statistical  variability in
the sense  that  both  lower and upper statistical  confidence limits are
used in its  definition.   Moreover,  the use of  median values  avoids some
difficulties  with  outliers and behaves  properly  in on  asymptotic  sense.
 Should anomalous results appear in some  bioassays, estimators of  the
                                1-13

-------
appropriate range such as that from the minimum lower bound to the
maximum upper bound are adversely affected.  Such "minimum-to-maximum"
ranges are highly sensitive to outliers and once outliers appear, those
estimators are not self-correcting.  As more bioassays of a particular
chemical are performed (and so as the chance of outliers increases) the
minimum lower bound and maximum upper bound estimates can only get more
extreme.  Median values, on the other hand, should behave more properly
in the sense of discounting truly anomalous results and converging to
the "true" value.  Let [LjQ. U2Q] be the interval fron the median
(second quartile) of the lower bound RRD estimates to the median of the
upper bound RRD estimates obtained for any given chemical.

We are interested in the correlation between the epidemiologically based
estimates and the bioassay-based estimates.  That is, we wish to know  if
chemicals with larger estimated human RROs also tend to have larger
estimated animal RROs.  In the absence of known or suspected
distributions for the RRD estimate.1!, nonparametric tests of correlation
are appropriate.  The standard nonparametric measures of correlation
(notably Spearman's rho) use the ranks of point estimates without
consideration of variabilities.  When the variability or uncertainty
about point estimates is not the same for each observation, as is the
case with the data in the present  analysis, such a method may be
inappropriate.  Ng (8) has  proposed  a concept of generalized ranks that
does consider variabilities as reflected  in the  intervals surrounding  a
point estimate.  That concept  is used to  rank the human  intervals,
(RRDm.. RRDmj). and, separately, t.o  rank  the animal  intervals, (LJQ,
1>2Q)> '-"ia to determine  the  degree  of correlation between  the two sets  of
intervals.

A partial ordering for  the  intervals is  defined  as  follows  (the
definition  is given  in  terms  of  the  animal intervals;  exactly  equivalent
definitions hold  for  the human intervals).  Let  interval i,

-------
corresponding to chemical i,  be labelled (L^Qi,  u"2Qi) * *i-   Then 1^ is
less than Ij if l-20i < L2Qj and U2Qi < U2Qj (if UJQI • UJQJ  • «, then l
is less than I j if LJQI < l-2Qj ) •   l± is greater than I j if I j is less
than Ij,.  Otherwise, 1^ and Ij cannot be ordered (we will say they are
-tied").

A ranking of the intervals can be defined on the basis of the partial
ordering.  Let nj be the number of intervals less than intervol Ij and
let mi be the number of intervals tied with 1^.  Define the rank of Ij,
R, to be
      i ' ni
We will use RI to denote the rank of the ith chemical when based on the
animal intervals and 3j to denote the rank of the ith chemical when
based on the human  intervals.  Ng (8) has shown that the ranks so
defined have desirable properties including the fact that the sum of  the
ranks, ER^ or ES^,  is N-(N+1)/2  (i.e. the sum of the ordinary ranks of N
numbers) and that these generalized  ranks reduce to ordinary ranks  if
the  partial ordering is also a total or daring.  The R^  and S^ values  are
used hare to estimate correlations.

By analogy to  Spearman's  rho,  a  correlation  coefficient,  p,  is  defined
as follows:
 Note that R » S • (N+1)/2.  The statistic p behaves appropriately for
 a measure of correlation (-1 < p < 1; larger positive (negative) values
 indicate more positive (negative) correlation; etc.).
                                1-15

-------
The significance of p i* assessed by simulation.   The Ri'» are held fixed
at their observed values while the S^'s are permuted over che set of
observed S^'s.  That is. each permutation •-    ts p(i), i - 1,2,....,N,
from the set {1,2	N) such that p(i) / p{j) for i f. j.  Let 3^' •
Sp(i).  The correlation coefficient is evaluated for each permutation
(in the numerator of equation 1-4, RI is paired with Si')-  If among a
total of M permutations, K of them yield coefficiertm at least as large
as the observed p, then the probability of observing a coefficient as
large as or larger than p under the null hypothesis of no correlation is
estimated to be K/M.  The null hypothesis is rejected in favor of the
alternative, f > 0, for small values of K/M.  In the present analysis,
10,000 permutations were created  (M • 10,000).

Prediction Analyses

The correlation analysis just discussed concentratos on  intervals of
RRO estimates to determine whether or not the human and  animal estimates
generally behave in the same way  (i.e., RRDs for chemical i are  lower
when  estimated from the epidemiology when they ore  lower when based on
the bioassay).  If that correlation analysic is positive, then it is
reasonable  to go on to  ask  if particular points obtained via bioassay
analysis are  good  predictors of the results that are chained directly
from  epidemiological  investigation.  At  this stage  also,  one can examine
the magnitude of errors, i.e. the uncertainty that  results  from  the use
of any  predictor.   The  following  is a  description  of the methods
employed to compare  and evaluate  various predictors.

Unlike  the  correlation  analysis,  the prediction  analy .is selects a
single  point  from  the bioassay  analysis  results  as the estimate  of  RRD
for each chemical.   Each of the 38 analyses descriaed  in Table  1-2  could
supply  any  number  of  predictors.   The  four  that  have been investigated
are the minimum of the  lower  bound estimates,  L^,  the  median of  the
                                1-16

-------
lower bound estimates, I-2Q,  and the minimum and median of the maximum
likelihood estimates, Ml- EM and ML t2Q. respectively.  Theso values are
available for each chemical analyzed by each of the thirty-eight
methods .

The behavior and properties of the predictors are assessed, again, by
comparison with the epidemiologically derived estimates.  Those human
estimates are not distilled to a single point.  Instead the best
estirate, RRD^, and/or the interval from RRDm. to RRDmj form the basis
for evaluating the predictors.  In particular, a straight line with
slope of 1 is fit to  the base ten logarithmic transform of predictor
values and the logarithmic transform of the human estimates.  That is,
the bioassay-based estimate of the human RRD corresponding to a risk of
one in four, HA. is given by

     log-io(HA) - logi0(Pi) -f C,       '

where P^ is  the predictor from the analysis (either L^, I-2Q' ML(^M> or
MLE2g) for chemical  i and C is the y-intercept to  oe  estimated.   This
relationship implies  that
      HA
 i.e.  that a linear relationship exists between the untransformed
 estimates.   Consequently,  the potency of chemicals as determined from
 bioassay data relative to the estimates of human potency is not related
 to their absolute potency,  which seems reasonable.

 Suppose that A^ • log-|n(pi) + C, where PJ is one of the predictors for
 chemical i from the bioassay data as described above, for any given
 value of C.  The y-intercept, C*. is determined (i.e. the line is i~it to
 the data) by minimizing the sum of the losses for each chemical
                                1-17

-------
associated with the predicted RRD, A£, and the estimates derived from
the epidemiological data.  Clearly, a loss function must be defined in
order to carry out this procedure.

Three different forms of loss function are considered.  The first and
simplest, called DISTANCED, defines th«i loss associated with tha
prediction for chemical i to be proportional to the square of the
distance from the predicted value to the interval defined by the lower
and upper endpoints of the epidemiologically derived RROs.  Though this
formulation of loss is straightforward, it does have some disadvantages.
First, it does not consider the oest estimates of RRD* obtained from the
epidemiological analysis, the RRO^s.  Moreover, it cannot be applied
when the animal predictors can have infinite values.  Since MLE^ and
MLEjQ can be infinite, but L^ and LgQ cr>:tnot, DISTANCED can be used co
evaluate only the latter two predictors.  This some difficulty with
infinite values arises when the loss function utilizes the RRD^
estimates, which may  indeed be infinite.  Because cf these limitations
of DISTANCE? and because we wish  to consider possibly infinite
estimates (particularly  in RROji since we made a point of  including in
•these analyses chemicals that may not be carcinogenic as  determined by
epidemiological investigation, i.e. for which RRD^  is infinite) two
other loss functions  have  been developed.  These ere called CAUCHY and
TANH.  All three forms of  loss function are  described in  detail below.

DISTANCED Los:* Function.   Loss associated with  chemical  i is  defined
solely  in terms of  the  interval  (RROHL.i-  RRDHU,i)-   That loss  is
given by

      Jl.i  "   0    if log1o(RPDHL,i)  < *i  < l"3lo(RRDHU,i).
              d*    if Ai  <  log10(RRDHL.i).
              k-d2  if Ai  >  log10(?RDHu,i).
                                1-18

-------
Here d is the absolute distance between A^ and the closest of
login(RRDnL.i) anti 109lo(RRDHU,i)-  Tn* constant k is the asymmetry
parameter .

If, :* appears reasonable, it is worse to overpredict RROs, then k > 1
can be used to reflect the belief about the degree of asyrmetry.
Nevertheless, this approach to fitting tr,» line is essentially
equivalent to determining the line that is closest to the intervals
defined by the lover and upper endpoints of the human estimates.  The
total loss associated with a particular analysis is the unweighted sum
of the looses associated with each chemical in the analysis.

A simple extension of the reasoning presented in the discussion of the
loss function I-j allows definition of a fitting algorithm for results
expressed as intervals in both the horizontal and vertical directions.
Such results are obtained in the  correlation analyses.  The extended l-\
routine has  been run with the same intervals used to determine
correlations.  Such a procedure allows us to identify individual
chemicals whose intervals of RRO  estimates are far from the fitted line
and, therefore, may be thought of as outliers and may, in fact, detract
from the correlation.
 CAUCHY  and  TANH  Loss  Functions.   Suppose  H^  »  log-jo  (RRD^i), the
 logarithmic transform of  the best estimate from  the  epidemiological
 analysis  for chemical i.   Then,  recalling that A^  -  logio  (Pi)  + C, we
 wish  to find C that minimizes

      EI(Ai. Hi)  •  Wt                                              (1-6)

 where !(-,-) is  a nonnegative loss function, W^  is the weight attached
 to the loss for  chamical  i,  and the sum runs over  all chemicals in the
 analysis.  Considering that A^ or H^ may be  infinite, these are
 properties that  we considered it desirable  for 1 to have:

                                1-19

-------
     PI:    J(A,H) - 0 if A - H
     P2Q:   !(<*>.  H) < * and 1(A,°°) < oo
     P2b:   l(oo,  «) - 0
     P3:    I(A-|, ao) > i(A2, OB) if A-| < A2
     P*:    I(A,  H) < 2(A, oo) for A, H < oo
     P5:    KA-I, H) > I(A2, H) if A-, > A2 > H or A-, < A2 < H
     P6:    1(A-|. H) < J(A2, H) for H - A; - A2 - H > 0.

These properties can be interpreted as stating that loss is minimized
only when the prediction matches the observation (PI); that loss is
finite *ven for infinite RRC estimates (P2a) and that predictions of
infinite RROs are good (i.e., have zero loss) when the observed RRDs are
infinite (P2b);  that the loss is lees if the prediction is larger when
the observed RRD is infinite (Pi); that the loss is greater when the
observed RRD is infinite than when the observation is finite if the
prediction is finite (P<»); that the farther away from the observed RRO
one goes in one direction  the greater is the loss (PS); and that the
loss from underestimating  an observed RRO by a certain amount  is no
greater than the loss from overestimating by that amount (P6).  The last
property allows one to choose an asymmetric loss function if one wants
to reflect the  belief that it is worse to overestimate RRDs than to
underestimate them.

Unfortunately,  it  is easy  to show  that the  properties P1 through P6 are
mutually inconsistent.   Two approaches have bean taken to get  a  set of
consistent properties to motivate  vhe choice  of  loss function.  The
first  approach  is  to drop property 3.  This is  the  only  property that
prevents the  loss  function from being expressed  as  a function  of the
distance (A-H).   Consequently,  we  defined  12  ac  follows:
      J2(A.H)  - 1  - <1/(H-f(
-------
where ?(-) it the sign function and f(-) is some positive function
allowing the introduction of asymmetry.  It i* clear that 1% satisfies
Pi, P2a, P2b, and PA - P6.  Moreover, once we are given a set of H^'s
end PI'S, then we can approximate P3 but retain the other properties.
To do so, infinity must be approximated by some large number.  That
number must be chosen large enough so tnat P.

The set of properties obtained  by replacing  P<» with P J(A2.  H) even though H-A1-A2-H>0
 (in violation of P6).
                                 1-21

-------
Consequently. we or* led to try a slightly different loss:

     K'". H) -  g(H) - g(A),          A < H
                «Cg(H) - g(2H-A)].    A > H.

where g  is *ome monotone increasing function and m > 1 .  When this is
the case, 1(A-|. H) - g(H) - g(A1) and 1(A2. H) - m[g(H) - g(2H - A2)]
m[gCH> - g(A'i)] when H-AT - A2-H  > 0. and BO P6 is not violated.
If w* adopt this form of the Iocs function, then both lim g(x) and
                                                      X-««
lira g(x) must be finite (by P2a).  A monotone increasing function that
X—oc
has all these nice properties  is

     g(*) - tanh (c,x) - (e^* - e-c,*)/^6!* * e'C,*}.

The constant c<|  > 0 can be thought of  as a  scaling factor.  The
resulting loss function (called TANH)  is

     IS(A.H) -    tanh(c-|H) -  tonh(ciA).        A <  H               (1-8)
                 m[tanh(c-|H) -  tanh(c1(2H-A))],  A >  H.

The factor m is  chosen to reflect the  desired degree of asymmetry.   For
this investigation, asymmetry  considerations have been examined  by
setting m equal  to  1.5, 2, 5.  10, 50,  and  100.  Larger values  of m
reflect stronger beliefs about the  undesirobility of overestimating
RRDs.  Small c-j  shrinks everything  (except infinity) toward zero where
tanh  is more nearly  ".inear,  so that loss when  an  infinite  value  is
observed or predicted  is exaggerated compared  to  the loss  when both
observed and predicted ore  finite.   Small  enough  c-|  may  also  moke  Pi»
true  for any given  set of  observations and reasonable  values  of  C.   A
value  of  0.1 has been  assigned to c-\ throughout these  analyses.
                                1-22

-------
Given the alternative* 1% and Jj. we can calculate the loss contributed
by any given chemical.  What remains is to specify the weights. W^, that
allow accumulation of the individual losses into an overall loss value
as shown in Eq. (1-6).  It seems clear that less weight should be given
to chemicals whose human RRD estimates are less certain, i.e. to those
whose intervals surrounding H^ are longer.  Once again, the problem of
infinite values exists, in this case infinite interval lengths.
Consequently one should consider positive, monotone decreasing functions
that go to a positive limit as the representation for the W^'s.

Let D  • log<)o(RM>HU) " lofl1o{RROHl)-  *• wimn to
where  lim h(x)  • r  > 0.  The  function selected is
       X-HOO

     h(x) - coth2(x) -  ((e* + e-*)/(e*  - e'*))2.

Note that lim coth2(v)  - 1.   Also consider  the following.   If we were
          X-«o

doing  ordinary  weighted least squares,  we would want  the weights
proportional to the inverses  of the variances.  In  our  case, we have
quasi-war ionce  represented by the intervals from  RROm_  to  RRO^u-   In the
ordinary situation, the intervals would be  proportional to the  standard
deviations  and  so  woights  could be formulated in  terms  of  the  inverses
of the interval lengths squared.  Note that coth(x) behaves like  1/x for
 x close 1.0  zero, so that coth2(x) would behave like 1/x2.   That is,  if
wo choot * to use coth2{D^), we have a function that mimics ordinary
 least soaares for small DI but that allow*  us to consider infinite-
 valued D^.

 Tor each analysis method and  for eoih predictor,  lines have been fit to
 the results using  ;>ot.h loss  functions ^2 and 23-    In both cases

                                 1-25

-------
Wl • coth2(Di) it the weighting scheme employed.  For thorn* predictors
that or* guaranteed to be finite (L^ and L2tj) the lost defined by l-\
(distance to the interval) with w^'s set equal to 1 have also provided
estimates of the beet fitting line*.  Average Iocs for any analysis and
for any loes function is the total loss (weighted cum of the chemieal-
•pecifie losses) divided by the sum of the weight*.

Uncertainty

Two type* of uncertainty are investigated in thi* project, what we have
called residual uncertainty and component-specific uncertainty.  The
methods for quantifying these uncertainties are described below.

Residual Uncertainty.  The lines fit to the data using any of the loss
functions described above will not eliminate all uncertain*. y.  That is,
there will remain differences between the values predicted on the basis
of the best-fitting line ord the observed epidemiologically derived
estimates.  The DISTANCE2 loss function is used to quantify those
differences.

Let At be the  prediction (in this case  from the DISTAWE2-fitted line}
for chemical  i in any particular analysis.  Let G± be defined as
follows:

     Ot  -  1  if log10(RRDHLii) <  Ai  <  log10(RRDHu.i).
           10Ai/RRDHU.i  if *i  >  logia(RRDHu.i>
                         ' *i
 Then the average of the Gj/* over all the chemicals in the analysis
 yields a result coiled the residual uncertainty factor,   it is the
 average factor by which the predicted values must be multiplied or
 divided in order to yi-,ld predictions that lie within the intervals

-------
defined by th* RRDm.,1'1 ond th*
Alternatively,  on* can *xomin* s*parot*ly thos* chemicals for which the
human RRO« or* overestimated by th« be»t-fitting line
[Aj > logio(RRDHu, *)] and thorn* for which they or* underestimated
[A} < login(RRDHL,i)3'  Th*«* or* th* two group* of chemicals fcr which
th* best-fitting lin* does not intersect th* interval of human PRO
estimates.  Factors analogous to th* uncertainty factor defined obovv
but pertinent to on* or th* other of these two group* ,  or* defined in a
slightly different manner.

For those chemicals whose human intervals li* completely below the line
[A£ > log
-------
Coi*ponent-Specific Uncertointy.  A histogram approach ho* been u««d to
investigate the uncertainty associated with any one component of risk
assessment.  Only the supplemental analyses (Analyses 30 through 50) are
used in this investigation.  However, all chemicals with relevant animal
bioassay v-»ta can be used since epidemiological data is not required.

For any given predictor (eg., the median lower bound PRO estimate) each
analysis results in a single result for each chemical, P^.  Let us
denote the dependence of the results on the analysis by letting PX,I be
the result for chemical i in Analysis X.  Component-specific uncertainty
addresses the issue of how the Px.i values change with the analyses, X.
The investigation of this uncertainty is limited to analyses that differ
from a standard analysis (Analysis 30)  in only one component.  Analyses
31 through 30 are such single-component variants of Analysis 30.  The
ratios Rx.i • Px,i/P30,i' wnare X •  31,32,...,50, ore the row data for
this component-specific uncertainty  investigation.

For each Analysis, X, for  X between  31  and  50, inclusive, there is a
corresponding histogram of the ratio*,  "x,i-  The cut points of the
histogram  are 0, 0.01. 0.02,  0.05. 0.10. 0.20. 0.50.  0.80, 1.25, 2.0.
5.0, 10.0, 20.0, 50.0, 100.0,  and <*>.  Each  ratio is  located in one of
the subintervols defined by the te cut points.  Its location indicates
how the results for  the corresponding chemical change when the component
associated with the  analysis  (the one that  differs from  the standard
choice, that  In Analysis 30)  is changed.

For each histogram,  the mode  is determined.   moreover, a dispersion
factor (>  1)  is defined that  indicates  how spread out  the ratios are.
Let I be the  subinterval containing  the mode of  the  distribution and  let
GI be the  geometric  mean of the encpoints  of interval  I.  For example,
if I • [0.8.  1.25],  then Gj  - ((0.08)•(1.25))1/? .  1.  Generally,  let  J
                                1-26

-------
be any subinterval with geometric mean Gj.  For the intervals on the
ends of the histogram?, [0, 0.01] and [100, »].  the geometric means are
determined from the ratios that fall within them.   If,  for instance, two
ratios are greater than 100, soy *»00 and 1000, then the geometric mean
for the interval [100, *>] in the histogram in question is
((*00)-(1000)1/2 . 632.  This procedure is followed since the entries in
the intervals on the ends of the histograms may vary over many orders of
magnitude, unlike the entries in any other interval.  It does not appear
reasonable or consistent to fix means in these cases.

The dispersion factor for any histogram is defined as
where Nj is the number of ratios  (chemicals) in interval J and the  sums
run over all  intervals.

The dispersion factor  indicates the  average factor by which the  ratios
differ  from the mode.  A dispersion  factor of  1 corresponds to a
histogram with oil  the ratios in  one subinterval.  Since the  moae can be
construed as  an  indication  of the direction and magnitude  of  the change
in RRO  estimates  when  the approach to a single component is changed,  the
dispersion  factor indicates trow consistent  that change  in  estimates is.
It  is the average factor by which the RRD estimates  from the  altered
(nonstandard) analysis must be multiplied or  divided to yield ratios in
the  interval  of  the mode.  Since the altered  analyses differ  from the
standard in only one component and a histogram corresponds to one
altered analysis, a dispersion factor is associated  with  one  component
and  indicates how well-determined are the changes that result froir a
change  in approach to that component.
                                1-27

-------
REFERENCES

1.  Howe, R. and Crump. K. (1982).  GLOBAL 82:  A Computer Program to
    Extrapolate Quanta! Animal Toxicity data to Low Doses.  Prepared for
    the Office of Carcinogen Standards, OSHA, U. S. Department of Labor,
    Contract 41USC252C3.

2.  Sawyer, C.,  Peto, R., Bernstein, L., and Pike, M. C. (1984).
     Calculation of carcinogenic potency from long-term animal
     carcinogenesis experiments.  Biometrics 40:27-40.

3.  Freireich, E., Gehan. £., Rail, 0., Schmidt, L., and Skipper, H.
    (1966).  Quantitative comparison of toxicity of anti-cancer agents
    in mouse, rat, hamster, dog, monkey, and man.  Cancer Chemotherapy
    Reports 50:219-244.

4.  Crump,  K. and Howe, R.  (1980).  A  Small  Sample Study of Some Multi-
    variate and Dose  Response Permutation  Tests for Use with
    Teratogenesis or  Carcinogenesis Data.  Prepared for the Food and
    Drug  Administration under contract to  Ebon  Research Systems, 34
    pages.

5.  International Classification  of Diseases for  Oncology  (1976).
    World Health  Organization.  Geneva,  Switzerland.

6.  Bickel,  P.  J. and Doksum,  K.  A. (1977).   Mathematical  Statistics.
    Holden-Day,  Inc., San Francisco.

7.  Armitage,  P.  (1955).   Tests for linear trends in proportions and
    frequencies.   Biometrics "5:375-386.
                                1-28

-------
S.  Ng, T-H. (1965).  A Generalized Ranking on Partially Ordered Set!
    and Its Applications to Multivariate Extensions to Some
    Nonparametric Tests.  (unpublished report).
                                 1-29

-------
                               Table 1-1

               APPROACHES TO RISK ASSESSMENT COMPONENTS
1.   Length of the experiment

    a.   Use data from any experiment but correct for short observation
        periods.
    b.   Use data from experiments which last no less then 90* of the
        standard experiment length of the test animal.

2.   Length of dosing

    a.   Use data from any experiment, regardless of exposure duration.
    b.   Use data from experiments that expos* animals to the test
        chemical no less than 80* of the standarc experiment length.

3.   Route of exposure

    a.   Use data from experiments for which route of exposure is most
        similar to that encountered by humans.
    b.   Use data from any experiment, regardless of route of exposure.
    c.   Use data from experiments that exposed animals by gavage,
        inhalation, any oral route, or by the route most similar to
        that encountered by humans.


-------
                          Table  1-1  (continued)

                APPRO\CHES TO  RISK ASSESSMENT COMPONENTS
 7.   Malignancy  status  to consider

     a.   Consider  malignant tumors  only.
     b.   Consider  both  benign and malignant  tumors.

 8.   Particular  tumor type to use

     a.   Use combination of tumor types with significant  dose-response.
     b.   Use total tumor-bearing animals.
     c.   Use the response that occurs in hunans.
     d.   Use any individual response.

 9.   Combining data from males and  females

     a.   Use data from  each sex within a study separately.
     b.   Average the results of different sexes within a  study.

10.   Combining data from different  studies

     a.   Consider every study within a species separately.
     b.   Average the result* of different studies within  a species.

11.   Combining data from different species

     a.   Average results from all available species.
     b.   Average results from mice and rats.
     c.   Use data from a single, preselected species.
     d.   Use all species separately.

NOTE:  Underlines indicate c,Broach used  in Initial Standard
(Analysis 0).
                                1-31

-------
                               Tob\» 1-2

           APPROACHES USED FOR INITIAL THIRTY-EIGHT ANALYSES
Component
Analysis
0
1
2
3a
3b
risk marks those approaches that differ from those in
 Analysis 0, the first standard.

                               1-32

-------
                               Table 1-3

        STANDARD VALUES USED IN ANALYSIS OF ANIMAL BIOASSAY DATA
3ody Food Inhalation
Surface Area Weight Consumption Rate
Animal Coefficient0 (kg) (mg/day) (m3/day)
Dog
Guinea
Hamster
Monkey
Mouse
Rabbit
Rat
10.1
9.5
9.0
11.8
9.0
10.0
9.0
12.7 508000 1.5
0.43 12900 0.074
0.12 9600 0.037
3.5 140000 1.4
0.03 3900 0.05
1.1J 33900 1.6
0.35 17500 0.26
Drinking
Water Rate
(mg/dav)
350000
145000
30000
450000
6000
300000
35000
Experi-
ment
Length
(weeks)
312
104
104
364
104
156
104
"Surface area in m2 is calculated as KW2/3/100 where W is weight in
 kilograms and K is the surface area coefficient (2).
                                1-33

-------
                                Table  1-4

                APPROACHES  USED  FOR  SUPPLEMENTAL ANALYSES
Component
Analysis

31
33
34
35
36
37
38
40
41
42
43
44
45
tc
HO
47
48
49
50
1

a
a
a
b"
a
a
a
a
a
a
a
a
a
a
Q
a
a
a
a
2

a
a
a
o
b"
a
a
o
a
a
a
a
a
a
a
a
a
a
a
3

b
b
b
b
b
a"
c"
b
b
b
b
b
b
b
b
b
b
4


c"
d"
a
a
a
a
Q
a
a
a
a
a
a
c
a
a
a •
a
5

a
a
a
a
a
a
a
a
a
a
a
a
o
a
a
o
a
6

a
a
a
a
a
a
a
a
b"
a
a
a
a
a
a
a
0
a
7

b
b
b
b
b
b
b
b
a'
b
b
b
b
b
b
b
b
8

d
d
d
d
d
d
d
d
d
a"
b"
c"
d
d
d
d
d
9

a
a
a
a
a
a
a
a
a
a
a
a
o
b"
a
a
a
a
10

a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
11

ct
d
d
d
d
d
d
d
d
d
d
d
d
a"
b"
e-b
c*b
°The letter* in this table correspond to the labeling of approaches in
 Table 1-1.
"Analyses 49 and 50 differ in that the single species considered in 49
 is rats and in 50 it is mice.
"An asterisK marks those approaches thct differ from those in
 Analysis 0, the first standard.
                                1-34

-------
                                Table 1-5

                      DESCRIPTIONS OF ALL ANALYicS
Analysis	Template0	Differed >«s&_
   0        Initial Standard   mg/m^/day, no averaging of results; oral,
                                gavage, inhalation or route like humans
   1               0           limited to experiments of long
                                observation
   2               C           limited to experiments of long dosing
   3a              0           route like human route only
   3b              Q           any route
   
-------
                          Table 1-5 (continued)

                      DESCRIPTIONS OF ALL ANALYSES
Analysis
Template0
                                            Differences6
2
24b
24c
24d
25
30

31
32
33
34
35

36
37
38

41
42
43
44
45
46

47
48
49
50
12
12
12
12
0
Alternative
itcn^ard
30
30
30
30
30

30
30
30

30
30
30
30
30
30

30
30
30
30
mg/kg/doy
ppm diet
ppm air
mg/kg/ lifetime
route and response that humans gek. only
mg/kg/day; no averaging; any route

mg/m2/doy
ppni di*t
ppm air
mg/kg/lifetime
limited to experiments of long
observation
limited to experiments of long dosing
route like humans only
oral, gavage, inhalation, or route ? -ke
humans
malignant responses only
combination of significant responses only
total tumor-bearing animals only
response that humans get only
results averaged over sex within study
results averaged over study within
species
results averaged over all species
results averged over rats and mica only
rot data only
mouse data only
°The  template  is  the  analysis which most closely resembles and on which
  is based  the  analysis  in  question.  Analyses 0 and 30 are the two
  standards;  they  have no template  but  rathwr are the bases for defining
  the  other analyses.
bThe  differences  listed are the ways in which the analysis in question
  differs from  its template.  For Analyses  0 and 30, for which there  ore
  no templates,  no "differences" are defined.  In these two cases the
  approaches for several prominent  components are listed.
                                1-36

-------
                                Toble  1-6

                   RANKS BASED  ON  LENGTH OF EXPERIMENT
                      AND NUMBER OF  TREATED ANIMALS
Length of
Experiment0
> 75*
50-75*
< 50*
Number
50+
1
3
6
of Dosed Animals
15-*9 <
2
*
8

15
5
7
9
°These values are expressed as percentages of the standard experiment
 length of the test soecies.  Table 1-3 lists the standard experiment
 lengths.
                                1-37

-------
                                              Figure 1-1


                                              Final mo  Estimate
I
u
OB
                    [Species ij
                                                     i
[Species II]
[Species "TYTj

-------
                                Section 2

                                 RESULTS

This section describes the results of the evaluation of the animal
bioassay data and of its comparison with the epidemiologicolly derived
risk estimates.  Tho evaluation is logically divided into two steps.
The first is a correlation analysis, the goal of which is to determins
whether or not the estimates of risk-related doses obtained from
analysis of the biiassay data (the animal estimates) are correlated with
the estimates obtained from epidemiology (the human estimates).  If no
correlation is found, thon it may not be appropriate to attempt to
estimate human ris'v from animal data.  If, on the other hand, a positive
correlation does exist, then it seems reasonable to assume that the
animal models are relevant to human risk estimation and to proceed to
the second step, that of identification of useful predictors.  The goal
of that process in to determine which particular point estimates
calculated from the bioassay data can be satisfactorily employed as
predictors of the human RRDs, and to evaluate the variability  (the
remaining uncertainty) associated with the identified predictors.

The correlation analysis reveals that there  is, indeed, a significant
positive correlation between the human estimates and most of the animal
estimates.  Thoie analysis methods  that demonstrate the best
correlations provide viable alternatives for the choice of the
predictors.  Thi» detailed results of the correlation and prediction
analyses are described below.

CORRELATION ANALYSIS

Table ?-1  pres»nts  the correlation  coefficient  estimates (and  their
associated p-values)  corresponding  to  each method  of  analyzing the
bioassay  data.   The four  columns  of that  table  represent th«« four  data
sieves we  havi* defined.   Graphs of  animal  RRDs  vs.  human r.RUs  for  many
of the analysis methods  are contained  in  Figuret,  2-1  to 2-34.
 (Abbreviations for  all  the chemicals considered in this project are

                                 2-1

-------
given in Table 2-2).

Generolly speaking, the results in Taole 2-1 show a strongly positive
relationship between animal and human RRDs.  The number of analyses
resulting in correlation coefficients greater than 0.6 ranges from 26 to
29 out of 38, depending on the sieve used.  When the full sieve is used.
26 analyses yield results with t > 0.7.  In no instance did a negative
value for f obtain.  Out of the 38 p-values associated with the analyses
employing the full sieve, 28 were less than O.C1, and 35 less than 0.05.

Given these results, it is inconceivable thc.i these correlations are due
to chance.  It is also highly unlikely that they are due to bias in the
methods employed.  The coding of the animal data into the computerized
data base was made by individuals who were unaware of the results for
the RRO estimate* for the human data.  The calculations for the animal
RROs were mode using  jnifo.-m approaches implemented by an impartial
computer program.  Although the calculations of the human RROs were made
individually and required judgements on the part of the analyst, they
also wer* made blindly without knowledge of animal RROs for any of the
chemicals.

Thus, by any reasonable standard, the animal RROs are substantially
correlated with the human RRDs.  This correlation is very important
because it demonstrates that it is both possible and scientifically
appropriate to estimate human risk from animal data.  The range of
finite, best RRO estimates from human  data  represented by these 23
chemicals spans roughly six orders of  magnitude  (from 10~3-5 mg/kg/day
for melphalan to lO^-B mg/kg/day for saccharin).  Human and animal RRO
estimates are fairly  consistent over this  range  considering the
rrudeness of much  of  the  underlying  data  (see, for example, Figure
2-12).

These analyses are considered  in greater  detail  below.   Individual
analysis methods will be  studied;  methods that yield  the  best
correlations  are  identified  and  discussed.   Similarly, methods  that
yield the  poorest  correlations will  be discussed.  We  begin with  an
evaluation  of  the  sieve.
                                 2-2

-------
Evoluotion cf Sieve

The purpose of the sieve is to select only the better data for analysis
whenever data of varying quality are available, while at the same time
not excluding any chemicals from analysis on the basis of the sieve.
The sieve consists of two parts: a quality screen that discriminates
among data Bets on the basis of number of animals tested and length of
observation, and a significance screen that selects only data sets in
which a statistically significant response was found, whenever such deta
sets are available for a chemical (cf. Section 1).  The idea behind the
sieve is that use of better data should improve the observed
correlations between the human and animal results.  This desired result
is in fact achieved, since ? \* higher when the full sieve is used in 28
out of the 3B analyses (Table 2-1).  The effect of the sieve can be
observed by comparing, for example, the graphs in Figures 2-1 through
2-k or in Figures 2-9 through 2-12.  The data appear to be more closely
grouped about the best-fitting line when the screenings are applied
(especially when the full sievr is applied. Figures 2-4 and 2-12) than
when they are not applied (Figures 2-1 and 2-9).  This improvement in
correlations when better data are used is further evidence that the
observed correlations between the animal and human RRDs are real.

Aside from  Analyses 3b, 8a end  11a, almost all of the benefit obtained
from applying o  sieve is seen when the significance  screen is applied.
That screen  limits attention to the carcinogenic  responses that are
significantly dose-related when such  responses are available.  The
quality  screen,  which focuses on  the  number of animals tested and the
length of  the experiment,  does  not appear  to  provide much of an
improvement over and  above  the  significance screen for most analysis
methods  (compare the  third and  fourth columns of  Table 2-1).  Such  a
result  is  expected for  analyses,  like number  1,  that already  restrict
attention  to a  subut of  the experiments,  in  this case  the  "long"
experiments.  It is somewhat surprising  in other  cases,  especially  since
 the majority of the carcinogenic  responses in the data  base  are  there
 because of their apparent dose-related action.  It is possible  that the
 criterion  limiting attention to studies  utilizing gavage,  inhalation,  or
 oral routes of exposure (or the route most similar to human  exposure),  a

                                 2-3

-------
criterion underlying every onolysis except 3D,  also had the side effect
of eliminating many of the studies that woul'1 receive lower ranks in the
quality screening.  This is conceivable if, for example, other methods
of dosing involved fewer animals (possibly because these routes are more
difficult to administer) or if other routes tend to involve bolus doses
that might cause early deaths (carcinogenic or non-carcinogenic) and
consequently shorten the duration of observation.  It is possible that
experiments employing "nonstandord" routes of dosing were designed to
investigate special questions and so may not have been overly concerned
with number of animals or length of observation.

Also in relation to the action of the sieves, those analyses that
average RRD values at each stage (over sex within study, over study
within species, and finally across species; Analyses 12 through 2dd) are
relatively impervious to the application of any screenings.  The
correlation coefficients within any of the rows in Table 2-1
corresponding to those analyses are very similar, no matter which sieve
is applied (cf. Figures 2-28 through 2-31).  It seems likely that the
averaging that occurs in these onnlyw«*» acts in much the same way as the
screens are intended to work; much as a sieve acts to eliminate
outliers, so averaging works to pull outliers toward the "middle" of the
results.  This effect is enhanced by the use of harmonic averaging which
severely limits the influence of infinite  values.  Since RRDs are
bounded beXow but not above, infinite-valued estimates  are obvious
candidates for outliers.  Similarly, the action cf both the quality
screen and the significance screen would tend to eliminate infinite-
valued estimates  since experiments that are too short or employ too few
animals would tend to find no carcinogenicity of a chemical (i.e. , give
infinite RRD estimates) and responses not  significantly related to dose
generally also produce  infinite-valued RRDs.

Analyses other than  12  through  2<»d employ  averaging, but not  at every
level.   Analysis  9 averages only  across sex within study.  Analysis  10
only across  study within  species;  Analysis 11a  only  across species; and
Analysis  lib  only across  the  species  rets  and mice.  Since the
experiments  employing  species  other  than  rats  and  mice  do  not appear  to
be as  "clean"  as  those  using  ruts and mice (compare  the first columns  in
the rows  corresponding  to Analyses 11a  and 11b  and note the  sizable

                                 2-
-------
increase m p when a quality screen is applied to Analysis 11a),  let us
concentrate on Analyses 9, 10,  and 11b (cf.  Figures 2-18 through 2-25}.
In those cases, one notes a similar but slightly lessened independence
from the sieve.  Especially when averaging across sex or across study,
an effect similar to that of the screenings may already be in operation.

In this connection note that Analysis 12, which averages at all levels
a'- that uses the same data as Analysis 0, results in larger correlation
coefficients than does Analysis 0 when the significance screen is not
used (columns 1 and 2 of Table 2-1).  Correlation coefficients
associated with Analysis  12 are somewhat smaller than those associated
with Analysis 0 when the significance screen is employed (third and
fourth columns) and, moreover, the application of the significance
screen to Analysis 0 produces larger coefficients than the nonscreened
Analysis 12.  This suggests that applicction of an appropriate sieve may
be a better approach than merely averaging at all levels.  Since use of
a sieve appears to improve most analyses, and since use of the full
siev» is about as good or better for most analyses than use of cither
screen by itself, the remaining discussion will emphasize analyses that
employ the full sieve.
 Analyses  that Use Combination of All Significant Individual Responses

 Two  of  the  endpoints  defined and included  in the bioassay data  base
 whenever  possible are the  combination of all individual carcinogenic
 responses that  are  significantly dose-related and the combination  of all
 such responses  that are malignant.  Analyses that use the first of
 these,  combination  of significant  responses, i.e. Analyses 8a,  16, and
 17,  provide relatively poor correlations  (p  < 0.6)  no matter  which data
 screening procedure is implemented (cf. Figures 2-14 and 2-32).  The
 p-values  associated with these  correlations  range betweon 0.02  and 0.05
 which,  given the number of comparisons  performed, might reasonably be
 considered  only marginally significant.   The response,  combination of
 significant responses, could not be defined  for  every  study  or  every
 chemical; only 13 of  the 23 chemicals had one  or  more  experiments
 presenting  data that allow calculation of this  endpoint.   it is the  case
 that the experiments that do provide the  necessary  information  in  this

                                 2-5

-------
regard are generally more complete and better studies,  notably the NTP
bioassays.  This is indicated by the fact that rank 1  studies (those
observing over 50 dosed animals for at least 754 of the standard
observation period) are available for 12 of the 13 chemicals included in
Analyses 8a one1 16.  That being the case, it is less likely that the
relatively poor correlation is due to use of data of poorer quality.

Interestingly, the analyses using the combination of malignant
statistically dose-related responses. Analyses 18 and 19, provide very
good (and in some cases, th» largest) correlation coefficients, ranging
from 0.73 to 0.79.  However, the difficulty of defining the response is
even more severe with this endpoint than with the previous one.  No more
than 10 chemicals included studies with the necessary information.  Note
that the  p-values associated with these chemicals range between 0.003
and 0.009; such p-values are associated with i's on the order of 0.61
when more chemicals are included (cf. Analysis 2, no screens).  So,
while use of this endpoint may well be appropriate, more data would have
to be made available before any stronger conclusion would be warranted.
Analyses That Utilize Malignant Neoplasms Only

Analysis 7  is identical  to  Analysis  0 except that the former analysis
utilizes animal  data on  malignant  neoplasms only, where the latter
analysis permits data on benign neoplasms to be used as well; Analyses
12 and  14 have a similar relationship.   Analyses 7 and 14, utilizing
data  on malignant neoplasms,  yield results that are quite similar to
those obtained from Analyses  0 and 12,  respectively, analyses that used
data  on both benign and  malignant  neoplasms.  The graphs for Analyses 0
and 7 (Figures 2-4 and 2-14)  are very similar, the major difference
being that  data  for benzidine are  utilized in Analysis 0 but not
Analysis 7.  It  is important  to  note that  inclusion of both benign and
malignant tumors does  not degrade  the correlations  (in fact, it improves
them  somewhat) despite  the fact  that the human results are for malignant
tumors  exclusively.
                                 2-6

-------
epidemiologically derived estimates even when no screening is used.
This may reflect an underlying difference in the overall quality of rat
and mouse experiments.

One of the few analyses that derives some benefit from the quality
screening, over and above that obtained by significance screening, is
11a.  The same is not true for AncXysis 11b.  This suggests that the
improvement obtained by screening the quality of the data in Analysis
11a is derived primarily from elimination of tixperimants in species
other than rats or mice that were too short or that tested too few
animals.  Indeed, the correlation coefficients for Analyses 11a and lib
are nearly identical when the quality screen is applied (columns 2 and <*
of Table  2-1; cf. Figures 2-23 and 2-25).  With that screening, either
of these  two analyses compares favorably with the standard analysis
(Analysis 0) and  are  similar to the results obtained from rats alone  or
mice alone.  This is  perhaps not surprising given the  preponderance of
rat and mouse experiments in the data base and  the previously noted
similarity of rat-alone  and mouse-alone correlation coefficients  when
the data  is appropriately screened.
 Choice of Dose Units

 Analyses 0,  
-------
prediction analyses.

Tdentificotion of Analyses Yielding Higher Correlations

Analysis 3b yields the highest correlation; when the fi-11 sieve is
applied, p - 0.90 (cf. Figure 2-12).  This analysis method also yields
correlation coefficients that are among the best when less than the full
sieve is used.  Interestingly, Analyris 3b is the Irost restrictive of
the methods examined.  Whereas all other analyses are restricted to
experiments that expose animals by gavage, inhalation, oral, or the
route of exposure that humans oncounter, 3b also allows instillation,
injection, and implantation experiments.  These additional routes are
often not considered  in quantitative risk assessment.

This discussion of Analynis 3b provides an opportunity to consider the
effect  that changes  in the data have on the correlation coefficients.
Some changes  in data  are the  result of changing the criteria used to
pick experiments and  carcinogenic responses for particular analysis
methods.  In  this case, allowing all routes of exposure adds threo
chemicals that only  have studies that expose animals  by "nonstandard"
means,  chlorambucil,  chromium, and  melphalan.  Moreover, RRO estimates
for certain other chemicals change  dramatically when  all routes are
allowed.  Arsenic is a prime  example; note the change in location of the
animal  lower  bound  for this chemical (compare Figure  2-4 to Figure 2-12).
The animal  lower bound RRO for arsenic  in Analysis  3b is derived from an
experiment  in which  exposure  was accomplished via  intratracheal
instillation  and for which a  dose-related increase  in lung  tumors was
found.   Note  that the animal  upper  limit  RRO  is  infinite whether or  not
all routes  of exposure are  included in  the analysis,  a fact consistent
with  the commonly held view  that arsenic  has  not been shown conclusively
to be  carcinogenic  in animals.

The correlation  coefficient,  f,  is  derived from the ranks  determined by
the relative  positions of  the intervals.   In  Analysis 0,  the  human  rank
of arsenic  is 6  and its  animal  rank is  15, a  major discrepancy.   In
Analysis 3b,  with  the addition  of  the  three chemicals, arsenic's  human
rank  increases  to  9 but  its  animal  rank,  due  to the reduction  in  the
lower  bound discussed above,  rises to  only 16.5.   So, while the values

                                2-10

-------
of the RRD« con and do change substantially,  the rank* based on the RRDs
may be relatively insensitive to those changes.

Nevertheless, ranks can be greatly t tered.  Comparing the same two
Analyses. 0 and 3b, one notes a great change in the lower and upper
bounds associated with estrogen.  This results in a change of rank, from
11.5 in Analysis 0 to 6 in Analysis 3b, despite the addition of three
more chemicals in the latter analysis.  Estrogen has only two bioassays,
one an implantation study and the othor a feeding study.  The feeding
study which was the only animal estrogen study utilized in Analysis 0,
failed to find any significantly dose-related responses (hence the
infinite MLE 'or estrogen in Analysis 0); however, the implantation
study entered Analysis Sb and included an increased incidence of kidney
tumors.  So. while we are interested  in changes in the underlying base
o* data  and their effect on risk estimates, one must be aware that other
changes  may be confounded with  the changes in which we are interested.
Thus, with estrogen, including  other  routes (the implantation study)
actually eliminates the feeding study (because of the action of the
sieve),  an unforeseen change in the underlying data.  Another
manifestation of this is the fact that certain chemicals are eliminated
from  some analyses because  they lack  the data to support those methods.

One can  attempt  to minimize such confounding data dependency.  It  is  of
interest, for example, whether  the  high  correlation obtained  in Analysis
3b  is due to the addition of the three chemicals mentioned  earlier,
chlorombucil, chromium,  and melphalon.   The human and animal  ranks of
chromium and melphalan are  well matched, but  those  for  chloromb>.icil  are
7.5 and  3.  respectively,  showing moderate  discrepancy.   If  these  three
chemicals  are  not  included  in  Analysis 3b,  so  that  the  only differences
between  Analyses 0 and  3b  are  due  to inclusion  of  additional  routes of
exposure,  then  f • 0.88  when the  full sieve  is  applied.   This is  very
close to the original  p,  0.90,  and is still  notably better  than  the
correlation obtained from any  other analysis.

Nevertheless,  it is not  possible to conclude unequivocally  from these
 analyses that the improved correlation due t-> inclusion of  additional
 routes of exposure will  hold generally and is not simply a  feature of
 the particular data available for  analysis.   A substantial  part of the

                                2-11

-------
improved correlation is due to a data-dependent change in one chemical
(emtrogen).  This may, however, be on indication that inclusion of
additional routes may allow improved estimates for some human
carcinogens that, for some reason, or* not easily shown to be
carcinogenic in animals via routes through which humans are normally
exposed.  Further investigation of this issue may be warranted.

Aside from Analysis 3b, no other analysis stands out as being superior
to the others based on the correlation analysis.  The largest of the
remaining correlation coefficients is 0.81 (Analysis 25, full sieve) and
another 16 of the 38 analyses performed with the full sieve yield />'s
between 0.76 and 0.81.  Those and perhaps many of the other analyses
provide ample indication that animal-based estimates of RRDs are
applicable to estimation of human RRDs and can be considered viable
alternative procedures for use in human risk assessment.
PREDICTION ANALYSIS

The three loss  functions  described  in the discussion of methodology
(Section 1 of this volume)  have  been used to determine the lines of best
fit for 30
-------
provided notable ocMitisnal improvement in the correlation.   Addition of
the quality screen ensurec that possibly questionable experiment* of
short duration or employing few ve*t animals do not conpronise the
results obtained from the lower ranked studies (the "better" studies).
This may be particularly important when extreme values,  such as the
minimum lower bound or minimum MLL', are used as predictors.
Sieve

Examination of Tables 2-3 through 2-5 reveals that use of the sieve
generally does improve the predictive ability of the bioastay results.
No matter which loss function is used, the predictive power of those
analyses employing the minimum values (LR or MLE^) is improved (in the
sense of yielding smaller average loss) 74* to 82* of the time when the
sieve is applied.  This effect is seen slightly more often among the
analyses that do not average over sex. study, and species (Analyses 0
through lid and 25) than among those that do average over sex, study,
and species (Analyses 12 through 2
-------
When the epidemiologicolly derived RRO ••titrates or* finite, the loss is
exacerbated.  The sieve eliminates such data sets from consideration
when better ones are available.

The results for the median lower bound predictor, L^Q, are apparently
the most stable, os reflected in the fact that the action of the sieve
is neutral in a general sense.  Note, however, that certain analyses
demonstrate definite improvement when the sieve is applied, even with
    as the predictor.
Predictors

The four predictors  that  have  been  selected, L^, 1.20, Mil?*), W-E2Q,  can
be compared on  the basis  of  the  average  loss suffered when each
predictor is  used in any  particular analysis.   Table* 2-3 through  2-5
clearly indicate that,  no matter which loss function is  employed,  1.20 is
the best predictor to use.   In only 2 to 6  (depending on the  loss
function) of  the 76  analyses (38 pairs,  with and without the  sieve) does
that  predictor  fail  to yield smaller loss than  does LM.  Analysis  2,
for example,  appears to yield  better prediction when LM  is used  instead
of 1-2;).  Comparison  of the . jss  values among the analyses, however,
reveals that  Analysis 2 in not one of the better methods of calculating
RRO estimate'*,  so this observation has  little  bearing on the  noted
superiority  of  LJQ.   In three to five instances (again depending on the
choice of  loss  function) L2Q does not provide  loss smaller than  results
from  using MLE2Q.   Analyses 6 and 18 account  for most of these
exceptions.   Recall  that these analyses  could  b» performed on only six
and  ten cftomicols,  respectively, «o again,  the f'jct that MLEjQ yields
the  smaller  loss  should not be given too much  weight.

The  superiority of  LJQ is independent of the  choice of  loss  function.
Also  independent  of loss function is the fact  that MLEjQ is  superior to
MlEpi among  those analyses employing a sieve;  the W.E2Q  losses are
smaller  in  17 or  18 of the 22 such analyses for which MLE^ and MLE2Q
differ.   (Note that the analyses that average over sex,  study, and
species,  Analyses  12 through 2<*d, provide a single lower bound estimate
and  a single maximum likelihood  estimate.  Consequently, LM  - I-2Q and

-------
MLEM - MLE.2Q for those analyses and the losses associated with use of LM
are identical to those associated with use of \.^Q and similarly for MLEM
and MLEjQ.)  Included in the set of 17 or 18 analyses are those that
yield the smallest losses when a maximum likelihood predictor is used so
the superiority of MLEjg over *LE- is clear when the sieve is applied.

The case is not so obvious when no sieve it applied.  In this instance,
the results do depend on the choice of lost function.  When fit is
measured by the CAUCHY loss function (Table 2-4) ML£20 is superior to
MLEM in 15 of the 22 no-sieve analyses.  When the TANH loss function is
used (Table 2-5), MCEjQ is cuperior in only 10 of 22 no-sieve analyses.
Howevor, with both loss functions, MLE2Q is better than MLE^ for most of
those analyses that yield the smallest losses (Analyses 2 and 11c being
the oxceptiona).   Thus, unless one wishes to analyze bioassays by method
2 (using only those experiments that dosed treated animals for at least
80* of  the standard experiment length) or by lie (use rat experiments
only) without a sieve, one must conclude that in the case of the maximum
likelihood estimates, as well as  in the case of the  lower bounds, a
median  predictor  i* o better choice than a minimum predictor.
Interestingly, Analysis 2 (though not  11c) is also the method
consistently yielding smaller losses with L^ than with I-2Q-

One final  observation will conclude this comparison  of the predictors.
Since LJQ  is better than LM  and MLEjQ  is better than MLt^, it is of
interest to determine whether LM  is better than MLEjQ-   The  answer
depends to some  extent  on the choice of  the  two  Iocs functions that  nan
be used to compare these predictors, CAUCHY  and  TANH (Tables C-4 and
2-5,  respectively).   Among  the  22 pairs  of analysis  • tthods  that do  not
average over sex,  study, and species the CAUCHY  loss function  indicates
that  LM produces smaller average  loss  than does  KLE^Q  *n 21  cases,  i.e.
less  than  hal'  the time.  Among those  same analyses, LM  outperforms
ML£20 f°r  **u analyses when  measured by TANH  loss.   Both  loss functions
indicate  that  Ml £20 is superior for Analyses 3b  with and without  th«
sieve (analyses providing  some of tho  oest correlations  in  the
correlation  analysis) and  for Analysis 6 with or without the sieve (tne
method applicable to only  six chemicals).   For those 32  analyses  that do
 average over sex, study,  and species (Analyses 12 through 24d.  with and
 without the »ieve),  LM is  better in every casa but two (Analysis 18 with

                                2-15

-------
and without the sieve) no mo'.tar which loss function is used.  Thus, LM
outperforms MLEjQ in th* "wjority of cases, especially when assessed by
the TANH loss function, but cny conclusion about the superiority of LM
may depend strongly on the analysis methods that are of interest.
Comparison of Analysis Methodn

The comparison of the analyses and the identification of the best ones
a'-e complicated because four separate predictors have been used and
three different loss functions have been defined.  If the different
predictors or different loss functions result in distinct ordering* of
the methods, interpretation is more difficult.  Table 2-6 presents the
five best analyses  (those giving  the smallest average losses) by
predictor and loss  function.

Analyses 6.  18, and 19 dominate the list of methods giving smallest
losses.  This is  true no matter which predictor or v/hich loss function
is used.  (Analysis 18 does not appear in any TANH list, however.)
Recall  that  these are the analyses cited in the discussion of correlation
analysis results  as those that yield relatively large correlation
coefficients but  that are applicable to few chemicals (six,  ten,  and
nine chemicals  for  'nalyses 6, 18, and 19, respectively).  So, as with
v.h« correlation results, the prediction results are suggestive for these
analysis methods, but no firm conclusions are warranted.

Table  2-7 lists the analyses that yield the smallest average losses
after  eliminating Analyses  6,  1Q,  and  19, which ore based on relatively
few chemicals.  Analysis  17 appears on the list frequently.   That method
uses the response that  is the combination of  significant individual
responses and  is  limited to experiments that  dosed and  observed the test
animals for  a  suitably  long period.  Furthermore, RRD estimates are
cveraged over  sex,  study, and  species.  That  this method should appear
to  provide  good fit to  the  data  is somewhat surprising  since the
correlation  coefficients  associated with  it are on the  order of 0.58,
not among  the  better correlation  results.  Once again,  however, the
number of chemicals that  can  supply  data  meeting  the  requirements of
this  approach  is  limited;  only  11 chwnicals  had  studies that dosed and

                                2-18

-------
observed animals long enough and for which the combination of
significant response* could be defined.

Analysis 16 is similar to Analysis 17 in that the endpoitit chosen is the
combination of significant responses and the estimates are averaged over
sex, study and species.  It does not, however, exclude oxperii.wnts on
the basis of their length of observation and dosing,  so that thirteen
chemicals can be analyzed by this method.  Analysis IE also appears in
the list given in Table 2-7, predominantly when the CAUCHY loss function
is used.  The analysis that does not average over sex, study, and
species but that rfoes use the combination of significant responses
(Analysis 8a) does not yield average losses that ore among the smallest,
for any loss function or predictor.  Thus, it appears that averaging at
each level may be the most useful method whan the combination of all
significant responses is the endpoint used.

Analysis 20 is identified as a  method yielding small losses, but only
when a  lower bound predictor is used.   That method is best when loss is
measured by the TANK function.  This analysis method selects as the
endpoint of interest total tumor-bearing  animals and overages estimates
over six,  study, and speciea.   The  corresponding uoaveraged method
(Analysis  8b) does yield  losses not much  larger than those associated
with Analysis 20, 0.127 vs. 0.121 and 0.125 vs 0.121 for  LM and LJQ,
respectively  (measured by  TANH).  Consequently, use of total tumor-
bearing animals  in conjunction  with a lower bound  estimate appears  to  be
an  appropriate technique,  if TANH  is a  suitable measure of loss.

Note that,  of the twenty  analyse*  listed as providing the smallest
average losses  determined by CAUCHY and TANH  (the  two functions  that
consider  the  best epidemiological  estimates of RRD) for the  lower  bound
predictors (L^  and LJQ),  all  but  'our use an  endpoint that  is a
combination of  individual respcnses, either  total  tumor-bearing  animals
or  the combination of  significant  responses.   Due  to  limitations  in the
data available  for analysis,  not  inherent limitations of  the methods
themselves,  some of  these analyses were applicable to relatively few
chemicals.  Nevertheless,  the consistency with which  these  endpoints
 yield  small average  Iocs indicates that they should be  considered vnble
 candidates for  estimation of human risX.
                                2-17

-------
The DISTANCE2 loss function identifies a set of good methods that
intersects with the sets identified by CAUCHY and TANH infrequently.
All but one of the ten analyses listed in Table 2-7 under DISTANCE2 use
individual carcinogenic responses rather than a combined response.  In
two coses (Analysis 25, the best when L^ is the predictor, and
Analysis 8c, also associated with the LM predictor) the response is
limited to those that are associated with human exposure.  This response
may be identifiable when human data exist, but when such data are
absent, as would be the case for a new chemicol, then the appropriate
choice of endpoint is unknown and application of these methods
problematical.

The DISTANCE2 loss function identified Analysis 3b with the sieve as the
best method  (in terms of average loss) when LJQ is the predictor.   This
analysis was also clearly superior in the correlation analysis
(t » 0.90).  It is not surprising that DISTANCE2 would tend to match the
results of the correlation analysis, especially when l_2Q is the
predictor.   First, I-2Q was one end of the interval of animal RRDs used
in the correlation analysis.  Second, DISTANCE2 does not consider the
location of  the best estimate of RRD derived from the epidemiological
data and so  is concerned only with the position of the human interval.
In any case. Analysis 3b yields the smallest loss with the DISTANCE2
function and reasonably small losses with CAUCHY and TANH, 0.4-13 and
0.140, respectively.   (Note: it is not appropriate to compare the loss
values obtained using different loss functions.  The fact that different
formulations of loss are used entails that the values in the different
columns of Table  2-7,  for example, are not comparable.)

Since  LJQ is the  predictor that produces  the best fit of the animal
results to the human results  (a fact that is rei"forced  by examination
of Table 2-7), wo concentrate on those analyses that perform best with
that predictor.   Table 2-7 shows that Analyses  3b,  17, and 20 are the
analyses yielding the  smallest  average  losses  for one of  the three  loss
functions.   One would  like to  have  results  that are  independent  of  the
choice of  loss function.   That  is,  a  good analysis method should be
robust with  respect  to differences  in  loss  functions.  To investigate
the  analyses in  this manner,  we have  defined what  is called  "total

                                •>- 18

-------
incremental normalized loss" as follows.   For each loss function,  the
difference between the smallest average loss and tne largest average
loss among the analyses (still ignoring Analyses 6,  18, and 19) when
is used is known.  For each analysis the difference between the average
loss for that analysis a.id the minimum average loss, divided by the
difference between the minimum and maximum average losses, is defined as
the "incremental normalized loss".  The sum of these across all three
loss functions gives the total incremental normalized loss (Table 2-8).
Normalization eliminates the difference in scale of the three loss
functions and should allow an overall oppraisal of the analyses.

Table 2-8 reveals that Analysis 17 (with the sieve) obviously adds least
to the average loss incurred.  Analysis 17 without the sieve is nearly
as good.  Analyser 3b with the sieve and 20 without the sieve, the other
two analyses picked as best by one of the three loss functions, yield
total incremental losses that are about the same, 0.555 and 0.558,
respectively, and follow the pair of Analysis  17  results, as the next
best methods of  analysis.

Figures 2-35 through  2-38  display the plots of those four analysis
methods.   One thing that is clear from these figures is that Analysis  17
derives much of  its good performance from the  specific subset  of
chemicals  to vhich it can  be applied.  For  only three  of  those eleven
chemicals  does the best fitting  line fail to pass through the  interval
of  human  RRD estimates, for any  loss function.  However,  even  when
Analyses  3b with the  sieve and 20 without the  sieve are limited to  the
same eleven chemicals (Tablet  2-9),  Analysis 17 with the sieve  rs better
when measured by CAUCHY and T*NH.   On  tiie other hand,  Analysis 3b with
the sieve,  restricted to  the  seventeen  chemicals  to which Analysis  20
can be  applied,  yields smaller  losses  than  does Analysis  20 as measured
by  all  three  loss function.

Other  analyses  that  yield  relatively good,  robust results can  be
 identified from Table 2-8.  Those for  which the total  incremental
 normalized losses are less than  1.0,  for example, for  at  least one
member of the pair of results (with or without th» sieve) include
 Analyses Ua through  <*d (analyses that differ from the  standard only with
 respect to the  Jose units used to extrapolate from animals to humans);

                                2-19

-------
8t>, utilising total tumor-bearing animals as the endpoint (as does
Analysis 20 discussed above); 8c, which is limited to carcinogenic
responses that humans got; 9, which average* results over sex within a
study; 11b, a method that averages the results from rats and mice; and
11c which uses rat data only.  The best total incremental normalized
losses among these analyses range from 0.698 to 0.963.

Note, in passing, that an alternative ranking procedure that can be
applied is a minimax scheme.  That is, for any analysis, the maximal
loss over the three loss function can be determined.  The analyses that
have the smallest maximum* are best in a minimax framework.  Since the
loss functions have different scales, this approach should also be based
on the incremental normalized lossos.  If this is done, then
Analyses 17, 3b, and 20 remain the best three, in order, and several of
the others just cited, notably 8c and 8b, remain in the list of good
analyses.  Analysis 22 also  satisfies the minimax criterion well.
Asymmetric  Loss

The  discussion up  to  this  point  has  concerned observations relating  to
loss that is  symmetric.  That  is,  the  loss  functions employed reflected
the  assumption that it  is  no worse to  overestimate RRDs by a given
amount  than to underestimate them by the  same amount.  In fact,  it is
reasonable  to think otherwise,  i.e.  to think and  base our decisions  on
the  premise that overestimation  is worse  than underestimation.   The
health  considerations involved in cancer  risk assessment make this a
prudent approach.

The  effect  of incorporating asymmetry  into  the  loss calculations is
investigated  in  the following  manner.   The  TANH loss function has been
used to fit a line to the  results of each analysis method.   In  the
definition  of TANH is a factor,  m, called the asymmetry constant, that
reflects the  degree of  asymmetry thought  to be  pertinent.  The  symmetric
version has m •  1.  The fitting is performed now with m equal to 1.5, 2,
5,  10,  50,  and  100.   Larger values of  m reflect stronger beliefs about
the  inadvisability of overestimating RRD-i.   Tables 2-10 ana  2-11 display
the  results for  the lower  bound predictors.

                                2-20

-------
Note trot the losses incurred when m • t>0 and m - ICO ore identical (to
three decimal places) for every analysis.   Any high degree of asymmetry
drive* the line toward the chemicals in the lower right-hand corner of
the plots.  Figure 2-39 displays this phenomenon for Analysis 3b,  the
method applicable to the most chemicals.

As witti the symmetric ve.'Sion of the TANH loss function (cf. Table 2-5),
Analyses 6 and 19 perform well with moderate degrees of asymmetry.
Analysis 6 continues to be among the five best for larger degrees of
asymmetry whii? 19 does not.  When the minimum lower bound is the
predictor, Analyses 20 (for moderate degrees of asymmetry) and Analysis
17 (for all degrees, with th» interesting exception of m • 5) perform
well, as they did with symmetric loss.  When the predictor is I-2Q.
Analysis  17 is again good for all degrees of asymmetry, but Analysis 8b,
not Analysis 20, moves to the top five for moderate asymmetry constants.
For both  predictors, Analysis 8c (which uses an endpoint that humans
get) produces small  losses  for m > 5.  Most notable, however, is  the
fact that Analysis 22  (using total malignancy-bearing animals and
averaging over sex,  study,  and species), a method moderately good with
no asymmetry, is second  only to Analysis 6 (which is applicable to only
six chemicals) for  high  degrees of as\mmetry.   This  implies that,  if  it
is deemed necessary  or desirable not  to overestimate any RRD, method  22
is the  best  analysis to  use (once agcin  ignoring  the suggestive results
of Analysis  6).

It  is  possible  to  characterize those  analyses  that  will  provide the
smallest  losses  with an  asymmetric  loss function.   If  those chemicals
that  fall below the line fit with  the symmetric function  are nearly
colinear  with  slope equal to one,  then a  great reduction  in asymmetric
loss  can  be  achieved by  moving the  line to the right (decreasing  the
y-intercept).   Of  course,  if those  chemicals that lie  above the
symmetrically  fit  line also have  this colinear relationship,  then the
 increase in  loss for those chemicals when the line moves to the  right
 (as it always  will with  the introduction of  asymmetry) can be minimized.
 Hence,  those analyses that do not produce outliers falling in the upper
 left corner  or,  especially, in the lower right corner  of the plots will
 suffer relatively less loss than  analyses thct do product such outliers.

                                2-21

-------
In this regard,  compare Analyse* 20 and 22 (Figures 2- 50.  In these cases the convention is driven by the chemicals
thai overestimate the most (cf. asbestos in Figures 2-40 and 2-41).   If
one believes that it is not 50 times worse to overestimate than to
underestimate RROs, then conversions that are still protective (what has
been called  "conservative") can be obtained.  These correspond to
smaller values of m and tend to include more, though not necessarily
all, chemical* above the fitted line, the region where bioassays predict
larger risks than are obtained from the epidemiology for the given
conversion.

Since the  question of asymmetric loss is closely linked  to degrees of
belief about the relative desirability of underestimation and
overestimation of RROs, no further investigation of this issue is
undertaken.  It should be borne in mind, however,  that all of the
analyses reported in this document can be undertaken using asymmetric
loss.  The remainder of the results and the  discussion focus solely on
symmetric  loss.
 Animol-to-Humon  Conversion

 In  the  previous  dijcuwsior  of asymmetry,  conversion of animal RRD
 estimates  to  human  RRO estimates  was  mentioned.   This conversion is
 based on the  best-fitting line that relates  the  two sets  of RRO
 estimates.  Specifically,  it depends  on  the  y-intercept,  c, that defines
 the line

     Log10(RRDH) -  Log-|0(RBOA) *  c.                                 (2-1)

                                2-22

-------
or

     RRDH - 10C-RRDA                                               (2-2)

which of course depends on the analysis method and predictor.   This
conversion is over and above those that are used to equate the units
between humans and animals: recall that, for each of the possible
choices of units used to extrapolate doses from animals to humans,
species- and chemical-specific dose conversions were used to arrive at
the units mg/kg/day in humans, the units in which all RRD estimates are
expressed.
The conversion that is discussed here is a multiplicative factor i^ot is
the empirical result of fitting Eq. 2-1 to the ensemble of bioassay
data.  The fitted line will rarely pass through a data point.  That is,
for any given chemical used to fit the line, the conversion determined
by Eq. 2-1 rarely describes the exact relationship between RRDH and RRD&
for that chemical alone.  Rather, all the study chemicals together
determine c, and this factor may then be applied to estimate RRD^ for
any other chemical without direct epidemiological estimates.  Tables
2-12  through 2-15 display the y-intercept values for each analysis and
each  predictor.

It is of r.ome interest to determine the conversion factor suggested by
the data that applies to the standard analysis, which is modelled after
the Carcinogen Assessment Group's  (CAG's) usual procedure.  That group
uses  the minimum lower bound as  its predictor.  Table 2-12 shows that
Analysis 0 with CM yields y-intercepts between 0.51 and 1.71 when no
sieve is applied and between 0.83  and 1.07 when the full sieve  is used.
The ratios,  RRDn/RRDA (which we  will call conversion factors),  with
these intercepts range between  3.2<»  (•  10°-51) and 51.7 or 6.71 and  11.7
without  or with  the  sieve,  respectively.  These figures ore  uniform  in
suggesting  that  CAG's procedure  is conservative,  in th» sense  of
underestimating  RROs or  overestimating  risk  and so being protective  of
human health.   Given that  CAG  screens  its dnta to select the best
available  studies,  a process  that  may  act  like our sieve,  the  degree  of
 underestimation  is  likely  to  be about  an  ord»r of magnitude  for the

                                2-23

-------
level of risk of interest here.

Since 1-2Q woe found to be the best predictor regardless of loss
function, the remainder of the discussion of conversion factors focuses
on that predictor (Table 2-13).   Over all analyses and loss functions,
the ratio, RRDH/RRDA, determined by Eq.  2-2 ranges from 0.184 to 151.
Among those analyses that are based on extrapolation assuming mg/m^/day
human-and-animal equivalence (which include almost all of the analyses
since the standard analysis assumes such equivalence) tho -atio ranges
from 0.184 (from the CAUCHY loss function applied to Analysis 21 with
the sieve) to 74.6 (from the TANH loss function line fit to Analysis 6
with or without the sieve).  Since Analysis 6 results are based on only
six chemicals, art alternative upper value that is more firmly supported
can be obtained from the CAUCHY line fit to Analysis 3a without the
sieve, 28.4.  On the other hand, if we limit attention to those analyses
that appear to yield the smallest average losses with the LJQ predictor
and the  loss functions for which they are best (cf. Table 2-7) than the
range is  from 1.29 (Analysis 20 without the sieve, TANH loss function)
to 16.7  (Analysis 3b with the sieve. DISTANCE2 loss function).

At this  point one can compare and contrast the results of the analyses
that are identical except for choice of the dose units assumed to yield
animal and human equivalence with respect to carcinogenic response
(component 4, cf. Table  1-1).   To facilitate this comparison, the
supplemental analyses discussed in  Section  1 are examined as well.  The
results  for these analyses are  presented in Tables 2-1S and 2-17.

It is possible  to  identify three  sets of five analyses each such that  the
analysis within  a  set differ only with  respect to the  dose units assumed
to yield equivalence.   These sets are (0, 4a, 4b, 4c.  
-------
DISTANCE2 loss function and in the second set.  (12,  2<»a-2<»d},  with the
DISTANCE2 and TANH loss functions.  In oil these instances,
mg/kg/lifetime are the units producing the smallest average losses.  The
units ing/kg/lifetime ore linear transformations of mg/kg/day dependent
only on the length of experiment, so a weight-based extrapolation
appears good when the sieve is used.  For Analysis 30 with the sieve,
the analysis t'-at yields the smallest loss of any in Table 2-18, the
y-lntercepts (Table 2-19) indicate the ratio RRD;-i/RRDA is between 1.079
and 2.438, depending on which of th« loss functions is used.  If
attention is restricted to the loss functions that base the fit of the
lines on the location of the bast epidemiological RRD estimates (i.e
CAUCHV and TANH) the range is narrowsd to between 1.079 and 1.698.
Thus, these calculations indicate that RRDs obtained from Analysis 30,
the least restrictive analysis using mg/kg/day, very slightly
underestimate human RRDs.  This  is  interesting in light of the fact  that
Analysis Aa, which is like Analysis 30 in every way except .that routes
of exposure ore limited to inhalation, gavage, oral, and.the route that
humans encounter, overestimates  RRDs on  average (note the negative
intercepts in Table 2-19).  This  is an instance of a general phenomenon:
no matter what units are used for extrapolation,  the analysis that is
less r-ctrictivs w'th respect to routes  of exposure yields  larger
y-intercepts than the more restrictive analysis.  The effect of
including all routes of exposure appears generally to be to decrease the
median lower bound.  Using the  restricted set  of  exposure routes  but
averaging results over sex,  study,  and species has the  same effect.
Conversion  factor*  (ratios)  for an units of extrapolation  are  given in
Table  2-20.

To  close out  this discussion  of conversion  factors,  it  is of  some
interest to compare conversion  from rats to  humans and  from mice  to
humans.   The  comparison  can  be  made using Analyses  11c  and  11d  (rats
alone  and mice  alone,  respectively, restricted routes of  exposure, with
extrapolation based on mg/m2/day) and Analyses <*9 and 50  (rats  alone and
mice alone,  respectively,  any route of  exposure,  with extrapolation
based  on mg/kg/day).   For  the first pair of analyses,  the rat bioassay
conversion  factor ranges from 0.81  to 1.85  with no sieve  and  from 1.<»3 to
 1.92 with the sieve whereas the mouse bioassay conversion factor  may
vary between 1.78 and 11.67 without the sieve and between 3.72  and «».30

                                2-25

-------
with the sieve.  Those results indicate that,  unadjusted,  the rat
results com* closer to the direct epidemiological results.   (Average
losses are generally smaller with rat data also.)  For the supplemental
pair, Analyses 49 and SO, rat data fits better only whcr the sieve is
not applied (Table 2-16) but tend to overestimate human RRDs whereas the
mouse data underestimate (Table 2-17).  When no sieve is applied, the
degree of underestimation with mouse data is comparable to the degree of
overestimation with rat data.  However, when the sieve is applied, the
underestimation with mouse data (conversion factors between 1.31 and
1.53) is less extreme than the overestimation with rat data (conversion
factors between 0.32 and 0.58).  [All conversion factors are based on
the CAUCHY and TANH loss function, not DISTANCE2.]

Uncertainty

It is important to characterize the sources and amount of uncertainty
associated with any method of estimating human risks from animal data.
As described  in Section  1, two approaches are taken to investigate
uncertainty.   The first, which is referred to as residual uncertainty,
is the analog of the residual error aspect of statistical analyses.   It
applies to each analysis method as a  whole and delineates the degree  of
uncertainty that remains even when the best unit-slope line describes
the  data.  The other uncertainty  investigation attempts to say something
about the uncertainty  associated  with each of the  components of  risk
assessment.   This  investigation  is more  qualitative,  but aids in
identification of major  sources  of uncertainty and in the degree of
variation attributable to  those  sources.

Residual Uncertainty.   The DISTANCE2  loss function is ideally suited  to
an  investigation of  residual uncertainty.  This  function finds the  line
that minimizes the  squared distances  to  the  intervals defining the  range
of  epidemiologically derived RRD estimates.   That  being the case, the
contribution  to  the  total  loss of any individual  chemical  indicates how
far  that  line is from the chemical's  interval  end  thus  indicates
uncertainty over and above that associated with  the epidemiologically
derived  estimates.   In this sense it  is  called  residual uncertainty:  it
is  uncertainty remaining after the epidemiologicol uncertainty  is
considered.
                                2-26

-------
For any analysis method, the DISTANCE2-fitted line determine* a
predicted dose, RRDp, for each chemical.  If RRDp for any chemical lies
between the upper and lower bounds of the epidemiologically derived
estimates, RRD^u and RRDn(L. respectively,  then no residual uncertainty
exists for that chemical.  Otherwise, resirtuol uncertainty remains.  T*e
residual uncertainties ore aggregated in two ways so as to indicate
something about tho uncertainty in terms of multiplicative factors that:
may be applied to the predictions to give a range of estimates about the
predicted value* which are consistent with the data (cf.  the description
of the methods in Section 1 of this volume).  Of course,  larger factors
(wider ranges) indicate greater residual uncertainty.

Wr.en all the chemicals  included in any analysis method, even those with
no residual uncertainty, are used to characterize uncertainty, a single
factor is estimated.  This factor is the average amount by which the
predicted RROs must  be  multiplied or divided so as to eliminate residual
uncertainty.   Alternatively, two sets of chemicals, those whose
epidemiological estimates lie completely above the line of predicted
values and those whose  epidemiologically derived estimates lie
completely below that line,  can be used separately to determine two
multiplicative factors,  one  to accommodate  underprediction and one to
accommodate overprediction.  Tables  2-21 and 2-22 present these factors
for all  analyses (including  the supplemental analyses) using  the  LJQ
predictor.

Analyses 6,  18,  and  19  are  the analyses yielding  the smallest factors.
Tliese  are the  analyses  with  the fewest  numbers  of chemicals.   As  in
previous discussions,  no more will ;>e said  about  these analyses.

The only other analyses for  whicrt overall uncertainty  factors (the
factors  based  on all chemicals  included in  an analysis)  are  less  than
2.0 are  Analyses 45  and 47,  with  the sieve.  These  supplemental analyses
average  either over  sex (45) or  over all  species  (47).   As  can be seen
from  Table 2-12,  these two analyse,  are two of  the  best  of  the
supplemental,  indeed of all, analyses.   Of  course,  the overall
uncertainty factors  are closely  tied to loss as determined  by the
DISTANCE2 function.   Consequently,  those producing  small average  loss

                                2-i/

-------
(e.g.  those in Table 2-6) also yield relatively small uncertoiity
factors.

The factors estimated using only those chemicals with positive residual
uncertainty (those for which the line does not intersect their vertical
interval) generally follow the same pattern as the overall uncertainty
Vactors.  Since fewer chemicals are used to estimate these values, they
may be less stable than the overall factors, however.  The usefulness of
separate "above the line* and "below the line" estimates can be
visualized if one considers that the chemicals completely below the line
are the ones of primary concern.  They are the chemicals for which
bioassay data overestimate RRDs (even given the conversion factor
suggested by the best-fitting line).  As long as one accepts that the
health implications are worse when RRDs are overestimated than when they
are underestimated, it may be reasonable to want to eliminate residual
uncertainty with respect to the former but not with respect to the
latter.  One approach mentioned earlier is to use asymmetric loss
functions; high degrees of asymmetry do act to eliminate the residual
uncertainty of concern.  Another approach, embodied here, is to estimate
an uncertainty factor tailored to those chemicals below the line.

That uncertainty factor can be seen to vary between 0.009 (Analysis 21
without the sieve) and 0.363  (Analysis 45 with the sieve), still
ignoring Analyses 6,  18, and  19.  Generally, among the better analyses,
the values indicate that predictions would need to be divided by a
factor of 3 to 5 to account for the chemicals that overpredict RRDs.

Component-Specific Uncertainty.  The supplemental analyses consist of an
alternative standard  (Analysis 30)  and 18 variations of the standard.
Each variant  differs  from  the standard in only one respect, i.a.  in the
approach taken to one of the  components defining the analyses.   This
supplemental  set  is used to  investigate the uncertainty associated with
each of  those components.

The alternative  standard accepts any  experiment  and  assumes a mg/kg/day
equivalence between  humans and  animals.   This  alternative is  used in
place  of Analysis  0  because the  correlation analysis and  certain of  the
prediction analysis  results suggest that  allowing  all  routes  of  exposure

                                2-28

-------
(Analysis 3b) is preferable to restricting the routes to those that
humans encounter, gavoge, inhalation, or oral.  Moreover,  mg/kg/day
rather tha.i m/m^/day nwy be the preferred units for extrapolation when
t_2Q is the predictor.  All component-specific uncertainty investigations
are limited to this predictor.

In such investigations, one is interested in how the RRC estimates
change when a component is changed.  Consequently,  it is not necessary
that there be epidemiologically derived estimates to use for comparison
end so all 44 chemicals (not only the 23 with human data) can b* used to
address this question.

It is usually the case that any change in an assumption underlying a
quantitative risk assessment will recult in a change in the risk
estimates.  A component-specific uncertainty investigation should then
tell us two things: how the risk estimates change and how consistently
they change.  A  histogram approach has been used to address these
issues.

Figures 2-42 through  2-59 display the histograms resulting from this
investigation.   Each  histogram corresponds to one of the variations of
the alternative  standard analysis.   The entry for a chemical in any
histogram indicates  the  magnitude of the ratio of the RRD estimates
(l-2o)  from the variant to that from  the alternative standard.  In this
way,  the  distribution of the  changes among the chemicals can be
visualized.

Table 2-23 displays  the  mode  of  the  distribution of each histogram.
Al?j  presented  in that table  is  a  dispersion  factor that is  analogous to
the uncertainty  factor used in  the residual  uncertainty analysis  and
specifies the average factor  by which  the  ensemble  of  chemicals  differs
from  the mode (cf.  the Section  1 description of  the methodology).   This
factor is dependent on the specific  cut points chaser  for  the
histograms, but because those cut points are the same  for  each analysis
method,  it is a valid means of comparing the components with respect  to
 uncertainty.  The greater the factor,  the lesu consistent  is the change
 in RRD estimates that results from the component change corresponding to
 the histogram.   Less consistency (more chemical dependency) indicates

                                2-29

-------
more uncertainty is associated with the corresponding component.

Figures 2-42 through 2-4i pnrtain to tho ehoica of dose units used for
animal-to-humon extrapolation.  These f-xguras snow relatively little
dispersion of the chemicals and heice indicate little uncertainty
associated with tha choice of cio?« units.  That is not to say that the
resulting RRD estimates ora not dromoticolly affected by changes in
units.  These are the only analyses for which tha mode is not in the
interval frcm 0.8 to 1.25 (Table 2-23).  However, the dispersion factors
and the figures indicate that changing units has a relatively
predictable effect, one that is not chemical dependent.  The plot in
Figure 2-42, for example, largely reflects the standard values  (body
weight, surface area coefficient; cf. Table 1-3) used in the conversion
from mg/kg/day to mg/m^/day in rots and mice.  When those standards are
used, the ratio of  the RRD estimates  (in all cases, the standard, using
mg/Kg/doy, is i" tne denominator) is,  about 0.09 for mice and 0.21 for
rats.  The chemicals falling between  0.1 and 0.2 in Figure 2-42 are the
result of using experiment-specific body weights or of cases in which
changing units also changes the ordering of the experiment* (due to
species-specific changes) and so changes the experiment yielding the
median estimate.  Hence  this figure,  showing the greatest ("spersion of
the four because of the  fairly even split between use of mice and rats,
may even exaggerate the  uncertainty here.

The next two  histograms  (Figures 2-46 and 2-47)  relate to criteria
placed on the  length of  observation and  length of dosing, respectively.
The relatively  large dispersion noted is due to  extreme changes in one
or two chemical!,.   Restriction  to  long experiments  decreases the RRD
estimate for  cigarette smoke  by a  factor over  1000.   Similarly,
restriction  to  experiments  that  dosed the treated animals for at least
80< of the  standard experiment  length increases  RRD estimates for
asbestos and  cadmium  by  over  three  orders of magnitude.

No  extreme  changes  are noted  when  experiments  are  limited to those  using
the  route  of  exposure  by which  humans encounter  the chemicals  in
question  (Figure 2-48).   However,  for only  24  of the 44  chemicals  were
there studies €>mploying that routb.   When  a less restrictive criterion
 is  used  (the route just mentioned plus gavage,  inhalation,  and  oral

                                2-30

-------
routes;  Figure 2-49),  moderately extreme values do appear,
benzo(a)pyrene and arsenic which change by factors of 847 and 410,
respectively.  These chemicals were not included in Figure 2-48 but
account for the majority of the dispersion seen in  Figure 2-49.
Otherwise Figure 2-49 is less disperse than Figure 2-48.

The next four histograms (Figures 2-50 through 2-53) relate to the
choice of endpoints to be analyzed.  These are among the most disperse
of the figures in the sense of including several extreme changes
(Figures 2-50 and 2-53} and also in the sense of having less
dominant modes (Figures 2-51 through 2-53).  If not for two extreme
changes (cigarette smoke and saccharin for which the ratios are
9.77x10-** and 8.69x10-6, respectively), Figure 2-50 (malignant tumors
only) would  display much less dispersion, being on the order of 13
.-other than  291.

All of the histograms discussed above, except for those that relate  to
choice of dose units, depict changes that occur because a subset  of  the
data are used..   They demonstrate the effect such selections have  on  the
location of  the  median  lower bound RRD.  A chemical with a ratio  greater
than 1 is one for which the selection  tends to eliminate snaller
estimates.   Componentt  such as the-.e that relate to bioassay or response
inclusion criteria can  be  very sensitive to the data that are available,
certainly more so than  those components that relate to manipulation  of
whatever data are available (s'-ich  as the component related to the choice
of  dose units).  This sensitivity  is reflected in the fact that for
these histograms fever  than the maximum nu;v<5er of chemicals  (44)  are
addressable  once the  inclusion criteria are applied and may  contribute
strongly to  the  appearance of  the  extreme  changes that  have  been  noted.
One must be  aware that  some confounding  due to data availability  may be
present  in the histograms  of  Figures 2-46  through  2-53.

On  the other hand,  those  analyses  that dictate  how  the  experiment-
specific  RRDs  are averaged are all based  on the  same  data.   In  the
standard  analysis  no  averaging is  [.erformed.   Analyses  45  through 47
 (Figures  2-54  through 2-56) average results over  sex  alone,  study alone,
 and species  alone,  respectively.   The  uncertainty associated with any of
 these procedures is small, the dispersion factors indicate that the

                                2-31

-------
average change in the RRD estimates is less than a factor of 2.2.

However, if we limit attention to rats and mice (Figures 2-57 through
2-59) uncertainty is again great.  This, too, may be in part a
reflection of dependence on data availability.  Consider the case of
saccharin which contains mouse and rat bioascays predominantly.  Th9 rat
studies are of better quality (they get rank 1 by the quality screen)
than are the mouse studies (rank 3) so that when both are considered,
only the rat studies are analyzed (the full sieve is used in the
analyses represented in the histograms).  Therefore, no change is seen
when rats alone or the average of rat and mouse data are used (Figures
2-47 and 2-46, respectively).  The mouse results are over rive orders of
magnitude smaller than the rat results.

Nevertheless, some species-specific changes can be discerned.  Cigarette
smoke is apparently more potent  in rats than in other species.  Arsenic
is less potent in rats and mice  although this may also reflect some data
dependence.  Overall, the choice of species appears to be a highly
uncertain component of risk assessment as indicated by the large
dispersion factors for Analyses  49 and 50, and by the difference in
dispersion between Analyses 47 and 48 (Figures 2-56 and 2-57) which
differ  only  in that species other than rats and mice are included in
Analysis 47.  It  is easy to see  how data availability can affect the
estimates from any given species, above and  beyond  the question of the
most appropriate  species for  any given chemical.
                                2-32

-------
               Table 2-1

CORRELATION COEFFICIENTS AND ASSOCIATED
p-VALUES.  BY ANALYSIS METHOD AND SIEVE
Analysis
0
1
2
3o
3b
4a
4b
4C
4d
5
6
7
8a
8b
8c
9
10
11a
lib
lie
11d
12
13
14
15
16
17
18
19
20
21
22
23
24o
24b
24c
24d
25
# of
Chemi-
cals
20
18
19
17
23
20
20
20
20
20
6
19
13
17
18
20
20
20
20
19
13
20
18
19
18
13
11
10
9
17
13
15
13
20
20
20
20
16
NO
f
.68
.55
.61
.62
.80
.70
.67
.67
.68
.69
.96
.55
.50
.80
.76
.69
.71
.60
.66
.77
.62
.75
.48
.71
.48
.48
.57
.79
.79
.67
.43
.34
.18
.76
.73
.74
.73
.69
Screens
P-
volue
.0002
.0095
.0034
.0041
<.0001
.0008
.0004
.0008
.0004
.w003
.0028
.0079
.0379
.0050
.0004
. OOOn
.0004
.0025
.0009
.0002
.0121
.0005
.0240
.0005
.0267
.0489
.0358
.0036
.0062
.0020
.0715
.1078
.2832
<.0001
.0004
.0001
.0003
.0023
Quality
Screen
P
.73
.55
.55
.64
.77
.76
.73
.71
.7t
.74
.79
.64
.56
.60
.71
.70
.73
.73
.72
.74
.69
.75
.50
.75
.50
.49
.57
.76
.79
.64
.37
.34
.01
.76
.75
.76
.76
.64
P-
value
.0001
.0083
.0075
.0026
<.0001
<.0001
.0001
.0004
.0002
.0001
.0317
.0015
.0207
.0052
.0009
.0007
<.C001
.0002
.0001
.0001
.0046
.0001
.0172
.0001
.0177
.0472
.0369
.0046
.0057
.0035
.1046
. 1075
.4904
.0004
.0001
<.0001
.0002
.0042
Significance
Screen
p
.78
.68
.49
.74
.78
.77
.76
.77
.77
.78
.93
.72
.50
.66
.76
.77
.75
.69
.73
.79
.80
.73
.43
.70
.45
.49
.58
.74
.79
.65
.43
.35
.18
.72
.71
.72
.71
.79
P-
value
<.0001
.0013
.0187
.0005
<.0001
<.0001
<.0001
.0002
.0001
.0001
.0106
.0003
.0435
.0013
.0002
<.0001
.0001
.0011
. 0003
.0001
.0006
.0001
.0368
.0007
.0321
.0470
.0280
.0090
.0060
.0024
.0698
.1001
.2744
.0006
.0003
< .0001
.0001
.0001
Quality and
Significance
Screen
e
.78
.63
.49
.73
.90
.78
.76
.78
.78
.75
.79
.76
.56
.66
.76
.76
.77
.76
.73
.79
.76
.75
.43
.71
.46
.49
.58
.73
.79
.63
.38
.35
.18
.75
.74
.74
.75
.81
P-
value
.0001
.0015
.0153
.0007
<.0001
.0001
.0001
<.0001
<.0001
<.0001
.0342
.0001
.0214
.0022
.0001
.0003
.0002
<.0001
<.0001
<.0001
.0023
<.0001
.0416
.0005
.0316
.0436
.0301
.0090
.0058
.0043
. 1023
.1036
.2821
.0001
.0001
.0001
<.0001
.0002
               2-33

-------
                    Table 2-2




ABBREVIATIONS FOR CHEMICALS INCLUDED IN THE STUDY
Thot><. with Suitable
Epideniolqical Data
Abbreviation
AB
AF
AS
BN
BZ
CB
CD
CR
CS
DS
EC
EO
ES
IS
MC
ML
NC
PC
PH
RS
SC
TC
VC

Chemical
Asbestos
Aflatoxin
Arsenic
Benzene
Benzidine
Chlorambucil
Cadmium
Chromium
Cigarette Smoke
DES
Epichlorohydrin
Ethylene Oxide
Estrogen
Isoniazid
Methylene Chloride
Melphalan
Nickel
PCBs
Phenacetin
Reserpine
Saccharin
Trichloroethylene
Vinyl Chloride

Abbreviation
AC
AL
AM
3A
CO
CT
DB

DE
DL
DP
ED
FO
HC
HY
LE
MU
NA
NT
TD
TE
TP

TO
Others
Chemical
Acrylonitrile
Allyl Chloride
^-Aminobiphenyl
Benzo(a)oyrene
Chlordane
Carbon Tetrachloride
3, 3-Dichloro-
benzidine
1 , 2-Dichloroethane
Vinylidene Chloride
Diphenylhydrazine
EDB
Formaldehyde
Hexachlorobenzene
Hydrazine
Lead
Mustard Gas
2-Nophthylamine
NTA
TCDD
Tetracholorethylene
2,*», 6-Trichloro-
phenol
Toxaphene

-------
                         Table 2-3

   AVERAGE LOSS AS DETERMINED BY THE SYMM£TRIC DISTANCE2
  IOSS FUNCTION, BY ANALYSIS METHOD, PRECICTOR.  AND SIEVE
    0
    1
    3a
    3b
    4o
    4b
    4c
   <»d
   5
   6
   7
   8a
   80
   8c
   9
  10
  110
 lid
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
24b
24c
24d
25
	
No Sieve
.650
.7rn
.7W
.779
.701
.684
.646
.767
.640
.240
.495
.514
.550
.367
.645
.541
.639
.659
.732
.490
.541
.352
.279
.368
.352
.289
.052
.090
.441
1 . 100
.530
1 . 181
.605
.580
.526
.630
.278
Predi
Sieve
.570
.630
.364
.620
.552
.599
.578
.516
.548
.159
577
.279
1.088
.246
.529
.412
.465
.488
.577
.256
.310
.459
.363
.430
234
.290
.064
.075
.752
.964
.574
1 . 170
.298
.318
.296
.276
.191
ctor
L20
No Sieve
.146
.261
.215
.266
.124
.166
.157
.134
.142
.001
.200
.430
.678
.189
. 141
.523
.272
.241
.253
.280
.541
.352
.279
.368
.289
.052
.090
.441
1 . 100
.530
1.181 1
.605
.580
.526
. 630
.232

Sieve
.298
.377
.310
. 113
.273
.316
.272
.267
.285
.001
.331
.224
.597
.243
.298
.268
.239
.274
.277
.228
.310
.459
.368
.430
.234
.290
.064
.075
.752
.962
.574
.170
.298
.319
.296
.277
.190
                     2-35

-------
                                                                             1 \
                       Table 2-4

  AVERAGE LGCS AS DETERMINED BY THE SYMMETRIC CAUCHY
LOSS FUNCTION, BY ANALYSIS METHOD, PREDICTOR, AND SltVE
Predictor
Analysis
0
1
3o
3b
4o
4b
4c
4d
5
6
7
8a
8b
8c
9
10
lla
11b
11C
11d
12
13
14
15
16
17
18
19
20
21
22
23
24a
24b
24C
24d
25
LM
L20
No No
Sieve Sieve Sieve Sieve
.566
.540
.547
.551
.597
.546
.546
.605
.560
.453
.496
.467.
.459
.511
.559
.530
.560
.477
.429
.528
.524
.490
.498
.500
.413
.375
.366
.344
.410
.242
.445
.460
.541
.522
.516
.550
.463
.509
.508
.464
.477
.492
.528
.493
.485
.511
.359
.523
441
.442
.533
.516
.482
.486
.447
.421
.490
.460
.519
.493
.489
.409
.333
.377
.322
.456
.452
.432
.445
.448
.477
.U51
,'*50
.494
.440
.478
.487
.453
.423
.420
.398
.428
.440
.270
.521
.430
.439
.432
.433
.499
.476
.381
.378
.488
.524
.490
.498
.500
.413
.375
.366
.344
.410
.424
.445
.460
.541
.•>i>2
.516
.550
.470
.457
.506
.467
.413
.437
.466
.440
.454
.455
.309
.491
.419
.447
.494
.643
.465
.448
.408
.390
.451
.460
.519
.493
.489
.409
.363
.377
.322
.456
.452
.432
. 443
. 448
.477
.451
.4r,0
.506
ML EM
NO
Sieve
.586
.569
.606
.558
.570
595
b84
.573
.591
.446
.559
.478
.604
.604
.589
.563
.575
.500
.445
.529
.566
.572
.564
.618
.437
.412
.366
.354
.614
.817
.66*
.665
.559
.571
.560
.570
.558
Sieve
.415
.513
.479
.483
.502
.536
.502
.492
.515
.360
.535
.463
.674
.578
.524
.495
.492
.466
.449
.458
.474
.567
.510
.579
.431
.399
.373
.327
.664
.766
.694
.680
.468
.491
.466
.468
.530
MLE2Q
NO
Sieve
.507
.619
.551
.520
.509
.492
.471
.508
.500
.261
.645
.453
.685
.711
.535
.538
.575
517
.527
.527
.566
.572
.564
.61ft
.437
.412
.366
.354
.614
.817
.664
.665
.559
.571
.560
.570
.494
Sieve
.482
.428
.492
.423
.463
.495
.467
.480
.476
.297
.510
.444
.651
.599
.490
.485
.488
.433
.428
.421
.474
.567
.579
.579
.431
.399
.573
.32.'
.664
.766
.694
.680
.468
.491
.466
.468
.561
                       •>-36

-------
                       Toble 2-5

   AVERAGE LOSS AS DETERMINED BY THE SYMMETRIC TANH
LOSS FUNCTION, BY ANALYSIS METHOD,  PREDICTOR.  AND SIEVE
Predictor
LM
Analysis
0
1
3a
3(3
4a
4b
4c
4d
5
6
7
80
8b
8c
9
10
11a
11b
lie
I1d
12
13
14
15
16
17
18
19
20
21
22
23
2
-------
                                                   Table 2-S

                                  COMPARISON OF ANALYSES; FIVE BEST ANALYSES.
                                        BY PREDICTOR AND LOSS FUNCTION
a>
Loss Function
Predictor
LM




"-20




MLEM




MLE2Q




DISTANCE2
Analysis Avg.Loss
18
19
6
25
2
6
18
19
3b
4a
_.b




__b




.052 (ns)a
.075 (s)
.159 (S)
.191 (s)
.209 (ns)
.001 (s)
.052 (ns)
.075 (8)
.113 (S)
.124 (ns)










CAUCHY
Analysis Avg.Loss
19
6
17
18
16
6
19
17
11c
lib
19
6
18
17
16
6
19
18
17
lid
.322 (s)
.359 (s)
.363 (s)
.366 (ns)
.409 (s)
.270 (ns)
.302 (s)
.363 (s)
.378 (ns)
.381 (ns)
.327 (s)
.360 (s)
.366 (ns)
.399 (s)
.431 (s)
.261 fns)
.327 (s)
.366 (ns)
.399 (s)
.421 (s)
TANH
Analysis Avg.Loss
19
20
17
8b
6
6
19
20
17
8b
19
6
3b
24d
17
6
19
3b
11b
11c
.120 (s)
.121 (ns)
.121 (s)
.127 (ns)
.134 (8)
.:io (ns)
.120 (s)
.121 (ns)
.121 (s)
.125 (s)
.121 (s)
.134 (s)
.175 (s)
.203 (s)
.203 (s)
.111 (ns)
.121 (s)
.131 (s)
.199 (s)
.202 (s)
                °The loss  given  is the  smaller  of  the  two losses  (with  and  without the sieve)
                 for any analysis.  The code  in parentheses  indicates whether  it  comes fro,T.
                 the analysis without the sieve (ns) or  with the  sieve  (s).
                bThe DISTANCE2 loss function  is not used with MLE predictors.

-------
                                                   Toblo 2-7

                             COMPARISON OF ANALYSES; FIVE BEST ANALYSES, EXCLUDING
                            ANALYSES 6, 18, AND  19. BV PREDICTOR AND LOSS FUNCTION
                                                      Loss Function
                                DISTANCE2                CAUCHY                   TANH
i
m
10
Predictor
LM




1-20




MLfM




MLE20




Analysis
25
2
16
Be
lid
3b
4a
3a
4d
9
__b




_.b




Avg.Loss
.191 (s)a
.209 (ns)
.234 (s)
.254 (s)
.256 (s)
-113 (s)
.124 (ns)
.130 (s)
.134 (ns)
.141 (ns)










Analysis
17
16
20
11c
21
17
lie
lib
16
20
17
16
11c
11d
11b
17
11d
3b
11c
16
Avg.Loss
.363 (s)
.409 (s)
.41C (ns)
.421 (s)
.424 (ns)
.363 (s)
.3/3 (ns)
.381 (ns)
.409 (s)
.410 (ns)
.399 (s)
.431 (s)
.445 (ns)
.458 (s)
.466 (s)
.399 (s)
.421 (s)
.423 Is)
.428 (s)
.431 (s)
Analysis
20
17
8b
22
21
20
17
8b
22
3b
3b
24d
17a
24a
12
3b
lib
11c
24d
17
Avg.Loss
.121 (ns)
.121 (s)
.127 (ns)
.137 (s)
.141 (ns}
.121 (ns)
.121 (s)
.125 (s)
.137 (s)
.140 (s)
.175 (s)
.203 (s)
.203 (s)
.204 (s)
.204 (s)
.141 (s)
.199 (s)
.202 (s)
.203 (s)
.203 (s)
                °The loss given is the smaller of the two losses (with and without the sieve)
                 for any analysis.  The code in parentheses indicates whether it  comes from
                 the analysis without the sieve (ns) or with the sieve (s).
                hThe DISTANCE2 loss function is not used with MLE predictors.

-------
                                 Table 2-8

                 TOTAL  INCREMENTAL NORMALIZED LOSSES,
                         3Y  ANALYSIS AND SIEVE0
Analysis
0
1
2
3a
3b
4a
4b
4c
4d
5
7
8a
8b
8c
9
10
11a
11b
11c
11d
12
13
14
15
16
17
20
21
22
23
24a
24b
24c
24d
25
Total Incremental
No Sieve
1.019
1.693
1.543
1 .719
1.079
0.920
0.900
0.698
0.853
1.015
1.825
1.534
0.996
0.758
0.961
1.944
1.1*50
0.746
0.741
1 .627
2.156
1.751
1.695
1.836
1.324
0.259
0.558
1.553
1.117
2.001*
2.398
2.242
2.084
2.484
1 .183
Normalized Loss
Sieve
1 .418
1.997
2.175
1.390
0.555
1.198
1.559
1.258
1.313
1.385
1.736
1.176
0.963
1.239
1.466
1 .M7
• 1 . 194
0.983
0.968
1.215
1.430
2.158
1.767
1.804
1 .133
0.166
1.231
1 .700
1.043
1 .888
1.325
1.591
1 .369
1 .301
1.292
°Calculated using L2Q as the predictor.
                               2-40

-------
                                Table 2-9

              AVERAGE LOSS FOR RESTRICTED SETS OF CHEMICALS
             FOR ANALYSES 3b,  17,  and 20. BY LOSS FUNCTION0
   Sets of
  Chemicals
Analysis
                              Loss Function
DISTANCE2
                                                        CAUCHY
                                                                    TANH
11 to which
Analysis 17
is applicable
17 to which
Analysis 20
is applicable
3b with sieve
17 with sieve
20 w/o sieve
3b with sieve
20 w/o sieve

0.053
0.290
0.161
0.082
0.441

0.414
0.363
0.409
0.360
0.410

0.131
0.121
0.133
0.100
0.121

°The L2Q predictor is used.
                                2-41

-------
                               Table 2-10

    AVERAGE LOSS AS DETERMINED BY THE ASYMMETRIC TANH LOSS FUNCTION
              FOR LM, BY ANALYSIS AND DEGREE OF ASYMMETRY
Asymmetry Constant (m)
Analysis01
0
1
2
3a
3b
4a
4b
4c
4d
5
6
7
8a
8b
8c
9
10
11a
lib
11e
11d
12
13
14
15
16
17
18
19
20
21
22
23
24a
24b
:>4c
i.4d
25
1 .5
.207
.220
.204
.200
.191
.201
.211
.205
.199
.207
.144
.213
.200
.157
.177
.208
.197
.200
.192
.191
.201
.192
.211
.199
.200
.189
. 140
.198
.134
. 154
.168
. 149
. 170
.188
.195
.189
.188
. 177
2
.224
.244
.222
.217
.205
.218
.228
.222
.218
.225
.149
.232
.218
.170
.186
.226
.214
.218
.209
.207
.220
.208
.224
.216
.219
.205
.157
.210
.146
.169
.177
.160
.187
.204
.211
.205
.204
.196
5
.279
.296
.264
.260
.247
.272
.282
.275
.270
.276
.161
.281
.282
.215
.211
.279
.266
.273
.264
.260
.271
.260
.256
.265
.263
.275
.218
.279
.215
.210
.225
.196
.257
.255
.259
.253
.254
.228
10
.312
.328
.284
.293
.283
.306
.312
.305
.305
.310
.173
.310
.290
.276
.232
.311
.294
.305
.296
.292
.291
.287
.267
.292
.276
.293
.23^
.335
.267
.270
.293
.225
.342
.283
.288
.281
.282
.253
50
.319
.332
.290
.321
.336
.312
.319
.312
.310
.316
.173
.313
.296
.352
.280
.314
.301
.309
.300
.297
.291
.288
.267
.301
.285
.312
.250
.363
.288
.340
.340
.230
.377
.287
.289
.282
.287
.305
100
.319
.332
.290
.321
.336
.312
.319
.312
.310
.316
.173
.313
.296
.352
.280
.314
.301
.309
.300
.297
.291
.288
.267
.301
.285
.312
.250
.363
.288
.340
.340
.230
.377
.287
.289
.282
.287
.305
°Analyses have been performed using the sieve.
                               2-42

-------
                              Table 2-11

    AVERAGE LOSS AS DETERMINED BY THE ASYMMETRIC TANH LOSS FUNCTION
             FOR L2Q. BY ANALYSIS AND DEGREE OF ASYMMETRY
Asymmetry Constant (m)
Analysis0
0
1
2
3a
3b
4a
4b
4c
4d
5
6
7
8a
8b
8c
9
10
lla
11b
lie
11d
12
13
14
15
16
17
18
19
20
21
22
23
24a
24b
24c
24d
25
1.5
.188
.208
.207
.195
.156
.184
.193
.185
.186
.188
.126
.199
.189
.148
.170
.189
.190
.184
.181
.182
.189
.192
.211
.199
.200
.189
.140
.198
.134
.154
.168
. 149
.170
.188
..195
.189
.188
.176
2
.203
.220
.223
.207
.169
.198
.205
.199
.200
.202
.130
.214
.204
.162
.179
.205
.207
.199
.197
.196
.207
.208
.224
.216
.219
.205
. 157
.210
.146
.169
.177
.160
.187
.204
.211
.205
.204
.189
5
.249
.267
.264
.237
.210
.240
.250
.243
.240
.247
. 144
.260
.269
.201
.205
.257
.251
.247
.246
.236
.258
.260
.256
.265
.263
.275
.218
.279
.215
.210
.225
.196
.257
.255
.259
.253
.254
.220
10
.283
.299
.279
.264
.235
.275
.281
.274
.275
.282
.149
.290
.277
.261
.227
.290
.279
.270
.272
.269
.279
.287
.267
.292
.276
.293
.234
.335
.267
.270
.293
.225
.342
.283
.288
.281
.282
.244
50
.290
.304
.287
.293
.283
.281
.288
.281
.281
.288
. 149
.293
.283
.339
.272
.292
.286
.284
275
.275
.279
.288
.267
.301
.285
.312
.250
.363
.288
.340
.340
.230
.377
.287
.289
.282
.287
.297
100
.290
.304
.287
.293
.285
.281
.288
.281
.281
.288
.149
.293
.283
.339
.272
.292
.286
.284
.275
.275
.279
.288
.267
.301
.285
.312
.250
.363
.283
.340
.340
.230
.377
.287
.289
.282
.287
.297
°Analyses have been performed using the sieve.
                               2-43

-------
                 Table 2-12

 Y-INTERCEPT VALUES FOR BEST-FITTING LINES,
PREDICTOR, BY ANALYSIS, SIEVE,  AND LOSS FUNCTION
No Sieve
Analysis
0
1
2
3o
3b
4a
4b
4c
4d
5
6
7
8a
8b
8c
9
10
110
11b
11c
1id
-2
13
14
15
16
17
18
19
20
21
22
23
24a
24b
24c
24d
25
DISTANCE
1.587
1 .450
0.927
1.683
1.528
0.890
1.352
1.309
2.563
1.546
2.453
1.151
1.120
0.869
0.860
1.565
1.162
1.370
1.322
1.235
1.301
0.939
0.319
0.482
0 . 298
0.905
0.7C9
1.274
1.188
0.450
0.106
0.233
-0.073
0.245
0.691
0.634
1 .812
1.033
CAUCHY
0.510
0.931
0.824
2.004
1.820
-0.462
0.308
0.344
1.230
0.440
1.519
1.723
1.135
0 . 071
1.732
0.493
0.665
0.340
0.359
0.361
0.228
0 . 664
0.709
1.003
0.693
0.613
0.443
0.518
0.374
-0.161
-0.159
-0 . 679
-0.564
0.113
0,182
0.328
1.943
1.758
TANH
1.714
1 .067
1.067
1.922
2.164
1.517
1.032
1.079
2.919
1.610
2.086
1.417
1.493
0.302
1.391
1.714
1.208
1.439
1.391
0.519
1.097
0.939
0.727
0.929
0.744
0.710
0.467
0.634
0.447
-0.110
0.233
-0.549
-0.058
0.361
0.498
0.545
2.179
1 .714
DISTANCE2
0.827
0.821
0.404
1.117
1.976
0.087
0.609
0.596
1.582
0 . 765
2.300
0.730
0.907
0.455
0.583
0.770
0.611
0.734
0.701
0.748
0.197
0.444
0.168
0.456
0.237
0.749
0 . 772
1.308
1 .254
0.159
0.053
0.286
-0.120
-0.305
0.226
0.209
1.201
0.722
Sieve
CAUCHY
1.066
1.357
0 . 537
1 «;84
O.S30
-0.095
0.546
0.599
1.621
1.084
1.488
0.960
1.045
-0.174
1.247
0.956
0.771
0.839
0.462
0.291
0.849
0.540
-0.045
0.360
0.291
0.679
0.451
0.616
0.450
0.038
-0.736
-0.657
-0.715
-0.372
0.249
0.261
1.296
1.381

TANH
1.067
1 .067
0.665
1 .374
1.260
0.072
0.742
0.725
1.617
1.071
2.086
1 .067
1.555
0.233
0.813
0.966
0.955
0.874
0.603
0.447
0.788
0.749
0.272
0.731
0.27J
0.631
0.447
0.988
0.'~',7
0.233
0.233
-0.549
-0.613
-0.212
0.470
0.471
1.364
0.955
                  2-44

-------
                     Table 2-13

     Y-INTERCEPT VALUES FOR BEST-FITTING LINES.
L2Q PREDICTOR, BY ANALYSIS. SIEVE.  AND LOSS FUMCTION
No Sieve
Analysis
0
1
2
'a
5b
Ha
4&
4c
4d
5
6
7
80
8b
8c
9
10
11 a
lib
1lc
1id
12
13
1i»
15
16
17
IS
19
20
21
22
23
24a
24b
24c
24d
25
DISTANCE2
0.?28
0.416
0.630
0.635
1 .314
-0.257
0.225
0.150
1.282
0.493
1 .667
0.370
1 .045
0.145
0.258
0.607
0.925
0.664
0.515
0 047
0.808
0.939
0.319
0.482
0.298
0.905
0.769
1.274
1.188
0.450
0.106
0.233
-0.073
0.245
0.691
0.634
1.817
0.516
CAUCHY
0.380
-0.270
0.258
1.454
0.612
-0.428
0.115
0.211
1.310
0.334
1 .670
0.872
0.686
-0.081
0.289
0.433
0 . 292
0.447
-0.064
-0.093
0.251
0.664
0.709
1.003
0.693
0.613
0.443
0.518
0.374
-0.161
-0. 159
-0.679
-0.564
0.113
0.182
0.328
1 .943
0.492
TANH
0.532
0.272
0.315
1.339
1.067
-0.143
0.226
0.338
1.598
0.434
1.873
0.977
0.742
0.233
0.413
0.532
1.065
0.841
0.173
0.267
1.067
0.939
0.727
0.929
C.744
0.710
0.447
0.634
0.447
-0.110
0.233
-0.549
-0.058
0.361
0.498
0.545
2.179
0.571
DISTANCE2
0.474
0.550
0.201
0.695
1.223
-0.358
0.236
0.186
1.183
0.435
1 .667
0.428
0.793
0.072
0.478
0.494
0.356
0.161
0.347
0.297
0.085
0.444
0.168
0.456
0.237
0.749
0 . 772
1.308
1.254
0.159
0.053
0.286
-0.120
-0.305
0.226
0.209
1 .201
0.651
Sieve
CAUCHY
0.199
0.135
-0.067
0.624
0.927
-0.566
-0.230
-0.079
1.017
0.154
1 .415
0.210
0.683
-0.036
-0.357
0.278
0.461
0.183
0.293
0.155
0.633
0.540
-0.045
0.360
0.291
0.679
0.451
0.616
0.450
0.038
-0.736
-0.657
-0.715
-0.372
0.249
0.261
1 .296
0.722

TANH
0.315
0.583
0.272
0.822
1.080
-0.401
0.069
0.024
1.222
0.315
1.873
0.555
0.742
0.233
0.654
0.449
0.459
0.283
0.447
0.283
0.571
0.749
0.272
0.731
0.272
0.631
0.447
0.988
0.447
0.233
0.233
-0.549
-0.613
-0.212
0.470
0.471
1 .364
0.813
                      2-45

-------
                     Table 2-14

     Y-INTERCEPT VALUES FOR BEST-FITTING LINES,
MLEM PREDICTOR, BY ANALYSIS. SIEVE,  AND LOSS FUNCTION
No oieve
Analysis
0
1
2
3o
3b
40
4b
'»c
4d
5
6
7
8a
8b
8c
9
10
11a
11b
11c
11d
12
13
14
15
16
17
18
19
20
21
22
23
24a
24b
24c
24d
25
DISTANCES
-1 .481
-1.617
-2.307
-1.68*
2.116
-2.192
-1.666
-1.685
-0.710
-1.528
2.129
-1.936
-2.600
-4.428
-4.192
-1.522
-1.903
-1.689
-1.713
-1.852
-2.861
-2. 150
-2.828
-2.520
-4.601
-2.932
-3.238
1.098
1.027
-4.892
-6.075
-7.516
-8,007
-2.874
-2.348
-2.381
-1 . 371
-2.502
CAUCHY
0.327
1.059
0.257
1.811
0.783
-0.496
-0.065
-0.011
1.238
0.278
1.470
1.113
1.026
0.444
1.231
0.292
0.338
0.519
0.294
0.162
0.663
0.132
-0.768
0.460
0.372
0.466
0.308
0.343
0.210
-0.352
-4.276
-0.969
-0.300
-0.670
-0.085
-0.015
0.980
1 .341
TANH
1.366
1.366
0.514
1.762
1.708
0.478
1.487
1.534
2.002
1.299
1.946
0.928
1.366
0.352
0.882
1.366
0.759
0.894
0.505
0.365
.0.908
0.430
-0.081
0.455
0.514
0.471
0.299
0.387
0.299
-0.378
0.023
-0.499
-0.294
-0.429
0.209
0.210
1.411
0.882
DISTANCE2
-1.814
-1.895
-4.060
-2.009
1.815
-2.556
-2.020
-2.014
-1.117
-1 .860
2.129
-2.154
-2.681
-6.527
-4.292
-1 .864
-2.079
-1.920
-1.958
-2.056
-3.358
-2.215
-4.380
-2.441
-4.502
-2.917
-3.222
1.176
1.094
-6.720
-8.220
-7.438
-8.063
02.961
-2.409
-2.416
-1 .519
-2.581
Sieve
CAUCHY
0.994
1.186
0.567
1.159
0.768
-0.081
0.624
0.487
1.557
0.994
1.319
0.846
0.917
0.633
1.261
0.783
0.680
0.784
0.374
0.177
0 . 75';
0.423
0.200
0.267
0.068
0.539
0.308
0.446
0.276
0.744
-1.706
-1.086
-0.902
-0.482
-0.183
0. 169
1 . 174
1.302

TANH
0.928
1.201
0.514
1.201
1.124
-0.069
0.610
0.592
1.476
0.933
1.946
0.882
1.008
0.449
0.882
0.752
0.859
0.737
0.505
0.299
0.908
0.555
0.299
0.528
0.031
0.733
0.299
0.739
0.299
0.306
0.549
-0.192
-0.255
-0.353
-0.279
0.281
1 . 178
0.882
                      2-46

-------
                   Table 2-15

Y-INTERCEPT VALUES FOR BEST-FITTING LINES.  MLE2Q
PREDICTOR, BY ANALYSIS,  SIEVE,  AND LOSS FUNCTION
Analysis
0
1
2
3a
3b
4o
4b
bo
4d
5
6
7
8a
3b
8c
9
10
110
11b
11 C
I1d
12
13
14
15
16
17
18
19
20
21
22
23
240
24b
24c
24d
25

DISTANCE2
-4.168
-5 . 846
-5.944
-6.189
-1.873
-4.923
-4.358
-4.365
-3.482
-4.203
1.524
-8.983
-2.775
-8.556
-15.326
-5.517
-3.880
-4.041
-2.631
-4.437
-5.660
-2.150
-2.828
2.520
-4.601
-2.933
-3.238
1.098
1.027
-4 . 982
-6.075
-7.516
-8.007
-2.874
-2.349
-2 . 380
-1.371
-10.801
No Sieve
CAUCHY
0.010
-0.057
0.035
1.016
-0.045
-0.785
-0.303
-0.241
0.931
-0.027
1.310
0.537
0.563
1.231
-0.697
0.387
-0.046
0.240
-0.162
-0.986
0.451
0.132
-0.768
0.460
0.372
0.466
0.308
0.343
0.210
-0.352
-4.264
-0.969
-0.300
-0.679
-0.085
-0.015
0.980
-0.227

TANH
0.080
0.579
0.199
0.643
0.643
-0.594
-0.226
-0.179
1 .179
-0.029
1.454
0.574
0.604
0.145
-0.564
0.207
0.500
0.363
-0.095
-0.465
0.643
0.480
-0.087
0.455
0.514
0.471
0.299
0.387
0.299
-0.378
0.022
-0.499
-0.294
-0.429
0.209
0.021
1 .411
-0.047

DISTANCE2
-2.253
-2.302
-5.788
-2.417
1.075
-2.999
-2 . 484
-2.488
-1 .544
-2.303
1.524
-2.444
-2.845
-6.719
-4.436
-2.218
-2.294
-4.031
-2.327
-2.366
-3.491
-2.215
4.380
-2.440
-4.502
-2.917
-3.222
1.176
1 .094
-6.720
-8.220
-7.438
-8.063
-2.961
-2.<»03
-2.416
-1.518
-2.735
Sieve
CAUCHY TANH
0.074 0.372
0.154 5.793
0.041 0.199
0.596 0.064
0.744 0.946
-0.693 -0.543
-0.324 0.115
-0.204 -0.065
0.90S 1.179
0.034 0.273
1.180 1.454
0.114 0.372
0.561 0.603
1.166 0.248
0.463 0.574
0.147 0.372
0 310 0.520
0.156 0.251
0.168 0.299
0.009 0.131
0.535 0.643
0.423 0.555
0.200 0.299
0.267 0.528
0.068 0.031
0.539 0.733
0.308 0.299
0.446 0.739
0.276 0.299
0.744 0.306
-1.706 0.549
-1.086 -0.192
-0.902 -0.255
-0.482 -0.353
0.183 0.279
0.169 0.281
1.174 1.178
0.644 0.602
                    2-47

-------
                    Table 2-16

AVERAGE LOSS FOR SUPPLEMENTAL ANALYSES WITH THE LjQ
 PRECICTOR, BY ANALYSIS,  SIEVE,  AND LOSS FUNCTION
Analysis
30
31
32
33
2k
35
36
37
38
41
42
43
44
45
46
<*7
48
49
50


No
Sieve
DISTANCE2 CAUCHY
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
.224
.266
.263
.273
.288
.213
.370
195
.124
.330
.622
.185.
.229
.163
.220
.271
.367
.172
.496
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
C
0
0
0
.441
.453
.434
.442
.458
.468
.457
.489
.434
.555
.462
.403
.461
.405
.454
.469
.486
.404
.538
TANH
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
149
151
152
154
150
176
173
181
156
183
174
110
146
142
154
163
166
149
168
Sieve
DISTANCE2 CAUCHY
0.
0.
0.
0.
C.
0.
0.
0.
0.
1 .
0.
0.
0.
0.
0.
0.
0.
0.
0.
107
113
124
141
131
738
549
129
273
303
567
253
887
072
234
084
792
953
711
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
390
413
420
437
400
511
502
445
437
535
461
435
577
376
411
378
509
509
448
TANH
0.137
0.140
0.142
0.149
0.138
0. 198
0.182
0. 174
0. 164
0.223
0.172
0.118
0. 18?
0.134
0. 145
0.133
0.180
0. 199
0.154
                     2-48

-------
                              Table 2-17
              Y-INTERCEPT VALUES FOR BEST-FITTING LINES,
                      AMONG  SUPPLEMENTAL ANALYSES,0
                 BY  ANALYSIS,  SIEVE, AND LOSS FUNCTION
No Sieve
Analysis
30
31
32
33
34
35
36
37
38
41
42
43
44
45
46
47
48
49
50
DISTANCE2
0.431
1.314
1.099
1.056
2.217
0.011
-0.177
-0.113
-0.257
-0.021
0.803
-0.238
-0.437
-0.561
0.799
0.531
0.321
-0. 105
0.446
CAUCHY
-0.147
0.612
0.250
0.157
1.606
-1.063
-0.578
0.673
-0.428
0.921
0.034
-0.747
-0.308
-0.014
0.161
-0.076
-0.363
-0.516
0.628
TANH
0.072
1.067
0.575
0.557
1.784
-0.226
-0.402
0.350
-0.143
0.072
0.476
-0.545
-0.217
0. 148
0.467
0.230
-0.007
-0.344
0.230
DISTANCE2
0.387
1.223
0.950
0.868
2.015
0.475
-0.597
-0.076
-0.358
0.549
0.689
-0.0^5
0.475
0.467
0.588
0.249
0.229
0.257
0.443
Sieve
CAUCHY
0.033
0.927
0.655
0.277
1.863
-0.362
-0.910
-0.308
-0.566
-0.529
0.005
-0.338
-0.030
3.063
-0.028
0.000
-0.120
-0.490
0. 184

TANH
0.230
1.080
0.774
0.820
1.901
-0.180
-0.650
-0.209
-0.401
-0.180
0.476
-0.295
0.106
0.230
0. 148
0.230
0.230
-0.238
0.117
°The
predictor is used.
                                2-49

-------
I
Ul
o
                                                    Table 2-18


                               AVERAGE LOSS.  BY DOSE UNITS.  SIEVE AND LOSS FUNCTION0
No Sieve
Units
mg/m^/day
mg/hg/day
ppm diet
ppm air
mg/kg/life
Analysis
0
12
31
4a
24a
30
4b
24b
32
4c
24c
33
4d
24d
34
DISTANCE2
0.146
0.541
0.266
0.124
0.605
0.224
0.166
0.580
0.263
0.157
0.526
0.273
0 134
0.630
0.288
CAUCHY
0.440
0.524
0.453
0.434
0.541
0.441
0.420
0.522
0.434
0.398
0.516
0.442
0.428
0.550
0.458
TANK
0.159
0.180
0.151
;56
v*.186
0.149
0.157
0.104
0.15?
0.152
0.179
0.154
0.153
0.187
0.150
DISTANCE2
0.298
0.310
0.113
0.273
0.298
0.107
0.316
0.319
0.124
0.272
0.296
0.141
0 . 2f, 7
0.277
0.131
Sieve
CAUCHY
0.457
0.460
0.413
0.437
0.448
0.390
0.466
0.477
0.420
0.440
0.451
0.437
0.454
0.450
0.400

TANH
0.170
0.169
0.140
0.164
0.167
0.137
0.175
0.173
0.142
0.167
0.169
0.149
0.166
0.166
0.138
                °The l2Q predictor is used.

-------
                                                    Table  2-19



                               Y-INTERCEPTS  BY  DOSE  UNITS.  SIEVE.  AND  LOSS  FUNCTION0
i
at
No Sieve
Units
mg/m^/day
mg/kg/day
ppm diet
ppm air
mg/kg/life
Analysis
0
12
31
i*a
24a
30

-------
                              Table 2-20

                 CONVERSION  FACTORS0 FOR ALL DOSE UNITS.
                    BY METHOD OF ANALYSIS AND SIEVEb
   Units
 Analysis Method
 No Sieve
mg/m2/day





mg/kg/doy





ppm diet





ppm air





Restricted routes.
unaveraged (0)
Restricted routes,
averaged0 (12)
Unrestricted routes,
unaveraged (31)
Restricted routes,
unaveraged (4a)
Restricted routes,
averaged0 (24a)
Unrestricted routes,
unaveraged (30)
Restricted routes,
unaveraged (4b)
Restricted routes,
averaged0 (24b)
Unrestricted routes,
unaveraged (32)
Restricted routes,
unaveraged (4c)
Restricted routes,
averaged0 (24c)
Unrestricted routes,
unaveraged (33)
2.40

4.61

4.09

0.37

1.30

0.72

1.30

1.52

3.76

1.62

2.13

1.1*3

-3.40

- 8.69

- 11.67

- 0.72

-2.30

- 1.18

-1.68

- 3.15

- 8.91

- 2.18

- 3.51

- 3.61

1.58 -

3.47 -

8.45 -

0.28 -

0.43 -

1.08 -

0.59 -

1.77 -

4.52 -

0.83 -

1.82 -

1.89 -

2.07

5.61

12.02

0.40

0.61

1.70

1.17

2.95

5.94

1.06

2.96

6.61

mg/kg/life
Restricted routes,
 unaveraged (4d)
Restricted routes,
 averaged0 (24d)
Unrestricted routes,
 unaveraged (34)
20.42 - 39.63    10.40 - 16.67

87.70 - 151.01   19.63 - 23.12

40.36 - 60.81    72.95 - 79.62
°The factor by which a bioassay-based RRO estimate is multiplied to give
 best fit, on average, to the human RRD estimates (RRDn/RRD^).
bThe range given is that suggested by the CAUCHY and TANH loss
 functions, the two that use point estimates of human RRDs.
cAveraged analyses average over sex, study, and species, in that order.
                               2-52

-------
                    Tobl« 2-21




UNCERTAINTY FACTORS FOR ANALYSES WITHOUT THE SIEVE0
All Chemicals Chemiccls Below Lin«b
Analysis
0
1
2
3a
3b
4a
4b
4c
4d
5
6
7
8a
8b
8c
9
10
110
11b
11c
11d
12
13
14
15
16
17
18
19
20
21
22
23
24a
24b
24c
24d
25
30
31
32
33
34
35
nc
20
18
19
17
23
20
20
20
20
20
6
19
13
17
is
20
20
20
20
19
13
20
18
19
18
13
11
10
9
17
13
15
13
20
20
20
20
16
23
23
23
25
23
20
Factor
2.257
3.381
6.852
2.862
4.216
2.046
2.488
2.504
2.131
2.239
1.048
2.800
8.005
29.008
2.936
2.227
10.240
3.745
5.040
4.530
4.570
10.065
5.731
3.918
6.018
5.871
4.174
1.467
1.790
7. 113
62.713
8.561
89.156
11.292
9.954
10.541
10.455
3.032
3.275
4.216
4.328
4.408
4.728
2.858
nd
3
4
3
3
6
4
3
3
4
3
1
4
2
4
4
3
5
4
5
4
3
5
4
4
4
2
1
2
2
5
2
5
3
5
5
5
4
3
6
6
8
7
6
6
rottor
0.188
0.185
0.068
0.146
0.210
0.291
0.171
0.155
0.269
0.194
0.874
0.204
0.071
0.043
0.252
0.185
0.127
0.183
0.315
0.137
0.233
0.119
0.159
0.160
0.150
0.092
0.047
0.462
0.428
0.132
0.009
0.141
0.017
0.100
0.107
0.126
0.067
0.180
0.231
0.210
0.278
0.249
0.190
0.255
Chemicals Above Lineb
ne
5
5
4
6
4
4
6
6
5
6
1
6
2
4
5
5
3
3
2
3
1
3
5
5
5
2
2
1
1
4
3
5
4
3
3
3
4
3
4
4
4
4
4
4
Factor
3.292
5.194
8.248
3.242
8.251
3.752
3.228
2.735
3.336
2.950
1 . 14
-------
                         Table 2-21  (continued)

           UNCERTAINTY FACTORS FOR ANALYSES WITHOUT  THE SIEVE0


             All Chemicals  Chemicals Below Linab  Chemicals Above Line6
Analysis
36
37
38
41
42
43
44
45
46
47
48
49
50
nc
19
17
20
20
16
17
19
23
23
23
23
21
18
Factor
5.954
2.662
2.046
4.604
17.063
2.817
3.164
2.557
3.104
3.856
4.623
2.657
6.807
nd
4
3
4
4
3
3
3
5
6
7
8
4
4
Factor
0.119
0.161
0.219
0.110
0.039
0.199
0.148
0.264
0.235
0.211
P. 190
0.239
0.073
ne
4
6
4
4
3
4
4
4
5
5
6
3
6
Factor
8.894
3.068
3.752
7.186
14.234
4.362
5.718
5.261
5.397
6.016
6.973
6.728
6.186
°The I-2Q predictor is used.
bThe line is the best-fitting line determined by the DISTANCE2 loss
function.
°Number of chemicals in analyses.
dNumber of chemicals with human RRO intervals completely below line.
eNjmber of chemicals with human RRO intervals completely above line.
                                2-54

-------
                   Table 2-22




UNCERTAINTY FACTORS FOR ANALYSES WITH THE SIEVE0
Analysis
0
1
2
3a
3b
4a
4b
4C
4d
5
6
7
8a
Bb
8c
9
10
110
11b
1lc
lid
12
13
1*
15
16
17
18
19
20
21
22
23
24a
2kb
24c
24d
25
30
31
32
33
34
35
All Chemicals Chemicals Below Lineb
nc Factor nd Factor
20
18
19
17
23
20
20
20
20
20
6
19
13
17
18
20
20
20
20
19
13
20
18
19
18
13
11
10
9
17
13
15
13
20
20
20
20
16
23
23
23
23
23
20
5.300
6.676
10.029
2.119
2.008
4.552
5.454
4.616
4.448
5.060
1.048
5.422
3.406
22.834
3.235
5.426
4.493
3.611
4.535
4.444
3.066
5.393
8.254
6.022
7.670
3.508
4.181
1.560
1.675
31.128
36.455
9.010
82 . 564
5.162
5.518
5.101
4.718
2.645
2.026
2.008
2.216
2.256
2.293
78.202
3
3
4
4
4
3
3
3
3
3
1
3
1
3
3
3
3
4
3
3
2
3
5
3
4
1
1
2
3
4
3
5
3
3
3
3
3
4
5
4
4
5
3
6
0.076
0.066
0.077
0.307
0.249
0.091
0.072
0.086
0.092
0.081
0.874
0.078
0.044
0.031
0.149
0.073
0.088
0.143
0.094
0.099
0.115
0.079
0.145
0.073
0.117
0.049
0.047
0.433
0.529
0.040
0.030
0.128
0.018
0.084
0.072
0.082
0.090
0.250
0.283
0.249
0.231
0.272
0.172
0.146
Chemicals Above Lineb
n»
5
4
4
3
4
4
5
5
4
5
1
5
2
4
4
6
7
5
4
4
3
4
4
3
4
2
2
1
1
4
4
4
4
4
5
5
4
3
4
4
4
4
4
4
Factor
4.623
6.407
15.975
4.190
3.342
5.534
4.968
4.327
5.322
4.462
1 .144
5.275
5.471
8.148
5.786
3.917
3.432
4.175
5.777
5.589
4.282
6.250
12.263
10.427
10.793
6.523
6.931
4.128
4.675
1 3 . 087
22.798
11 .914
17.378
6.150
5.054
4.869
5.639
5.322
.097
.432
.578
.362
.647
42.184
                    2-55

-------
                         Table 2-22 (continued)

            UNCERTAINTY FACTORS FOR ANALYSES WITH THE SIEVE0


             All Chemicals  Chemicals Below Lineb  Chemicals Above Lineb
Analysis	nc    Factor	nd   Factor	ne   Factor
36
37
38
41
42
43
44
45
46
47
48
49
50
19
17
20
20
16
17
19
23
23
23
23
21
18
10.196
2.076
4.552
96.444
14.512
4.200
84.246
1.670
3.594
1.770
67.502
129.229
23.545
c
3
3
5
3
4
5
5
7
4
5
6
4
0.111
0.247
0.091
0.034
0.048
0.219
0.089
0.363
0.256
0.298
0.059
0.074
0.062
4
4
4
4
3
3
3
4
4
3
4
4
4
16.301
3.319
5.584
91.020
13.458
8.401
87 . 708
2.608
7.166
3.354
42 . 975
57 . 072
20.743
°The 1.2Q predictor is used.
bThe line is the best-fitting  line determined by the DISTANCE2 loss
 function.
°Number of chemicals in analyses.
^Number of chemicals with  human RRO intervals completely below line.
eNumber of chemicals with  human RRO intervals completely above line.
                                2-56

-------
                              Table 2-23

          COMPONENT-SPECIFIC  UNCERTAINTY: MODES AND DISPERSION
         FACTORS  FOR  RATIOS OF RRDSa,  BY SUPPLEMENTAL ANALYSIS"
Analysis
31
32
33
34
35
36
37
38
41
42
43
44
45
46
47
48
49
50
Number of
Chemicals
44
44
44
44
40
34
24
40
39
29
31
37
44
44
44
43
39
36
Mode of
Histoqrom
.05
.2
.2
.02
.8
.8
.8
.8
.8
.8
.8
.8
.8
.8
.8
.8
.8
.8
- .1
- .5
- .5
- .05
- 1.2b
- 1.25
- 1.25
-1.25
- 1.25
-1.25
-1.25
-1.25
-1.25
-1.25
- 1.25
- 1.25
-1.25
-1.25
Dispersion
Factor0
2.3
1.7
1.8
1.3
28.5
86.0
5.3
33.7
290.6
75.6
39.6
54.1
1.2
1.7
2.2
23.2
39.6
335.6
Number of
Extremes^
0
0
0
3
1
4
0
2
3
1
1
4
0
0
0
2
3
3
°The ratios ara of the chemical-specific RRD estimates from the
 indicated analysis to these of Analysis 30, the alternative standard.
bThe analyses were performed with the L2Q predictor and using the full
 sieve.
°The dispersion factor i» the average factor by which the chemicals
 differ from the mode.
dThe number of chemicols for which the ratios are greater than 100 or
 less than J.01.
                                2-57

-------
               2-1
   Corr«l«ilon
   SUnd*rd An«1y«l.  (0)
CO
4

3
3 *


I uj ,
01 1
m Q
2
1 °
_i
-2



-3

_^
t
k*_ ^ 	 	


-




^


_
r
tm




*-•
1 ^
-« -4















X













A
/
'

x^PS
B*
i '
-3

















f


Jfc_


I
-2
i











CD
f






, 4
1





-



NC
~x*'-
> 4



Y

VC"


, 	 u_
IT

x
X
^/

X


to

' *•
t 4
EC
»-

S3,
T
y
^






PC
i J



B?


i • i
-1 0 1
(


c.
/




.
"
" -


/
1?









*S






,
2


X
x
X


PH














1 JL-
3 4
J

!





sc .
*











Afi ,
	 f

ES >






-------
       I *
I
Ol
(O
         -2
         -3
                          Qu*1Ity
                                                            Flgurw 2-2

                                                       Correlation Arwly«l«:
                                                       Standard Arwlycl*  (0)
                                                      CO
                                                           NC
                                                        -B?
                                                                         EO
                                                                               EC
                                                                                               ISC
                                                                                 •HE
TC.
IS

                                                                                          PH
                                -4-3-2-1        0        1

                                                    Lag of AnlMl RRO E*tl*«t«e
                                                                                                         IAB
                                                                                                           ES

-------
j'
1,
c
  -2
  -3
                                                Flgur* 2-3
                Slgnlf lo*no« Sor««n
                                           Corr*t«tlan
                                           SUnctard AnalyvU (0)
                       *-E?
                                  ^CP
                                       4E_  -89
                  v a;  L
                                        HC
                                     ^e«
                                                                             sc
                                                           ,&5	
                                                      _L
                                                                                     ->..
                      -4
-3
-2
                                        Log
 -1

F AnlMl
o       i

  E*tlMtM

-------
M
I

          OB   A
          4     f

      j°
        -2
        -3
        —A
                                                           Figure 2-4


                                                       Correlation An«ly«l»:
                                                       SUnctard An*ly*U (0)
                                        -3       -2-1        0        1

                                                   Log of AnlMl RRO E«tlMt««

-------
                                                           Figure 2-5
to
I
00
      I

      4*
      •
        '-1
         -2
         -4
                                                      Correlation An*ly*l»:
                                                    Long ExpwlMnt*  Only (1)
                                                                                                                to

-------
                                                           Flgur* 2-€
      1  2
      •
O>
tn
        -2
        -3
        -4
                                                   Log of AnlMl RRO Estln

-------
         4    *-
      I
M
I
m
*•
        -2
        -3
                       Full Slovo
                                                          Flgur* 2-7


                                                     Correlation Analysis:
                                                     Long Dosing Only (21
       /     *~~
     tr          f*
«^x
                                                              V-
                                                            !G-
                                                           /
                                                       y
                                                                            \
                              KJ
                                                                                  Ei:
                                                                                  T
                                                    x
                                                                                   IS
                                                                       AS
                                                                          a_
                              -4       -3      -2       -t       0        1


                                                  Log of AnlMl  RRO  Estla*
                                                       l_.

                                                       3

-------
                                                         Klgur* 2-a
                                                     Corr*1*ilon
                                               Rout* Th*i HUMWW Enoountar <3»)
IM
I

-------
                                                            Flgur* 2-9
           4    4-
        I
01
o>
I
          -1
                                                       Correlation Arwlyolo!
                                                    Any Rout* of Expovur*  Ob)
          -3
                                                    Lag of AnlMl WO EotlM

-------
         4     *•
M
I
O>
vj
,
      I"
        -2
        -3
                        Quality
                                                          Flgur* 2-10

                                                     Correlation Analysis:
                                                  Any Rout* of Expoaur* (3b)








/
/-
/
/

1 f~






»—
/'


•—





VE

^


Efi 	 ,

1"



t <
BC--


ES,
^
L/tC*



'
' '
'?_^,


X
X
IS









AS



;

X
/


f*H
k
sc .. ,
«- _{

/
[/



	 . k.






...JAB _*
	 •** 	 f
                                                          CR
                                                                                                       E8
                                       -3      -2-1        0        1

                                                  Log of AnlMl RRD E^tlMtM

-------
i
Ol
a
        J

        i
        t

        I-


          -2



          -3
                                                          Figure 2-11


                                                      Cor•--'••• loo An*ly*l«:

                                                             of Expovur*  Ob)

-------
                                           Flgur* 2-12
                                       Correlation

                                     Any Rout* of £xpo*ur« Ob)
I'
i,
i
i-
  -2
  -3
  -4
                                    \jtrj of AnlMl RRO ZafclM

-------
                                                             2-13
                                                  Corr«t«tlon

                                           Avwaga Don ov«r 80X of EMpwlMnt (51
        I*
M
X
o
        ui
          -2
          -3
                                               Log of AnlMl RRO E^I««

-------
                                                            Flgur« 2-14



                                                       Correlation An«ly«l»:

                                                     Itellgrwni Tu»or» Only  (7)
N)

I
           -2
           -3
                           Fill! SI«V*

                                                           >-te
                                                                        .EP
                                                                                y-
                                 -I*
                                   [
                         -8      -4      -3
-2-1       0        1



  Log of An I Ml  RRD  E*tlMtM

-------
I
vl
M
    I *

    i,
     -2
     -3

-------
                                                     Flgurti 2-16
                                                Correlation An*ly«l«:
                                                TiMor-fiMrlno AnlMl*  (fib)
u
  -2
  -3
  -4
                                            Log of  AnlMl ROD EvtlM

-------
                                                    Flgur* 2-17
 4     *-
"1
-2
l-l
-------
I
*J
01
I
*>
bJ
          -2
          -3
          -4
                          NO
                                                             Flour. 2-18


                                                        Correlation
                                                                over Sax (9)
                                                      CO
                                                          HC
                                                         X
                                                     „	..w
                                                                            yta
                                                                            BN
                                                                                EC
                                                                                Tfc
                                                                                           PH
                                                                                 AS
                                                                                                   AS
                                                                                                  OS. J
                                                                                                             ES
                                                                                                 •*	>
                                 -4      -3-2       -1        0        1


                                                    Log of AnlMl RRD E*tlM

-------
                                                     FI0ur* 2-19
   4     f-
M
1

0)
I  2

I,
  -2
  -3
  -4
                                  -3       -2-1        0        1

                                             Log of AnlMl  RRO  E*tlMi««

-------
                                                  Flgur* 2-20
                                              Correlation
                                             Av«r*go over Study  (101
    CO   A
   4    r
|
i
u
t
5"

  -2

  -3
  -4    L
                 No
                                           Log of An I Ml RRO EailM

-------
M


CD
I'
u
"  1
J
           "
  -2
  -3
                  Full  Sl«v*
                                                    Flgur* 2-21


                                               Correction Amly*l«:
                                              Av*r«g* ov«r Study (10)
                                                                   ^EC
                                         -1KP
                                                .8?
                                                                      -H^
                                                                          AS
                                 -3       -2-1        0        1


                                            Log of Anlcwl RRO E«tl*«

-------
•si
ID
           4    V-
        I'
       i

       I  °

       %

       J-1

          -2


          -3
                                                           Flgur* 2-22

                                                       Correlation An«ly«l«:
                                                  Av«r*g» ovw All  SpoolM C11«>
                                                   Log of An I Ml RRO E^lwi

-------
                                                       Flgur* 2-23
                                                   Correlation An«ty*l*:

                                                      QVM- All SfMOlM  (11«)
         !'
CD
o
           -2
           -3
                                                Log of AnlMl RRO E^lMU*

-------
                                                            Flgur« 2-24
                                                       Correlation An«1y«l«:
                                                 Avw«g* ovw R»t* «nd Mlo« (lib)
            4     <-
M

OD
         5"
           -2
           -3
                                                                                                        t
                                                    Lag of An I Ml MRO EvtlwtM

-------
        4   1~
CD
M
     2,
       -2
       -3
                                      Log of AnlMl PRO E^lM

-------
                                                       Flgur* -» 28
a
o>
       J  *



       1,
       J«
         -2
         -3
                       Full 8l«v«
                                         -GP
                     _J	


                      -9
tX
                                         /
                                                  Corr«1«tlcn

                                                   Itei teU Only I11o)
                                                       c»
                                                                     vc
                                                                 XEtt
                                  AS
                                                                                     sc_

                                            H^Ht
                                                                                                lES
    -3      -2-1        0        1



              Log of An I M) RRO  Eatl«*t««

-------
       I'
       3,
*      i
       1-
         -2
         -3
                      Full Sl«v«
                                                  Flgur* 2-27


                                               Correlation An«1y«l»:
                                               MOUM 0«t« Only I11d)

                                                 -L.
                     -8-4-3
-2     -1      0

     of AniMi ran
                       14
                                               45C-
                          -Jp*


-------
                                                        Flgur* 2-20
I

CO
           4    f-
        J*
        U
1
          '1
          -2
          -3
                                                   Correlation
                                           Av«r«g* ov*r 8«x, Study, mnd Spool** (12)

-------
                                                      Flour* 2-2S
                                            CorrcUtlon Amly»l«:
                                   Av«r«o» owr S«M, Study,  «nd
                                                                       l««  (121
      !'
00
o>
i
        -2
        -3
                                              Log o^ AnlMl RRO E«tlM

-------
                                                                     2-30
                                                         Corr*1«tlan
                                                Avw«a» OVWP SOM, Study, *nd fpwlM (12)
OB
            4    ?•
         1
            ,
         I"
           -2
           -a
                                                      Log of An I Ml NRO E^lM

-------
                                                               Figure 2-31
                                                          Correlation
                                                Av«r«g* over $•*, Study, «ind Sp*ol*« (12)
            4     <-
         I

         4)
         W
8
           -2
           -3
                                                      Log of AnlMl  RRD E^lM

-------
                                                              Flgur. 2-32

                                                          Correlation
(18)
             4    f-
00
ID
            -3
                                                    All; CtMblnttlon of
                                                      Log of AnlMl RRO EotlMUo

-------
                                                              Flgur* 2-33
                                                         Correlation Arw1y*l«:

                                           AV«T*Q» ov«r All; Total Tuaor Bearing An I Ml*  (20)
         j
u>
o
           -2
           -3
                                                      Log of AnlMl RRD CuUm+

-------
                                                                Flour* 2-34
                                                           Correlation
                                                    Rout* and RMPOTIM LI tea HiuMn*  (20)
K)

10
•>
4 ;
3
J *
I
" 1
i
J °
1"
-2
-3

t
^


"
'

Full Sl«v*



IS 	 , ,£
/ •
r 1 ./I 1
-a -4 -3



»_
/
'tf




\ *L
I / *«-
»-
i i
-2 -1
cs

I
»—
,'
/' »-
—>t
/ '
4
85
i
0
j
1-
^i
X
E&
i
1
» t
t—
	
rc^, •
•-I

i
2
k
«c
, /




t
»•
,,
X

PH


3
f 1
S£_
'"


•

4
k
IS .
, V

»,

                                                        Log of An I Ml RRO f^tlM

-------
                                                               Flgur* 2-35
M
I
IO
rv>
          M
            -2
            -3
                            Full 8l«v«
                                                          Prediction
                                                Analyst* 17, M*dl«n Lo»«r  Bound Predictor
                                                                                       1C
                                                               s
                                                                                X
                                                                       EO
                                                                      PC

                                                                                  18
                                                                                     TC
                                                                                          f
                                                                           CO
                                           -3       -2-1        0        1


                                                       Log of An I Ml RRO EotlM
                                                                                                           -

-------
                                                              Flgur* 2-36
I
(O
o<
         J


         il
           -2
           -3
          Prediction
An«1y«l* 17, Itedlcn LOMT
                           No Sl«v«
                                                             RS
                      /

                      EO





                      PC
                                                                                PrwJIoior
                                                                                      MC
                                                                                  /
-4
-3
    -2
      Log
                                                            -1        0


                                                             AnlMl  MRD
                                       re

-------
10

-------
                                                              Flgur* 2-30
I
IO
UI
J

t:
ui
           -2
           -3
                           No SI
                                                         Prediction An«1y«l«:
                                               Amlyvls 20. M*dl«n LMMMT Bound Predictor
                                                                   VC
                                                                  EO
                                                                 CO
                                                                                    TC
                         -4      -3
                                                   -2-1        0


                                                     Log °* AnlMl RHO

-------
                                                              Flgur* 2-38
                                                         Prediction An*1y«l«:
                         Antlyvl* 3b.  Madlan Law Bound; Bart-Flttlng LlnM tilth  It
IO
O>
         j

         1
           -2
           -3
                                                      Log of AnlMl RRD E^lMU*

-------
                                                         2-40
I
10
                    Amlywl* 20, Mwll*n
                                        Prwllotlan An*1y»U:

                                   •ound; iMt-f liilr^ LlnM with
I-
3,
         -2
         -f
                                            Ua of An I Ml MO E^lM
                     HI      -4
                                                                                4     •

-------
                                                                2-41
                       Arwly«lv 22.  ttmdlmn
Prediction
 BMt-f filing Ll
with Ir
           4    f
        l«
        u
10
OB
          -2
          -3
                                       -3      -2-1       0        1

                                                  Log of AnlMl RRD E*tlMiM

-------
  .00 -

  .01 •

  .02 •

  .05 •

  .W -

  .20 •

  .50 •

  .K •

 1.25 •

 2.00 •

 5.00 •

 10.00 •

 20.00

 50.00

100.00

 oo
                                 Figure  2-42

          Component-Specific  Uncertointy; Ratios of  RRD« for
            Analysis  31 (mg/n^/doy) to RROs  for Analysis 30
M.ASBACDOE1.E5FOHCNCKPMISSCTC

ABACBZOIDBECEOISUIUICn)
   .00 -

   .01 -

   .02 •

   .05 •

   .10 -

   .20 •

   .50 •

   .10 •

  1.25

  2.00

  5.00

  10.00

  20.00

  50.00

 100.00
                                          Figure  2-43

                   Component-Specific  Uncertainty; Ratios of  RROs for
                      Analysis  32 Jppm  diet) to RROs  for Analysis 30
 X TO
 MMM-ASMBZCDCftOSOEOLECEOESFOISlEICMICPCPMISn)
                                          2-99

-------
  .00 -

  .m -

  .02 -

  .as •

  .10 -

  .20 •

  .50 •

  .K •

  1.25

  2.00

  S.QO

 10.00 •

 20.00

 50.00

100.00
                                  Figure  2-4*

           Component-Specific  Uncertainty; Ratio*  of RRD«  for
              Analysis  33 (ppm air) to RRO*  for Analysis 30
SC TO

AFMBMCaCOCSCTaSEDHCHVILNJirTTETrvC

MACM.BZCOaiDBOCDLECEOFOISL£ICMM:PCPN>STCTD

•A

AS E5
                                           Figure  2-45

                    Component-Specific  Uncertainty;  Ratios  of RRDs for
                   Analysis 34  (mg/kg/lifetime) to RROs for Analysis 30
  .00 -

  .01 -

  .02 -


  .05 -

  .10 •

  .20 -

  .50 •

  .W •

  1.25

  2.00

  5.00

 10.00

 20.00

 50.00

 100.00
BZ E3 NJ

ci at H.

MWtfM.MA3BAMCDa>C3CTDBOEDLOSFCCDEOFOHC>*ISUH:
                            MNCKTPCPMBSCTtTOTfTOtrvC
                                           2-100

-------
                                          Figure  2-<»6
                   Component-Specific  Uncertainty; Ratio* of  RRDs for
  .00 —r-      Analysis  35 (Long Experiments  Only) to RROs  for Analysis 30
        Q
  .01
  .02 -


  .05 -


  .10 -


  .20 -


  .50 -


  .« -


 1.25 -


 2.00 •


 5.00 •


 10.00


 20.00 •


 50.00


100.00 •
vc
TC Tt

WM.ASMMCDCO'BOCa.eCEOFOHVISLEICMIICMTPCrHRSSCTOT?


AC Alt O>


CT


* OS
CR ES HC TQ
                                          Figure 2-*7

                    Component-Specific Uncertainty; Ratios of RROs for
                 Analysis  36 (Long Dosing Only)  to RROs  for Analysis 30
   .00 -

   .01 -

   .02 -

   .05 -

   .10 -

   .20 -

   .50 -

   .10 -

  1.25 •

  2.00 •

  5.00 •

  10.00

  20.00

  50.00
 100.00 —
 KT


 OL HA


 « TO Tt

 MOOaOBOSEDFOHC ISPCPMBSCTF


 AC ED TC


 HI U 1C




 Af EC CS 1C
          AB AS CD
                                           2-101

-------

M«

.05 —

.20 —


1**
2fM

5.00 —
10.00 —

20.00 —

50.00 —
VMM MM
00 —
A



05

PC
TC Tt
W CD CS EC
ED
AC DE
--

AB Af ES PC

a


                                    Figure 2-48

               Component-Specific Uncertainty;  Ratios of RRDs for
             Analysis 37  (Route Like Humans)  to RRDs for Analysis  30
.00

.01

.02

.05

.10

.20
        BA
                                    Figure 2-<»9

                  Component-Specific  Uncertainty; Ratios of  RRDs
                    for Analysis 38 (Inhalation, Oral, Gavage,
                    Route Like Humans)  to RRDs for Analysis  30
  .K •

 1.25

 2.00

 5.00

 10.00

 20.00

 50.00 --

100.00
       AB Af E5 1C

-------
                                         Figure 2-50
  .00 •

  .01

  .02 •

  .05

  .10

  .20

  .SO

  .80

 1.25

 2.00

 5.00

10.00

20.00

50.00

KXJ.OO

 oo
          Component-Specific Uncertainty;  Ratios of RRDs for
CSSC Analysi« 41  (Malignant Tumors Only)  to RROs for  Analysis  30
HC vc

CO TO

ABAFALAHWCDCHCTDBaEOSECEDEaFOHYISIIAICICTPCPHTCTCTOT?

AC U

US

01 K

SC
 AS
  .00 -

  .01 -

  .02 -

  .05 -

  .10 -

  .20 -

  .SO -

  .M •

  1.25 -

  2.00 •

  5.00 •

 10.00 •

 20.00

 50.00

 100.00
                                         Figure 2-51

                     Component-Specific Uncertainty;  Ratios of RRDs
                       for  Analysis <»2 (Combination of Significant
                            Responses} to  RROs for  Analysis 30
 AS

 CB NC

 OS VC

 ED

 M m OL ED IS HA
 COECflCtfTrCRSTCTEn)

 OE TP

 L£

 AF

 MC t*

 CS



 vc ca
                                          2-103

-------

.02 —

.20 —

1.25 —


W.OO —


Of
AS
08 NT
BA OS
BN PH
AB AC OE ED HC TO
CO DL ED TC TO
AF AL l£ BC US TE VC
PC
IS
ES
at cs

CD
                        Figure 2-52

         Component-Specific Uncertointy; Ratios
      of RRDs  for  Analysis U3 (Total Tumor-Bwaring
             Animals)  to RRDs for  Analysis  30
                        Figure 2-53

    Component-Specific Uncertainty; Ratios of RRDs for
Analysis kk (Response Like Humans) to RRDs for Analysis 30
.00 —
.02 —




1.25 —
5.00 —



oo —
BA
AS
OB


AL
UE
AB
AH
CO
IS
cs
NC

LE
CS ES
ID
PC TC
AC AF
EO FO
DL 1C
US
ES HV

-------
                                          Figure 2-54
  .00 -

  .01 -

  .02 -

  .05 -

  .10 -

  .20 -

  .50 •

  .80 •

 1.25 •

 2.00 •

 5.00 •

 10.00 •

 20.00

 50.00

100.00

 00
          Component-Specific Uncertainty; Ratios of RRDs  for
       Analysis  45 (Average Over Sex) to RRDs for  Analysis 30
 ED FO

 ACAfM.AflASBABZC8CDCaaiCSCTDEECEDE5HCHVISLEK:it.lUNANCirTPC<>CTCTDTET?

 OS RS TO VC
   .00 -

   .01 -

   .02 -

   .OP -

   .10 -

   .20 -

   .50 •

   .« •

  1.25 •

  7.00

  5.00 •

  10.00 •

  20.00 •

  50.00

 100.00

  oe
                                           Figure 2-55

                    Component-Specific  Uncertainty;  Ratios  of RRDs for
                Analysis  
-------
                                        Figure 2-56
  .00 -

  .01 •

  .02 -

  .05 •

  .10 •

  .20 •

  .50 •

  .80 •

 1.25

 2.00

 5.00

 10.00

 20.00

 50.00

100.00
               Component-Specific Uncertointy;  Ratios of
                RRDs for Analysis <*7  (Average Over All
                    Species)  to RRDs for Analysis 30
vc

ABACABBAWBZC8CDCHCTOBEOESISLEI1.HJHAKTPCPHSCTCTDTE

AFALASCOCSDEDLEDRJHCICTO

OS HV

HC RS
EC
m
MM

Bill II III
.10 —
.50 —
1 9^

2.00 —
5M»
WM

20.00 — •
50.00 —
100.00 —

cs



is vc
AB AC
AF N.

05 PC
NC
E5



AS
                                         Figure  2-57

                        Component-Specific Uncertainty;  Ratios of
                    RRDs  for Analysis <»8 (Average Over  Rats and Mice)
                                  to RRDs for Analysis  30
         AFALCOCTDEOlEDFOHCieiBTCTOTO
                                         2-106

-------
                                       Figure 2-58

.01 —

.10 —
.20 —

.10 —

1.25 —

2.00 —

5.00 —
«M M>
20.00 —
50.00 —
100. M —
aa __
CS OS *




vc

AB AC AL BN SZ

ED

CT

OS CO ES HY TO


AS
                 Component-Specific Uncertainty; Ratios of RRDfc for
                 Analysis 49  (Rat Data Only) to RRDs for Analysis 30
       ABACM-eNSZCDCROeaEOLECEOFOHCISLEICMNCMTrCPHRSSCTClDTET?
                                        Figure  2-59

                  Component-Specific Uncertainty; Ratios of RRDs  for
                 Anal/sis 50 (Mouse Data Only) to RRDs  for Analysis 30
 .00 -

 .01 -

 .02 -

 .35 -

 .w -

 .»-

 .so -

 .» •

 1.25 -

 2.00 •

 5.00 •

 10.00

 20.00 •

 M.OO

100.00

 00
LE SC
IS M PC

NC

AFAHWCBCOCSCTDSEDHVNLIWKTrHTCTTTOrrvC

M. OE

NC ID

DL FO B

SC HC

a
                                        2-107

-------
                                Section 3

                               DISCUSSION"

POSITIVE CORRELATION

The results presented in the previous section reveal that estimates of
risk-related doses from animal bioassay data are generally highly
correlated with the estimates derived directly from epidemiological
data.  Of the thirty-eight initial analysis methods investigated, 35 hod
p-values less than 0.05, when the full sieve was applied, and with that
same sieve, 17 had p-volues of 0.0001 or smaller.  Even with no sieve,
so that data from experiments of highly variable quality are included,
thirty-five analyses have p-values less than 0.05.  Not only do most, of
the analyses yield correlation coefficients that are statistically
significantly positive, but the coefficients are large in on absolute
sense as well.  Twenty-six of the analyses hove coefficients larger
than 0.7.

The strongly positive result of the correlation analysis was obtained
even though a number of uncertainties had to be accounted for.  First,
the uncertainty of the human RRD estimates is explicitly incorporated
since the ranking that underlies the correlation analysis is based on
the lengths and positions of the intervals of RRD estimates derived from
the epidemiology.  Those intervals reflect the uncertainty in the
exposure estimate* and statistical variability.  Analysis of the
epidemiological data, including the exposure uncertainty derivation, was
conducted prior to the analysis of the animal data and,  therefore,
without knowledge of its outcome.  Moreover, since the criteria used to
determine the exposure uncertainty values were consistent across all
chemicals, the subjective aspect of their derivation should not affect
the correlations.  That is, if the individual subfoctors corr3sponding
to sources of uncertainty in exposure estimation were to be altered, the
bounds on exposure estimates would change in a predictable and largely
consistent manner for all chemicals.   The relative rankings of the
chemicals  should be minimally affected.  This  is one advantage of using

                                 3-1

-------
a nonparametric (rank-based) approach in the correlation  analysis.

The second uncertainty accounted for in the correlation analysis
pertains to the bioassay data.   The intervals defined for the animal
results also incorporate statistical variability;  statistical lower
bound and upper bound estimates define the endpoints of the intervals.
In addition, the entire ensemble of data for a given chemical is
considered in the sense that that ensemble defines the median lower
bound and the median upper bound.  So, while discounting extreme values,
the intervals defined take into consideration the RRD estimates that can
be obtained from each experiment in the data base.

These various uncertainties and the methods used to account for them
will tend to wash out any real correlations that may exist, in the sense
of producing small correlation coefficients that may not be
significantly different from zero  (no correlation).  Despite this,
strong correlations ore obtained.  The  positive correlations exist for
chemicals whose RRDs  (and,  therefore, potencies) span several orders of
magnitude.  Indeed, the strong correlations obtained despite these
factors  make it highly unlikely  that  the  positive results are due to
chance or to othar factors  not  incorporated  in the  analysis.

The  fact that  these positive correlations exist is  very  important.  The
assumption  that risk  estimates  derived  from  bioassay data are relevant
to the estimation of  human  risk  is crucial to all risk assessments  for
which epidemiologicol data  is  limited.   Heretofore,  it has been a
 largely  untested assumption.   The  correlations determined  in this
 investigation  strongly support  that  assumption and  thereby strengthen th«
scientific  suppprt  for quantitative  risk  assessment.

The  thirty-eight  initial  analysis  methods represent a  wide variety  of
approaches  to  bioassay-based risk  assessment.  Although  a  few of  them
appear  to be  less-well correlated  with  the epidemiological assessment
 results, the  fact  that most are highly  correlated makes  it reasonable
 to  attempt  to  determine which  methods ore best when point  estimates of
 risk ore desired.   The variety of  methods ensures that a variety  of
 point estimates  will  be available  to discriminate between  different
 predictors and different  acceptable analysis approaches.  This  is the

                                 3-2

-------
subject of the prediction analysis,  the interpretation of which will
follow a discussion of data quality and the data screenings.
DATA QUALITY AND DATA SCREENING

As discussed in the second volume of this report,  the extent and quality
of the bioassay data varies greatly from chemical  to chemical.   Some
chemicals have few acceptable experiments (e.g., estrogen has two),
some have experiments testing only one species (e.g., chlorambucil with
mouse data only), and some only have experiments of short duration or
dosing (e.g., benzidine and chromium).  Still other limitations exist
that affect the calculation of RROs, such as the number of animals on
test (which can greatly affect computation of the statistical confidence
limits) as well as the actual conduct of the experiment (animal
husbandry and care, adherence to protocol, etc., which we have not used
to rate experiments) and, most importantly, data reporting limitations.
Aside from reports like those produced by the National Toxicology
Program, rarely were full details of the bioassay results available.

Even though correlation coefficients were large and  significant for many
of the unscreened (no sieve) analyses, as mentioned  above, it was  felt
that some attempt should be mode to use the "best" data that was
available.  On the other hand, it would not be  appropriate to eliminate
chemicals from the analyses on the basis of 'quality" consideration*.
First, the maximum number of chemicals is 23, so that elimination  of
chemicals could  lead to very small sample size.  This is seen, for
instance, when very restrictive criteria on carcinogenic endpoint  and/or
experimental  protocol define an analysis method (e.g., Analysis 19 with
only nine chemicals) or when the doto requirements ore not satisfied by
the published bioossoy results available to us  (e.g., Analysis 6 with
only six chemicals).  Second,  in any  future risk assessment on a
particular chemical, the data will undoubtedly  be  limited in certain
respects.  Port  of our task is to try to determine how best to proceed
even with those  limitations.

Consequently, the data screening (sieve) selects the better data
("better" being  defined  solely on the basis of  the definition of the

                                 3-3

-------
sieves) for use in the calculation of RRO estimates.   In  the present
investigation,  two screenings have been defined:  a screening based on
the significance of the dose-dependency of the carcinogenic responses
and a screening based on the number of dosed animals  and  length of
observation.  The use of these two screenings yields  three possible
sieve approaches.  Of course many reasonable alternatives,  whether
based on these criteria or others, are possible.   No  examination of
other sieves has been undertaken.

The goal of screening the data is to produce a data base  that will
perform better when compared to the epidemiological estimates.  The
sieves defined here appear generally to achieve that goal.  The
significance screen, in particular, worked to increase the correlation
coefficients for 25 of the 38 initial analyses.  Although the quality
screen does not provide substantial improvement over the significance
screen in most cases, certain analyses are much better correlated when
both screens (the full sieve) is  applied.  Rarely does the addition of
the quality screen to the significance screen decrease the correlation.
Thus, we have selected the full  sieve to  represent the action of data
screening  in the prediction and  uncertainty analyses.

However, in the  prediction analysis,  it  is frequently the case  that the
average  loss for an  analysis method  is greater when  the  full  sieve  is
applied  than when no sieve  is applied  (cf. Tables 2-3 through 2-5 and
Table  2-13), especially when the median  lower  bound  estimate  is the
predictor.  One  might  be  tempted to  conclude  that only those  methods
yielding smaller loss  when  the  sieve is  applied  should be  considered as
good risk  assessment procedures.   Conversely,  one might  conclude that
either  the sieve is  not working correctly to  select  the  better  data  or
thut  it  is working  but the  data it selucts are not.  in actuality, better
for  risk assessment  purposes.   We argue  that  any of  these  conclusions  is
unwarranted.

First,  the results  of the correlation analysis strongly  indicate that
the  better data  are being selected by the sieve  and  that these  data are,
in actuality,  better for  risk  assessment purposes.   This in in
accordance with  common sense:  if too few animals ore tested or  if the
period of  observation is  too short,  then it is difficult to elicit  an

                                 3-*

-------
observable (dose-related) carcinogenic response.   Similarly,  those
responses that or* significantly related to doting tell us more about
the carcinogenicity of a chemical than the endpoints that lock a
significant relationship with dose (unless all responsec lack a
significant relationship, but then the significance screen does not
eliminate any of the responses from consideration).  Had the results of
the correlation analysis been less consistent in  indicating the benefit
of the sieve, then one might have reason to suspect that the "common
sense* reaction is not supported and may be in error.

Secondly, we prefer the correlation analysis results ov«»r the prediction
analysis results as indicators of the action of the sieve since the
former does not select a single estimate from each analysis method and
it is not dependent on the specification of loss.  The correlation
analysis utilizes a range of estimates consistent with the ensemble of
data available for each chemical and employs a general measure of the
degree of similarity between the animal and human estimates.  This
framework is less sensitive to variations  in the data and results that
are due  to unintentional changes (confounders) in the data base.  It is
entirely possible that application of the  sieve may  tend to eliminate
certain  routes of exposure, for example, although such a result is not
the intended result of the sieve.  Unless  the elimination entails
substantial change in the RRO estimates (as  in the cose of arsenic or
estrogen, as discussed in Section 2)  the generalized ranking  scheme  is
not unduly affect.d.

Indeed,  we feel  that  such confounding changes in the data base and
random variation may  largely explain  the occurrence  of overage losses
that  are greater when thn sieve  is applied than when it  is not.   For any
experiment,  random factors affect the response rotes and, consequently,
the estimation of RRDs.  For oil  the  bioassays of  a  particular chemical.
then, the changes  seen when  a sieve  is  applied depend  on  these random
variations.   [Again,  this is one reason for  preferring the correlation
analysis over  the prediction analysis as a test  of  the sieve: the
correlation  analysis  accounts for the random variation by using upper
and lower confidence  limits  instead  of  a  single  point.]
                                 3-5

-------
At a consequence of this observation,  it is appropriate to compare the
analysis methods in the prediction analysis both with and without the
sieve.  An analysis that yie?  s smell  average loss under either of these
conditions should bs considered a -'table option in the sense o'
determining the best risk assessment procedures.  Thus.  for example.
Analysis 20 without the sieve is the best approach, as measured by the
TANH loss function, whei\ LJQ i» the predictor (cf. Table 2-7).   Average
loss for that analysis is increased when the sieve is applied so that,
even among the analyses employing the sieve. Analysis 20 is no longer
the best.  We wish to continue to consider Analysis 20 as a good
potential procedure since it is not Known whether the increaso in
average loss may be due to random variation or to data base changes
confounded with the application of the sieve, though we suspect that  it
is.   This procedire is followed throughout this discussion; those
analyses cited as  being good are those with small average losses for  at
least one of the pair (with sieve and without sieve).  However, in the
suggested guidelines for presenting risk estimates, and in the examples
provided, screening of the data is always performed, no matter which
analysis method  is applied.
 APPLICATION OF  ANALYSIS  RESULTS  IN  EXTRAPOLATING FROM ANIMALS TO HUMANS

 Heretofore,  anjmol-to-human  extrapolation  has generally been conducted
 by assuming that equal doses will produce  the some  lifetime risks  in
 animals and humans,  when both animal  and human  doses are measi red  in  the
 some particular units.   Dose units  that  have been applied  in the past
 include mg/kg body weight/day, mg/m?  surface area/day, ppm in diet or
 air,  and mg/kg body weight/lifetime.   Because of differences between
 animals and humans in body weights,  life spans, etc.. use  of different
 units produce different  estimates of  human risk.  There  is limited
 scientific support for use of any particular dose units  (1.).  Results
 from the present study can be used  empirically  to determine appropriate
 methods for animal-to-human extrapolation.  Specifically,  multiplication
 of the animal RRO by the 10C. where c is the y-intercept  from the  best-
 fitting line, provides an estimate  of the  human PRO in which the bias
 due to systematic differences in unimal  and human risk estimates found
 in this study has been eliminated.   With this  approach,  the dose units

                                 3-6

-------
con be selected on the bo»is of those thot,  olong with other focets of
on analysis, produced the best correlation*  between animals and humans
(or smallest losses).  The bios correction factor 10C corrects for any
overestimotion or underestimation by the analysis mevhod used.
IDENTIFICATION OF GOOD METHODS

Predictors
Each analysis method was run vIth the four predictors examined in
this investigation, the medlar, and the minimum of the lower bound RRDs
and of the maximum-likelihood RRDs.  Three loss functions were defined
that determine the lines of unit slope that minimize the total loss for
the collection of chemicals being analyzed.  Despite the fact that the
three loss functions calculate loss in different ways, all three are
consistent in indicating that the median lower bound RRD predictor, L.2Q.
yields the smallest average losses for most analysis methods.  It should
be emphasized that this is a strong result not only because of the
consistency of the loss functions but primarily because it is not
dependent on the particular data that were available for analysis.  For
any given analysis (with rare exceptions) the lots when L;Q is used is
smaller than losses with other predictors even though the very same data
are used to calculate  the estimates and, hence, the losses for each
predictor.

It is important to note that  the predictor, LjQ- which *• derived from
lower bound RRDs,  yielded smaller  losses than any of  the predictors
based on maximum  likelihood estimated RRDs.  This is  probably related to
the fact that small changes in the bioassay data can  result  in sizable
changes in MLE estimates of RROs, which suggests that the desirable
large-sample theoretical statistical properties possessed by MLE
estimates  (such as consistency and asymptotic efficiency) are not
operative  to any  practical extent  in this  situation given the usual
sample  sizes encountered in bioassays.  This lack of  stability of  the
MLE estimates is  a much more  severe problem when extrapolating to  low
doses   Regulatory agencies hove  in the past relied more on  lower  bound
RRDs  vhan  maximum likelihood  estimates, mainly  in the interest of  being
protective of human health.   This  study shows that lower bound RRDs are,

                                 3-7

-------
in fact,  better predictors of the human data than are the MLE estimates,
and thus provides additional rationale f«y •mphasizing lower bound RRDs
in risk assessment.   This study olso estimates the level of conservatism
(or anti-conservatism) that may be inherent in specific analyses that
use lower bound RRDs end estimates compensating or bias-removing
factor*,  i.e.  the conversion factors (10C).  Thi» issue will be
discussed further later.

One potential problem with use of lower bound RRO» is that they are
always finite, ever when the data show no evidence of corcinogenicity
(consistent with infinite maximum likelihood RROs).  To some this might
imply that use of lower bound RROs will lead a regulatory agency to
treat every chemical os a carcinogen, irrespective of bioassay results.
This need not be the cose.  For most purposes, there must be reasonably
convincing evidence of carcinogenicity from bioassay results before an
agency will undertake the assessment of risk.  Moreover, the problem may
be further mitigated if we recall that the correlation analysis
demonstrated the strong positive correlation between ranges of human and
ranges of animal RRDs.  This result does not depend on tno position of
the best epidemiological or bioassay estimates, only on the bounds for
estimates.  Consequently, we know that those chemicals that tend to have
larger RRD estimates (lower bounds)  from epidemiological analyses also
tend to have  larger bioassay-based estimates  (lower bounds) so that
chemicals with  large 1-20'*  (in ° relative  sense, compared to other
chemicals) ore  those that may  be of  less concern when  it comes to
regulation and/or control.   One corollary  of  thi .  line of reasoning  is
that the degree of  correlation,  in addition  to  the average  losses
calculated for  specific predictors,  is an  important factor  in comparing
the analysis  methods and deciding which  are  better.

At any rate,  there  will always be the possibility  that a noncarcinogen
may be regulated as a  carcinogen  on  the  Do»is of false-positive  data.
Use of PILES  would not  remove this problem; MLE  RROs  from bioassoys  of
noncorcinogans wpl be finite about  500  of the time.   In this  regard,  it.
 is of  interest to note that in this  study chemicals  with infinite RRO
astimates  based on  the epidemiological analyses did  not in general  have
 infinite maximum likelihood PRO estimates based on the animal  data.

                                 3-8

-------
However,  this was to some extent prearranged because for a chemical to
be included in the analysis, positive evidence of corcinogenicity
(implying finite RRDs) hod to exist for either animals or humans.

Use of 1-20 °* the predictor is in a *ens« less conservative than us.e of
the minimum lower bound, L^.  The y-intercepts for the analyses ans
almost always larger when LM is used in place of LJQ (compare Tables
7-12 and 2-13).  This means that L^ is more conservative, in fact,
generally overconservative.  Of course, if one applies the conversion
factors suggested by the y-intercepts, then no approach is more or less
conservative than another; estimates obtained by using the conversion
factors are those that come closes, to the epidemiologica.1. estimates.
In this sense, thv remaining error, expressed as average loss, is the
primary determinant of good or bad analyses or predictors.  As
previously mentioned, LJQ i* preferable to LM in this regard.  But it is
also the case that the conversion factors are less extreme with L^Q than
with LM.

Analysis Methods

Given the superiority of Lpo over the  other predictors examined,
one can compare the analysis methods on the basis of how the/ perform
with l-2o-  This has been done for each loss function separately  (cf.
Table 2-7) and for the  three functions combined  (Table 2-8) for  the
initial 38 analyses.  The  supplemental analyses  (Table 2-16)  si-ould also
be considered, especially  since  their  template  is Analysis  3b which  is a
metnod producing excellent  correlation and which is also  identified as
resulting in  small average  loss.

Analyses 6,  18, and  19, which are  applicable  to limited  numbers  of
chemicals (six, ten,  and  nine,  respectively), will  not tie examined  in
detail.  Although  both  the correlation and  prediction analyses  suggest
that  these methods may  be beneficial,  the data  are  not sufficient to
warrant detailed examination of  these method*-.   In  order  to use  the
methods routinely, data availability  would  have to  be  improved.   To
perform Ar.jlysis 6,  one must have  available  the numoor of animals alive
in each dose  group at the time  of  first  occurrence  of each  tumor type.
For Analyses  18 and  19  (as well  as ony other  method that  uses on

                                 3-9

-------
endpoint that is a combination of individual carcinogen!: responses) one
must know which animals got which tumors in order to can.oir.s responses.
Detailed reporting procedures like these in many National Toxicology
Program reports are ideally suited for thes« purposes.   Bioassoy results
published in peer-reviewed journals rarely contain such detail.
Nevertheless, torn* other means must be found to disseminate the full
results before analyses like 6. 18 or 19 can be more thoroughly
investigated.  The incomplete bvt suggestive results of Analyses 6, 18,
and 19 indicate that this may be worthwhile.

Comparing the results in Table 2-16 to those i«* Table 2-7 reveals  that
several analysis methods from the supplemental list are as good as or
better than the best of tie initial analyses.  With the DISTANCE2  loss
function. Analyses 30, <»5, and k7 yield the smallest losses of any
analyses.  Similarly, Analysis <»3 results in the smallest loss as
measured by the TANH lo&s function; Analy.es <»5 and 
-------
<»3, 45, and ols is the single endpoint evaluated  in  the former

                                3-11

-------
cose.  Similarly.  Analyses 45 and 47 differ from Analysis 30 because
some averaging of RRDs does take place;  for Analysis 45,  estimates are
averaged over bioassay identical except for the sex of the test specie?
(i.e. over sex within study) and for Analysis 47.  results obtained for
each species are averaged to yield the ultimate RRO estimates.

The Base Analysis (Analysis 0) employing the minimal lower bound
estimator, LM (second row of Table 3-1} has both the largest normalized
loss and the largest residual error.  Moreover, RRDs derived from this
analysis underestimate the human PRDs on average by a factor of 12.  By
all standards, this method is the poorest of those listed.  However,
this method is perhaps most like that presently employed by EPA.
Modification of this i.iethod by using the median lower bound estimator,
LJQ. rather than l^, as represented in the first row of Table 3-1,
provides an improvement in terms of normalized loss, residual error, and
requiring a smaller conversion factor.  These results illustrate further
the  finding discussed earlier that analysis methods thot use median
lower bound RRDs as estimators provide smaller lov-.cs than  use of
minimum estimates.

Although  Analyse*  0, 7, 11c, and  11d were  associated with generally good
correlation values,  their normalized loss  values do not  compare with the
best of the remaining methods (e.g. Analyses 43, 45, and 47).  Moreover,
the  residual  uncertainty  factors  associated with these analyses are
among  the largest  presented.  Analyses 0.  7. 11d,  and  11d therefore are
iiot  considered  to  be among the  better methods  for  predicting human  risk
on the  basis  of bioassay  results.

The  case  is  somewhat more complex for  Analyses 17  and  20.   As  previously
mentioned.  Analysis 17  is the best method  determined  by  the CAUCHY  loss
function.   In part because of that result,  the total  incremental
normalized loss for Analysis 17 is nearly  the  smallest.   Nevertheless,
 its  correlation coefficient is  also the  smallest of  thcte presented.
Even if one notes  that  Analysis 17 is  applicable to only 11 chemicals
and  that  consequently the coefficient  would be less stable, the
 importance attached to the correlation results when using the  L2Q
 predictor (as described above)  tends  to make the use of  method 17 less
 desirable than use of the other methods.   In addition,  the  residual
                                3-12

-------
uncertainty factor is relatively large, in the range of those associated
with Analyses 0, 7, lie, and 11d.

Analysis 20 also has a large uncertainty factor.  That fact, plus the
large incremental loss value, makes Analysis 20 less appealing than the
remaining five analyses, 30, 31, 43, 45, and 47.  Note also that
Analysis 43 is the only other method in Table 3-1 that uses the endpoint
used by Analysis 20, total tumor-Searing animals.  Analysis 43 is
superior in all respects to Analysis 20.  This is one other reason why
Analysis 20 should not be considered among the better methods for
extrapolating human risk.

There is another reason not to  recommend Analyses 17 and 20 for use in
extrapolating between humans and animals.  Aside from Analyses 0, 7,
11c, and 11d. which have already been  deemed inappropriate, only
Analyses 17 and 20 are  restricted to specific routes of exposure.  It is
likely, given the  pattern seen  for  other analysis methods,  that methods
identical to 17 and 20  but without  this restriction on route would do
even better.  Th.s would seem to be the case because the supplemental
analyses generally yield smaller losses than those analyses in the
initial set that differ only with respect to allowable routes of
exposure and units of extrapolation, the latter having little effect on
average loss.

Analysis 20 and the other method using total tumor-bearing  animals as
the endpoint. Analysis  43, are  the  only two that overestimate RRDs on
average.  On the other  hand, Analysis  31, a method extrapolating risk on
the basis of mg/m^/doy. underestimates RRDs by  roughly an order of
magnitude.  Analyses  should  not be  compared on  the basis of these
conversion factor*, however.  When  the estimates from any analysis
method are multiplied by the indicated conversion factors,  a  line fit to
the converted estimates (on  the x-axis) and th« epidemiological results
(on the y-axis) would pass  through  the origin.   The conversion  factor
represents a degree r'  freedom  in  the  prediction analysis  corresponding
to the estimation  of  the  intercept.  A conversion factor estimated here
for any method  can be used  to  adjust  the  results obtained  for a
particular  risk assessment  on  a single chemical when  the bioossay data
 is analyzed  by  that method.

                                3-13

-------
One might be tempted to conclude that Analysis 45.  which extrapolates
from animals to humans on a mg/kg/day basis,  accepts all routes of
exposure, and averages results over pairs of experiments that differ
only with respect to the sex of the test aninvjls (i.e.  the experi-
menters, protocol, and strain of the test animal are the same), is the
best of the analyse* presented in Table 3-1.   Its correlation
coefficient is as good as any other, its incremental loss is smallest,
and its residual uncertainty factor is smallest (Figure 3-1).  While
this analysis is certainly a good one by all these criteria, it is not
possible to conclude unequivocally that Analysis 45 is better than some
of the others listed.  For one thing, the ranking of analyses differs
depending en vhe choice of loss function.  Analysis 45 is best with the
DISTANCE2 loss function, Analysis 17 is best with CAUCHY and Analysis 43
with TANH.  A minimax criteria would select Analysis 17. followed by
Analysis 43.  Moreover, it is not clear which of the loss functions is
most appropriate  for determining fits of RROs and no statistical
development that would allow us to  test for lack of fit or  to  test
differences in values of average (or total) loss is available.  Far
these  reasons, the  loss functions have been used in this investigation
as a method of ranking the analyses.  Since no  one  loss function  is
obviously more appropriate, an  ove-oll measure  such as  total  incremental
normalized  loss  has been employed to find analyses  that are fairly
robust with respect to calculation  of loss.   The analyses  in  Table  3-1
remaining after  elimination from consideration  of Analysert  0,  7,  lie,
11d,  17. and  20  (i.e.  Analyses  30,  31,  43,  45,  and  *7)  dr/monstrcte  such
robustness.

Let  us call  these five analyses the set  of  recommended  analyses.   RRD
estimates  derived using  tliese analyses  for  each o'  the  chemicals
 included in  this investigation  but  lacking  epidomiologicol  data
sufficient  for  quantitative assessment  are  presented  in Table 3-?.   The
values in  Table  3-2 have not  been  adjustej  by the  conversion factors.
When this  is  done ».he range of  RRD  estimates  for each  chemical appears
as in Table 3-3.

 One final  comment will conclude this discussion of  recommended analysis
methods.  Four of the five members  of the recommended set  make no
                                3-14

-------
restrictions on the carcinogenic endpoints considered,  and data are
generally available for conducting these analyses.   Analysis *»3 utilizes
total tumor-bearing animals as an endpoint.  While,  in  theory,  this
should pose no restrictions on the analysis of bioassay data,  in
practice these endpoints often cannot be defined.   Availability of
needed data is an important consideration when assessing human risk and
expressing the results as a range of RRDs consistent with the data but
quantitatively incorporating uncertainties.

COMPONENT-SPECIFIC UNCERTAINTY

The discussion to this point has not considered uncertainty associated
with any specific components of the risk assessment process.  Rather, we
have emphasized the analysis methods as wholes and examined the
uncertainty remaining after the predictions have been obtained and
compared to the human RRDs (residual uncertainty).  This course has been
followed because of the apparent interaction of the components.  This
interaction takes two forms.  First, certain components are not mutually
independent.   A component that defines approaches to length of dosing
obviously also influences choices concerning length of observation; a
study cannot  dose animals for 80 weeks without also observing the
animals  for at least 80 weeks.  Moreover,  as discussed r>bove, altering
the approach  to some components can also,  unintentionally, affect the
make-up  of  the underlying base of data and, hence, changes attributed to
changing those components may be confounded by changes that may be
partially explicable by changes in other  components.  If, for example,
limiting experiments to those that last at least 90 percent of the
standard length also, unintentionally, excludes routet of exposure
besides  inhalation, oral, or gavoge, then the change :.n RRDs attributed
to changing requirements on  the  length of observation  is  confounded  by
changes  due to restricting routes of exposure.

It  is also  the case  that a component-specific  investigation  is not
sufficient  to characterize the  best  approaches becauue of the  second
type  of  interaction,  the empirical  interaction of  th»  components  on  the
results.   Consider,  for  example,  the components relating  to choice of
dose  units  (specifically,  the  approaches  specifying  use  of  mg/m^/day and

                                3-15

-------
mg/kg/lifetime) and allowable routes of exposure (unrestricted v*rsu«
restriction to an oral rout*, gavage,  inhalation,  or the rout* that
human* encounter).  These two components are not inherently
interrelated.  Nevertheless,  the effect on the RRD estimates and on the
estimation of loss (i.e.  the adequacy of the predictions) resulting from
selection of approaches to the indicated components is not readily
attributable to one component or the other.   Note that,  when LJQ is the
predictor. Analysis 0 (mg/m2/doy, restricted routes) yields average loss
of 0.298 as measured by the DISTANCE2 function (sieve applied).  If the
units are changed (Analysis <»d: mg/kg/lifetime, restricted routes) or if
the allowable routes are augmented (Analysis 31: mg/m^/day, unrestricted
routes) the  loss decreases, to 0.267 or 0.113, respectively.  When both
components are changed, howeve-  (Analysis 34: mg/kg/lifetime,
unrestricted routes), the decrease is intermediate between the two;
average loss in that case is 0.131.  Hence, the two components do rot
act independently on the estimates for some or all of the chemicals.  In
this  sense,  it is pointless  to debate whether mg/m^/day  is an
appropriate  dose measure for animal-to-human extrapolation without
taking into  consideration the approaches  taken for other components.
Evaluation of risk assessment methods should focus on the complete
process rather than  on individual components.

Consequently,  one must be cautious  in  interpreting  results  of  component-
specific  changes  in  analysis methods and  should not evaluate  analysis
methods  solely on the  basis  of  component-specific changes.  Neverthe-
less,  such an  examination may  be useful  in  determining  sources of
uncertainty  in risk  assessment  and  in  suggesting means  of  improving  risk
estimation through  additional  research  or data acquisition.   The results
of our component-specific  ur.certainty  investigation may also  be  useful
for presenting a  range of  human  risk estimates and  so can  be
 incorporated into the guidelines for determining  that range.

 The components can be divided  into two sets.   First ore those that
 do not change the data base underlying the assessment.   Included in  this
group ore the components dictating the dose units  used  for extrapolation
 and those specifying the manner in which results  ore  averaged.  Such
 components are not susceptible to confounding due to  unintentional
 changes  in the data base.   These are the components that show very

                                3-16

-------
consistent changes when approaches to them ore altered (cf.  Table 2-23).
The components relctad to averaging results have relatively little
effect on the RRO estimates; the modes are in the interval 0.8 to 1.25
and the dispersion factors are between 1.2 and 2.2.   Interestingly,  two
of the analyses included in the recommended set (45  and 47) differ from
the standard. Analysis 30 (also in the recommended set),  only in the
way they average results.  It appears that Analysis  30 is a satisfactory
method of bioassay analysis and RRD prediction; the  analyses that ciffer
from it only in the approach to a component that produces consistent
changes in RRDs also tend to be satisfactory.  Changing dose units also
produces consistent changes in RRD estimates (dispersion factors between
1.3 and 2.3) although the modes of the distributions are shifted, often
substantially.  Again, the analyses that differ from 30 only with
respect to dose units yield relatively good prediction; Analysis 31 is
included in  the recommended set.

The second category of components  includes those that change the data
base on which a risk assessment is based.  These display the least
amount of consistency with  respect to RRO changes and so are the most
uncertain aspects  of quantitative  risk assessment.  This conclusion is
not diminished by  the fact  that these components are subject to
confounding  due to unintentional data changes.   In any assessment of a
particular chemical, which  may  have  more  limited data than  many  of  the
chemicals  in our  data base,  such confounding  remains a potential
problem.

With one  exception (Analysis  43),  the analyses that incorporate
alternative  approaches  to these components are -elotively  poor methods
of human  risk  prediction;  the predictive  power ind good  correlation
noted  for  Analysis 30  are diminished by altering one component.   It
seems  likely that  the  high  degree  of chemical-specific change  (lack of
consistency) is responsible.   That is not  to soy  that  some  degree of
chemical-specificity  is  not desired.  One  would li'
-------
large dispersion factor,  39.6,  associated with the change in choice of
endpoint.

A corollary of these observations is that these highly uncertain
components — related to length of observation and dosing,  route of
exposure, carcinogenic responses to use,  and species to use — deserve
much more investigation (certainly more than choice of dose units for
extrapolation).  The goals of such an investigation include elucidation
of the reasons behind the observed changes in RROs and identification of
new approaches x!*nt would produce the desired changes in RRDs, that is,
ones that improve the predictiveness of the bioassay analyses.
Potentially useful studies of the high degree of chemical-specific
changes may start with identification of groups of chemicals (e.g.
aromatic hydrocarbons, epigenetic carcinogens, early-stage carcinogens,
etc.) and examination of patterns within the groups.  Tor some
components, notably the one associated with choice of species, other
considerations, such as pharmocokinetic or genetic differences, may need
to be e; ominsd.  The empirical approach adopted for the present
investigation may not be sufficient to explicate oil of the changes
seen.   But ooove all else, availability of good data sets presenting
information sufficient for studying these components with a minimum of
con'ounding is  essential.
 OPTIONS  FOR  PRESENTING A  RANGE  OF RISK ESTIMATES

 In  this  section  we discuss three options  for  presenting a range of  risk
 estimates suggested by the data.  These options are  derived  from  the
 five recommended analyses discussed  in the  previous  section.   Option  1
 requires selection of a single  analysis method from  among the  five,
 while Options 2  and 3 involve combining results from more than one
 analysis.

 Regardless of the option selected,  it seems reasonable  to screen  the
 data that ore going to be used.  The correlation  analysis indicates that
 data screening improves the correlations  in general.  Consequently,  a
 process akin to  the sieve that has  been defined here, one that selects
 the best of the  available bioatsays. is recommended.   In applications to

                                3-18

-------
a single chemical, a less automated,  more customized procedure could be
applied.  On the other hand,  if a consistent and uniform approach is
desired for many chemicals, some automated sieve may be preferable.

Option 1

This option involve* selecting one froa> among the five analyses
discussed in the previous section.  The selected analysis method is
applied to each of the eligible data sett. th>> median of the resulting
lower bound estimates is used as the predictor,  and the conversion
factor, 10C (cf. Table 3-1),  is applied to the predictor to correct for
bias.  The resulting estimate is multiplied and divided by the residue!
uncertainty factor (cf. Table 3-1) and the resulting range of RRDs is
the ciesired range.  Analysis-specific results are shovn in Table 3-4 for
the twenty-one chemicals in the data base for which human data are not
available.  Any one of the five intervals displayed for each chemical
can be used to represent the range of risk estimates.  Note that for
several chemicals, it was not possible to apply Analysis 43.

Options 2 and 5

For  these options, all five analyses must be performed, using the
appropriate endpoints  and  dose units for  extrapolation.  For each
analysis, select  as the  predictor the median of the  lower bounds
resulting from  the analysis and apply the corresponding conversion
factors.  The values obtained  in  this manner represent the  results of
the  methods of  bioassay  analysis  that appear to ".e most appropriate  for
estimating human  risk  and  form the basis  for determining  the range of
those  human estimates  consistent  with the data.

 It  is  always  possible  to determine estimates via  Analyses  30,  31, 45,
and  47.   It may,  nowever,  be  the  case that  Analysis  43  cannot  be
completed given the  data available  (cf.  Table  3-2 in which  several
chemicals lack  estimates associated  with  this  analysis).   When  this
occurs,  additional  uncertainty is associated with the risk  estimates-
 the  full  characterization  of  the  range  of estimates  consistent  with  tlie
 recommended  analyses  is  not  possible.   To account for this,  on? may  wish
 to  impute values  for  the missing  estimates.   The  component-specific

                                3-19

-------
uncertainty analysis,  with its dispersion factor,  provides the means to
do so.

Analysis 43 is a single-component variant of Analysis 30.   The histogram
associated with this variation (Figure 2-52) indicates the mode lies
between 0.8 and 1.25 (the geometric mean of the*-', values being 1.0).
Th9 RROs for Analysis <»3 are imputed by taking the RRDs from Analysis 30
and multiplying them by a factor indicating the average ratio of the RRO
pairs.   We have used the geometric moan of the interval wttich contains
the mode, i.e. 1.0.  (Another reasonable factor could be based on the
median ratio.)  In doing this, the uncertainty is increased (reflecting
the uncertainty due to lack of the complete ensemble of results) which
is estimated  by the dispersion factor.  The imputed Analysis *3 results
are multiplied and divided by the dispersion factor  (39.6) since that
factor is the average amount by which the ratios differ from the mode.
The imputation of predictions for Analysis 43 is completed by applying
the conversion factors for Analysis 43 just as if the estimate* were not
imputed.

At this  stage, the assessment (i.e. prediction) of risk is completed.
One has  derived the best  predictions  of  human RROs that are possible
from the data available:  for  an analysis  that could be performed, a
short  interval  (derived  from  the  range of conversion  factors pertinent
to that  method) of predictions  is available, whwreos  for  an analysis
whose  results have had to be  imputed  a generally much wider interval of
predictions  is  the best  that  can  be obtained.  However, because of  the
variability  characterized by  the  residual uncertainty factor,  these
intervals are not  sufficient  indicators  of  the range  of risk estimates
consistent with the data.   The  converted predictions  must be multiplied
and  divided  by  the residual uncertainty  factors  to derive upper and
 lower  uncertainty  bounds for  the  risk estimates  from  each analysis
method.

 Since  the recommended set of  analyses contains methods  that are  good
with  respect to prediction of human risk,  the  ranges  of estimates
 associated with those analyses  characterize the  human RROs for the
 chemical in  question.   The ranges extending from the  lower to the upper
 uncertainty  bounds for each individual method  con  be  considered  as  self-

                                3-20

-------
contained results (this is Option 1) or they may be considered as a
whole to present overall ranges of risks.   Two lines of reasoning
dictJte how this might be accomplished.  First (Option 2).  one may
reason as follows: to be most certain of including the true RRD in the
overall range, one must consider each analysis (since the best one for
any given chemical is not known) and characterize the range of estimates
by the interval from the smallest lower bound to the largest upper bound
(the "full range").  On the other hand (Option 3). one might argue that
any of these methods is adequate and that the range of human estimates
is suitably represented by the estimates from the method(s) that are
most consistent with the entire ensemble of results.  In this context,
consistency can only be determined modulo the degree of uncertainty.
[This is analogous to a statistical argument concerning the difference
between point estimates, for example, which can only be resolved to the
extent that the statistical variability allows.]  Consequently, the
Option 3 characterization of the range of risk estimates is defined as
the union of  the  intervals from the lower to the upper bounds associated
with some subset  of the analyses such  that the union contains the
predictions from  all the analyses  (i.e. the values, like those in Table
3-3, that have been adjusted by tho conversion factors but have not had
the reeidual  uncertainty factors applied) and is  the smallest union
satisfying  that condition.  This is a  reasonable  representation of  the
range of estimates consistent  with  the results from all recommended
analyses, given our present degree  of  uncertainty.  Wo will call  it the
smallest consistent range.

Comparison  of Options

All  three options present*d define  ranges of estimates by utilizing one
or more of  the methods  that hove been  shown empirically  in this
investigation to  do well with  respect  to prediction of human risk.
Moreover, they all  incorporate quantitatively those aspects of
uncertainty that  are  summarized  by  the re*idual uncertainty factor.
Several of  the advantages  and  disadvantages of  the  options are discussed
below.

Option  \ requires analysis  of  the  bioaisay  data by  a  single method  only.
The  selection of  the  single method may be somewhat  problematical,

                                3-21

-------
however.  It hos been orgued that the ability of the analyse* in the
recommended set to predict human risk is not clearly distinguishable by
the empirical approach adopted for the present investigation.  Neverthe-
less, other factors, based on toxicological consideration* for example,
may dictate the choice of one of the analyses methods.  In that case,
there is no question about the method that should underlie Option 1.

It is hard to conceive of other factors that could clearly dictate the
choice of a single analysis method, however.  Aside from Analysis 43,
all the analyses in the recommended set use exactly the same experiments
and carcinogenic responses to estimate risk.  If Analysis <*3 is deemed
inappropriate because it uses total tumor-bearing animals, for instance,
cne is left with four other methods one of which must, a priori, supply
the range of risk estimates, if or.e follows the procedure of Option  1.
A priori selection of a method may suit regulatory purposes very well.

Options 2 and 3, however, consider the intervals of estimates derived
from all of  the methods.  No a priori, decision  is made about the
particular method to use.  Rather, the results  of all the methods are
examined for consistency and the  summary i jnge  of estimates  reflects
that consistency as well os the analysis-specific uncertainty.  (In  this
sense,  these options reflect ocross-method  uncertainty in addition  to
within-metnod uncertainties.)   Greater consistency across analysis
methods yields  smaller  ranges.  Of Course,  the  overall range is no
smaller than the smallest range associated  with any given method  (which,
given  the  conversion arid  residual uncertainty factors, must  be  from
Analysis <»5);  an overall  range can reflect  no more certainty than the
method with  least  uncertainty.

 It may be  the  case tnat the full  range estimated via  Optior  2 over-
estimates  uncertainty.   It  is  true that  inclusion of  more analysis
methods in the preferred  set  can  never diminish the  Option  2 range.   So,
 for example, should further investigation reveal other analysis methods
worranting inclusion in the recommended set,  their inclusion could not
 shrink the range  determined by Option 2 and the current  s»t  of  analyses
 Furthermore, no particular  u*e is made of the analysts with  least
 residual  uncertainty.   If all  analyse* predicted the some RPDs.  the
 method with the largest uncertainty factor, not the one  with the

                                3-22

-------
smallest factor, determines the full range.

Option 3 doe* rot share these disadvantages  with Option 2.   Because the
third option selects the smallest range that is consistent with all tne
predictions, priority is given to the methods with least uncertainty.
Moreover, it it entirely possible that additional methods could reduce
the smallest consistent range, even if they  hove larger uncertainty
factors, if the added methods "cover" more of th» original predictions.
In this manner, additional information of comparable quality (i.e.  as
good in terms of predicting human risk) can  refine our estimates of
human health effects.

At first glance, it appears that the necessity cf imputing values when
particular analyses cannot be performed is a major disadvartage of
Options 2 and 3.  Indeed, the need to imputo adds greatly to the
uncertainty and may provi.de some justification for dropping from the
recommended set those analyses for which imputation may be required.
(Note that Options 2 and 3 are equally applicable no matter how many
analysis methods are considered.)  It must be emphasized, however, that
the problem with imputation is not a methodological one, i.e. there is
nothing  inherent ir Options 2 or 3 that makes them suff*.' from this
difficulty.  (In fact. Option 1 would have the same difficulty if  the
• ingle method  selected by that option was *»3.)   The i .creased
uncertainty that results from imputation is  caused by  inadequacies in
r.ata  reporting  or data dissemination.  If complete results, especially
those allowing  definition of  the responses need,  total  tumor-bearing
oniirols  or  the combination uf significant responses, were available,
then  no  imputation would be required  and uncertainty caused by  lack of
data  considerably reduced.  The  need  to  impute  values  is not a
legitimate  criticism of  Option 2 or  Option  3.

In  closing  this comparison,  it should be noted  that the uncertainties
dis«.us»etf  in connection  with  oil three of the  options  may not  completely
characterize uncertainty.   In particular, there  is  uncertainty  about  the
shape of the dose-recponte  curve that is not quantitatively estimated.
Moreover,  the  residual  uncertainty factors  represent  only that  uort of
t*« ufcertaiitty thot  is  not  explainahle  by  uncertainty in the  human
estimates.   The uncertainties not  quantified fall outside tl.e  definition
                                3-23

-------
of those that ore particularly associated with any given analysis
method, but they should be borne in mind when considering the ranges of
estimates of human risk derived from any option.
The chemicals included in this investigation but lacking epidemiological
data sufficient for quantitative risk assessment (Table 2-2} can serve
as examples of the application of the three options.   Tables 3-2 through
3-<» present the median lower bound RRDs,  the converted predictions, and
intervals of estimates derived by appiicatitn of the analysis-specific
residual uncertainty factors, respectively, that underlie the
application of these options.  As mentioned earlier,  any one column of
Table 3-<* represents the output from Option 1.  Table 3-5 contains the
two overall ranges from Option* 2 and 3.

Several interesting features are illustrated by the ranges in Table 7-5.
First, as an example of the procedure for determining the smallest
consistent range (Option 3), consider ocrylonitrlie.  The interval
(Table 3-%) associated with Analysis 
-------
of those that art particularly associated with any given analysis
method, but they should be borne in mind when considering the ranges of
estimates of human risk derived from any option.

Examples

The chemicals included in this investigation but lacking epidemiological
data sufficient for quantitative risk assessment (Table 2-2) can serve
as examples of the application of the three options.   Tables 3-2 through
5-
-------
involve* other features of the data a« well.   In particular,  note in
Table 3-2 that the median lower bound estimates from Analysis 43 are
generally smaller than those front the other analyses.   Moreover, the
conversion factors for Analysis 43 are 0.18 and 0.29;  i.e.  the converted
values are even smaller than the raw values,  whereas the factors for
other analyses in the restricted recommended set are greater than or
equal to one.  (This also explains why, even when no imputation is
necessary. Analysis 43 uncertainty bounds always determine the lower end
of the smallest consistent range.)  The difference in conversion factors
and the reason why they tend to separate the predictions of Analysis 43
from those of the other analyses can be explained by reference to Figure
2-52.  This histogram depicts the chemical-specific ratios of RRD
estimates derived from Analysis 43 to those derived from Analysis 30.
The six chemicals whose ratios are greater than 1.25 are only from the
set that nav« epidemiological data suitable for estimating the
conversion factors.  Given that the conversion factor for Analysis 30 it
roughly unity, these six chemicals, especially, have shifted the best
fitting line to the right, decreasing  the y-intercept, and entailing
conversion factors substantially  less  thnn one.  But, as already notod,
the chemicals represented in Tadle 3-2 generally have ratios in Fig-jre
2-52 that are less than  0.80.  Hence  the divergence of the RRD
predictions.  The dichotomy displayed  in Figure 2-52 between those
chemicals with suitable  epidemiological  data and those without  is
undoubtedly  fortuitous.   Nevertheless,  it  does  show that a dispersion
factor  as large as  39.6  used  in  the  context of  the  imputation of values
is  necessary to cover  such  occurrences.   It  is  important to  note that
this added uncertainty is unnecessary:  imputation  is  dictated  solely by
data availability  (having the  ability to define the total  tumor-bearing
animal  response).   Better data reporting procedures can  substantially
reduce  the ranges  of  risk estimates.

For purposes of  comparison.  Table 3-6 presents  the ranges  of estimates
 ..hot are  obtained from Options 2 and 3 when  Analysis  43  is not
considered.   This eliminates  the wide ranges produced as a results  of
 imputation.   However,  by simply ignoring a method  of  extropolot.ion  that
ham been  deemed to be of a  value comparable to those  of  the  other
rriethods in  the recommended  set,  these ranges may be too narrow to  the
extent  tnat  ocr-oss-method uncertainty is underestimated.   Certainly, the

                                3-25

-------
range* presented in Table 3-5 ore to be preferred ovsr those in Table
3-6 wnen no imputation is necessary.  When Analysis i»3 cannot be
performed and imputation is necessary,  it is not clear which range is
more appropriate; those based on fewer analyses (Table 3-6) may be too
narrow while those hoseo ?•* an ad hoc imputation procedure may be too
wide.  It bears repeating thav this dilemma csuld be avoided entirely if
some better meant of data dissemination were to be found.

At present, there are no quantitative estimates of RRDs derived from the
epidemiologicol literature to which these predictions can be compared.
It might be possible to qualitatively compare the predictions to the
epidemiology in a couple of ways.  The predictions could be used to rank
the chemicals in order of their RRDs (reverse order of their
carcinogrnic potencies).  Another ordering could be based on a
comparative examination of the epidemiology.  The degree of
correspondence of the two orders might provide information about the
predictions.  Of course, without quantitative estimates, the
epidemiologically based ordering would be subject to considerable
uncertainty in and of itself.  A chemical-specific examination of the
epidemiology might be useful  in uncovering predictions that are way off
the mark.  Such a comparison  would  probably be quite crude and may be
limited  to identifying  those  chemicals for which the predictions (being
finite)  indicate carcinogenicity but v.he epidemiology  indicates no
carcinogenicity.  Neither  type of comparison has been  undertaken for
this  project.
 GENERAL  CONSIDERATIONS AND MAJOR CONCLUSIONS

 It  is  apparent  that the animal  data  base  and  the methods used  in  this
 study  provide a useful basis  for evaluating quantitative risk
 assessment.   Their use in  the present  context has  demonstrated the
 relevance of animul carcinogenicity  experiment..- to human risk
 estimation.   Moreover, it  has been possible to identify methods of
 analysis of  the bioassay data,  including  the  choice of the  median lower
 bound  predictor,  that satisfactorily predict  risk-related doses in
 humans.   Application of these methods  has led to suggested  guidelines
 concerning the  prediction  of  human risks  and  the presentation  of  ranges

                                3-20

-------
of estimates incorporating the relevant uncertainties.

Certain features of this investigation must be borne in mind.   Primary
among these is the fact that the level of risk for which RRDs  have tien
determined is 0.25.  This value is a compromise between the need to use
a value high enough to be fairly independent of the choice of  dose-
response model in the bioassay analyses and the desire not to  greatly
exceed the risk found in most epidemiologically studied cohorts.  A risk
level of 0.25 is higher than that which exists in most human exposure
situations.  While we would not expect some of the results to  be altered
if the investigation had utilized a different risk level (e.g.  10~6), it
is not certain that all the conclusions would remain the same.   In
particular, the evaluation of uncertainty in this report does  not encom-
pass that related to the shape of the dose-response curve.  It may be
worthwhile to check some of the results at lower levels of risk,
although it mus. be noted that the increased uncertainty associated with
the shape of the dose-response curve at low doses may moke interpreta-
tion of results concerning other components of risk assessment
difficult.

Also recall that the bioassay data, though extensive,  is rather crude in
many respect*.  We hove already noted the problems associated with data
deficiencies, mostly caused by  incomplete reporting of  results.  Over
and above  that, however,  the  analyses performed  did not use time-to-
tumor  data, i.e. a quantal model  has  been used to estimate RRDs.   Time
and data constraints dictated that  choice, but it is of interest to
determine  if  time-to-tumor analyses,  which utilize more of the
 information obtained from o  bioassay,  could  refine our results  and
conclusions.

 It must be recoiled  that when several forms  of a suspected carcinogen
 have  been  tested in  animal  bioassays,  the results for  all forms have
 been  grouped  together.   This primarily influences the  data  and  resjlts
 for the metals.   Since all forms are individually identified,  it is
 possible to perform the analyses on each form separately.   Of course,
 the number of experiments for the affected chemicals would  be reduced
 Moreover,  it is often not known,  even for substantiated human carcino-
 gens,  which particular moieties cause cancer or  which  are most  potent.

                                3-27

-------
Other reasonable approaches to the components of risk assessment could
have been defined.  Thus,  for example,  the component related to length
of dosing did not have to include only two approaches,  one including all
experiments and the other including only those experiments lasting at
least 90 percent of the standard length.  Short experiments by
themselves could have been studied.  This may have led to an examination
of thfc correction factor that has been used to adjust for short
observation periods.

As discussed in Volume 1 of this report, the epidemiological data used
in this study are jf variable quality.   The bounds determining ranges of
exposure are somewhat arbitrary and, for each chemical, one cancer
endpoint from a single study was selected to reprecent the range of RRD
estimates and to be the target of the bioas«ay analyses.  It would be
interest to determine how robust our findings are with respect to these
choices.  Moreover, the pattern of exposure for which RROs hove been
estimated ("»5 years of constant exposure storting at age 20) is not
realistic for some of the study chemicals (suc.fi as DCS '  • estrogen).
This choice, too, it a compromise between the usual lifetime exposure
administered in bioassays (that is, lifetime after start of exposure
which may be several weeks after the birth of the test animals) and the
less consistent exposures which humans  encounter.

In the prediction analysis, three  loss  functions have  been defined.
None of thorn is the standard  squared-error loss routinely applied,  since
the latter  is clearly not appropriate for the estimates  derived here.
No statistical development of these loss function* exists to inform us
about  lack  of fit,  significance of  differences  in loss,  etc.   If a
statistical underpinning  did  exist, it  would be possible to use the loss
functions  in som* capacity besides as ranking  procedures and.  thereby,
to be  able  to better  differentiate between t.ie  analysis  methods an-j
refine the  conclusions.

Finally, only 55  distinct bioassay analysis  methods  have been  defined.
This  is  only a  small  fraction of  those  that  could be considered,  even
fixing the  approaches to  the components at  those  defined here.
                                3-28

-------
Despite the caveats just presented,  the following major conclusions have
emerged from the present investigation.

     •  Animal and human RRDs are strongly correlated.  The knowledge
        that this correlation exists should strengthen the scientific
        basis for cancer risk assessment and cause increased confidence
        to be placed in estimates of human concur risk made from animal
        data.

     •  In the majority of cases considered, analysis methods for
        bioassay data that utilize lower statistical confif-  :e limits
        as predictors yield better predictions of human risX than do the
        some methods using maximum likelihood estimates.

     •  Analysis methods for i-^oassoy  data that utilize median lower
        bound RROs determined from che ensemble of data for a chemical
        generally yield better predictions of human results than
        analyses that utilize minimum  lower bound RRDe (assuming
        approaches to other risk assessment components are chosen
        appropriately.

     •  Use of the "ing  intake/kg body  weight/day" (body weight) method
        for atiimal-to-humon extrapolation generally cause* RROs
        estimated from  animal and human  data to correspond more closely
        than  the other  methods evaluated, including the "mg intake/m^
        surface area/day*  (surface area) method.

      •  The  risk assessment approach  for animal data  that was  intended
        lo mimic that  used by the EPA  underestimates  the RROs
        (equivalent  to overestimating  human risk) obtained  from  the
        human data  in  this study  by  about on order  of magnitude,  on
        average.   However,  it  should be understood  that  the risk
        assessment  approaches  implemented  in this  study  ore computer
        automated  and  do not  always  utilize the some  data  or  provide the
         •ar,
-------
       the overage multiplicative factor by which the RRD predictors
       obtained  from  the animal data are inconsistent with the ranges
       of hunvju  RROs  consistent with the human data) to  1.7.  This  is
       not the same as  saying  that  the  predictors are accurate to
       within a  factor-of  1.7, because  the estimated ranges of human
       RROs  that are  consistent with the human data cover an  order  of
       magnitude or more for most chemicals.

     •  It has been  possible to identify a  set of analysis methods using
       the median lower bound  estimates that ore most appropriate for
       extrapolating  risk  front u.-.imols  to  humans, given  the current
       state of  knowledge  and  data  analysis.  It  is possible  to  use the
       information  and results presented  in  this  investigation to
       calculate ranges of risk  estimates  that are consistent with  the
       data  and  also  incorporate many  uncertainties associated with the
       extrapolation  procedure.

     •  Evaluation of  risk  assessment methods should focus on  the
       complete  risk  assessment  process rather  than on  individual
       components.

     •   The  data  base  and methods used  in  this  study can  provide  a
        useful basis for the evaluation of various risk  assessment
       methods.
DIRECTIONS FOR FUTURE RESEARCH

In the course of the previous discussion, several proposed extensions of
(• . s project have been mentioned.  Several fall under the heading of
.-u.nsitivity analyses of the results already obtained.  fhese include
investigation of the robustness of the results to reasonable alternative
choices for the «plJ
-------
The data that is available from this project could provide an
interesting and pertinent example to which that development could apply.

Also discussed in connection with component-specific uncertainty are
efforts directed at reducing or explaining that uncertainty.  The
greatest uncertaintins are related to the components specifying how to
handle experiments of different lengths of dosing, routes of exposure.
or test species and specifying the carcinogenic responses to use.  Many
aspects of these components and their uncertainties can be addressed in
an investigation of pharmacokinetics.  The data base contains detailed
data on the timing and intensity of exposure for each bioassay, so a
pharmacokinetic study, which requires such information, is entirely
feasible with the currently collected data.  Two specific proposals are
discussed here.

Risk estimates  incorporating pharmacokinetic data could be used to
determine appropriate surrogate doses.   It is  sometimes assumed that a
given  dcse measured as average concentration of the active metabolite at
the  target tissue will produce the  saite  risk in animals and  humans.
However, given  the many  differences between animals and humo.is (size, '
life span, and  metabolic rotes, to  mention a few),  it  is  ncc clear
which,  if any,  surrogate dose  is the most appropriate.  This issue  is
similar  to that of choice of the most appropriate surrogate  dose measure
for  animal to  human extrapolation  (e.g.  mg/kg/day versus  mg/i>2/dov)
considered in  this study and can be studied  in a  similar  manner.  Risk
estimates using phormocokinetic data could be  used  to  determine
empirically  the most  appropriate surrogate dose.  Even though  the range
of RROs  consistent with  the human  data generally  cover a  range of an
order  of magnitude or  greater, the  potential surrogate doses cover  on
even wider range.  Just  us the present study indicates that  certain dose
measures appear to predict human results well  in  conjunction with
appropriate  choices  for  ether  risk  assessment  components,  a study using
 pharmacokinetic data  should allow  similar conclusions  regarding  the
 surrogate  dose.  A  preliminary investigation indicates that possibly  16
 of the 23  chemicals  with suitable  human  data used in  this study  might
 also have data that  would support  a risk assessment that incorporates
 pharmacokinetic data.
                                3-31

-------
A second potentially useful investigation incorporating pharmacokinetic
data involves using the data in the data base on different routes of
exposure to study the best means of extrapolating from route to route in
animal studies.  RisK assessment methods, including the ones examined in
this study, often assume a given dose rate involves the same risk,
regardless of route.  This clearly is a gross oversimplification.  The
animal data collected for this study contains numerous examples of
carcinogenicity studies on the same chemical and animal species, but for
which exposure is through different routes.  Those studies could be used
to determine how pharmacokinetic data could best be applied to perform
route-to-route extrapolation.  Since human data would not be essential
in these investigations, our total data base that encompasses k<*
chemicals  could be  used.

The question of different cnemical classes and the consistency that may
be apparent within  any  'jf the classes is deserving of further study.   It
would be reasonable to  couple this work with pharmacokinetic methods.
In the present data base, several classes ore represented.  However, the
number within any particular class is somewhat  limited.  An expanded
data base  may be recessary for a thorough  investigation.

In fact, one desirable  goal  in and of itself, but one that would enhance
the prospects  for successful, completion  of these other proposals,  is the
maintenance and updating of  the bioassay data base.  All aspects of
this, including accumulation of more data  sets  for the chemicals already
included and addition of more substances, may be necessary.  Some
revamping  o'' the data coding format may  also make future analyses  eaeie-
and more accurate.   Especially for phormacokinetic studies, for
instance,  dose patterns could be recorded on a  daily rather than weekly
basis.

As a counterpart to th* bicastoy data base enhancement, updating and
augmenting tha epidemiologicol doto  is  essential.  Since  the
epidemiological data (in particular, data  on exposure)  is  the  single
most  limiting  factor preventing  use  of  hjman data, any  hope of
 increasing the size of  the  sample  of chemicals  uteful  in  estimating
conversion factors  and  residual  uncertainty  must be  based on an effort
 to  acquire such  data.   For  those chemicals already analyzed, more

                                3-32

-------
•pacific exposure doto would reduce the uncertainty bounds surrounding
epidemiological RRD estimates and refine our estimates.   At it the case
with the bioassay data, much of the limitation or uncertainty i» solely
a matter of inadequate reporting of data.

It should be noted in passing that th« methods and portions of the
computer programs developed and applied in this project may be useful in
other contexts.  Of particular interest is a study of other types of
health effects, e.g. reproductive effects.  The investigation of these
issues could include determinations of uncertainty at well as
identification of the most appropriate methods.  Other projects,
including investigation of other types of extrapolations, e.g. from one
temporal dosing pattern to another or from rot* to mice, could also be
facilitated by use of  the data base, met,.ods, and programs developed  in
the  present work.

Finally, one would  like to  investigate cancer risk assessment methods
appropriate when  data  available  to a particular assessment are  limited.
We have  mentioned this problem  in connection with component  specific
uncertainty  (i.e.  noting  that confounding like that  affecting those
uncertainty  calculations  will often be present in any given  risk
analysis setting) and  in  connection with the  set of  recommended bioassay
analysis methods.   In  the latter instance,  it was pointed  out that each
analysis in  the  recommended set,  save  for Analysis  17,  is  capable of
being applied  to any data base  but  that  data  limitations due to
 incomplete data  presentation may entail  that  Analyses 20 and *3 are  not
possible.   The remaining  analyses  (30,  31,  <»5, and  *7)  can be performed
no matter what the data  set contains,  but they may  be seriously affected
by the extent  and nature  of the contents.

 Consequently,  the following investigation is  proposed as o means  of
 studying the effects of  the limitations  on the data for any  chemical  of
 Interest and of determining how best  to extrapolate risks to humans.
 Pick the doto in the data base that most nearly  matches the  data  for  the
 chemical in question.   The matching may be based on specie?. routes  of
 exposure,  and quality of  '.he data.   Moreover,  one may wish to restrict
 attention to chemicals that are in the same class of the substance of
 interest.  Suppose, for example, a volatile organic chemical is under

                                3-33

-------
investigation and that the only data available are from rat inhalation
studies.  Then,  the proposed procedure would first select rat inhalation
bioassoys conducted using appropriate chemicals (i.e.,  perhaps limited
to volatile organics).  The components of risk assessing,t not fixed by
the selection could be varied and the method that works best with the
selected data would be the basis for extrapolating to humans risks due
to the chemical in question.  Since we olso have a recommended set
consisting of methods that appear to perform well for the data and
chemicala considered as a whole, the risks estimated on that basis (i.e.
using the recommended *et) would be available for comparison.  These
estimates reveal what would happen if other species, other routes, and
other chemicals are included.  The relationship between the estimates
obtained by the two approaches would suggest a general type of
uncertainty attributable to use of a limited data base (in this example,
rat inhalation studies).  A pilot study could investigate the
feasibility of such c chemical-specific approach to risk assessment.
REFERENCE

1.  Crump, K., Silvers. A., Ricci. P., nnd Wyzga, R. (1985).  Inter-
    species Comparison for Carcinogenic Potency to Humans.  Princ tples
    of  Health Ri«x  Assessment.  Ricci. P. (ed.).  Prentice Holl.
                                S-5<»

-------
                                Table  3-1

              COMPARISON OF  RESULTS  FOR  SELECTED  ANALYSES0
Bias-
Analysis
0
0«
7
lie
110
17
20
SO
31
43
45
47
Number of
Chemicals
20
20
19
19
13
11
17
23
23
17
23
23
Correlation
Coefficient
0.
0.
0.
0.
0.
0,
0
0,
0
0
0
0
78
78
76
77
78
.58
.67
.91
.90
.74
.91
.89
Correcting Residual
Total Incremental Conversion Uncertainty
Normalized Lossb Fac*.orsc Factor^
1 .
1 .
1 .
0.
1 .
0
0
0.
0
0
0
0
15
71
40
62
O'l
.27
,62
.39
.53
.28
.27
.28
1
12
1
0
3
2
0
1
8
0
1
1
.6 -
-
.6 -
.81 -
.7 -
.8 -
.69 -
. 1 -
.5 -
.18 -
.2 -
.0 -
2
12
3
1
4
7.
0
1
12
0
1
1
. 1

.6
.9
.3
.8
.78
.7

.29
.7
.7
5.
16.
5.
4.
3.
4
7
2.
2
2
1 .
1
3
2
4
5
1
.2
. 1
.0
.0
.8
.7
.8
°TI-,e results correspond to trie member of the pair (with sieve,  without
 sieve) that gives best results.  For Analyses 11c,  20, and 43 this is
 without the sieve; for other analyses this is with the sieve.   The
 median lower bound predictor, LJQ. is used in all analyses except for
 the exception noted.
t>This value is not the sama as that in Table 2-8 because th>> inclusion
 of the supplemental analysss reduced the minimum average loss for two
 of the three loss functions and increased the maximum loss for all
 three of the functions.
°These values ore the factors,  10C. based on the y-intercepts from the
 CAUCHY and TAriH loss functions (cf. Tables 2-13 and 2-17) and represent
 the average ratio of human RROs to animal RRDs.
^Residual uncertainty is  frcwn Table 2-21 or 2-22.  It  is the factor
 computed for all chemicals and represents the average factor by which  n
 prediction must be  multiplied  or  divided  in order to  eliminate
 uncertainty not due to uncertainty  in  the humor estimates.
•Using minimal  lower bound estimator \.n.
                                3-35

-------
                               Toble 3-2

  MEDIAN COWER BOUND RRO ESTIMATES, BY CHEMICAL AND ANALYSIS METHOD0
Analysis
Cheffliccl
Acrylonitrile
Ally! Chloride
4-Aminobiphenyl
Benzo(a)pyrene
Carbon tetrachloride
Chlordane
3, 3-Dichlorobenzidine
1 , 2-Oichloroethane
EDB
Formaldehyde
Hexachlorobenzene
Hydrazine
Mustard Cos
Lead
2-Napthylomine
NTA
2,4, 8-Trichlorophenol
TCDO
Tetrochloroethylene
Toxaphene
Vinylidene chloride

4.
6.
2.
5.
3.
2.
1 .
2.
3.
1 .
1 .
1
1
6
1
6
1
7
9
2
1
30
39
92E+1
17E+1
21E-1
10E+1
36
24E+1
79E + 1
77
88
30
8.'
.40E-7
.14
.20E»1
.24E+2
. 73E+2
. 32E-5
.22E+1
.58
.34

9.
1 .
2.
7.
2
1 .
2
4.
2.
3.
2
1
1
1
A
*>
1
1
8
1
2
31
29E-1
02E+1
03
02E-2
89
96E-1
62
62
94E-1
09E-1
OOE-1
. 74E-1
.31E-8
.30
.54
.66E»1
.61E+1
. 55C-5
.25
.89E-1
.45E-1

1 .
6.

4.

1.
5.
1 .
2.

2


6

2

2
8
1
7
43
01
71E*1
_-b
80E-2
—
56
59E-1
28E+1
96
--
8"»E-1
—
--
.09
--
. 23E+1
--
.56E-5
.06E»1
.37
-26E-1

3.
7.
2.
5.
3.
1 .
1 .
3.
3.
1 .
1 ,
1
1
6
1
R
2
6
8
4
5
45
57
27E»1
42E+1
87E-1
57E+1
99
91E + 1
34E + 1
29
15
.30
87
.40E-7
. 14
.20E+1
. 24E+2
.11E+2
.87E-5
.70E+1
23
.56E-1

„
1 .
2.
5.
3.
4.
1 .
4.
4.
3.
2
9
1
6
1
6
1
9
1
4
2
47
39
11E-
17E-


>2
^1
21E-1
10E + 1
43

24E+1
31E
82
16
48
15
.40E
. 14
. 20E
. 97E
. 90E
fl




-7

+ 1
+ 2
+ 2
.05E-5
.13E+2
.74
.34


°Th« full «i«v« wot uted to «cr»an th« data;  th* •stimatcc have not be«n
 odju*t«d by th« oppropriat* conversion factors.
bA •--" indicates that tn» data w«r» not available to apply the m«thod
 to the chemical.
                               3-36

-------
                                                        Table 3-3

                                   RRO PREDICTIONS0.  BY CHfMICAL  AND ANALYSIS METHOD
       Oemcal
Acrylonitrlie
Allyl Chloride
4-A/»inobipfieny 1
B«>i2o(o Jpyrene
Carbon  Tetrochlorirte
Chlorco.ie
3 3-Dichlerobenjridine
1 .2-DicMoroet'iane
EOB
f ortna] Oehyde
Analysis
30
[4 74.
[7 «7E
[2 34E
[5 63E
[i 35E
[2 55.
[1 3*-E
[3 DIE
(4 07.
[2 03.
[1 40.
[2 02.
[1 51E
[6 63.
[1 JOE
[6 7/.E
[1 87E
1 7 91E
(9 96E
|2 79.
[1 45.
7
•1 .
»1.
-1.
«1 .
4
*1.
«1 .
6
3
p
i
-7.
1
«1.
t2.
.2.
-5.
»1.
4
2
46]'
1 18t+2]
3.69E»1]
8 86E-1]
5 27E.1]
01]
2 11E.1]
4. 74E* 1 j
41]
20]
21]
'8J
2 3HC-7J
04E»1]
2 04E.1]
1 06Et3]
2 94E«2]
t .24E-4J
1 57E»2]
39]
JBJ
[7 85. 1
[8 62E*1
[1 72E»1
[5.93E-1
[2 44E»1
[1 66. 2
[2 1]
3.97E-1]
2.11E-1]
[4.14, 6.
[8.43E + 1 .
[2.81E+1,
[6.81E-1.
[4. 14E*1 .
[2.31. 3.
[2.22E+1.
[3.87E+1.
[3.82. 5.
[1.J.. 1.
[1 .51. 2.
[2.17. 3.
[1.62E-7.
[7.12. 1.
[1 . 39E+1.
[7.24E»2.
[2.45E+2.
[7.97E-5.
[1.01E+2.
[4.91, 7.
[6.45E-1.
07]
1 .24£ + 2]
4. 11E + 1]
9.98E-1 ]
6.0/Etl]
58]
3.25E+1]
5.68E*1 J
59]
96]
21]
18]
2.38E-7]
04E+1]
2.04E*1 J
1 .06E+3]
3.59E+2]
1.18E-4]
1 .48E + 2J
19]
9.45E-1]
[4.39. 7
[1 . 11E*2
[2.17E*1
[5.21E-1
[3.10EO
[4.43. 7
[1 .24E»1
l4.31E»1
[4.82, 8
[3.16, 5
[2.48, 4
[9.15, 1
[1 .40E-7,
[6.14. 1.
[1 .20E+1.
[6.97E+2.
[ 1 .90E»2.
[9.05E-5.
[1 . 13E+2.
[4.74, 8.
[2.34. 3.
47
.46]
. 1.89E+2]
. 3.69E»1]
. 8.86E-1]
. 5.2/tt-l]
.53]
. 2. 11F»1 J
, 7.33E 1]
.19]
37]
22]
56Et1]
2.38E-7]
04E»1]
2.04E+1 J
1. 18t*3]
3.23E»2]
1.54E-4]
1.92E+2J
06]
98]
^iQr^Zjne
W^jitor{J COS
lead
2-Nocthy Icwnine
NT*
24.6-Trichlorophenol
TCOO
Tetrochloroethylene
To*ophene
Vinylidene Cr,

°i^e predictions  are  derived frofli the voli_«t. in Table 3-2 oy cpplicoticr of the appropriate conversion  factors.
''The interval* o-e  the 'eiclt of oi/plying the t«*o conv».rsion factors giver in Table 3-1 for each analysis  method.
CA •--• indicates that tic  Co!.o were no'  available to apply '.he nwthod to the chemical.

-------
                                                      Table 3-4

                     UNCERTAINTY INTERVALS FOR RRO PREDICTIONS0. BY CHEMICAL AND ANALYSIS METHOD
Analysis
Cheirical
Acrylonitrile
Allyl Chloride
4-Aminobiphenyl
Benzo(a)pyrene
Carbon Tetrachloride
Chi or done
3,3-Dichlorobenzidine
j, 1,2-Dichloroethane
* EDB
30 Formaldehyde
Hexach lor obenzene
Hydrazine
Mustard Gas
Lead
2-Nap thy lamina
NTA
2 , 4, 6-Tr ichlorophenol
TCDO
Tetrochloroethylene
Toxophene
Vinylidene Chloride

30
[2.37, 1.49E+1]
[3.74E+1. 2.36E+2]
[1.17E+1. 7.38E+1]
[2.82E-1. 1.77]
[1.68E+1. 1.05E+2]
[1.28. 8.'02]
[6.70. 4.22E+1]
[1.51E+1. 9.48E+1]
[2.04, 1.28E+1]
[1.02. 6.40]
[7.00E-1, 4.42]
[1.01. 6.36]
[7.55E-8. 4.76E-7]
[3.32. 2.08E+1]
[6.50, 4.08E+1]
[3.37E+2, 2.12E+3]
[9.35E+1. 5.88E+2]
[3.96E-5, 2.48E-4]
[4.98E+1, 3.14E+2]
[1-40. 8.78]
[7.25E-1. 2.56]

[3.93, 2
[4.31E+1
[8.60. 4
[2.96E-1
[1.22E+1
[8.30E-1
[1.10E+1
[1.95E+1
[1.24. 7
[1.30. 7
[8.45E-1
[7.35E-1
[5.55E-8
[5.50. 3
[1.08E+1
[2.39E+2
[6.80E+1
[6.55E-5
[3.48€+1
[8.00E-1
[1.04. 5
31
.24E+1]
. 2.46E+2]
.88E+1]
. 1.69]
! 4.72]
, 6.30E+1]
. 1.11E+2]
.06]
.42]
. 4.80]
. 4.18]
. 3.14E-7]
.12E+1]
. 6.10E+1]
, 1.36E+3]
. 3.88E+2]
. 3.72E-4]
. 1.98E+2]
. 4 54]
.88]
43
[6
[4
[3
[1
[3
[8
[1
[1
[3
[1
[1
[5
[8
[4
.50E-2.
.32. 5.
.09E-3.
-OOE-1.
.61E-2.
-21E-1.
.90E-1.
.82E-2.
.93E-1.
.43. 1.
.65E-6.
.18. 6.
.82E-2.
.68E-2.
8.20E-1]
46E+1]
3.89E-2]
1.27]
4.54E-1]
1.04E+1]
2.40]
2.31]
4.96]
81E+1]
2.08E-5]
55E+1]
1.11]
5.91E-1]
[2.44. 1
[4.96E+1
[1.65E+1
[4.01E-1
[2.44E+1
[1.36. 5
[1.31E+1
[2.28E+1
[2.25. 9
[7.82E-1
[8.88E-1
[1.28. 5
[9.53E-8
[4.19. 1
[8.18. 3
[4.26E+2
[1.44E+2
[4.69E-5
[5.94E+1
[2.89. 1
[3.79E-1
45
.03E+1]
, 2.11E+2]
. 6.99E+1]
. 1.70]
. 1.03E+2]
.75]
. 5.53E+1]
, 9.66E+1]
.50]
. 3.33]
. 3.76]
.41]
. 4.05E-7]
.77E+1]
.47E+1]
. 1.80E+3]
. 6.10E+2]
. 2.01E-4]
. 2.52E+2]
.22E+1]
. 1.61]

[2.44. 1
[6.17E+1
[1.21E+1
[2.89E-1
[1.72E+1
[2.46. 1
[6.89. 3
[2.39E+1
[2.68. 1
[1.76. 9
[1.38. 7
[5.08, 2
[7.78E-8
[3.41. 1
[6.67. 3
[3.87E+2
[1.06E+2
[5.03E-5
[6.28E+1
[2.63, 1
[1-30. 7
47
.34E+1]
, 3.40E+2]
, 6.64E+1]
. 1.59]
. 9.49E+1]
.36E+1]
.80E+1]
. 1.32E+2]
.47E+1]
.67]
.60]
.81E+1]
. 4.28E-7]
.87E+1]
.67E+1]
. 2.12E+3]
. 5.81E+2]
. 2.77E-4]]
, 3.46E+2]
.45E+1]
.16]
°The intervals are derived from the values  in Table  3-3 by application  of the residual  uncertainty factors  (cf.
 Table 3-1).
bA "—" indicates that the data were not available to apply the method  to the chemical.

-------
I
U
to
                                                               Table 3-5

                                  RANGES OF HUMAN RROS DERIVED FROM THE  RECOMMENDED SET OF ANALYSES0
Chemical
Acrylonitrile
Allyl Chloride
4-Aminobiphenyl
Benzo(a)pyrene
Carbon Tetrachloride
Chlordane
3. 3-Dichlorobenzidine
1 , 2-Dichloroethane
EDB
Formaldehyde
Hexachlorobenzene
Hydrazine
Mustard Gas
Lead
2-Napthylamine
NTA
2,4,6-Trichlorophenol
TCDD
Tetrach loroethylene
Toxaphene
^inylidene Chloride
Option 2 =
Fu'll Ranaeb
[6.50E-2. 2.24E+1]
[4.32, 3.40E+2]
[3.52E-2. 6.98E+2]*
[3.09E-3. 1.77]
[5.03E-2. 9.97E+2]"
[1.00E-1. 1.S6E+1]
[3.61E-2. 6.30E+1]
[8.21E-1. 1.32E+2]
[1.90E-1. 1.47E+1]"
[3.05E-3. 6.05E+1]"
[1.B2E-2. 7.60]
[3.04E-3. 6.01E+1]"
[2.27E-10. 4.50E-6]"
[3.93E-1. 3.12E+1]
[1.95E-2. 3.86E+2]"
[1.43. 2.12E+3]
[2.81E-1. 5.56E+3J"
[1.65E-6. 3.72E-4]
[5.18. 3.46E+2]
[8.82E-2. 1.45E+1]
[4.68E-2, 7.16]
Option 3:
Smallest Consistent Range0
[6.50E-2, 8.20E-1] U [2.44. 1.34E+1] (43,
[4.32. 2.11E+2] (43. 45)
[3.52E-2, 6.98E+2] (%3)
[3.09E-3. 3.89E-2] U [4.01E-1. 1.70] (43,
[5.03E-2. 9.97E+2] (43)
[1.00E-1, 1.27] U [1.28. 8.02] (43. 30)
[3.61E-2. 4.54E-1] U [6.89. 3 80E+1] (43.


47 )<


45)


47)
[8.21E-1, 1.04E+1] U [2.28E+'lt 9.66E+1] (43. 45)
[1.90E-1. 9.50] (43. 45)
[3.05E-3. 6.05E+1] (43)
[1.82E-2, 4.42] (43. 30)
[3.04E-3. 6.01E+1] (43)
[2.27E-10. 4.50E-6] (43)
[3.93E-1. 1.77E+1] (43. 45)
[1.95E-2, 3.86E+2] (43)
[1.43. 1.81E+1] U [4.26E+2. 1.80E+3] (43.
[2.81E-1, 5.56E+3] (43)







*5)

[1.65E-6, 2.08E-5] U [4.69E-5. 2.01E-4] (43. 45)
[5.18. 1.98E+2] (43. 31)
[8.82E-2, 1.11] U [1.40. 8.78] (43. 30)
[4.68E-2, 5.88] (43. 45. 31)



                     °Values of RRDs are  in mg/kg/day.
                     bThe  full  range extends from the smallest lower bound to the largest upper bound among
                      analyses  in  the recommended set.
                     cThe  smallest consistent range is the union of intervals from analyses in the recommended
                      set  such  that the union includes all predictions (from Table 3-3) and is the smallest union
                      that does so.
                     ''When the  union is of disjoint parts, both parts are shown, connected by the union symbol,
                      "U".   In  parentheses are the analyses whose union defines the smallest consistent range.
                     "An asterisk  marks those intervals that are the result of imputing values for Analysis 43.

-------
                                Table  3-6

            RANGES  OF  HUMAN  RRDS DERIVED FROM  THE  RECOMMENDED
                 SET  OF  ANALYSES IGNORING  ANALYSIS  43°
Chemical
Acrylonitrile
Allyl Chloride
4-Aminobiphenyl
Benzo(a)pyrene
Carbon Tetrachloride
Chlordane
3, 3-Dichlorobenzidine
1 ,2-Dichloroethane
EDB
Formaldehyde
Hexachlorobenzene
Hydrazine
Mustard Gar.
Lead
2-Napthylamine
NTA
2,4, 6-~- ichlorophenol
TCDD
Tetrachloroethylene
Yoxaphene
Vinylidene Chloride
Option 2:
Full Ranqeb
[2.37. 2.24E+1]
[3.74E+1. 3.40E+2]
[8.60. 7.38E+1]
[2.82E-1. 1.77]
[1.22E+1, V.05E+2]
[8.30E-1. 1.36E+1]
[6.70, 6.30E+1]
[1.51E+1, 1.32E+2]
[1.2*. 1.47E+1]
[7.82E-1. 9.67]
[7.00E-1. 7.60]
[7.35E-1. 2.81E+1]
[5.55E-8. 4.76E-7]
[3.32. 3.12E+1]
[6.50, 6.10E+1]
[2.39E+2, 2.12E+5]
[6.80E+1, 6.10E+2]
[3.96E-5, 3.72E-4]
[3.WE-H. 3.i»6E-i-2]
[8.00E-1, 1.WE+1]
[3.79E-1. 7.16]
Option S:
Smallest Consistent Range0
[2.44. 1.34E+1] (47)
[4.96E+1, 2.11E+2] (45)
[1.65E+1, 6.99E+1] (45)
[4.01E-1, 1.70] (45)
[2.44E+1. 1.03E+2] (45)
[1.28. 8.02] (30)
[6.89, 3.80E+1] (47)
[2.28E+1, 9.66E-H] (45)
[2.25, 9.50] (45)
[1.30, 7.42] (31)
[1.38, 7.60] (47)
[1.28, 2.81E-H] (45, 47)
[9.53E-8, 4.05E-7] (45)
[4.19. 1.77E+1] (45)
[8.18. 3.47E+1] (45)
[4.26E+2, 1.80E+3] (45)
[1.06E+2. 5.81E+2] (47)
[4.69E-S, 2.01E-4] (45)
[5.94E+1, 2.52E-t-2] (45)
[1.40, 8.78] (30)
[3.79E-1. 5.88] (31, 45)
°Values of RRDs are in mg/kg/day.
bThe full range extends from the smallest lower bound to the largest
 upper bound among analyses in the recommended set.
cThe smallest consistent range is the union of intervals from analyses
 in the recommended set such that the union includes all predictions
 (from Table 3-3) and is the smallest union that does so.
                               3-40

-------
                                                   Flgur* 3-1



                                  Plot For Araly*I* 45  (All RoutM,  A
ovw Sm)
                                                                                          8C
-2
-3
              -«       -4

-------